general

Does using flags --model and --chat-model at the same time load both models in GPU at the same time?

RiQuY is asking if using the flags --model and --chat-model simultaneously will load both models in GPU at the same time, causing the VRAM costs for each model to add up.

Ri

RiQuY

Asked on Apr 10, 2024

Yes

Apr 10, 2024Edited by