general
Does using flags --model and --chat-model at the same time load both models in GPU at the same time?
RiQuY is asking if using the flags --model and --chat-model simultaneously will load both models in GPU at the same time, causing the VRAM costs for each model to add up.
Ri
RiQuY
Asked on Apr 10, 2024
Yes
Apr 10, 2024Edited by