What are ways of increasing the speed of tabby?

Is increasing the --parallelism flag helpful?


Sam Tenenholz

Asked on Dec 19, 2023

Increasing the --parallelism flag can potentially increase the throughput of tabby, but it will not have an impact on reducing latency for single-user setups. To reduce latency, you can try using a smaller model size or using a better device such as a GPU.

