I would like to host Tabby on my own server but I do not want to buy an expensive GPU. Can I run Tabby with just the CPU?
Claudio Piguet
Asked on Jan 15, 2024
Unfortunately, CPUs generally have very bad latency for LLMs, so running Tabby with just the CPU is not ideal for everyday development.
It may work for a proof-of-concept tryout, but the response time may exceed the latency limit. It is recommended to use a cloud hosting solution like Hugging Face or Modal Labs, which provide GPU resources for running machine learning models.