general

What hardware setup is recommended for serving CodeLlama-70B models in TabbyML?

I am looking to integrate CodeLlama-70B with TabbyML for code completion. What kind of hardware setup is recommended for serving CodeLlama-70B models in TabbyML?

Me

Merxhan Bajrami

Asked on May 08, 2024

  • For serving CodeLlama-70B models in TabbyML, it is recommended to use an 8xA100 / H100 setup with individual model serving infra support (e.g vllm).
  • Check the following link for GPU recommendations for various models: TabbyML GPU Recommendations
  • Refer to this example for more details: CodeLlama-70B Model Serving Example
May 09, 2024Edited by