general
What hardware setup is recommended for serving CodeLlama-70B models in TabbyML?
I am looking to integrate CodeLlama-70B with TabbyML for code completion. What kind of hardware setup is recommended for serving CodeLlama-70B models in TabbyML?
Me
Merxhan Bajrami
Asked on May 08, 2024
- For serving CodeLlama-70B models in TabbyML, it is recommended to use an 8xA100 / H100 setup with individual model serving infra support (e.g vllm).
- Check the following link for GPU recommendations for various models: TabbyML GPU Recommendations
- Refer to this example for more details: CodeLlama-70B Model Serving Example
May 09, 2024Edited by