Peter Ahlers
Asked on Nov 06, 2023
Yes, it is possible to use the codellama/CodeLlama-34b-hf
model. However, for code completion use case, the model might be too large due to latency requirements. If you still want to use it, you can refer to the MODEL_SPEC.md
file in the https://github.com/TabbyML/tabby repository to learn how to load the model from local. Additionally, you can acquire the model file from https://huggingface.co/TheBloke/CodeLlama-34B-GGUF/blob/main/codellama-34b.Q8_0.gguf.