general
How can I run models at half precision with --compute-type deprecated?
I see that --compute-type was deprecated, and I'm looking for a way to run models at half precision.
K
K
Asked on Apr 30, 2024
- Check the MODEL_SPEC.md file in the TabbyML repository
- Look for the option to put quantized gguf in the model directory
Apr 30, 2024Edited by