How can I optimize the usage of Metal when running Tabby on Mac M1?
I'm using the --device metal option to run Tabby on my Mac M1, but the GPU usage does not go above 8% and it takes 20-30 seconds to get results. Is there any optimization I could apply to improve this?