Tabby Community - Is the slow response and timeout issue on NVIDIA A2000 GPU related to GPU problem or other factors?

general

Is the slow response and timeout issue on NVIDIA A2000 GPU related to GPU problem or other factors?

The user is experiencing slow response and timeout issues when running a model with NVIDIA A2000 GPU, while another GPU (NVIDIA RTX 3060) does not have the same problem. They are using the TabbyML/StarCoder-3B model. The user also noticed that the Docker CPU usage stays at 100% when the request times out. Is this related to a GPU problem or other factors?

yu

yuhui pang

Asked on Mar 13, 2024

The slow response and timeout issues on NVIDIA A2000 GPU could be related to the GPU's age and VRAM limitations.
The Docker CPU usage staying at 100% when the request times out may indicate unreleased resources or potential infinite loops in the service.
It's recommended to verify the cancellation logic and check for any resource leaks or inefficiencies in the service code.
Revisiting the issue in 2-3 weeks due to bandwidth constraints was suggested by Meng Zhang.

Mar 14, 2024Edited by