general

Is the slow response and timeout issue on NVIDIA A2000 GPU related to GPU problem or other factors?

The user is experiencing slow response and timeout issues when running a model with NVIDIA A2000 GPU, while another GPU (NVIDIA RTX 3060) does not have the same problem. They are using the TabbyML/StarCoder-3B model. The user also noticed that the Docker CPU usage stays at 100% when the request times out. Is this related to a GPU problem or other factors?

yu

yuhui pang

Asked on Mar 13, 2024

  • The slow response and timeout issues on NVIDIA A2000 GPU could be related to the GPU's age and VRAM limitations.
  • The Docker CPU usage staying at 100% when the request times out may indicate unreleased resources or potential infinite loops in the service.
  • It's recommended to verify the cancellation logic and check for any resource leaks or inefficiencies in the service code.
  • Revisiting the issue in 2-3 weeks due to bandwidth constraints was suggested by Meng Zhang.
Mar 14, 2024Edited by