How to Enable ROCm in Docker for TabbyML?
I'm trying to run TabbyML with ROCm support inside a Docker container, but it seems to be running on the CPU regardless of the --device
argument I pass. Here's the command I used:
docker run --device /dev/dri/card0 -it -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby-rocm:0.1 serve --model TabbyML/DeepseekCoder-1.3B --device rocm
And here's part of the Dockerfile I used to build the image:
ARG UBUNTU_VERSION=22.04
ARG CUDA_VERSION=11.7.1
FROM ghcr.io/cromefire/hipblas-manylinux/2014/5.7:latest as runtime
COPY ./tabby_x86_64-manylinux2014-rocm57 .
ARG TARGETARCH
ENV TABBY_ROOT=/data
ENTRYPOINT ["./tabby_x86_64-manylinux2014-rocm57"]
I've also tried variations of the --device
flag and checked the output of rocminfo
inside the container, which seems to recognize my GPU. Any insights on how to properly enable ROCm in the container would be appreciated.
John LaRocque
Asked on Dec 21, 2023
I finally managed to run TabbyML with ROCm support in a Docker container. I started with a fresh container from rocm/rocm-terminal
(Ubuntu 20.04), installed the necessary dependencies, and built Tabby. It's now running inference on my GPU. Here's a rough outline of the steps I took, though this isn't a tested Dockerfile:
FROM rocm/rocm-terminal
# var required for both build and serve ???
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
WORKDIR /home/rocm-user
RUN git clone --recurse-submodules https://github.com/TabbyML/tabby
RUN sudo apt update && \
sudo apt install cargo pkg-config libssl-dev protobuf-compiler hipblas
RUN sudo ln -s /opt/rocm/lib/libamdhip64.so /usr/lib64 && \
sudo ln -s /opt/rocm/lib/libhipblas.so /usr/lib64 && \
sudo ln -s /opt/rocm/lib/librocblas.so /usr/lib64
WORKDIR /home/rocm-user/tabby
RUN cargo build --features rocm --release --package tabby
ENTRYPOINT ["target/release/tabby", "serve", "--model", "/data/models/TabbyML/DeepseekCoder-1.3B/", "--device", "rocm"]
Additionally, I used the following docker run command:
sudo docker run -it --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video -p 8080:8080 -v $HOME/.tabby:/data rocm/rocm-terminal
It seems that starting with the rocm/rocm-terminal
image and setting up the environment correctly was key to getting ROCm to work properly in the container.