general

How to Enable ROCm in Docker for TabbyML?

I'm trying to run TabbyML with ROCm support inside a Docker container, but it seems to be running on the CPU regardless of the --device argument I pass. Here's the command I used:

docker run --device /dev/dri/card0 -it -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby-rocm:0.1 serve --model TabbyML/DeepseekCoder-1.3B --device rocm

And here's part of the Dockerfile I used to build the image:

ARG UBUNTU_VERSION=22.04
ARG CUDA_VERSION=11.7.1
FROM ghcr.io/cromefire/hipblas-manylinux/2014/5.7:latest as runtime
COPY ./tabby_x86_64-manylinux2014-rocm57 .
ARG TARGETARCH
ENV TABBY_ROOT=/data
ENTRYPOINT ["./tabby_x86_64-manylinux2014-rocm57"]

I've also tried variations of the --device flag and checked the output of rocminfo inside the container, which seems to recognize my GPU. Any insights on how to properly enable ROCm in the container would be appreciated.

Jo

John LaRocque

Asked on Dec 21, 2023

I finally managed to run TabbyML with ROCm support in a Docker container. I started with a fresh container from rocm/rocm-terminal (Ubuntu 20.04), installed the necessary dependencies, and built Tabby. It's now running inference on my GPU. Here's a rough outline of the steps I took, though this isn't a tested Dockerfile:

FROM rocm/rocm-terminal

# var required for both build and serve ???
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
WORKDIR /home/rocm-user

RUN git clone --recurse-submodules https://github.com/TabbyML/tabby

RUN sudo apt update && \
    sudo apt install cargo pkg-config libssl-dev protobuf-compiler hipblas

RUN sudo ln -s /opt/rocm/lib/libamdhip64.so /usr/lib64 && \
    sudo ln -s /opt/rocm/lib/libhipblas.so /usr/lib64 && \
    sudo ln -s /opt/rocm/lib/librocblas.so /usr/lib64

WORKDIR /home/rocm-user/tabby

RUN cargo build --features rocm --release --package tabby

ENTRYPOINT ["target/release/tabby", "serve", "--model", "/data/models/TabbyML/DeepseekCoder-1.3B/", "--device", "rocm"]

Additionally, I used the following docker run command:

sudo docker run -it --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video -p 8080:8080 -v $HOME/.tabby:/data rocm/rocm-terminal

It seems that starting with the rocm/rocm-terminal image and setting up the environment correctly was key to getting ROCm to work properly in the container.

Dec 28, 2023Edited by