Skip to content

Commit ef628a3

Browse files
authored
Merge pull request #103 from VectorInstitute/feature/add-flashinfer
Update base image to cuda 12.4 and install FlashInfer for better performance
2 parents 4c6db55 + 8bee2e8 commit ef628a3

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

Dockerfile

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM nvidia/cuda:12.3.1-devel-ubuntu20.04
1+
FROM nvidia/cuda:12.4.1-devel-ubuntu20.04
22

33
# Non-interactive apt-get commands
44
ARG DEBIAN_FRONTEND=noninteractive
@@ -41,8 +41,10 @@ COPY . /vec-inf
4141

4242
# Install project dependencies with build requirements
4343
RUN PIP_INDEX_URL="https://download.pytorch.org/whl/cu121" uv pip install --system -e .[dev]
44-
# Install Flash Attention
44+
# Install FlashAttention
4545
RUN python3.10 -m pip install flash-attn --no-build-isolation
46+
# Install FlashInfer
47+
RUN python3.10 -m pip install flashinfer-python -i https://flashinfer.ai/whl/cu124/torch2.6/
4648

4749
# Final configuration
4850
RUN mkdir -p /vec-inf/nccl && \

0 commit comments

Comments
 (0)