Skip to content

Commit

Permalink
Go to 12.1 instead of 12.2 so gpu system doesn't have to upgrade driv…
Browse files Browse the repository at this point in the history
…er from 530.30.02 cuda12.1
  • Loading branch information
pseudotensor committed Jan 27, 2024
1 parent b77061c commit a68cbdb
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 13 deletions.
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# devel needed for bitsandbytes requirement of libcudart.so, otherwise runtime sufficient
FROM nvidia/cuda:12.2.2-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04

ENV DEBIAN_FRONTEND=noninteractive

ENV PATH="/h2ogpt_conda/bin:${PATH}"
ARG PATH="/h2ogpt_conda/bin:${PATH}"

ENV HOME=/workspace
ENV CUDA_HOME=/usr/local/cuda-12.2
ENV CUDA_HOME=/usr/local/cuda-12.1
ENV VLLM_CACHE=/workspace/.vllm_cache
ENV TIKTOKEN_CACHE_DIR=/workspace/tiktoken_cache

Expand Down
4 changes: 2 additions & 2 deletions docker_build_script_ubuntu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ set -ex
export DEBIAN_FRONTEND=noninteractive
export PATH=/h2ogpt_conda/bin:$PATH
export HOME=/workspace
export CUDA_HOME=/usr/local/cuda-12.2
export PIP_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu122
export CUDA_HOME=/usr/local/cuda-12.1
export PIP_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu121

# Install linux dependencies
apt-get update && apt-get install -y \
Expand Down
12 changes: 3 additions & 9 deletions docs/README_InferenceServers.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,9 @@ conda create -n vllm -y
conda activate vllm
conda install python=3.10 -y
```
Assuming torch was installed with CUDA 12.3, and you have installed cuda locally in `/usr/local/cuda-12.3`:
Assuming torch was installed with CUDA 12.1, and you have installed cuda locally in `/usr/local/cuda-12.1`:
```bash
export CUDA_HOME=/usr/local/cuda-12.3
export CUDA_HOME=/usr/local/cuda-12.1
export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu123"
pip install mosaicml-turbo --upgrade # see docker_build_script_ubuntu.sh for x86 prebuilt wheel on s3
pip install git+https://github.com/stanford-futuredata/megablocks.git # see docker_build_script_ubuntu.sh for x86 prebuilt wheel on s3
Expand Down Expand Up @@ -288,14 +288,8 @@ export CUDA_VISIBLE_DEVICESs=0,1,2,3
python -m vllm.entrypoints.openai.api_server --port=5000 --host=0.0.0.0 --model h2oai/h2ogpt-4096-llama2-70b-chat --tokenizer=hf-internal-testing/llama-tokenizer --tensor-parallel-size=4 --seed 1234 --max-num-batched-tokens=8192
```
For Mixtral 8*7B run:
For Mixtral 8*7B need newer cuda 12 toolkit and vllm build, then run:
```bash
export CUDA_HOME=/usr/local/cuda-12.3
export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu123"
# so builds on cuda 12.3 given 12.1 is default build
pip install git+https://github.com/vllm-project/vllm.git
pip install mosaicml-turbo
pip install git+https://github.com/stanford-futuredata/megablocks.git
export CUDA_VISIBLE_DEVICES=0,1
python -m vllm.entrypoints.openai.api_server --port=5002 --host=0.0.0.0 --model mistralai/Mixtral-8x7B-Instruct-v0.1 --seed 1234 --max-num-batched-tokens=65536 --tensor-parallel-size=2
```
Expand Down

0 comments on commit a68cbdb

Please sign in to comment.