Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't launch OpenAI API server on newly installed vLLM in Docker - fastchat not found #537

Closed
TheBloke opened this issue Jul 20, 2023 · 7 comments

Comments

@TheBloke
Copy link

TheBloke commented Jul 20, 2023

Hi

I have a Docker container that I created for vLLM. I built it a few days ago and it worked fine. Today I rebuilt it to get the latest code changes, and now it's failing to launch the OpenAI server. SSHing in to the docker and running the launch command directly shows the following error:

vllm@36b7089a5957:~/vllm (main ✔) ᐅ python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/vllm/vllm/vllm/entrypoints/openai/api_server.py", line 17, in <module>
    from fastchat.model.model_adapter import get_conversation_template
ModuleNotFoundError: No module named 'fastchat.model.model_adapter'

However I can launch the non-API server fine:

vllm@36b7089a5957:~/vllm (main ✔) ᐅ python -m vllm.entrypoints.api_server
Downloading (…)lve/main/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 651/651 [00:00<00:00, 1.57MB/s]
INFO 07-20 21:26:11 llm_engine.py:67] Initializing an LLM engine with config: model='facebook/opt-125m', tokenizer='facebook/opt-125m', tokenizer_mode=auto, trust_remote_code=False, dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)
Downloading (…)okenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 685/685 [00:00<00:00, 1.83MB/s]
...
Output of `pip freeze`
accelerate==0.21.0
aiofiles==23.1.0
aiohttp==3.8.5
aiosignal==1.3.1
altair==5.0.1
anyio==3.7.1
appdirs==1.4.4
async-timeout==4.0.2
attrs==23.1.0
certifi==2023.5.7
charset-normalizer==3.2.0
click==8.1.6
cmake==3.27.0
contourpy==1.1.0
cycler==0.11.0
docker-pycreds==0.4.0
exceptiongroup==1.1.2
fastapi==0.100.0
ffmpy==0.3.1
filelock==3.12.2
fonttools==4.41.0
frozenlist==1.4.0
fschat==0.2.3
fsspec==2023.6.0
gitdb==4.0.10
GitPython==3.1.32
gradio==3.23.0
grpcio==1.51.3
h11==0.14.0
httpcore==0.17.3
httpx==0.24.1
huggingface-hub==0.16.4
idna==3.4
Jinja2==3.1.2
jsonschema==4.18.4
jsonschema-specifications==2023.7.1
kiwisolver==1.4.4
linkify-it-py==2.0.2
lit==16.0.6
markdown-it-py==2.2.0
markdown2==2.4.9
MarkupSafe==2.1.3
matplotlib==3.7.2
mdit-py-plugins==0.3.3
mdurl==0.1.2
mpmath==1.3.0
msgpack==1.0.5
multidict==6.0.4
mypy-extensions==1.0.0
networkx==3.1
ninja==1.11.1
numpy==1.25.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
orjson==3.9.2
packaging==23.1
pandas==2.0.3
pathtools==0.1.2
Pillow==10.0.0
prompt-toolkit==3.0.39
protobuf==4.23.4
psutil==5.9.5
pydantic==1.10.11
pydub==0.25.1
Pygments==2.15.1
pyparsing==3.0.9
pyre-extensions==0.0.29
python-dateutil==2.8.2
python-multipart==0.0.6
pytz==2023.3
PyYAML==6.0.1
ray==2.5.1
referencing==0.30.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
rpds-py==0.9.2
safetensors==0.3.1
semantic-version==2.10.0
sentencepiece==0.1.99
sentry-sdk==1.28.1
setproctitle==1.3.2
shortuuid==1.0.11
six==1.16.0
smmap==5.0.0
sniffio==1.3.0
starlette==0.27.0
svgwrite==1.4.3
sympy==1.12
tokenizers==0.13.3
toolz==0.12.0
torch==2.0.1
tqdm==4.65.0
transformers==4.31.0
triton==2.0.0
typing-inspect==0.9.0
typing_extensions==4.7.1
tzdata==2023.3
uc-micro-py==1.0.2
urllib3==2.0.4
uvicorn==0.23.1
-e git+https://github.com/vllm-project/vllm.git@6fc2a38b110f9ba6037b31ee016f20df32426877#egg=vllm
wandb==0.15.5
wavedrom==2.0.3.post3
wcwidth==0.2.6
websockets==11.0.3
xformers==0.0.20
yarl==1.9.2
My Dockerfile
ARG CUDA_VERSION="11.8.0"
ARG CUDNN_VERSION="8"
ARG UBUNTU_VERSION="22.04"

# Base NVidia CUDA Ubuntu image
FROM nvidia/cuda:$CUDA_VERSION-cudnn$CUDNN_VERSION-devel-ubuntu$UBUNTU_VERSION AS base

EXPOSE 22/tcp
EXPOSE 8000/tcp

USER root
# Install Python plus openssh, which is our minimum set of required packages.
# Install useful command line utility software
ARG APTPKGS="zsh sudo wget tmux nvtop vim neovim curl rsync less"
RUN apt-get update -y && \
    apt-get install -y python3 python3-pip python3-venv && \
    apt-get install -y --no-install-recommends openssh-server openssh-client git git-lfs && \
    python3 -m pip install --upgrade pip && \
    apt-get install -y --no-install-recommends $APTPKGS && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

ENV PATH="/usr/local/cuda/bin:${PATH}"

ARG USERNAME=vllm
ENV USERNAME=$USERNAME
ARG VOLUME=/workspace
ENV VOLUME=$VOLUME

# Create user, change shell to ZSH, make a volume which they own
RUN useradd -m -u 1000 $USERNAME && \
    chsh -s /usr/bin/zsh $USERNAME && \
    mkdir -p "$VOLUME" && \
    chown $USERNAME:$USERNAME "$VOLUME" && \
    usermod -aG sudo $USERNAME && \
    echo "$USERNAME ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-docker-users

USER $USERNAME
ENV HOME=/home/$USERNAME
ENV PATH=$HOME/.local/bin:$PATH
WORKDIR $HOME

ENV TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX;8.9;9.0"

RUN git clone https://github.com/vllm-project/vllm.git && \
    cd vllm && \
    pip3 install -e . && \
    pip3 cache purge

And finally here is the log from the creation of the Docker container:

docker build log

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 2.67kB done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
#3 ...

#4 [auth] nvidia/cuda:pull token for registry-1.docker.io
#4 DONE 0.0s

#3 [internal] load metadata for docker.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
#3 DONE 0.7s

#5 [ 1/17] FROM docker.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04@sha256:b856c89aa26c1dc1b56c834a66b44c527a298325173c87291486fc100ceedb6e
#5 DONE 0.0s

#6 [ 2/17] RUN apt-get update -y && apt-get install -y python3 python3-pip python3-venv && apt-get install -y --no-install-recommends openssh-server openssh-client git git-lfs && python3 -m pip install --upgrade pip && apt-get install -y --no-install-recommends zsh sudo wget tmux nvtop vim neovim curl rsync less && apt-get clean && rm -rf /var/lib/apt/lists/*
#6 CACHED

#7 [ 3/17] RUN useradd -m -u 1000 vllm && chsh -s /usr/bin/zsh vllm && mkdir -p "/workspace" && chown vllm:vllm "/workspace" && usermod -aG sudo vllm && echo "vllm ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-docker-users
#7 CACHED

#8 [ 4/17] WORKDIR /home/vllm
#8 CACHED

#9 [internal] load build context
#9 transferring context: 444B 0.0s done
#9 DONE 0.0s

#10 [ 5/17] RUN git clone https://github.com/vllm-project/vllm.git && cd vllm && pip3 install -e . && pip3 cache purge
#10 0.321 Cloning into 'vllm'...
#10 2.377 Defaulting to user installation because normal site-packages is not writeable
#10 2.419 Obtaining file:///home/vllm/vllm
#10 2.423 Installing build dependencies: started
#10 90.29 Installing build dependencies: still running...
#10 90.75 Installing build dependencies: finished with status 'done'
#10 90.75 Checking if build backend supports build_editable: started
#10 90.97 Checking if build backend supports build_editable: finished with status 'done'
#10 90.97 Getting requirements to build editable: started
#10 93.48 Getting requirements to build editable: finished with status 'done'
#10 93.49 Preparing editable metadata (pyproject.toml): started
#10 96.08 Preparing editable metadata (pyproject.toml): finished with status 'done'
#10 96.47 Collecting ninja (from vllm==0.1.2)
#10 96.47 Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
#10 96.80 Collecting psutil (from vllm==0.1.2)
#10 96.99 Downloading psutil-5.9.5-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (282 kB)
#10 97.20 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 282.1/282.1 kB 1.4 MB/s eta 0:00:00
#10 97.50 Collecting ray>=2.5.1 (from vllm==0.1.2)
#10 97.50 Obtaining dependency information for ray>=2.5.1 from https://files.pythonhosted.org/packages/9c/42/ef94d5cbd492d05999ee6f77fa7de6a16c18b634241085203919029fef8d/ray-2.5.1-cp310-cp310-manylinux2014_x86_64.whl.metadata
#10 97.53 Downloading ray-2.5.1-cp310-cp310-manylinux2014_x86_64.whl.metadata (12 kB)
#10 97.77 Collecting sentencepiece (from vllm==0.1.2)
#10 97.80 Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
#10 97.87 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 21.5 MB/s eta 0:00:00
#10 98.44 Collecting numpy (from vllm==0.1.2)
#10 98.44 Obtaining dependency information for numpy from https://files.pythonhosted.org/packages/d0/55/559e6f455a066e12058330377259a106b7fefa41c15dbdb1b71070cec429/numpy-1.25.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 98.47 Downloading numpy-1.25.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
#10 98.59 Collecting torch>=2.0.0 (from vllm==0.1.2)
#10 100.9 Using cached torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
#10 103.1 Collecting transformers>=4.31.0 (from vllm==0.1.2)
#10 103.1 Obtaining dependency information for transformers>=4.31.0 from https://files.pythonhosted.org/packages/21/02/ae8e595f45b6c8edee07913892b3b41f5f5f273962ad98851dc6a564bbb9/transformers-4.31.0-py3-none-any.whl.metadata
#10 103.1 Downloading transformers-4.31.0-py3-none-any.whl.metadata (116 kB)
#10 103.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.9/116.9 kB 20.7 MB/s eta 0:00:00
#10 103.2 Collecting xformers>=0.0.19 (from vllm==0.1.2)
#10 103.2 Obtaining dependency information for xformers>=0.0.19 from https://files.pythonhosted.org/packages/4b/b0/dfbb3b0ceafdb73cd1b2bbe33f65dc1c5c47dcb0d4b03ba6f95da6557306/xformers-0.0.20-cp310-cp310-manylinux2014_x86_64.whl.metadata
#10 103.2 Downloading xformers-0.0.20-cp310-cp310-manylinux2014_x86_64.whl.metadata (1.1 kB)
#10 103.4 Collecting fastapi (from vllm==0.1.2)
#10 103.4 Obtaining dependency information for fastapi from https://files.pythonhosted.org/packages/49/f5/048206823aae9b3a4a61ba6b7a1dd1de36bd4c0a0283f2efb1f1f2289c8a/fastapi-0.100.0-py3-none-any.whl.metadata
#10 103.4 Downloading fastapi-0.100.0-py3-none-any.whl.metadata (23 kB)
#10 103.5 Collecting uvicorn (from vllm==0.1.2)
#10 103.5 Obtaining dependency information for uvicorn from https://files.pythonhosted.org/packages/5d/07/b9eac057f7efa56900640a233c1ed63db83568322c6bcbabe98f741d5289/uvicorn-0.23.1-py3-none-any.whl.metadata
#10 103.5 Downloading uvicorn-0.23.1-py3-none-any.whl.metadata (6.2 kB)
#10 103.9 Collecting pydantic<2 (from vllm==0.1.2)
#10 103.9 Obtaining dependency information for pydantic<2 from https://files.pythonhosted.org/packages/b6/8e/7dd215f91528487535e7aa048e4092c20ecd0168df958e58809e2235cece/pydantic-1.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 103.9 Downloading pydantic-1.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (148 kB)
#10 103.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 149.0/149.0 kB 30.1 MB/s eta 0:00:00
#10 104.0 Collecting fschat (from vllm==0.1.2)
#10 104.0 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/d2/29/ab7ae254ab4b73f29fc1a0b9dd5f95cbcaacfb28f836719033458337d9cf/fschat-0.2.18-py3-none-any.whl.metadata
#10 104.0 Downloading fschat-0.2.18-py3-none-any.whl.metadata (15 kB)
#10 104.1 Collecting typing-extensions>=4.2.0 (from pydantic<2->vllm==0.1.2)
#10 104.1 Obtaining dependency information for typing-extensions>=4.2.0 from https://files.pythonhosted.org/packages/ec/6b/63cc3df74987c36fe26157ee12e09e8f9db4de771e0f3404263117e75b95/typing_extensions-4.7.1-py3-none-any.whl.metadata
#10 104.1 Using cached typing_extensions-4.7.1-py3-none-any.whl.metadata (3.1 kB)
#10 104.3 Collecting attrs (from ray>=2.5.1->vllm==0.1.2)
#10 104.3 Downloading attrs-23.1.0-py3-none-any.whl (61 kB)
#10 104.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.2/61.2 kB 11.2 MB/s eta 0:00:00
#10 104.4 Collecting click>=7.0 (from ray>=2.5.1->vllm==0.1.2)
#10 104.4 Obtaining dependency information for click>=7.0 from https://files.pythonhosted.org/packages/1a/70/e63223f8116931d365993d4a6b7ef653a4d920b41d03de7c59499962821f/click-8.1.6-py3-none-any.whl.metadata
#10 104.5 Downloading click-8.1.6-py3-none-any.whl.metadata (3.0 kB)
#10 104.6 Collecting filelock (from ray>=2.5.1->vllm==0.1.2)
#10 104.6 Obtaining dependency information for filelock from https://files.pythonhosted.org/packages/00/45/ec3407adf6f6b5bf867a4462b2b0af27597a26bd3cd6e2534cb6ab029938/filelock-3.12.2-py3-none-any.whl.metadata
#10 104.6 Using cached filelock-3.12.2-py3-none-any.whl.metadata (2.7 kB)
#10 104.7 Collecting jsonschema (from ray>=2.5.1->vllm==0.1.2)
#10 104.7 Obtaining dependency information for jsonschema from https://files.pythonhosted.org/packages/a1/ba/28ce987450c6afa8336373761193ddaadc1ba2004fbf23a6407db036f558/jsonschema-4.18.4-py3-none-any.whl.metadata
#10 104.7 Downloading jsonschema-4.18.4-py3-none-any.whl.metadata (7.8 kB)
#10 104.9 Collecting msgpack<2.0.0,>=1.0.0 (from ray>=2.5.1->vllm==0.1.2)
#10 104.9 Downloading msgpack-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (316 kB)
#10 104.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 316.8/316.8 kB 38.5 MB/s eta 0:00:00
#10 105.0 Collecting packaging (from ray>=2.5.1->vllm==0.1.2)
#10 105.0 Using cached packaging-23.1-py3-none-any.whl (48 kB)
#10 105.5 Collecting protobuf!=3.19.5,>=3.15.3 (from ray>=2.5.1->vllm==0.1.2)
#10 105.5 Obtaining dependency information for protobuf!=3.19.5,>=3.15.3 from https://files.pythonhosted.org/packages/01/cb/445b3e465abdb8042a41957dc8f60c54620dc7540dbcf9b458a921531ca2/protobuf-4.23.4-cp37-abi3-manylinux2014_x86_64.whl.metadata
#10 105.5 Downloading protobuf-4.23.4-cp37-abi3-manylinux2014_x86_64.whl.metadata (540 bytes)
#10 105.6 Collecting pyyaml (from ray>=2.5.1->vllm==0.1.2)
#10 105.6 Obtaining dependency information for pyyaml from https://files.pythonhosted.org/packages/29/61/bf33c6c85c55bc45a29eee3195848ff2d518d84735eb0e2d8cb42e0d285e/PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 105.7 Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
#10 105.7 Collecting aiosignal (from ray>=2.5.1->vllm==0.1.2)
#10 105.7 Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
#10 105.9 Collecting frozenlist (from ray>=2.5.1->vllm==0.1.2)
#10 105.9 Obtaining dependency information for frozenlist from https://files.pythonhosted.org/packages/1e/28/74b8b6451c89c070d34e753d8b65a1e4ce508a6808b18529f36e8c0e2184/frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 105.9 Downloading frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
#10 106.1 Collecting requests (from ray>=2.5.1->vllm==0.1.2)
#10 106.1 Obtaining dependency information for requests from https://files.pythonhosted.org/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl.metadata
#10 106.1 Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
#10 107.4 Collecting grpcio<=1.51.3,>=1.42.0 (from ray>=2.5.1->vllm==0.1.2)
#10 107.5 Downloading grpcio-1.51.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.8 MB)
#10 107.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.8/4.8 MB 51.3 MB/s eta 0:00:00
#10 107.7 Collecting sympy (from torch>=2.0.0->vllm==0.1.2)
#10 107.7 Using cached sympy-1.12-py3-none-any.whl (5.7 MB)
#10 107.8 Collecting networkx (from torch>=2.0.0->vllm==0.1.2)
#10 107.8 Using cached networkx-3.1-py3-none-any.whl (2.1 MB)
#10 107.9 Collecting jinja2 (from torch>=2.0.0->vllm==0.1.2)
#10 107.9 Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
#10 108.0 Collecting nvidia-cuda-nvrtc-cu11==11.7.99 (from torch>=2.0.0->vllm==0.1.2)
#10 108.0 Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
#10 108.3 Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch>=2.0.0->vllm==0.1.2)
#10 108.3 Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
#10 108.4 Collecting nvidia-cuda-cupti-cu11==11.7.101 (from torch>=2.0.0->vllm==0.1.2)
#10 108.4 Using cached nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
#10 108.5 Collecting nvidia-cudnn-cu11==8.5.0.96 (from torch>=2.0.0->vllm==0.1.2)
#10 110.6 Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
#10 112.3 Collecting nvidia-cublas-cu11==11.10.3.66 (from torch>=2.0.0->vllm==0.1.2)
#10 113.5 Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
#10 114.7 Collecting nvidia-cufft-cu11==10.9.0.58 (from torch>=2.0.0->vllm==0.1.2)
#10 115.3 Using cached nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
#10 115.9 Collecting nvidia-curand-cu11==10.2.10.91 (from torch>=2.0.0->vllm==0.1.2)
#10 116.1 Using cached nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
#10 116.3 Collecting nvidia-cusolver-cu11==11.4.0.1 (from torch>=2.0.0->vllm==0.1.2)
#10 116.7 Using cached nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
#10 117.0 Collecting nvidia-cusparse-cu11==11.7.4.91 (from torch>=2.0.0->vllm==0.1.2)
#10 117.7 Using cached nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
#10 118.3 Collecting nvidia-nccl-cu11==2.14.3 (from torch>=2.0.0->vllm==0.1.2)
#10 118.8 Using cached nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
#10 119.2 Collecting nvidia-nvtx-cu11==11.7.91 (from torch>=2.0.0->vllm==0.1.2)
#10 119.2 Using cached nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
#10 119.2 Collecting triton==2.0.0 (from torch>=2.0.0->vllm==0.1.2)
#10 119.4 Using cached triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
#10 119.5 Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm==0.1.2) (59.6.0)
#10 119.5 Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm==0.1.2) (0.37.1)
#10 119.8 Collecting cmake (from triton==2.0.0->torch>=2.0.0->vllm==0.1.2)
#10 119.8 Obtaining dependency information for cmake from https://files.pythonhosted.org/packages/14/b8/06f8fdc4687af3d3d8d95461d97737df2f144acd28eff65a3c47c29d0152/cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
#10 119.8 Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
#10 119.8 Collecting lit (from triton==2.0.0->torch>=2.0.0->vllm==0.1.2)
#10 119.8 Using cached lit-16.0.6-py3-none-any.whl
#10 120.2 Collecting huggingface-hub<1.0,>=0.14.1 (from transformers>=4.31.0->vllm==0.1.2)
#10 120.2 Obtaining dependency information for huggingface-hub<1.0,>=0.14.1 from https://files.pythonhosted.org/packages/7f/c4/adcbe9a696c135578cabcbdd7331332daad4d49b7c43688bc2d36b3a47d2/huggingface_hub-0.16.4-py3-none-any.whl.metadata
#10 120.3 Downloading huggingface_hub-0.16.4-py3-none-any.whl.metadata (12 kB)
#10 121.0 Collecting regex!=2019.12.17 (from transformers>=4.31.0->vllm==0.1.2)
#10 121.0 Obtaining dependency information for regex!=2019.12.17 from https://files.pythonhosted.org/packages/a4/06/85618f80ae552ac309ead9702c6826edda27884e26e07fdc8fa93f283546/regex-2023.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 121.0 Downloading regex-2023.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
#10 121.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 7.3 MB/s eta 0:00:00
#10 121.3 Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers>=4.31.0->vllm==0.1.2)
#10 121.3 Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
#10 121.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 42.7 MB/s eta 0:00:00
#10 121.6 Collecting safetensors>=0.3.1 (from transformers>=4.31.0->vllm==0.1.2)
#10 121.7 Downloading safetensors-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
#10 121.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 68.6 MB/s eta 0:00:00
#10 121.8 Collecting tqdm>=4.27 (from transformers>=4.31.0->vllm==0.1.2)
#10 121.9 Downloading tqdm-4.65.0-py3-none-any.whl (77 kB)
#10 121.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.1/77.1 kB 12.5 MB/s eta 0:00:00
#10 122.0 Collecting pyre-extensions==0.0.29 (from xformers>=0.0.19->vllm==0.1.2)
#10 122.0 Downloading pyre_extensions-0.0.29-py3-none-any.whl (12 kB)
#10 122.1 Collecting typing-inspect (from pyre-extensions==0.0.29->xformers>=0.0.19->vllm==0.1.2)
#10 122.1 Obtaining dependency information for typing-inspect from https://files.pythonhosted.org/packages/65/f3/107a22063bf27bdccf2024833d3445f4eea42b2e598abfbd46f6a63b6cb0/typing_inspect-0.9.0-py3-none-any.whl.metadata
#10 122.1 Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
#10 122.3 Collecting starlette<0.28.0,>=0.27.0 (from fastapi->vllm==0.1.2)
#10 122.3 Obtaining dependency information for starlette<0.28.0,>=0.27.0 from https://files.pythonhosted.org/packages/58/f8/e2cca22387965584a409795913b774235752be4176d276714e15e1a58884/starlette-0.27.0-py3-none-any.whl.metadata
#10 122.4 Downloading starlette-0.27.0-py3-none-any.whl.metadata (5.8 kB)
#10 122.5 Collecting accelerate (from fschat->vllm==0.1.2)
#10 122.5 Obtaining dependency information for accelerate from https://files.pythonhosted.org/packages/70/f9/c381bcdd0c3829d723aa14eec8e75c6c377b4ca61ec68b8093d9f35fc7a7/accelerate-0.21.0-py3-none-any.whl.metadata
#10 122.6 Downloading accelerate-0.21.0-py3-none-any.whl.metadata (17 kB)
#10 122.8 Collecting gradio==3.35.2 (from fschat->vllm==0.1.2)
#10 122.8 Obtaining dependency information for gradio==3.35.2 from https://files.pythonhosted.org/packages/50/70/ed0ba0fb5c3b1cb2e481717ad190056a4c9a0ef2f296b871e10375b2ab83/gradio-3.35.2-py3-none-any.whl.metadata
#10 122.8 Downloading gradio-3.35.2-py3-none-any.whl.metadata (15 kB)
#10 122.9 Collecting httpx (from fschat->vllm==0.1.2)
#10 122.9 Obtaining dependency information for httpx from https://files.pythonhosted.org/packages/ec/91/e41f64f03d2a13aee7e8c819d82ee3aa7cdc484d18c0ae859742597d5aa0/httpx-0.24.1-py3-none-any.whl.metadata
#10 122.9 Downloading httpx-0.24.1-py3-none-any.whl.metadata (7.4 kB)
#10 123.0 Collecting markdown2[all] (from fschat->vllm==0.1.2)
#10 123.0 Obtaining dependency information for markdown2[all] from https://files.pythonhosted.org/packages/8f/b5/93495ced07fb66c8b8a0fbc5edf07bf9fefefc1135d6e2d66e0ce5689b7d/markdown2-2.4.9-py2.py3-none-any.whl.metadata
#10 123.0 Downloading markdown2-2.4.9-py2.py3-none-any.whl.metadata (2.0 kB)
#10 123.2 Collecting nh3 (from fschat->vllm==0.1.2)
#10 123.2 Obtaining dependency information for nh3 from https://files.pythonhosted.org/packages/b7/cd/7f64121ec731255265867e0d7d782962f2bd1f15fce83f523c8f6b69463b/nh3-0.2.14-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 123.2 Downloading nh3-0.2.14-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.6 kB)
#10 123.2 Collecting peft (from fschat->vllm==0.1.2)
#10 123.2 Obtaining dependency information for peft from https://files.pythonhosted.org/packages/88/a0/6e1c23293a922a9c9e9bd8d56a60cd78ecf531fdabe45ac975e142bfbe86/peft-0.4.0-py3-none-any.whl.metadata
#10 123.3 Downloading peft-0.4.0-py3-none-any.whl.metadata (21 kB)
#10 123.4 Collecting prompt-toolkit>=3.0.0 (from fschat->vllm==0.1.2)
#10 123.4 Obtaining dependency information for prompt-toolkit>=3.0.0 from https://files.pythonhosted.org/packages/a9/b4/ba77c84edf499877317225d7b7bc047a81f7c2eed9628eeb6bab0ac2e6c9/prompt_toolkit-3.0.39-py3-none-any.whl.metadata
#10 123.4 Downloading prompt_toolkit-3.0.39-py3-none-any.whl.metadata (6.4 kB)
#10 123.6 Collecting rich>=10.0.0 (from fschat->vllm==0.1.2)
#10 123.6 Obtaining dependency information for rich>=10.0.0 from https://files.pythonhosted.org/packages/fc/1e/482e5eec0b89b593e81d78f819a9412849814e22225842b598908e7ac560/rich-13.4.2-py3-none-any.whl.metadata
#10 123.6 Downloading rich-13.4.2-py3-none-any.whl.metadata (18 kB)
#10 123.7 Collecting shortuuid (from fschat->vllm==0.1.2)
#10 123.7 Downloading shortuuid-1.0.11-py3-none-any.whl (10 kB)
#10 123.8 Collecting tiktoken (from fschat->vllm==0.1.2)
#10 123.8 Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)
#10 123.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 56.6 MB/s eta 0:00:00
#10 123.9 INFO: pip is looking at multiple versions of fschat to determine which version is compatible with other requirements. This could take a while.
#10 123.9 Collecting fschat (from vllm==0.1.2)
#10 123.9 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/8e/4e/58a0929f57afaf92f2a04758982b93d484a31b77392d03adfaec2c9dab06/fschat-0.2.17-py3-none-any.whl.metadata
#10 123.9 Downloading fschat-0.2.17-py3-none-any.whl.metadata (15 kB)
#10 124.0 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/49/03/c9eb3b7ea94a7e0cc200cbc99b8c8b2856100d5895a2dff790cdcb721f89/fschat-0.2.16-py3-none-any.whl.metadata
#10 124.0 Downloading fschat-0.2.16-py3-none-any.whl.metadata (16 kB)
#10 124.0 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/bd/31/8f916a482674de9ce02bd5433073a7c2c441ef3dd387441e7675bce23662/fschat-0.2.15-py3-none-any.whl.metadata
#10 124.1 Downloading fschat-0.2.15-py3-none-any.whl.metadata (17 kB)
#10 124.1 Collecting gradio==3.23 (from fschat->vllm==0.1.2)
#10 124.2 Downloading gradio-3.23.0-py3-none-any.whl (15.8 MB)
#10 124.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.8/15.8 MB 55.7 MB/s eta 0:00:00
#10 124.5 Collecting fschat (from vllm==0.1.2)
#10 124.5 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/0d/ff/1da77fc8f7c08d00822944cbb8bc6563b98f4c2d1bdd784947dc72bf6bd9/fschat-0.2.14-py3-none-any.whl.metadata
#10 124.5 Downloading fschat-0.2.14-py3-none-any.whl.metadata (19 kB)
#10 124.6 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/07/ae/3f1e88b46daad7230b93341b12f851db2c4ce590a4282d3fee5b674bfb5e/fschat-0.2.13-py3-none-any.whl.metadata
#10 124.6 Downloading fschat-0.2.13-py3-none-any.whl.metadata (18 kB)
#10 124.6 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/af/a5/dfe1bcc540995911c36f67b23eecee5fd5054dab006bac42d5a64e79b215/fschat-0.2.12-py3-none-any.whl.metadata
#10 124.7 Downloading fschat-0.2.12-py3-none-any.whl.metadata (18 kB)
#10 124.7 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/87/41/84972fca6c05407eddd079e3861f0e53b1181136e01934373f6292fbe1f0/fschat-0.2.11-py3-none-any.whl.metadata
#10 124.8 Downloading fschat-0.2.11-py3-none-any.whl.metadata (18 kB)
#10 124.8 INFO: pip is still looking at multiple versions of fschat to determine which version is compatible with other requirements. This could take a while.
#10 124.8 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/e1/29/e299813dd5b0035637d06ce1f79cdd754f71325aea3affe35db8e20f9f67/fschat-0.2.10-py3-none-any.whl.metadata
#10 124.8 Downloading fschat-0.2.10-py3-none-any.whl.metadata (16 kB)
#10 124.9 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/1d/f5/2ef73a157f15bcbd579a2e31ccec4440cc29fce5e34d1361e1337b34cdfd/fschat-0.2.9-py3-none-any.whl.metadata
#10 124.9 Downloading fschat-0.2.9-py3-none-any.whl.metadata (16 kB)
#10 124.9 Obtaining dependency information for fschat from https://files.pythonhosted.org/packages/f7/31/d16a1f71efb0d3681c22579f07a898ce02f2c018689231ffdd7550768887/fschat-0.2.8-py3-none-any.whl.metadata
#10 125.0 Downloading fschat-0.2.8-py3-none-any.whl.metadata (16 kB)
#10 125.0 Downloading fschat-0.2.7-py3-none-any.whl (109 kB)
#10 125.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.4/109.4 kB 15.5 MB/s eta 0:00:00
#10 125.1 Downloading fschat-0.2.6-py3-none-any.whl (108 kB)
#10 125.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 108.5/108.5 kB 18.1 MB/s eta 0:00:00
#10 125.2 INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
#10 125.2 Downloading fschat-0.2.5-py3-none-any.whl (164 kB)
#10 125.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 164.6/164.6 kB 23.2 MB/s eta 0:00:00
#10 125.3 Downloading fschat-0.2.4-py3-none-any.whl (161 kB)
#10 125.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 161.4/161.4 kB 27.2 MB/s eta 0:00:00
#10 125.4 Downloading fschat-0.2.3-py3-none-any.whl (79 kB)
#10 125.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.0/80.0 kB 12.5 MB/s eta 0:00:00
#10 125.6 Collecting wandb (from fschat->vllm==0.1.2)
#10 125.6 Obtaining dependency information for wandb from https://files.pythonhosted.org/packages/87/26/66e944b17aa06a4c9df9850a6a5c56378cb4f9b3acf3452ace7dfd895c13/wandb-0.15.5-py3-none-any.whl.metadata
#10 125.7 Downloading wandb-0.15.5-py3-none-any.whl.metadata (8.2 kB)
#10 125.8 Collecting aiofiles (from gradio==3.23->fschat->vllm==0.1.2)
#10 125.8 Downloading aiofiles-23.1.0-py3-none-any.whl (14 kB)
#10 126.5 Collecting aiohttp (from gradio==3.23->fschat->vllm==0.1.2)
#10 126.5 Obtaining dependency information for aiohttp from https://files.pythonhosted.org/packages/3e/f6/fcda07dd1e72260989f0b22dde999ecfe80daa744f23ca167083683399bc/aiohttp-3.8.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 126.5 Downloading aiohttp-3.8.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)
#10 126.6 Collecting altair>=4.2.0 (from gradio==3.23->fschat->vllm==0.1.2)
#10 126.6 Obtaining dependency information for altair>=4.2.0 from https://files.pythonhosted.org/packages/b2/20/5c3b89d6f8d9938325a9330793438389e0dc94c34d921f6da35ec62095f3/altair-5.0.1-py3-none-any.whl.metadata
#10 126.6 Downloading altair-5.0.1-py3-none-any.whl.metadata (8.5 kB)
#10 126.7 Collecting ffmpy (from gradio==3.23->fschat->vllm==0.1.2)
#10 126.7 Downloading ffmpy-0.3.1.tar.gz (5.5 kB)
#10 126.7 Preparing metadata (setup.py): started
#10 127.0 Preparing metadata (setup.py): finished with status 'done'
#10 127.1 Collecting fsspec (from gradio==3.23->fschat->vllm==0.1.2)
#10 127.1 Obtaining dependency information for fsspec from https://files.pythonhosted.org/packages/e3/bd/4c0a4619494188a9db5d77e2100ab7d544a42e76b2447869d8e124e981d8/fsspec-2023.6.0-py3-none-any.whl.metadata
#10 127.1 Downloading fsspec-2023.6.0-py3-none-any.whl.metadata (6.7 kB)
#10 127.3 Collecting markdown-it-py[linkify]>=2.0.0 (from gradio==3.23->fschat->vllm==0.1.2)
#10 127.3 Obtaining dependency information for markdown-it-py[linkify]>=2.0.0 from https://files.pythonhosted.org/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl.metadata
#10 127.3 Downloading markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
#10 127.5 Collecting markupsafe (from gradio==3.23->fschat->vllm==0.1.2)
#10 127.5 Obtaining dependency information for markupsafe from https://files.pythonhosted.org/packages/12/b3/d9ed2c0971e1435b8a62354b18d3060b66c8cb1d368399ec0b9baa7c0ee5/MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 127.5 Using cached MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
#10 127.9 Collecting matplotlib (from gradio==3.23->fschat->vllm==0.1.2)
#10 127.9 Obtaining dependency information for matplotlib from https://files.pythonhosted.org/packages/c2/da/a5622266952ab05dc3995d77689cba600e49ea9d6c51d469c077695cb719/matplotlib-3.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 127.9 Downloading matplotlib-3.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
#10 128.0 Collecting mdit-py-plugins<=0.3.3 (from gradio==3.23->fschat->vllm==0.1.2)
#10 128.0 Downloading mdit_py_plugins-0.3.3-py3-none-any.whl (50 kB)
#10 128.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.5/50.5 kB 8.1 MB/s eta 0:00:00
#10 128.6 Collecting orjson (from gradio==3.23->fschat->vllm==0.1.2)
#10 128.6 Obtaining dependency information for orjson from https://files.pythonhosted.org/packages/a3/13/959dbe9e6cc77a0e50f617b79d49e21d0ac80a16838d4f2d2a172f76f363/orjson-3.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 128.6 Downloading orjson-3.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (49 kB)
#10 128.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.2/49.2 kB 8.9 MB/s eta 0:00:00
#10 129.0 Collecting pandas (from gradio==3.23->fschat->vllm==0.1.2)
#10 129.0 Obtaining dependency information for pandas from https://files.pythonhosted.org/packages/e3/59/35a2892bf09ded9c1bf3804461efe772836a5261ef5dfb4e264ce813ff99/pandas-2.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 129.0 Downloading pandas-2.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
#10 129.6 Collecting pillow (from gradio==3.23->fschat->vllm==0.1.2)
#10 129.6 Obtaining dependency information for pillow from https://files.pythonhosted.org/packages/3d/36/e78f09d510354977e10102dd811e928666021d9c451e05df962d56477772/Pillow-10.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata
#10 129.6 Downloading Pillow-10.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.5 kB)
#10 129.7 Collecting pydub (from gradio==3.23->fschat->vllm==0.1.2)
#10 129.7 Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
#10 129.8 Collecting python-multipart (from gradio==3.23->fschat->vllm==0.1.2)
#10 129.8 Downloading python_multipart-0.0.6-py3-none-any.whl (45 kB)
#10 129.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.7/45.7 kB 6.0 MB/s eta 0:00:00
#10 129.9 Collecting semantic-version (from gradio==3.23->fschat->vllm==0.1.2)
#10 129.9 Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
#10 130.2 Collecting websockets>=10.0 (from gradio==3.23->fschat->vllm==0.1.2)
#10 130.2 Downloading websockets-11.0.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (129 kB)
#10 130.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.9/129.9 kB 22.8 MB/s eta 0:00:00
#10 130.3 Collecting h11>=0.8 (from uvicorn->vllm==0.1.2)
#10 130.3 Downloading h11-0.14.0-py3-none-any.whl (58 kB)
#10 130.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 11.2 MB/s eta 0:00:00
#10 130.6 Collecting wcwidth (from prompt-toolkit>=3.0.0->fschat->vllm==0.1.2)
#10 130.6 Downloading wcwidth-0.2.6-py2.py3-none-any.whl (29 kB)
#10 130.7 Collecting pygments<3.0.0,>=2.13.0 (from rich>=10.0.0->fschat->vllm==0.1.2)
#10 130.8 Downloading Pygments-2.15.1-py3-none-any.whl (1.1 MB)
#10 130.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 69.4 MB/s eta 0:00:00
#10 131.0 Collecting anyio<5,>=3.4.0 (from starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.2)
#10 131.0 Obtaining dependency information for anyio<5,>=3.4.0 from https://files.pythonhosted.org/packages/19/24/44299477fe7dcc9cb58d0a57d5a7588d6af2ff403fdd2d47a246c91a3246/anyio-3.7.1-py3-none-any.whl.metadata
#10 131.0 Downloading anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)
#10 131.3 Collecting certifi (from httpx->fschat->vllm==0.1.2)
#10 131.3 Downloading certifi-2023.5.7-py3-none-any.whl (156 kB)
#10 131.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 157.0/157.0 kB 28.2 MB/s eta 0:00:00
#10 131.4 Collecting httpcore<0.18.0,>=0.15.0 (from httpx->fschat->vllm==0.1.2)
#10 131.4 Obtaining dependency information for httpcore<0.18.0,>=0.15.0 from https://files.pythonhosted.org/packages/94/2c/2bde7ff8dd2064395555220cbf7cba79991172bf5315a07eb3ac7688d9f1/httpcore-0.17.3-py3-none-any.whl.metadata
#10 131.4 Downloading httpcore-0.17.3-py3-none-any.whl.metadata (18 kB)
#10 131.5 Collecting idna (from httpx->fschat->vllm==0.1.2)
#10 131.5 Downloading idna-3.4-py3-none-any.whl (61 kB)
#10 131.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.5/61.5 kB 11.0 MB/s eta 0:00:00
#10 131.6 Collecting sniffio (from httpx->fschat->vllm==0.1.2)
#10 131.6 Downloading sniffio-1.3.0-py3-none-any.whl (10 kB)
#10 131.8 Collecting jsonschema-specifications>=2023.03.6 (from jsonschema->ray>=2.5.1->vllm==0.1.2)
#10 131.8 Obtaining dependency information for jsonschema-specifications>=2023.03.6 from https://files.pythonhosted.org/packages/1c/24/83349ac2189cc2435e84da3f69ba3c97314d3c0622628e55171c6798ed80/jsonschema_specifications-2023.7.1-py3-none-any.whl.metadata
#10 131.8 Downloading jsonschema_specifications-2023.7.1-py3-none-any.whl.metadata (2.8 kB)
#10 131.9 Collecting referencing>=0.28.4 (from jsonschema->ray>=2.5.1->vllm==0.1.2)
#10 131.9 Obtaining dependency information for referencing>=0.28.4 from https://files.pythonhosted.org/packages/ea/c3/f75f0ce2cdacca3d68a70b1756635092a1add1002e34afb4895b9fb62598/referencing-0.30.0-py3-none-any.whl.metadata
#10 132.0 Downloading referencing-0.30.0-py3-none-any.whl.metadata (2.7 kB)
#10 132.3 Collecting rpds-py>=0.7.1 (from jsonschema->ray>=2.5.1->vllm==0.1.2)
#10 132.3 Obtaining dependency information for rpds-py>=0.7.1 from https://files.pythonhosted.org/packages/e2/26/69fd9b7e0ec9c2d710eae3eac5db157f5384b7717f2342596948c14cb6a3/rpds_py-0.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 132.3 Downloading rpds_py-0.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.7 kB)
#10 132.4 Collecting wavedrom (from markdown2[all]->fschat->vllm==0.1.2)
#10 132.5 Downloading wavedrom-2.0.3.post3.tar.gz (137 kB)
#10 132.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 137.7/137.7 kB 17.7 MB/s eta 0:00:00
#10 132.5 Preparing metadata (setup.py): started
#10 136.7 Preparing metadata (setup.py): finished with status 'done'
#10 137.0 Collecting charset-normalizer<4,>=2 (from requests->ray>=2.5.1->vllm==0.1.2)
#10 137.0 Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/a4/65/057bf29660aae6ade0816457f8db4e749e5c0bfa2366eb5f67db9912fa4c/charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 137.1 Downloading charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (31 kB)
#10 137.2 Collecting urllib3<3,>=1.21.1 (from requests->ray>=2.5.1->vllm==0.1.2)
#10 137.2 Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/9b/81/62fd61001fa4b9d0df6e31d47ff49cfa9de4af03adecf339c7bc30656b37/urllib3-2.0.4-py3-none-any.whl.metadata
#10 137.2 Downloading urllib3-2.0.4-py3-none-any.whl.metadata (6.6 kB)
#10 137.3 Collecting mpmath>=0.19 (from sympy->torch>=2.0.0->vllm==0.1.2)
#10 137.3 Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
#10 137.6 Collecting GitPython!=3.1.29,>=1.0.0 (from wandb->fschat->vllm==0.1.2)
#10 137.6 Obtaining dependency information for GitPython!=3.1.29,>=1.0.0 from https://files.pythonhosted.org/packages/67/50/742c2fb60989b76ccf7302c7b1d9e26505d7054c24f08cc7ec187faaaea7/GitPython-3.1.32-py3-none-any.whl.metadata
#10 137.6 Downloading GitPython-3.1.32-py3-none-any.whl.metadata (10.0 kB)
#10 137.8 Collecting sentry-sdk>=1.0.0 (from wandb->fschat->vllm==0.1.2)
#10 137.8 Obtaining dependency information for sentry-sdk>=1.0.0 from https://files.pythonhosted.org/packages/8b/ef/cee575cda78f419a76ac9be4830f136c16bc2d90f00720f03b70bf7d8a6d/sentry_sdk-1.28.1-py2.py3-none-any.whl.metadata
#10 137.8 Downloading sentry_sdk-1.28.1-py2.py3-none-any.whl.metadata (8.8 kB)
#10 137.9 Collecting docker-pycreds>=0.4.0 (from wandb->fschat->vllm==0.1.2)
#10 137.9 Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
#10 138.0 Collecting pathtools (from wandb->fschat->vllm==0.1.2)
#10 138.0 Downloading pathtools-0.1.2.tar.gz (11 kB)
#10 138.0 Preparing metadata (setup.py): started
#10 138.3 Preparing metadata (setup.py): finished with status 'done'
#10 138.4 Collecting setproctitle (from wandb->fschat->vllm==0.1.2)
#10 138.5 Downloading setproctitle-1.3.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30 kB)
#10 138.5 Collecting appdirs>=1.4.3 (from wandb->fschat->vllm==0.1.2)
#10 138.6 Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
#10 138.8 Collecting toolz (from altair>=4.2.0->gradio==3.23->fschat->vllm==0.1.2)
#10 138.8 Downloading toolz-0.12.0-py3-none-any.whl (55 kB)
#10 138.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.8/55.8 kB 9.4 MB/s eta 0:00:00
#10 138.9 Collecting exceptiongroup (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm==0.1.2)
#10 138.9 Obtaining dependency information for exceptiongroup from https://files.pythonhosted.org/packages/fe/17/f43b7c9ccf399d72038042ee72785c305f6c6fdc6231942f8ab99d995742/exceptiongroup-1.1.2-py3-none-any.whl.metadata
#10 139.0 Downloading exceptiongroup-1.1.2-py3-none-any.whl.metadata (6.1 kB)
#10 139.1 Collecting six>=1.4.0 (from docker-pycreds>=0.4.0->wandb->fschat->vllm==0.1.2)
#10 139.1 Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
#10 139.2 Collecting gitdb<5,>=4.0.1 (from GitPython!=3.1.29,>=1.0.0->wandb->fschat->vllm==0.1.2)
#10 139.2 Downloading gitdb-4.0.10-py3-none-any.whl (62 kB)
#10 139.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 kB 9.3 MB/s eta 0:00:00
#10 139.4 Collecting mdurl~=0.1 (from markdown-it-py[linkify]>=2.0.0->gradio==3.23->fschat->vllm==0.1.2)
#10 139.5 Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
#10 139.5 Collecting linkify-it-py<3,>=1 (from markdown-it-py[linkify]>=2.0.0->gradio==3.23->fschat->vllm==0.1.2)
#10 139.6 Downloading linkify_it_py-2.0.2-py3-none-any.whl (19 kB)
#10 139.6 INFO: pip is looking at multiple versions of mdit-py-plugins to determine which version is compatible with other requirements. This could take a while.
#10 139.6 Collecting mdit-py-plugins<=0.3.3 (from gradio==3.23->fschat->vllm==0.1.2)
#10 139.6 Downloading mdit_py_plugins-0.3.2-py3-none-any.whl (50 kB)
#10 139.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.4/50.4 kB 10.4 MB/s eta 0:00:00
#10 139.7 Downloading mdit_py_plugins-0.3.1-py3-none-any.whl (46 kB)
#10 139.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.5/46.5 kB 11.1 MB/s eta 0:00:00
#10 139.8 Downloading mdit_py_plugins-0.3.0-py3-none-any.whl (43 kB)
#10 139.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.7/43.7 kB 7.4 MB/s eta 0:00:00
#10 139.8 Downloading mdit_py_plugins-0.2.8-py3-none-any.whl (41 kB)
#10 139.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.0/41.0 kB 9.4 MB/s eta 0:00:00
#10 139.9 Downloading mdit_py_plugins-0.2.7-py3-none-any.whl (41 kB)
#10 139.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.0/41.0 kB 9.0 MB/s eta 0:00:00
#10 139.9 Downloading mdit_py_plugins-0.2.6-py3-none-any.whl (39 kB)
#10 140.0 Downloading mdit_py_plugins-0.2.5-py3-none-any.whl (39 kB)
#10 140.0 INFO: pip is still looking at multiple versions of mdit-py-plugins to determine which version is compatible with other requirements. This could take a while.
#10 140.0 Downloading mdit_py_plugins-0.2.4-py3-none-any.whl (39 kB)
#10 140.1 Downloading mdit_py_plugins-0.2.3-py3-none-any.whl (39 kB)
#10 140.1 Downloading mdit_py_plugins-0.2.2-py3-none-any.whl (39 kB)
#10 140.2 Downloading mdit_py_plugins-0.2.1-py3-none-any.whl (38 kB)
#10 140.3 Downloading mdit_py_plugins-0.2.0-py3-none-any.whl (38 kB)
#10 140.3 INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
#10 140.3 Downloading mdit_py_plugins-0.1.0-py3-none-any.whl (37 kB)
#10 140.4 Collecting markdown-it-py[linkify]>=2.0.0 (from gradio==3.23->fschat->vllm==0.1.2)
#10 140.4 Downloading markdown_it_py-2.2.0-py3-none-any.whl (84 kB)
#10 140.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 84.5/84.5 kB 11.5 MB/s eta 0:00:00
#10 140.7 Collecting python-dateutil>=2.8.2 (from pandas->gradio==3.23->fschat->vllm==0.1.2)
#10 140.8 Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
#10 140.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 36.1 MB/s eta 0:00:00
#10 140.9 Collecting pytz>=2020.1 (from pandas->gradio==3.23->fschat->vllm==0.1.2)
#10 140.9 Downloading pytz-2023.3-py2.py3-none-any.whl (502 kB)
#10 140.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 502.3/502.3 kB 49.5 MB/s eta 0:00:00
#10 141.0 Collecting tzdata>=2022.1 (from pandas->gradio==3.23->fschat->vllm==0.1.2)
#10 141.0 Downloading tzdata-2023.3-py2.py3-none-any.whl (341 kB)
#10 141.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 341.8/341.8 kB 40.2 MB/s eta 0:00:00
#10 142.0 Collecting multidict<7.0,>=4.5 (from aiohttp->gradio==3.23->fschat->vllm==0.1.2)
#10 142.0 Downloading multidict-6.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
#10 142.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.5/114.5 kB 19.8 MB/s eta 0:00:00
#10 142.1 Collecting async-timeout<5.0,>=4.0.0a3 (from aiohttp->gradio==3.23->fschat->vllm==0.1.2)
#10 142.1 Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
#10 142.5 Collecting yarl<2.0,>=1.0 (from aiohttp->gradio==3.23->fschat->vllm==0.1.2)
#10 142.5 Downloading yarl-1.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (268 kB)
#10 142.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 268.8/268.8 kB 37.4 MB/s eta 0:00:00
#10 142.9 Collecting contourpy>=1.0.1 (from matplotlib->gradio==3.23->fschat->vllm==0.1.2)
#10 142.9 Obtaining dependency information for contourpy>=1.0.1 from https://files.pythonhosted.org/packages/aa/55/02c6d24804592b862b38a85c9b3283edc245081390a520ccd11697b6b24f/contourpy-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 142.9 Downloading contourpy-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.7 kB)
#10 143.0 Collecting cycler>=0.10 (from matplotlib->gradio==3.23->fschat->vllm==0.1.2)
#10 143.1 Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
#10 143.2 Collecting fonttools>=4.22.0 (from matplotlib->gradio==3.23->fschat->vllm==0.1.2)
#10 143.2 Obtaining dependency information for fonttools>=4.22.0 from https://files.pythonhosted.org/packages/e5/3d/000faec66c11733a0bf9f9a3a7b69290329cc8b3799228fe33eb0707dc7b/fonttools-4.41.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
#10 143.3 Downloading fonttools-4.41.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (149 kB)
#10 143.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 149.4/149.4 kB 23.8 MB/s eta 0:00:00
#10 143.4 Collecting kiwisolver>=1.0.1 (from matplotlib->gradio==3.23->fschat->vllm==0.1.2)
#10 143.5 Downloading kiwisolver-1.4.4-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)
#10 143.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 51.0 MB/s eta 0:00:00
#10 143.7 Collecting pyparsing<3.1,>=2.3.1 (from matplotlib->gradio==3.23->fschat->vllm==0.1.2)
#10 143.7 Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
#10 143.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.3/98.3 kB 14.4 MB/s eta 0:00:00
#10 144.0 Collecting mypy-extensions>=0.3.0 (from typing-inspect->pyre-extensions==0.0.29->xformers>=0.0.19->vllm==0.1.2)
#10 144.0 Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
#10 144.1 Collecting svgwrite (from wavedrom->markdown2[all]->fschat->vllm==0.1.2)
#10 144.1 Downloading svgwrite-1.4.3-py3-none-any.whl (67 kB)
#10 144.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.1/67.1 kB 11.6 MB/s eta 0:00:00
#10 144.4 Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->GitPython!=3.1.29,>=1.0.0->wandb->fschat->vllm==0.1.2)
#10 144.4 Downloading smmap-5.0.0-py3-none-any.whl (24 kB)
#10 144.6 Collecting uc-micro-py (from linkify-it-py<3,>=1->markdown-it-py[linkify]>=2.0.0->gradio==3.23->fschat->vllm==0.1.2)
#10 144.6 Downloading uc_micro_py-1.0.2-py3-none-any.whl (6.2 kB)
#10 145.0 Downloading pydantic-1.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
#10 145.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 48.8 MB/s eta 0:00:00
#10 145.1 Downloading ray-2.5.1-cp310-cp310-manylinux2014_x86_64.whl (56.2 MB)
#10 146.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.2/56.2 MB 28.4 MB/s eta 0:00:00
#10 146.1 Downloading numpy-1.25.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.6 MB)
#10 146.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.6/17.6 MB 55.7 MB/s eta 0:00:00
#10 146.4 Downloading transformers-4.31.0-py3-none-any.whl (7.4 MB)
#10 146.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.4/7.4 MB 63.6 MB/s eta 0:00:00
#10 146.6 Downloading xformers-0.0.20-cp310-cp310-manylinux2014_x86_64.whl (109.1 MB)
#10 148.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.1/109.1 MB 21.3 MB/s eta 0:00:00
#10 148.4 Downloading fastapi-0.100.0-py3-none-any.whl (65 kB)
#10 148.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.7/65.7 kB 10.3 MB/s eta 0:00:00
#10 148.4 Downloading uvicorn-0.23.1-py3-none-any.whl (59 kB)
#10 148.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.5/59.5 kB 9.2 MB/s eta 0:00:00
#10 148.5 Downloading click-8.1.6-py3-none-any.whl (97 kB)
#10 148.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB 17.4 MB/s eta 0:00:00
#10 148.5 Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
#10 148.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 268.8/268.8 kB 35.8 MB/s eta 0:00:00
#10 148.6 Downloading prompt_toolkit-3.0.39-py3-none-any.whl (385 kB)
#10 148.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 385.2/385.2 kB 46.1 MB/s eta 0:00:00
#10 148.8 Downloading protobuf-4.23.4-cp37-abi3-manylinux2014_x86_64.whl (304 kB)
#10 148.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 304.5/304.5 kB 27.2 MB/s eta 0:00:00
#10 148.8 Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
#10 148.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 705.5/705.5 kB 43.3 MB/s eta 0:00:00
#10 148.9 Downloading regex-2023.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (770 kB)
#10 148.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 770.4/770.4 kB 43.2 MB/s eta 0:00:00
#10 148.9 Downloading rich-13.4.2-py3-none-any.whl (239 kB)
#10 148.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 239.4/239.4 kB 24.7 MB/s eta 0:00:00
#10 149.0 Downloading starlette-0.27.0-py3-none-any.whl (66 kB)
#10 149.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.0/67.0 kB 8.5 MB/s eta 0:00:00
#10 149.0 Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB)
#10 149.1 Downloading accelerate-0.21.0-py3-none-any.whl (244 kB)
#10 149.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 244.2/244.2 kB 24.4 MB/s eta 0:00:00
#10 149.1 Downloading frozenlist-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (225 kB)
#10 149.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 225.7/225.7 kB 24.4 MB/s eta 0:00:00
#10 149.1 Using cached filelock-3.12.2-py3-none-any.whl (10 kB)
#10 149.2 Downloading httpx-0.24.1-py3-none-any.whl (75 kB)
#10 149.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.4/75.4 kB 9.6 MB/s eta 0:00:00
#10 149.2 Downloading jsonschema-4.18.4-py3-none-any.whl (80 kB)
#10 149.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.0/81.0 kB 11.0 MB/s eta 0:00:00
#10 149.2 Downloading requests-2.31.0-py3-none-any.whl (62 kB)
#10 149.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 8.8 MB/s eta 0:00:00
#10 149.3 Downloading wandb-0.15.5-py3-none-any.whl (2.1 MB)
#10 149.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 53.6 MB/s eta 0:00:00
#10 149.4 Downloading altair-5.0.1-py3-none-any.whl (471 kB)
#10 149.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 471.5/471.5 kB 37.3 MB/s eta 0:00:00
#10 149.4 Downloading anyio-3.7.1-py3-none-any.whl (80 kB)
#10 149.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.9/80.9 kB 11.6 MB/s eta 0:00:00
#10 149.5 Downloading charset_normalizer-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (201 kB)
#10 149.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.8/201.8 kB 22.0 MB/s eta 0:00:00
#10 149.5 Downloading GitPython-3.1.32-py3-none-any.whl (188 kB)
#10 149.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.5/188.5 kB 20.3 MB/s eta 0:00:00
#10 149.6 Downloading httpcore-0.17.3-py3-none-any.whl (74 kB)
#10 149.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.5/74.5 kB 9.5 MB/s eta 0:00:00
#10 149.6 Downloading jsonschema_specifications-2023.7.1-py3-none-any.whl (17 kB)
#10 149.6 Using cached MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
#10 149.7 Downloading pandas-2.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
#10 149.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.3/12.3 MB 65.4 MB/s eta 0:00:00
#10 149.9 Downloading referencing-0.30.0-py3-none-any.whl (25 kB)
#10 149.9 Downloading rpds_py-0.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
#10 149.9 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 66.4 MB/s eta 0:00:00
#10 150.0 Downloading sentry_sdk-1.28.1-py2.py3-none-any.whl (214 kB)
#10 150.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 214.7/214.7 kB 27.9 MB/s eta 0:00:00
#10 150.0 Downloading urllib3-2.0.4-py3-none-any.whl (123 kB)
#10 150.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.9/123.9 kB 19.4 MB/s eta 0:00:00
#10 150.1 Downloading aiohttp-3.8.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
#10 150.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 64.9 MB/s eta 0:00:00
#10 150.2 Using cached cmake-3.27.0-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB)
#10 150.3 Downloading fsspec-2023.6.0-py3-none-any.whl (163 kB)
#10 150.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 33.2 MB/s eta 0:00:00
#10 150.3 Downloading markdown2-2.4.9-py2.py3-none-any.whl (39 kB)
#10 150.3 Downloading matplotlib-3.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
#10 150.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 61.7 MB/s eta 0:00:00
#10 150.5 Downloading Pillow-10.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
#10 150.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 55.2 MB/s eta 0:00:00
#10 150.6 Downloading orjson-3.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (138 kB)
#10 150.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 138.7/138.7 kB 20.0 MB/s eta 0:00:00
#10 150.7 Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
#10 150.7 Downloading contourpy-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (300 kB)
#10 150.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 300.7/300.7 kB 34.1 MB/s eta 0:00:00
#10 150.8 Downloading fonttools-4.41.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
#10 150.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 72.7 MB/s eta 0:00:00
#10 150.9 Downloading exceptiongroup-1.1.2-py3-none-any.whl (14 kB)
#10 151.4 Building wheels for collected packages: vllm, ffmpy, pathtools, wavedrom
#10 151.4 Building editable for vllm (pyproject.toml): started
#10 217.8 Building editable for vllm (pyproject.toml): still running...
#10 299.4 Building editable for vllm (pyproject.toml): still running...
#10 360.2 Building editable for vllm (pyproject.toml): still running...
#10 422.2 Building editable for vllm (pyproject.toml): still running...
#10 484.0 Building editable for vllm (pyproject.toml): still running...
#10 485.2 Building editable for vllm (pyproject.toml): finished with status 'done'
#10 485.2 Created wheel for vllm: filename=vllm-0.1.2-0.editable-cp310-cp310-linux_x86_64.whl size=8447 sha256=bb9cb10f442e5c705e8e675ae3a42a75c0f7e21c5bfc235739098dc4750bcd76
#10 485.2 Stored in directory: /tmp/pip-ephem-wheel-cache-21f6p5js/wheels/61/90/8e/df32f4c5b947476bbfa504d343595e2cd0c99019c48e878a86
#10 485.2 Building wheel for ffmpy (setup.py): started
#10 485.5 Building wheel for ffmpy (setup.py): finished with status 'done'
#10 485.5 Created wheel for ffmpy: filename=ffmpy-0.3.1-py3-none-any.whl size=5596 sha256=7778ed64cc7cb26d23f1279c00efcf1a383239f9d1e3658f1701c4d1452b53eb
#10 485.5 Stored in directory: /home/vllm/.cache/pip/wheels/01/a6/d1/1c0828c304a4283b2c1639a09ad86f83d7c487ef34c6b4a1bf
#10 485.5 Building wheel for pathtools (setup.py): started
#10 485.8 Building wheel for pathtools (setup.py): finished with status 'done'
#10 485.8 Created wheel for pathtools: filename=pathtools-0.1.2-py3-none-any.whl size=8806 sha256=89a04102b6eb416abc0ddbe3063db4db5c75082c9705673643bd960fe2ca3c68
#10 485.8 Stored in directory: /home/vllm/.cache/pip/wheels/e7/f3/22/152153d6eb222ee7a56ff8617d80ee5207207a8c00a7aab794
#10 485.8 Building wheel for wavedrom (setup.py): started
#10 486.2 Building wheel for wavedrom (setup.py): finished with status 'done'
#10 486.2 Created wheel for wavedrom: filename=wavedrom-2.0.3.post3-py2.py3-none-any.whl size=29952 sha256=25e7b144b1a9c01ee035a6b62130e94a0fcd212628968c5b1d5b2e0f8509ab93
#10 486.2 Stored in directory: /home/vllm/.cache/pip/wheels/9c/52/8c/38b454b42f712f325e26f633287484c7dc1ad469e1580c5954
#10 486.2 Successfully built vllm ffmpy pathtools wavedrom
#10 487.5 Installing collected packages: wcwidth, tokenizers, sentencepiece, safetensors, pytz, pydub, pathtools, ninja, msgpack, mpmath, lit, ffmpy, cmake, appdirs, websockets, urllib3, uc-micro-py, tzdata, typing-extensions, tqdm, toolz, sympy, svgwrite, sniffio, smmap, six, shortuuid, setproctitle, semantic-version, rpds-py, regex, pyyaml, python-multipart, pyparsing, pygments, psutil, protobuf, prompt-toolkit, pillow, packaging, orjson, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, mypy-extensions, multidict, mdurl, markupsafe, markdown2, kiwisolver, idna, h11, grpcio, fsspec, frozenlist, fonttools, filelock, exceptiongroup, cycler, click, charset-normalizer, certifi, attrs, async-timeout, aiofiles, yarl, wavedrom, uvicorn, typing-inspect, sentry-sdk, requests, referencing, python-dateutil, pydantic, nvidia-cusolver-cu11, nvidia-cudnn-cu11, markdown-it-py, linkify-it-py, jinja2, gitdb, docker-pycreds, contourpy, anyio, aiosignal, starlette, rich, pyre-extensions, pandas, mdit-py-plugins, matplotlib, jsonschema-specifications, huggingface-hub, httpcore, GitPython, aiohttp, wandb, transformers, jsonschema, httpx, fastapi, ray, altair, gradio, triton, torch, accelerate, xformers, fschat, vllm
#10 542.4 Successfully installed GitPython-3.1.32 accelerate-0.21.0 aiofiles-23.1.0 aiohttp-3.8.5 aiosignal-1.3.1 altair-5.0.1 anyio-3.7.1 appdirs-1.4.4 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.2.0 click-8.1.6 cmake-3.27.0 contourpy-1.1.0 cycler-0.11.0 docker-pycreds-0.4.0 exceptiongroup-1.1.2 fastapi-0.100.0 ffmpy-0.3.1 filelock-3.12.2 fonttools-4.41.0 frozenlist-1.4.0 fschat-0.2.3 fsspec-2023.6.0 gitdb-4.0.10 gradio-3.23.0 grpcio-1.51.3 h11-0.14.0 httpcore-0.17.3 httpx-0.24.1 huggingface-hub-0.16.4 idna-3.4 jinja2-3.1.2 jsonschema-4.18.4 jsonschema-specifications-2023.7.1 kiwisolver-1.4.4 linkify-it-py-2.0.2 lit-16.0.6 markdown-it-py-2.2.0 markdown2-2.4.9 markupsafe-2.1.3 matplotlib-3.7.2 mdit-py-plugins-0.3.3 mdurl-0.1.2 mpmath-1.3.0 msgpack-1.0.5 multidict-6.0.4 mypy-extensions-1.0.0 networkx-3.1 ninja-1.11.1 numpy-1.25.1 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 orjson-3.9.2 packaging-23.1 pandas-2.0.3 pathtools-0.1.2 pillow-10.0.0 prompt-toolkit-3.0.39 protobuf-4.23.4 psutil-5.9.5 pydantic-1.10.11 pydub-0.25.1 pygments-2.15.1 pyparsing-3.0.9 pyre-extensions-0.0.29 python-dateutil-2.8.2 python-multipart-0.0.6 pytz-2023.3 pyyaml-6.0.1 ray-2.5.1 referencing-0.30.0 regex-2023.6.3 requests-2.31.0 rich-13.4.2 rpds-py-0.9.2 safetensors-0.3.1 semantic-version-2.10.0 sentencepiece-0.1.99 sentry-sdk-1.28.1 setproctitle-1.3.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.0 sniffio-1.3.0 starlette-0.27.0 svgwrite-1.4.3 sympy-1.12 tokenizers-0.13.3 toolz-0.12.0 torch-2.0.1 tqdm-4.65.0 transformers-4.31.0 triton-2.0.0 typing-extensions-4.7.1 typing-inspect-0.9.0 tzdata-2023.3 uc-micro-py-1.0.2 urllib3-2.0.4 uvicorn-0.23.1 vllm-0.1.2 wandb-0.15.5 wavedrom-2.0.3.post3 wcwidth-0.2.6 websockets-11.0.3 xformers-0.0.20 yarl-1.9.2
#10 544.6 Files removed: 328
#10 DONE 546.1s

This has really confused me because this was working fine until recently, and I can't see any commits in vLLM that look like they should affect this. Unless it's a change in FastChat that's broken it?

Thanks in advance for any help.

@TheBloke
Copy link
Author

TheBloke commented Jul 20, 2023

OK yes it must be recent commits in FastChat that broke it.

I can fix the error by manually rolling FastChat back to an earlier commit, with:

pip3 install git+https://github.com/lm-sys/FastChat@a2faf8fd44de8c58e0ce78a390a181765f0fe305

So my Dockerfile is now:

RUN git clone https://github.com/vllm-project/vllm.git && \
    cd vllm && \
    pip3 install -e . && \
    pip3 install git+https://github.com/lm-sys/FastChat@a2faf8fd44de8c58e0ce78a390a181765f0fe305 && \
    # Older FastChat commit downgrades Transformers, so need to re-upgrade this for Llama 2 support
    pip3 install git+https://github.com/huggingface/transformers accelerate==0.21.0 && \
    pip3 cache purge

Now it seems to work fine.

@gesanqiu
Copy link
Contributor

gesanqiu commented Jul 21, 2023

This should be the requirement issue, LLaMA requires transformers version >= 4.30.0 and LLaMA-2 requires >= 4.31.0, while FastChat requires >= 4.28.0 < 4.29.0. so when you update install vLLM it will uninstall FastChat because of the requirement conflict.
I opened an issue to FastChat to tell this, but FastChat didn't move on it before clarify the difference between different version of transformers.

@trannhatquy
Copy link

@TheBloke I do exactly the same as you and when I use the api to do inference it raises an error: "openai.error.APIConnectionError: Error communicating with OpenAI: HTTPConnectionPool(host='localhost', port=8888): Max retries exceeded with url: /v1/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe7d8208df0>: Failed to establish a new connection: [Errno 111] Connection refused'))"
Do you have any solutions ? Thank you so much

@merrymercy
Copy link
Contributor

The transformers version in fastchat has been updated lm-sys/FastChat#2016.
A pypi package will be available soon

@TheBloke
Copy link
Author

Thank you!

@trannhatquy
Copy link

@TheBloke run the opt 125M model takes 21GB on RTX 6000 gpu, I think it doesn't make sense, do you see the same situation ?

@zhuohan123
Copy link
Member

@TheBloke run the opt 125M model takes 21GB on RTX 6000 gpu, I think it doesn't make sense, do you see the same situation ?

Please refer to this discussion for the memory issue: #241

rickyyx pushed a commit to rickyyx/vllm that referenced this issue Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants