Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] OSS LLM Serving #50643

Merged
merged 61 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
b42f419
WIP
GeneDer Feb 16, 2025
f7df29e
WIP
GeneDer Feb 16, 2025
b2e0b5c
WIP
GeneDer Feb 16, 2025
a6065e0
WIP
GeneDer Feb 16, 2025
b5740de
WIP
GeneDer Feb 16, 2025
03c8404
WIP
GeneDer Feb 16, 2025
85c8975
WIP
GeneDer Feb 16, 2025
ceb2534
WIP: code complete, pending tests
GeneDer Feb 16, 2025
46951b4
lint
GeneDer Feb 16, 2025
3a09df9
lint
GeneDer Feb 16, 2025
df87d16
add asyncache dependency
GeneDer Feb 16, 2025
030a0bc
try again
GeneDer Feb 16, 2025
314c44e
move error handling into configs
GeneDer Feb 16, 2025
6de69a6
fix typo and LLMModelRouterDeployment
GeneDer Feb 16, 2025
31a70c9
refactor build_vllm_deployment
GeneDer Feb 16, 2025
130d094
lint
GeneDer Feb 16, 2025
f63a808
fix imports in test
GeneDer Feb 16, 2025
f6c9541
drop generation config and use hf prompt template by default
GeneDer Feb 16, 2025
a0d3dd7
fix test imports
GeneDer Feb 16, 2025
b8db1f0
test build_openai_app
GeneDer Feb 16, 2025
b07b2c4
add test for using serve run
GeneDer Feb 17, 2025
59b3490
add test for build_vllm_deployment
GeneDer Feb 17, 2025
0ae54ad
add test for vllm engine
GeneDer Feb 17, 2025
cf84767
add test for telemetry
GeneDer Feb 17, 2025
7782178
add test for image retriver
GeneDer Feb 17, 2025
481186f
add streaming_error_handler test
GeneDer Feb 17, 2025
428e223
add tests for lora
GeneDer Feb 17, 2025
d749f89
add integration tests
GeneDer Feb 17, 2025
94ab191
drop asyncache requirement
GeneDer Feb 17, 2025
dc2ebff
ensure all tests are ran
GeneDer Feb 17, 2025
48693a1
test assert False
GeneDer Feb 17, 2025
0276326
try to build with all conftest files
GeneDer Feb 17, 2025
6fc94a6
include yamls to the test
GeneDer Feb 17, 2025
69779da
maybe should be this
GeneDer Feb 17, 2025
57c4642
Revert "maybe should be this"
GeneDer Feb 17, 2025
b496f45
fix some tests and split out longer running ones
GeneDer Feb 17, 2025
24ede65
add back depencencies changes
GeneDer Feb 17, 2025
a86e16a
fix buld file
GeneDer Feb 17, 2025
afd00d9
address comments
GeneDer Feb 17, 2025
76914ee
fix import
GeneDer Feb 17, 2025
26bbf49
fix more imports
GeneDer Feb 17, 2025
31deae2
more fixes and update test depencencies
GeneDer Feb 17, 2025
bcd3618
more fixes and split out the integration test into a large gpu instance
GeneDer Feb 17, 2025
b955e0e
go back to cpu
GeneDer Feb 17, 2025
a13f987
test upgrade test requirements
GeneDer Feb 18, 2025
792f507
Merge branch 'master' into oss-rayllm
GeneDer Feb 18, 2025
1963d4a
update requirements_compiled.txt
GeneDer Feb 18, 2025
2d238ed
update compiled requirements
GeneDer Feb 18, 2025
5eb5c2c
update non-test compiled requirements
GeneDer Feb 18, 2025
3d6f887
Merge branch 'master' into oss-rayllm
GeneDer Feb 18, 2025
bc60c2e
fix all download_model_ckpt
GeneDer Feb 18, 2025
e097712
fix hf_prompt_format test
GeneDer Feb 18, 2025
23bab42
test using GPU
GeneDer Feb 18, 2025
b49adfe
add botocore as requirement
GeneDer Feb 18, 2025
8e4df55
add boto3
GeneDer Feb 18, 2025
b6200bb
add async_timeout
GeneDer Feb 18, 2025
9825678
Merge branch 'master' into oss-rayllm
GeneDer Feb 18, 2025
c5fc505
add TODOs on removing dependencies
GeneDer Feb 18, 2025
144dd5f
drop dependencies changes
GeneDer Feb 18, 2025
7846937
Merge branch 'master' into oss-rayllm
GeneDer Feb 18, 2025
e2ac46c
drop python/__init__.py
GeneDer Feb 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add back depencencies changes
Signed-off-by: Gene Su <e870252314@gmail.com>
  • Loading branch information
GeneDer committed Feb 17, 2025
commit 24ede65d04b024b3d02098a116c65501509739d0
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import json
import subprocess
import time
from typing import Any, Dict, List, Optional, Tuple, Union

from asyncache import cached
from cachetools import TLRUCache
from fastapi import HTTPException
from filelock import FileLock

Expand Down Expand Up @@ -190,6 +193,14 @@ def _get_object_from_cloud(object_uri: str) -> Union[str, object]:
return body_str


@cached(
cache=TLRUCache(
maxsize=4096,
getsizeof=lambda x: 1,
ttu=_validate_model_ttu,
timer=time.monotonic,
)
)
async def get_object_from_cloud(object_uri: str) -> Union[str, object]:
"""Calls _get_object_from_cloud with caching.

Expand Down
2 changes: 2 additions & 0 deletions python/requirements/llm/llm-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
# Keep this in sync with the definition in setup.py for ray[llm]
vllm>=0.7.2
asyncache==0.3.1
jsonref==1.1.0
11 changes: 11 additions & 0 deletions python/requirements_compiled_rayllm_py311.txt
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,12 @@ astor==0.8.1 \
# via
# -c python/requirements_compiled_rayllm_test_py311.txt
# depyf
asyncache==0.3.1 \
--hash=sha256:9a1e60a75668e794657489bdea6540ee7e3259c483517b934670db7600bf5035 \
--hash=sha256:ef20a1024d265090dd1e0785c961cf98b9c32cc7d9478973dcf25ac1b80011f5
# via
# -c python/requirements_compiled_rayllm_test_py311.txt
# -r python/requirements/llm/llm-requirements.txt
attrs==25.1.0 \
--hash=sha256:1c97078a80c814273a76b2a298a932eb681c87415c11dee0a6921de7f1b02c3e \
--hash=sha256:c75a69e28a550a7e93789579c22aa26b0f5b83b75dc4e08fe092980051e1090a
Expand Down Expand Up @@ -247,6 +253,7 @@ cachetools==5.3.2 \
--hash=sha256:861f35a13a451f94e301ce2bec7cac63e881232ccce7ed67fab9b5df4d3beaa1
# via
# -c python/requirements_compiled_rayllm_test_py311.txt
# asyncache
# google-auth
certifi==2023.11.17 \
--hash=sha256:9b469f3a900bf28dc19b8cfbf8019bf47f7fdd1a65a1d4ffb98fc14166beb4d1 \
Expand Down Expand Up @@ -1058,6 +1065,10 @@ jiter==0.8.2 \
# via
# -c python/requirements_compiled_rayllm_test_py311.txt
# openai
jsonref==1.1.0 \
--hash=sha256:32fe8e1d85af0fdefbebce950af85590b22b60f9e95443176adbde4e1ecea552 \
--hash=sha256:590dc7773df6c21cbf948b5dac07a72a251db28b0238ceecce0a2abfa8ec30a9
# via -r python/requirements/llm/llm-requirements.txt
jsonschema==4.23.0 \
--hash=sha256:d71497fef26351a33265337fa77ffeb82423f3ea21283cd9467bb03999266bc4 \
--hash=sha256:fbadb6f8b144a8f8cf9f0b89ba94501d143e50411a1278633f56a7acf7fd5566
Expand Down
7 changes: 6 additions & 1 deletion python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,12 @@ def get_packages(self):
#
# Keep this in sync with python/requirements/llm/llm-requirements.txt
#
setup_spec.extras["llm"] = list(set(["vllm>=0.7.2"] + setup_spec.extras["data"]))
setup_spec.extras["llm"] = list(
set(
["vllm>=0.7.2", "asyncache==0.3.1", "jsonref==1.1.0"]
+ setup_spec.extras["data"]
)
)

# These are the main dependencies for users of ray. This list
# should be carefully curated. If you change it, please reflect
Expand Down