Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend] Set server's maximum number of generated tokens using generation_config.json #12242

Merged
merged 34 commits into from
Jan 26, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5c85448
Adding max_new_tokens support to generation_config.json
mhendrey Jan 20, 2025
4ad6b45
Changed default_max_tokens to server_max_tokens
mhendrey Jan 20, 2025
95f9c97
Renamed default_max_tokens to server_max_tokens
mhendrey Jan 20, 2025
4786e56
Removed the float("inf") bug
mhendrey Jan 20, 2025
4980a73
Renamed default_max_tokens to server_max_tokens
mhendrey Jan 20, 2025
39d7d76
Rearranged lines to make the changes with existing as small as possible
mhendrey Jan 20, 2025
b6a24c4
Limit generated tokens by server's max_tokens setting when available
mhendrey Jan 20, 2025
aa7cff1
Changed syntax to pass format.sh tests
mhendrey Jan 20, 2025
2f6e43b
[Bugfix] Fix num_heads value for simple connector when tp enabled (#1…
ShangmingCai Jan 20, 2025
6baa0ea
[torch.compile] fix sym_tensor_indices (#12191)
youkaichao Jan 20, 2025
35b5948
Move linting to `pre-commit` (#11975)
hmellor Jan 20, 2025
0c2f332
[DOC] Fix typo in docstring and assert message (#12194)
terrytangyuan Jan 20, 2025
46249e5
[DOC] Add missing docstring in LLMEngine.add_request() (#12195)
terrytangyuan Jan 20, 2025
0b2e3de
[Bugfix] Fix incorrect types in LayerwiseProfileResults (#12196)
terrytangyuan Jan 20, 2025
090eca3
[Model] Add Qwen2 PRM model support (#12202)
Isotr0py Jan 20, 2025
5d36c1f
[Core] Interface for accessing model from `VllmRunner` (#10353)
DarkLight1337 Jan 20, 2025
df331a7
[misc] add placeholder format.sh (#12206)
youkaichao Jan 20, 2025
881964d
[CI/Build] Remove dummy CI steps (#12208)
DarkLight1337 Jan 20, 2025
5cc6a09
[CI/Build] Make pre-commit faster (#12212)
DarkLight1337 Jan 20, 2025
9f3d5a6
[Model] Upgrade Aria to transformers 4.48 (#12203)
DarkLight1337 Jan 20, 2025
957ca23
[misc] print a message to suggest how to bypass commit hooks (#12217)
youkaichao Jan 20, 2025
399d224
[core][bugfix] configure env var during import vllm (#12209)
youkaichao Jan 20, 2025
df06503
[V1] Remove `_get_cache_block_size` (#12214)
heheda12345 Jan 20, 2025
b89529b
[Misc] Pass `attention` to impl backend (#12218)
wangxiyuan Jan 20, 2025
a5d57f1
[Bugfix] Fix `HfExampleModels.find_hf_info` (#12223)
DarkLight1337 Jan 20, 2025
b1af379
[CI] Pass local python version explicitly to pre-commit mypy.sh (#12224)
heheda12345 Jan 20, 2025
0e3a719
Added tests to check max_tokens is properly set
mhendrey Jan 23, 2025
6867b37
Merge branch 'server_max_tokens'
mhendrey Jan 23, 2025
99243cf
Mucked up the rebasing. Fixing that now.
mhendrey Jan 23, 2025
1a15431
Reverting the serving_chat & serving_completion back and putting all …
mhendrey Jan 23, 2025
c10eb1f
Didn't quite revert back. Deleting empty line from both
mhendrey Jan 23, 2025
a3fc62b
Changed to using one-liner and edited engine arg for generation-config
mhendrey Jan 24, 2025
98949f6
Merge branch 'vllm-project:main' into main
mhendrey Jan 24, 2025
c71f429
Converted to a one-liner for taking minimum value & added to generati…
mhendrey Jan 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[CI/Build] Make pre-commit faster (#12212)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Matthew Hendrey <matthew.hendrey@gmail.com>
  • Loading branch information
DarkLight1337 authored and mhendrey committed Jan 23, 2025
commit 5cc6a09ffcf30548f5ec1ba0a1779c0e1087f0da
2 changes: 2 additions & 0 deletions .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,5 @@ jobs:
python-version: "3.12"
- run: echo "::add-matcher::.github/workflows/matchers/actionlint.json"
- uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
with:
extra_args: --hook-stage manual
16 changes: 15 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
default_stages:
- pre-commit # Run locally
- manual # Run in CI
repos:
- repo: https://github.com/google/yapf
rev: v0.32.0
Expand Down Expand Up @@ -33,30 +36,41 @@ repos:
files: docs/.*
- repo: local
hooks:
- id: mypy-local
name: Run mypy for local Python installation
entry: tools/mypy.sh
language: python
types: [python]
additional_dependencies: &mypy_deps [mypy==1.11.1, types-setuptools, types-PyYAML, types-requests]
stages: [pre-commit] # Don't run in CI
- id: mypy-3.9 # TODO: Use https://github.com/pre-commit/mirrors-mypy when mypy setup is less awkward
name: Run mypy for Python 3.9
entry: tools/mypy.sh 1 "3.9"
language: python
types: [python]
additional_dependencies: &mypy_deps [mypy==1.11.1, types-setuptools, types-PyYAML, types-requests]
additional_dependencies: *mypy_deps
stages: [manual] # Only run in CI
- id: mypy-3.10 # TODO: Use https://github.com/pre-commit/mirrors-mypy when mypy setup is less awkward
name: Run mypy for Python 3.10
entry: tools/mypy.sh 1 "3.10"
language: python
types: [python]
additional_dependencies: *mypy_deps
stages: [manual] # Only run in CI
- id: mypy-3.11 # TODO: Use https://github.com/pre-commit/mirrors-mypy when mypy setup is less awkward
name: Run mypy for Python 3.11
entry: tools/mypy.sh 1 "3.11"
language: python
types: [python]
additional_dependencies: *mypy_deps
stages: [manual] # Only run in CI
- id: mypy-3.12 # TODO: Use https://github.com/pre-commit/mirrors-mypy when mypy setup is less awkward
name: Run mypy for Python 3.12
entry: tools/mypy.sh 1 "3.12"
language: python
types: [python]
additional_dependencies: *mypy_deps
stages: [manual] # Only run in CI
- id: shellcheck
name: Lint shell scripts
entry: tools/shellcheck.sh
Expand Down