This repository was archived by the owner on Oct 11, 2024. It is now read-only.
forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 9
Add lm-eval correctness test #210
Merged
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
952a0db
Add test framework for server
dbarbuzzi 6178aea
Update docstring
dbarbuzzi c13b5c2
Add missing '__init__.py'
dbarbuzzi df48eef
In-line updated` ServerRunner` implementation
dbarbuzzi 09f7161
Restore logging of server command args
dbarbuzzi 2b32a92
Add lm-eval correctness test
dbarbuzzi 74d0293
Add "--max-model-len" arg
dbarbuzzi 4f6a5cf
Adjust relative tolerance value to 0.05
dbarbuzzi 7392992
Change '--max-model-len' to 2048
dbarbuzzi 3ebcc81
Fix comment length, remove outdated comment
dbarbuzzi a790a1f
Update comment
dbarbuzzi 431f051
Skip if `lm_eval` is not available
dbarbuzzi 44d781f
Merge branch 'main' into add-lm-eval-correctness-test
dbarbuzzi 6856f24
Skip test in remote push jobs
dbarbuzzi dc33cee
Fix check in lm-eval smoke test
dbarbuzzi 9bf3a71
Update lm-eval smoke job to use prebuilt wheel
dbarbuzzi c914b36
Fix typing in test
dbarbuzzi da1adf2
Add lm-eval-full job on release runs
dbarbuzzi 473f8ee
Skip full test in nightly
dbarbuzzi f316375
Fix style
dbarbuzzi c61d6b2
Update eval task configs
dbarbuzzi 44df6ad
Add support for configurable `rtol`
dbarbuzzi 7a1ecdf
Mark 'chat-marlin' model as xfail
dbarbuzzi 49d115b
Use correct label for TEST-LM-EVAL-FULL
dbarbuzzi d6571d4
Only run full lm-eval on a weekly cadence
dbarbuzzi 3b25154
Update naming
dbarbuzzi 5308642
Add manual release workflow
dbarbuzzi 4471031
Remove xfail logic
dbarbuzzi e972635
Fix release workflow category
dbarbuzzi 73adc9f
Disable marlin models
dbarbuzzi 638d924
Separate nightly/weekly workflows
dbarbuzzi 9828633
Additional fix for lm-eval smoke check
dbarbuzzi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
neuralmagic/benchmarks/configs/benchmark_serving.json | ||
neuralmagic/benchmarks/configs/benchmark_throughput.json | ||
neuralmagic/benchmarks/configs/benchmark_throughput_decode.json | ||
neuralmagic/benchmarks/configs/benchmark_throughput_prefill.json | ||
neuralmagic/benchmarks/configs/benchmark_remote_push.json |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.