-
Notifications
You must be signed in to change notification settings - Fork 504
Support for token-in vLLM endpoint #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+404
−51
Merged
Changes from all commits
Commits
Show all changes
84 commits
Select commit
Hold shift + click to select a range
0406268
skeleton of token in multi-turn conversation
mikasenghaas b584f0e
fix url parsing
mikasenghaas bf88a32
allow configuring in eval entrypoint
mikasenghaas 4a2c8a8
raise for status
mikasenghaas 046e433
set sampling args and print warning
mikasenghaas c0e680f
correctly process tools
mikasenghaas 16d68c5
fix eval cli test
mikasenghaas 1d004fe
use setter
mikasenghaas 6389aff
remove prints
mikasenghaas eb4d702
create a oai client copy with diff base url and use post from there t…
mikasenghaas 866384a
also use oai post routes for tokenize
mikasenghaas 94e394b
use incremental tokenization trick again
mikasenghaas b404171
suffix trick
mikasenghaas 56efbfb
only call env_response once
mikasenghaas c9c28d0
only check suffix if and eom token present
mikasenghaas 1c32891
parallelize tokenization calls
mikasenghaas f16797c
avoid redundant concat
mikasenghaas d8ccee2
fix typo
mikasenghaas 96b9a98
move use token prompts into state
mikasenghaas 1276711
add use token prompts to metadata
mikasenghaas 1e94571
removed docstring
mikasenghaas 6ecbae5
merge logic into get_model_response to avoid redundant code
mikasenghaas 0b3fca6
abstract tokenize_vllm and do not use token prompts on initial turn
mikasenghaas 1e0bf29
allow local tokenization to save http overhead
mikasenghaas 2eb4518
rename to utp
mikasenghaas 3d12cf4
better best case complexity for find_lst_index
mikasenghaas 73bf1d9
configure exact tokenization
mikasenghaas b427069
fix caches + one run local tokenizer in process pool
mikasenghaas 5631365
add debug logs
mikasenghaas d71199b
generate client
mikasenghaas fbd9396
reverse conditional
mikasenghaas 79f397d
use thread pool
mikasenghaas 4e847b8
fix caching bug
mikasenghaas 22daf23
cleanup
mikasenghaas 7b25189
more workers in threadpool
mikasenghaas 41c7a29
do exact tokenization
mikasenghaas e0da0fa
bring back setter
mikasenghaas bf19bd3
match signature
mikasenghaas f6ed5e2
setup state from class attr
mikasenghaas 8c6c4e2
also set tokenize method
mikasenghaas ee4e858
avoid mutation
mikasenghaas 1a5d193
larger default thread pool
mikasenghaas cb33caf
make exact tokenization configurable
mikasenghaas 9df126d
fix passing exact tokenization
mikasenghaas e4f4636
use client directly on /v1/chat/completions/tokens route
mikasenghaas dd504cc
add timing [revert later]
mikasenghaas 08e8f07
Revert "add timing [revert later]"
mikasenghaas c942c44
add exact tokenization in rollout
mikasenghaas d931786
make tokens prompt arg none by default
mikasenghaas 67bd8cc
compute + cache suffix ids in non-exact tokenization and make it the …
mikasenghaas 7edddd9
fix tests
mikasenghaas 13be6c9
fix ty
mikasenghaas 329a7ec
fix caching edge case with truncated turns
mikasenghaas 237fb8b
correctly support completions tokenization req
mikasenghaas ca1dc2d
shorter warning
mikasenghaas 71bb1a3
fix ty
mikasenghaas bc113be
only support setting token prompt args via class attrs/ setters
mikasenghaas fd3b726
do not tokenize with tools in non exact mode
mikasenghaas 3eb3643
fix overlap
mikasenghaas 00e39e7
fix adding suffix
mikasenghaas 5d1c7a7
remove tokenize_method
mikasenghaas 9b4d2d6
move tokenize method
mikasenghaas d5317e5
deprecate exact tokenization
mikasenghaas 2864e8e
move building prompt_ids into get_model_respsonse
mikasenghaas b1125d1
using generic extra env kwargs setter
mikasenghaas d7a0a7a
fix tests
mikasenghaas 405ce8c
revert env group changes
mikasenghaas ace0ec1
revert moving line
mikasenghaas d416a6e
remove unused logger
mikasenghaas 8a1c2a5
revert changing state
mikasenghaas d4b44b6
abstract overlong prompt handling into decorator
mikasenghaas 3926b10
rename to get_model_response_with_messages
mikasenghaas 1b1e0c1
fix tokens_client caching
mikasenghaas c596bb9
fix typo
mikasenghaas 129fb43
more typo
mikasenghaas 451d1a2
more accurate comment
mikasenghaas 3759f7f
allow setter
mikasenghaas e36460b
fix url parsing edge case
mikasenghaas 67d98b1
support generic setters
mikasenghaas 3233ed4
move method
mikasenghaas ad15055
more readable find_last_index
mikasenghaas f4c44e5
fix
mikasenghaas 4185685
remove the is not recommended for general use
mikasenghaas 75fa695
fix find last index again
mikasenghaas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.