Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VLM] Support caching in merged multi-modal processor #11396

Merged
merged 82 commits into from
Dec 27, 2024
Merged
Changes from 1 commit
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
faa9b84
Refactor multi-modal processor to support caching
DarkLight1337 Dec 19, 2024
9711a15
Clean up
DarkLight1337 Dec 19, 2024
29e3fcd
Fix cached result being mutated
DarkLight1337 Dec 19, 2024
ab64e85
Rename
DarkLight1337 Dec 19, 2024
81215a2
Fix docs
DarkLight1337 Dec 19, 2024
cf52b3b
Fix a typo
DarkLight1337 Dec 19, 2024
a4a8eb9
Fix unhandled sampling rate in initialization
DarkLight1337 Dec 19, 2024
c48f7c5
format
DarkLight1337 Dec 19, 2024
b84ff42
Change the delimiter
DarkLight1337 Dec 19, 2024
c3f1bde
Fix extra dimension
DarkLight1337 Dec 19, 2024
32e5197
Update
DarkLight1337 Dec 19, 2024
7264d4e
Use the inner processor to enable fine-grained caching
DarkLight1337 Dec 20, 2024
02ea829
Make the cache optional
DarkLight1337 Dec 20, 2024
b981a9d
Fix invalid kwargs being passed to tokenizer
DarkLight1337 Dec 20, 2024
5dde7d0
Fix Phi3V prompt replacement
DarkLight1337 Dec 20, 2024
7339ab8
Refine
DarkLight1337 Dec 20, 2024
509411d
Enable fine-grained caching for audio models
DarkLight1337 Dec 20, 2024
c0454f5
Add fallback
DarkLight1337 Dec 20, 2024
d50ef03
Fix typo
DarkLight1337 Dec 20, 2024
81f7d61
Fix video processor for Qwen2-VL
DarkLight1337 Dec 20, 2024
13eede3
Merge branch 'main' into mm-processor-cache
DarkLight1337 Dec 20, 2024
affbc5c
Fix a bunch of type errors
DarkLight1337 Dec 20, 2024
b4ddfb1
Fix qwen2-vl
DarkLight1337 Dec 20, 2024
4b3db32
Fix
DarkLight1337 Dec 20, 2024
dafbc7f
Simplify Pixtral-HF
DarkLight1337 Dec 21, 2024
38aaff8
Cleanup
DarkLight1337 Dec 21, 2024
5fcb5d6
Fix Pixtral-HF
DarkLight1337 Dec 21, 2024
f86e148
Enable caching outside the processing loop
DarkLight1337 Dec 21, 2024
337f0d2
Make debugging easier
DarkLight1337 Dec 21, 2024
c01d38a
Update
DarkLight1337 Dec 21, 2024
84f02fb
Fix ultravox
DarkLight1337 Dec 21, 2024
9f417c2
Revert some unnecessary changes
DarkLight1337 Dec 21, 2024
00b765b
Merge branch 'main' into mm-fields
DarkLight1337 Dec 22, 2024
2ed431e
Add test and fix some issues
DarkLight1337 Dec 23, 2024
baaf551
Update
DarkLight1337 Dec 23, 2024
f5dbcb8
Fix
DarkLight1337 Dec 23, 2024
afd3f4f
Rework
DarkLight1337 Dec 23, 2024
6172450
Rename the test
DarkLight1337 Dec 23, 2024
416943d
Update count
DarkLight1337 Dec 23, 2024
86f2786
Rename
DarkLight1337 Dec 23, 2024
f5b6214
Some fixes
DarkLight1337 Dec 23, 2024
8a68e87
Cleanup
DarkLight1337 Dec 23, 2024
ab7e84b
Skip unspecified fields
DarkLight1337 Dec 23, 2024
9f2cdaa
Fix equality checking
DarkLight1337 Dec 23, 2024
d11e833
Consolidate common code
DarkLight1337 Dec 23, 2024
5fee280
Improve error message
DarkLight1337 Dec 23, 2024
6182fd6
Cleanup
DarkLight1337 Dec 23, 2024
e1214cf
Fix Pixtral-HF
DarkLight1337 Dec 23, 2024
c717bce
Fix missing mm_count key
DarkLight1337 Dec 23, 2024
023890e
Fix qwen2-vl
DarkLight1337 Dec 23, 2024
b5e5b8a
Fix Qwen2-VL
DarkLight1337 Dec 23, 2024
cf24a1b
Fix Qwen2-VL and Qwen2-Audio
DarkLight1337 Dec 23, 2024
73271e9
Debug Phi3V
DarkLight1337 Dec 23, 2024
e30deec
Consolidate common code
DarkLight1337 Dec 23, 2024
ea6f8b5
Try to fix Phi3V and Ultravox
DarkLight1337 Dec 23, 2024
10ae755
Remove benchmark
DarkLight1337 Dec 23, 2024
85c5e2c
Fix token mismatch in Phi3V and Ultravox
DarkLight1337 Dec 23, 2024
4873ff8
Update max image tokens
DarkLight1337 Dec 23, 2024
4dbb5a3
Strictly check the number of placeholder tokens
DarkLight1337 Dec 23, 2024
6dbae81
Fix doc failure
DarkLight1337 Dec 23, 2024
fb51c9b
Test and fix Mantis processor
DarkLight1337 Dec 24, 2024
91cbd63
Fix embedding inputs
DarkLight1337 Dec 24, 2024
6bee6ba
Update entrypoints tests
DarkLight1337 Dec 24, 2024
cfa2ce8
Merge branch 'main' into mm-fields
DarkLight1337 Dec 24, 2024
fa54292
Clean up
DarkLight1337 Dec 24, 2024
cbf79be
Avoid extra placeholder in phi3v
DarkLight1337 Dec 24, 2024
9cd38b1
Fix OOM
DarkLight1337 Dec 24, 2024
14dcdd5
Fix mantis processor
DarkLight1337 Dec 24, 2024
b8bd2d4
Merge branch 'main' into mm-fields
DarkLight1337 Dec 24, 2024
5045d93
Remove redundant code
DarkLight1337 Dec 24, 2024
4cac998
Still need Mantis repo for testing
DarkLight1337 Dec 24, 2024
e8afd10
Merge branch 'main' into mm-fields
DarkLight1337 Dec 25, 2024
93bba0a
Fix incorrect max image tokens (Updated in #11258)
DarkLight1337 Dec 25, 2024
ea9f888
Also cache by model ID
DarkLight1337 Dec 25, 2024
58747f6
Format
DarkLight1337 Dec 25, 2024
323657a
Update link
DarkLight1337 Dec 25, 2024
695c79e
Merge branch 'main' into mm-fields
DarkLight1337 Dec 26, 2024
c67efda
Address some comments
DarkLight1337 Dec 26, 2024
d4abec7
Move `MultiModalDataItems` to `inputs` module to address more comments
DarkLight1337 Dec 26, 2024
9f4a8be
Add documentation
DarkLight1337 Dec 26, 2024
1d5b56d
Fix circular import
DarkLight1337 Dec 26, 2024
e4c7a14
Update docs
DarkLight1337 Dec 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix qwen2-vl
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
  • Loading branch information
DarkLight1337 committed Dec 20, 2024
commit b4ddfb15f1f33f52c552f95d29d45c0a464ecfa3
4 changes: 2 additions & 2 deletions vllm/multimodal/processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -708,8 +708,8 @@ def _cached_call_fine(
)

for k, v in processed_modal_item.items():
# Remove the extra batch dimension
processed_modal_items[k].append(v[0])
# Remove the extra batch dimension (if it exists)
processed_modal_items[k].append(v.squeeze(0))

for k, vs in processed_modal_items.items():
# Try to merge elements into a single tensor
Expand Down
Loading