Adds multimodal support and MMMU pro #675

NathanHB · 2025-04-15T11:38:49Z

Aims to add multimodal support for transformers model by creating the VLMTransformersModel.

Adds the MMMU task
modify the prompt manager to support multimodal input (images for now)
adds a lighteval accelerate vlm cli entry for creating the right config and using the VLMTransformersModel
tests failing is because of shortcut it's normal

To test / use:

uv run lighteval accelerate "model_name=HuggingFaceTB/SmolVLM-Instruct" "lighteval|mmmu_pro|0|0" --use-chat-template

src/lighteval/models/model_loader.py

HuggingFaceDocBuilderDev · 2025-04-15T11:41:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

NathanHB · 2025-04-15T11:48:59Z

src/lighteval/models/transformers/vlm_transformers.py

need working will mainly be here, first is to have the greedy untill function working

src/lighteval/tasks/default_tasks.py

qubvel · 2025-04-23T12:28:34Z

src/lighteval/tasks/lighteval_task.py

+        # TODO: What is the best option to pass images to the requests?
+        # dirty hack for now
+        for reqs in requests.values():
+            for req in reqs:
+                req.specific = formatted_doc.specific


What is the best option to pass images to the requests?

for now it would be using specifics but we could also add an images field to the request that default to None

src/lighteval/tasks/prompt_manager.py

src/lighteval/tasks/requests.py

src/lighteval/models/transformers/vlm_transformers.py

qubvel

Hey @NathanHB, thanks for reviewing the PR. As we discussed internally, the only thing left is the tests, and the rest can be merged in the current state.

I also plan to conduct more experiments to evaluate VLM models after my vacation, and maybe, we will already have some user feedback to improve the evaluation.

I also left some comments, feel free to resolve them (if needed) while adding tests 🤗 Thanks!

src/lighteval/data.py

src/lighteval/tasks/lighteval_task.py

NathanHB · 2025-05-15T12:48:34Z

hey @qubvel Taking care of adding tests and removing unneded stuff, thank you so much for the help in adding this feature :)

``` uv run lighteval accelerate "model_name=HuggingFaceTB/SmolVLM-Instruct" "lighteval|mmmu_pro|0|0" --use-chat-template --vision-model ``` --------- Co-authored-by: qubvel <qubvel@gmail.com>

NathanHB added 2 commits April 15, 2025 11:34

init

409b0c0

init

ee334c5

NathanHB commented Apr 15, 2025

View reviewed changes

src/lighteval/models/model_loader.py Show resolved Hide resolved

init

e988f6f

NathanHB commented Apr 15, 2025

View reviewed changes

qubvel added 4 commits April 21, 2025 17:18

Naive implementation

5fddc82

Fix choices + change metric

7ce9c97

refactor prompt function

e08731a

style

8d4543b

qubvel reviewed Apr 23, 2025

View reviewed changes

NathanHB added the feature/enhancement New feature/request label May 5, 2025

qubvel added 18 commits May 6, 2025 13:01

FIx typing

05df4b6

Merge branch 'main' into nathan-adds-multimodal

16a9e97

Update max length

de60add

Remove docs

5fd52f5

Update auto processor

10b4e0b

add quantization config, transformers config

bc7610d

Update generation size

49e4986

Add batching

75c900c

Style

4e5fdd3

Add images to requests

d1ae8b7

nit

f855158

nit

641819e

Clean up a bit

aa0acb7

nit

56f962b

Fix batch size

8e99388

Add images for Doc class

418840d

clean-up prompt manager

e35db98

Style

57c18f7

qubvel added 5 commits May 7, 2025 15:44

Style

7cd35c2

Clean up prompt manager

e13cac9

Add dtype

fa18ec2

Update prompt function

c59e5af

Refactor to pass ruff check

8f31f1b

qubvel reviewed May 7, 2025

View reviewed changes

src/lighteval/tasks/prompt_manager.py Show resolved Hide resolved

NathanHB commented May 9, 2025

View reviewed changes

src/lighteval/tasks/requests.py Outdated Show resolved Hide resolved

NathanHB commented May 9, 2025

View reviewed changes

src/lighteval/models/transformers/vlm_transformers.py Outdated Show resolved Hide resolved

NathanHB and others added 9 commits May 12, 2025 13:26

fix the CI

3675066

fix the CI

30e22ab

Fit typing

924bf13

Fix system content

b909259

Split to vision and standard tasks

665474a

Data parallel

1a73dd0

Clean up config docs, tokenizer -> processor

b618af7

Add fast image processor option

79e222d

Fix style

bd2c595

qubvel reviewed May 15, 2025

View reviewed changes

src/lighteval/data.py Outdated Show resolved Hide resolved

src/lighteval/data.py Outdated Show resolved Hide resolved

src/lighteval/tasks/lighteval_task.py Outdated Show resolved Hide resolved

NathanHB linked an issue May 15, 2025 that may be closed by this pull request

[FT] Add multimodal for transformers models #729

Closed

NathanHB added 4 commits May 19, 2025 12:04

commit

831f95e

commit

80568e7

commit

9fb75a6

commit

62165a8

NathanHB changed the title ~~Adds multimodal support~~ Adds multimodal support and MMMU pro May 19, 2025

NathanHB merged commit 1607dc1 into main May 19, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds multimodal support and MMMU pro #675

Adds multimodal support and MMMU pro #675

Uh oh!

NathanHB commented Apr 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2025

Uh oh!

NathanHB Apr 15, 2025

Uh oh!

Uh oh!

qubvel Apr 23, 2025

Uh oh!

NathanHB Apr 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qubvel left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NathanHB commented May 15, 2025

Uh oh!

Uh oh!

Uh oh!

Adds multimodal support and MMMU pro #675

Adds multimodal support and MMMU pro #675

Uh oh!

Conversation

NathanHB commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2025

Uh oh!

NathanHB Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qubvel Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

NathanHB Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qubvel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NathanHB commented May 15, 2025

Uh oh!

Uh oh!

Uh oh!

NathanHB commented Apr 15, 2025 •

edited

Loading