Skip to content

Conversation

@GuilhermeViveiros
Copy link
Collaborator

This PR introduces several updates:

There has been significant work within deep-spin that, unfortunately, was not merged periodically into main.
I am unable to provide a detailed description of all the changes, but I can highlight the main points.
Previous iterations on this branch focused on evaluating the multimodal capabilities of TowerVision, and the key updates are:

  1. Improved post-processing functions for various benchmarks, including ALM-Bench, Commute, Kalidoscope, etc
  2. Updated support for more languages in ALM-Bench.
  3. Added support for the likelihood function across different multimodal models.
  4. Introduced judges in Aya-Vision.

Additionally, we introduced a new benchmark called Blink.

manzar96 and others added 30 commits December 20, 2024 17:02
…/lmms-eval into multiling_multimodal_tasks_add
…/lmms-eval into multiling_multimodal_tasks_add
…/lmms-eval into multiling_multimodal_tasks_add
manzar96 and others added 27 commits April 27, 2025 14:25
…/lmms-eval into multiling_multimodal_tasks_add
@GuilhermeViveiros GuilhermeViveiros self-assigned this Oct 7, 2025
@GuilhermeViveiros GuilhermeViveiros added the enhancement New feature or request label Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants