WIP: Merge main into v2 #2116

KennethEnevoldsen · 2025-02-20T19:31:18Z

There were some changes to taskmetadata's descriptive_stat_path, I am not entirely sure if the merge is correct here.
why the "InstructionReranking"? (probably missed the reason somewhere?)
@sam-hey they couldn't get the iso library to work. Seems it could be python-iso639? (seems like the test passed because try except loop).
@Samoed I tried to calculate desc stats, in v2, but got this error, which I believe you might have fixed (my merge could have undone it)

from mteb.tasks.Reranking.eng import BuiltBenchReranking

task = BuiltBenchReranking()
task.load_data() # fails (but only on v2) as "default" is not available?

for now I just calculated the desc. stats over on main and moved it over.

~~Note: we are likely missing some task imports in v2 (it is impossible to know with the star imports) - any good way to check?~~ I used the now outdated category to check.

It might be nice to start doing these merges more often?

Code Quality

Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Samoed · 2025-02-20T19:42:36Z

Since there no more duplicate class names you can run scripts/generate_imports.py

Samoed · 2025-02-21T15:29:41Z

why the "InstructionReranking"? (probably missed the reason somewhere?)

Instruction reranking were added in #1359 after unifing retrieval and reranking

There were some changes to taskmetadata's descriptive_stat_path, I am not entirely sure if the merge is correct here.

Yes, that's correct

I tried to calculate desc stats, in v2, but get this quite error, which I believe you might have fixed (a merge could have undone it)

I think you've fixed it for now.

Samoed

Great!

sam-hey · 2025-02-21T16:03:44Z

Just had a look at the original pr - back then it was iso639, I think python-iso639 is also fine to use. Not sure when the dependency was removed.

Samoed · 2025-02-21T16:11:30Z

To load BuiltBenchReranking correctly you need to put it in

mteb/mteb/abstasks/AbsTaskReranking.py

Lines 13 to 30 in 056c09a

    
           OLD_FORMAT_RERANKING_TASKS = [ 
        
               "MindSmallReranking", 
        
               "SciDocsRR", 
        
               "StackOverflowDupQuestions", 
        
               "WebLINXCandidatesReranking", 
        
               "AlloprofReranking", 
        
               "SyntecReranking", 
        
               "VoyageMMarcoReranking", 
        
               "ESCIReranking", 
        
               "MIRACLReranking", 
        
               "WikipediaRerankingMultilingual", 
        
               "RuBQReranking", 
        
               "T2Reranking", 
        
               "MMarcoReranking", 
        
               "CMedQAv1-reranking", 
        
               "CMedQAv2-reranking", 
        
               "NamaaMrTydiReranking", 
        
           ]

I'm removing this in #2097

KennethEnevoldsen · 2025-02-21T16:48:03Z

Failures seems to be HF failures. Will try to rerun to see if it resolves it

Samoed · 2025-02-23T16:45:12Z

I think we should modify test_dataset_availability to test datasets independently. Currently, if one test fails, all datasets are rechecked due to pytest.mark.flaky, which results in around 4,000 requests after retries. This could lead to Hugging Face blocking our requests.

…hmark/mteb into merged-from-main

KennethEnevoldsen added 2 commits February 20, 2025 20:25

Merge main into v2

942a5a9

format

c2d1af1

KennethEnevoldsen added 4 commits February 20, 2025 20:48

tests collect

e2a0f87

most tests now pass

50422bd

add missing desc. stats

08eee44

format and remove junk file

ac322e7

KennethEnevoldsen marked this pull request as ready for review February 21, 2025 15:26

KennethEnevoldsen requested review from isaac-chung and Samoed February 21, 2025 15:26

Samoed approved these changes Feb 21, 2025

View reviewed changes

add BuiltBenchReranking to old format reranking

2646e07

Samoed mentioned this pull request Feb 21, 2025

[v2] reupload reranking datasets in old format #2097

Open

20 tasks

Samoed added 2 commits February 23, 2025 18:26

update imports

22c01ba

add missing descriptive stats

8d1f158

KennethEnevoldsen added 2 commits February 23, 2025 19:58

Added missing desc stats

52a92f2

Merge branch 'merged-from-main' of https://github.com/embeddings-benc…

29cc478

…hmark/mteb into merged-from-main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Merge main into v2 #2116

WIP: Merge main into v2 #2116

KennethEnevoldsen commented Feb 20, 2025 •

edited

Loading

Samoed commented Feb 20, 2025

Samoed commented Feb 21, 2025

Samoed left a comment

sam-hey commented Feb 21, 2025

Samoed commented Feb 21, 2025

KennethEnevoldsen commented Feb 21, 2025

Samoed commented Feb 23, 2025

WIP: Merge main into v2 #2116

Are you sure you want to change the base?

WIP: Merge main into v2 #2116

Conversation

KennethEnevoldsen commented Feb 20, 2025 • edited Loading

Code Quality

Documentation

Testing

Samoed commented Feb 20, 2025

Samoed commented Feb 21, 2025

Samoed left a comment

Choose a reason for hiding this comment

sam-hey commented Feb 21, 2025

Samoed commented Feb 21, 2025

KennethEnevoldsen commented Feb 21, 2025

Samoed commented Feb 23, 2025

KennethEnevoldsen commented Feb 20, 2025 •

edited

Loading