Changed MMLU Pro for Non-COT Version #3108

siyagoel · 2024-10-28T21:41:42Z

Added the mmlu_pro.py (scenario), lite_run_spec.py, and test_mmlu_pro_scenario.py

This is the implementation for MMLU Pro without Chain of Thought.

yifanmai

Looks good besides a few minor things. Thanks!

yifanmai · 2024-10-29T02:32:08Z

src/helm/benchmark/scenarios/mmlu_pro.py

+        }
+        for hf_split, split in splits.items():
+            data = dataset[hf_split].filter(lambda x: x["category"] == self.subject)
+            print(f"Filtered instances in {hf_split}: {len(data)}")


Remove the print.

yifanmai · 2024-10-29T02:32:39Z

src/helm/benchmark/scenarios/test_mmlu_pro_scenario.py

+        # Test for the "abstract_algebra" subject
+        scenario = MMLUProScenario(subject="math")
+        instances = scenario.get_instances(tmpdir)
+        # assert len(instances) == 116


Uncomment or delete this line.

yifanmai · 2024-10-29T02:33:32Z

src/helm/benchmark/scenarios/test_mmlu_scenario.py

+        assert instances[1].input == Input(text="Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.")
+        assert instances[1].references == [


Revert changes in this file.

yifanmai · 2024-10-29T02:33:48Z

src/helm/benchmark/scenarios/test_mmlu_pro_scenario.py

+        # Test for the "anatomy" subject
+        scenario = MMLUProScenario(subject="health")
+        instances = scenario.get_instances(tmpdir)
+        # assert len(instances) == 154


Uncomment or delete this line.

yifanmai · 2024-10-29T02:34:27Z

src/helm/benchmark/scenarios/mmlu_pro.py

+        hlog(f"Processing data for {split} split")
+        for row in data:
+            question = row["question"]
+            answers = row["options"][:10]  # Limit to 10 answers if necessary


Do we actually need [:10]?

No I will remove this.

yifanmai · 2024-10-31T22:28:00Z

Merging this pull request broke main: https://github.com/stanford-crfm/helm/actions/runs/11620257291

Could you open a new pull request to fix the tests and also to address the open comments in this pull requests?

siyagoel added 5 commits October 27, 2024 21:05

New changes to MMLU Pro

7ba9c19

commit to mmlupronew

1db01a2

Implementing MMLU Pro final changes.

ec9eab5

Changing images_utils.py for mmlu pro

74b37b7

Changes for formatting for MMLU Pro Files

6273765

yifanmai approved these changes Oct 29, 2024

View reviewed changes

siyagoel merged commit b92b93f into main Oct 31, 2024
8 of 12 checks passed

siyagoel deleted the siyagoel/mmluprofinal branch October 31, 2024 21:35

yifanmai mentioned this pull request Nov 18, 2024

Added scenario for MMLU Pro #3077

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changed MMLU Pro for Non-COT Version #3108

Changed MMLU Pro for Non-COT Version #3108

Uh oh!

siyagoel commented Oct 28, 2024

Uh oh!

yifanmai left a comment

Uh oh!

yifanmai Oct 29, 2024

Uh oh!

yifanmai Oct 29, 2024

Uh oh!

yifanmai Oct 29, 2024

Uh oh!

yifanmai Oct 29, 2024

Uh oh!

yifanmai Oct 29, 2024

Uh oh!

siyagoel Oct 31, 2024

Uh oh!

Uh oh!

yifanmai commented Oct 31, 2024

Uh oh!

Uh oh!

		assert instances[1].input == Input(text="Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.")
		assert instances[1].references == [

Changed MMLU Pro for Non-COT Version #3108

Changed MMLU Pro for Non-COT Version #3108

Uh oh!

Conversation

siyagoel commented Oct 28, 2024

Uh oh!

yifanmai left a comment

Choose a reason for hiding this comment

Uh oh!

yifanmai Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

yifanmai Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

yifanmai Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

yifanmai Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

yifanmai Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

siyagoel Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yifanmai commented Oct 31, 2024

Uh oh!

Uh oh!