Skip to content

Conversation

siyagoel
Copy link
Contributor

Added the mmlu_pro.py (scenario), lite_run_spec.py, and test_mmlu_pro_scenario.py

This is the implementation for MMLU Pro without Chain of Thought.

Copy link
Collaborator

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good besides a few minor things. Thanks!

}
for hf_split, split in splits.items():
data = dataset[hf_split].filter(lambda x: x["category"] == self.subject)
print(f"Filtered instances in {hf_split}: {len(data)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the print.

# Test for the "abstract_algebra" subject
scenario = MMLUProScenario(subject="math")
instances = scenario.get_instances(tmpdir)
# assert len(instances) == 116
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncomment or delete this line.

Comment on lines +14 to +15
assert instances[1].input == Input(text="Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field.")
assert instances[1].references == [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert changes in this file.

# Test for the "anatomy" subject
scenario = MMLUProScenario(subject="health")
instances = scenario.get_instances(tmpdir)
# assert len(instances) == 154
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncomment or delete this line.

hlog(f"Processing data for {split} split")
for row in data:
question = row["question"]
answers = row["options"][:10] # Limit to 10 answers if necessary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need [:10]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I will remove this.

@siyagoel siyagoel merged commit b92b93f into main Oct 31, 2024
8 of 12 checks passed
@siyagoel siyagoel deleted the siyagoel/mmluprofinal branch October 31, 2024 21:35
@yifanmai
Copy link
Collaborator

Merging this pull request broke main: https://github.com/stanford-crfm/helm/actions/runs/11620257291

Could you open a new pull request to fix the tests and also to address the open comments in this pull requests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants