Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed mmlu generative response extraction #2503

Merged
merged 8 commits into from
Jan 20, 2025

Conversation

RawthiL
Copy link
Contributor

@RawthiL RawthiL commented Nov 18, 2024

Related to #2279

The generative modality filtering is too strict resulting in zero accuracy on many models. This PR adds a filter to recover cases like:

  • Correct answer is (B) -> (B)
  • (B) -> (B)
  • (B)\n -> (B)

Also, the sub-task metric was changed from acc to exact_match, to match the aggregated metric.

An example of this PR in execution can be seen in this comment #2279 (comment)

@baberabb
Copy link
Contributor

baberabb commented Dec 4, 2024

Hi! looks good, to me! Just need you to increment the version. Also you can add some args to exact_match, if you think they are appropriate.

@RawthiL
Copy link
Contributor Author

RawthiL commented Dec 4, 2024

Done, also added the arguments to ignore punctuation and case, which can be useful.

@RawthiL RawthiL force-pushed the mmlu_generative_fix branch from 578dbdb to eeaf4ce Compare December 4, 2024 18:40
@RawthiL
Copy link
Contributor Author

RawthiL commented Dec 4, 2024

I rebased to main, but I see some linting errors on files that are not part of my commit, also a failed test but I'm not sure if the problem is on my end

@baberabb baberabb merged commit 12b6eeb into EleutherAI:main Jan 20, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants