Experimental results on ARC-C subset for challeging reasoning? #47

tongyx361 · 2024-07-18T06:13:37Z

“Notably, except for ORPO, almost all approaches lead to consistent drops in one or more settings.”
But ARC shows that commonsense reasoning is improved.
However, the contradiction between reasoning tasks is counter-intuitive.
I wonder if this is caused by the difficulty difference.
So I am curious about the results on the ARC-C subset.
It would be so nice of you if you could curate related data and report the numbers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental results on ARC-C subset for challeging reasoning? #47

Experimental results on ARC-C subset for challeging reasoning? #47

tongyx361 commented Jul 18, 2024

Experimental results on ARC-C subset for challeging reasoning? #47

Experimental results on ARC-C subset for challeging reasoning? #47

Comments

tongyx361 commented Jul 18, 2024