-
Notifications
You must be signed in to change notification settings - Fork 716
Open
Description
Prerequisite
- I have searched Issues and Discussions but cannot get the expected help.
- The bug has not been fixed in the latest version.
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
暂无
Reproduces the problem - code/configuration sample
暂无
Reproduces the problem - command or script
暂无
Reproduces the problem - error message
暂无
Other information
最近,关注到了司南官网添加了daily benchmark这个模块,帮助非常大。但是在看当中的AI解读的时候,发现一些幻觉现象。
比如,10月30日的Automating Benchmark Design的一文中,AI解读道 “成本黑洞:构建新基准需要数百专家月的人力投入(如MMLU耗资超200万美元)“,发现原文并没有提及MMLU的构建成本,不知道出处何来,还是说这个是大模型的幻觉导致
Metadata
Metadata
Assignees
Labels
No labels