-
Notifications
You must be signed in to change notification settings - Fork 716
Description
先决条件
问题类型
我正在使用官方支持的任务/模型/数据集进行评估。
环境
{'CUDA available': False,
'GCC': 'n/a',
'MMEngine': '0.10.7',
'MSVC': '用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.50.35720 版',
'MUSA available': False,
'OpenCV': '4.11.0',
'PyTorch': '2.9.1+cpu',
'PyTorch compiling details': 'PyTorch built with:\n'
' - C++ Version: 201703\n'
' - MSVC 194234444\n'
' - Intel(R) oneAPI Math Kernel Library Version '
'2025.3-Product Build 20251007 for Intel(R) 64 '
'architecture applications\n'
' - Intel(R) MKL-DNN v3.7.1 (Git Hash '
'8d263e693366ef8db40acc569cc7d8edf644556d)\n'
' - OpenMP 2019\n'
' - LAPACK is enabled (usually provided by '
'MKL)\n'
' - CPU capability usage: AVX2\n'
' - Build settings: BLAS_INFO=mkl, '
'BUILD_TYPE=Release, '
'COMMIT_SHA=5811a8d7da873dd699ff6687092c225caffcf1bb, '
'CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/pytorch/.ci/pytorch/windows/tmp_bin/sccache-cl.exe, '
'CXX_FLAGS=/DWIN32 /D_WINDOWS /EHsc '
'/Zc:__cplusplus /bigobj /FS /utf-8 '
'-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
'-DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER '
'-DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK '
'-DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 '
'/wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 '
'/wd4804 /wd4273, LAPACK_INFO=mkl, '
'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
'TORCH_VERSION=2.9.1, USE_CUDA=0, USE_CUDNN=OFF, '
'USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, '
'USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, '
'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, '
'USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, '
'USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, '
'USE_XPU=OFF, \n',
'Python': '3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 '
'64 bit (AMD64)]',
'lmdeploy': "not installed:No module named 'lmdeploy'",
'numpy_random_seed': 2147483648,
'opencompass': '0.5.1+unknown',
'sys.platform': 'win32',
'transformers': '4.57.3'}
重现问题 - 代码/配置示例
`from opencompass.models import Qwen
api_meta_template = dict(round=[
dict(role='HUMAN', api_role='HUMAN'),
dict(role='BOT', api_role='BOT', generate=True),
], )
models = [
dict(
type=Qwen,
abbr="qwen3-max",
path='qwen3-max',
key=
'sk-xxxxxxxxxxxxxxx', # The key will be obtained from $OPENAI_API_KEY, but you can write down your key here as well
meta_template=api_meta_template,
query_per_second=1,
max_out_len=1024,
max_seq_len=2048,
batch_size=8),
]`
‘from mmengine.config import read_base
with read_base():
from opencompass.configs.datasets.demo.demo_gsm8k_chat_gen import
gsm8k_datasets
from opencompass.configs.datasets.demo.demo_math_chat_gen import
math_datasets
from opencompass.configs.models.qwen2_5.qwen_test_opencompass_custom import models as qwen_model
datasets = gsm8k_datasets
models = qwen_model
’
重现问题 - 命令或脚本
python run.py examples/eval_api_demo.py -w outputs/qwen_max_test --debug
重现问题 - 错误信息
12/12 09:56:48 - OpenCompass - INFO - Task [qwen3-max/demo_gsm8k]
C:\Project\opencompass-plus-main.venv\Lib\site-packages\jieba_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
signal.SIGALRM is not available on this platform
signal.SIGALRM is not available on this platform
12/12 09:57:06 - OpenCompass - INFO - Try to load the data from C:\Users\maxweli.cache/opencompass/./data/gsm8k/
Map: 0%| | 0/7473 [00:00<?, ? examples/s]
Map: 33%|�������� | 2496/7473 [00:00<00:00, 24178.01 examples/s]
Map: 73%|���������������� | 5472/7473 [00:00<00:00, 25674.55 examples/s]
Map: 100%|��������������������| 7473/7473 [00:00<00:00, 25481.09 examples/s]
Map: 0%| | 0/1319 [00:00<?, ? examples/s]
Map: 100%|��������������������| 1319/1319 [00:00<00:00, 27596.31 examples/s]
12/12 09:57:06 - OpenCompass - INFO - Start inferencing [qwen3-max/demo_gsm8k]
[2025-12-12 09:57:06,588] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting build dataloader
[2025-12-12 09:57:06,588] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
0%| | 0/1 [00:00<?, ?it/s]
100%|��������������������| 1/1 [00:19<00:00, 19.64s/it]
100%|��������������������| 1/1 [00:19<00:00, 19.64s/it]
12/12 09:57:26 - OpenCompass - INFO - time elapsed: 37.89s
C:\Project\opencompass-plus-main.venv\Lib\site-packages\jieba_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
signal.SIGALRM is not available on this platform
signal.SIGALRM is not available on this platform
12/12 09:57:52 - OpenCompass - INFO - Try to load the data from C:\Users\maxweli.cache/opencompass/./data/gsm8k/
Map: 0%| | 0/7473 [00:00<?, ? examples/s]
Map: 35%|�������� | 2652/7473 [00:00<00:00, 25592.68 examples/s]
Map: 72%|���������������� | 5360/7473 [00:00<00:00, 24882.88 examples/s]
Map: 100%|��������������������| 7473/7473 [00:00<00:00, 24828.75 examples/s]
Map: 0%| | 0/1319 [00:00<?, ? examples/s]
Map: 100%|��������������������| 1319/1319 [00:00<00:00, 23369.41 examples/s]
Parameter 'function'=<function OpenICLEvalTask._load_and_preprocess_test_data..postprocess at 0x000001B5179956C0> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
Map: 0%| | 0/8 [00:00<?, ? examples/s]
Map: 100%|��������������������| 8/8 [00:00<00:00, 511.68 examples/s]
text None
Traceback (most recent call last):
File "C:\Project\opencompass-plus-main\opencompass\tasks\openicl_eval.py", line 561, in
inferencer.run()
File "C:\Project\opencompass-plus-main\opencompass\tasks\openicl_eval.py", line 93, in run
self._score()
File "C:\Project\opencompass-plus-main\opencompass\tasks\openicl_eval.py", line 116, in _score
pred_strs = self._process_predictions(pred_strs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Project\opencompass-plus-main\opencompass\tasks\openicl_eval.py", line 244, in _process_predictions
pred_strs = [proc(s, **kwargs) for s in pred_strs]
^^^^^^^^^^^^^^^^^
File "C:\Project\opencompass-plus-main\opencompass\datasets\gsm8k.py", line 46, in gsm8k_postprocess
text = text.split('Question:')[0]
^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
其他信息
由于报错是text是nonetype,因此日志中加了print(text),内容确实为None,但仅出现在Qwen3-max模型评测任务中,向其他如qwen2.5,qwen-max等模型,text都不是None。
另外还想问,看日志消息,似乎日志中做了两次评测,无论是哪一个模型似乎都是两次评测,这是正常的吗?只是Qwen3-max第一次评测没报错,第二次报错,其他模型两次评测都没报错