-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Hi.
I have a problem with GPT-OSS 20B, my master. And we need help from the OpenVINO GenAI team.
openarc add --model-name gpt-oss-20b-int4-ov --model-path T:\models\desmondsow\gpt-oss-20b-int4-ov --engine ovgenai --model-type llm --device GPU.0
✅ Dolphin-X1-8B-int4_asym-awq-ov is working
❌ GPT-OSS 20B not working
❌ Qwen3-14B-int4-ov not working
I wrote in Discord about Qwen3 14b, it doesn't work in Windows, but it works in Linux.
https://discord.com/channels/1341627368581628004/1361023199109845282/1458763931588886540
OS: Windows 11
Driver: 8331
Intel Arc A770 (16gb)
Models:
- https://huggingface.co/desmondsow/gpt-oss-20b-int4-ov
- https://huggingface.co/OpenVINO/Qwen3-14B-int4-ov/
(bench Dolphin-X1-8B-int4_asym-awq-ov)
I'm using the latest dev build because gpt-oss is not supported on 2025.3.
openvino 2026.0.0.dev20260109
openvino-genai 2026.0.0.0.dev20260109
openvino-tokenizers 2026.0.0.0.dev20260109
ERROR
[GPU] clFinish, error code: -5 CL_OUT_OF_RESOURCES
(openarc) C:\llm\openarc\201>openarc serve start --host 127.0.0.1
Configuration saved to: C:\llm\openarc\201\openarc_config.json
Starting OpenArc server on 127.0.0.1:8000
2026-01-12 23:14:15,372 - INFO - Launching 127.0.0.1:8000
2026-01-12 23:14:15,373 - INFO - --------------------------------
2026-01-12 23:14:15,373 - INFO - OpenArc endpoints:
2026-01-12 23:14:15,373 - INFO - - POST /openarc/load Load a model
2026-01-12 23:14:15,373 - INFO - - POST /openarc/unload Unload a model
2026-01-12 23:14:15,373 - INFO - - GET /openarc/status Get model status
2026-01-12 23:14:15,374 - INFO - --------------------------------
2026-01-12 23:14:15,374 - INFO - OpenAI compatible endpoints:
2026-01-12 23:14:15,374 - INFO - - GET /v1/models
2026-01-12 23:14:15,374 - INFO - - POST /v1/chat/completions
2026-01-12 23:14:15,374 - INFO - - POST /v1/audio/transcriptions: Whisper only
2026-01-12 23:14:15,374 - INFO - - POST /v1/audio/speech: Kokoro only
2026-01-12 23:14:15,375 - INFO - - POST /v1/embeddings
2026-01-12 23:14:15,375 - INFO - - POST /v1/rerank
C:\llm\openarc\201\.venv\Lib\site-packages\torch\onnx\_internal\registration.py:162: OnnxExporterWarning: Symbolic function 'aten::scaled_dot_product_attention' already registered for opset 14. Replacing the existing function with new function. This is unexpected. Please report it on https://github.com/pytorch/pytorch/issues.
warnings.warn(
2026-01-12 23:14:26,280 - INFO - Started server process [10640]
2026-01-12 23:14:26,280 - INFO - Waiting for application startup.
2026-01-12 23:14:26,282 - INFO - Application startup complete.
2026-01-12 23:14:26,283 - INFO - Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
2026-01-12 23:14:27,636 - INFO - Request received: POST /openarc/unload from 127.0.0.1
2026-01-12 23:14:27,639 - INFO - Request completed: POST /openarc/unload status=500 duration=0.004s
2026-01-12 23:14:38,689 - INFO - Request received: POST /openarc/load from 127.0.0.1
2026-01-12 23:14:38,699 - INFO - gpt-oss-20b-int4-ov loading...
2026-01-12 23:14:38,699 - INFO - ModelType.LLM on GPU.0 with {}
2026-01-12 23:14:58,995 - INFO - gpt-oss-20b-int4-ov loaded successfully
2026-01-12 23:14:58,997 - INFO - [gpt-oss-20b-int4-ov LLM Worker] Started, waiting for packets...
2026-01-12 23:14:58,998 - INFO - Request completed: POST /openarc/load status=200 duration=20.310s
2026-01-12 23:15:26,869 - INFO - Request received: GET /v1/models from 127.0.0.1
2026-01-12 23:15:26,871 - INFO - Request completed: GET /v1/models status=200 duration=0.002s
2026-01-12 23:15:33,396 - INFO - Request received: POST /openarc/bench from 127.0.0.1
2026-01-12 23:15:36,104 - ERROR - LLM inference failed!
Traceback (most recent call last):
File "C:\llm\openarc\201\src\server\worker_registry.py", line 87, in infer_llm
async for item in llm_instance.generate_type(packet.gen_config):
File "C:\llm\openarc\201\src\engine\ov_genai\llm.py", line 100, in generate_text
result = await asyncio.to_thread(self.model.generate, prompt_token_ids, generation_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\uuk\AppData\Local\Programs\Python\Python311\Lib\asyncio\threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\uuk\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src\inference\src\cpp\infer_request.cpp:224:
Exception from src\plugins\intel_gpu\src\runtime\ocl\ocl_stream.cpp:376:
[GPU] clFinish, error code: -5 CL_OUT_OF_RESOURCES
2026-01-12 23:15:36,111 - ERROR - [gpt-oss-20b-int4-ov LLM Worker] Inference failed, triggering model unload...
Unhandled exception caught in c10/util/AbortHandler.h
00007FF921F3369400007FF921F2ACE0 torch_python.dll!torch::autograd::THPCppFunction_requires_grad [<unknown file> @ <unknown line number>]
00007FF98D7BEE1200007FF98D7BEDF0 ucrtbase.dll!terminate [<unknown file> @ <unknown line number>]
00007FF96A571AB100007FF96A571150 VCRUNTIME140_1.dll!_NLG_Return2 [<unknown file> @ <unknown line number>]
00007FF96A57232F00007FF96A571150 VCRUNTIME140_1.dll!_NLG_Return2 [<unknown file> @ <unknown line number>]
00007FF96A57238900007FF96A571150 VCRUNTIME140_1.dll!_NLG_Return2 [<unknown file> @ <unknown line number>]
00007FF96A57418900007FF96A5740E0 VCRUNTIME140_1.dll!_CxxFrameHandler4 [<unknown file> @ <unknown line number>]
00007FF8F748B69000007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF98FF9547F00007FF98FF95350 ntdll.dll!_chkstk [<unknown file> @ <unknown line number>]
00007FF98FF0E88600007FF98FF0DDF0 ntdll.dll!RtlFindCharInUnicodeString [<unknown file> @ <unknown line number>]
00007FF98FF4495500007FF98FF447C0 ntdll.dll!RtlRaiseException [<unknown file> @ <unknown line number>]
00007FF98D24016C00007FF98D240100 KERNELBASE.dll!RaiseException [<unknown file> @ <unknown line number>]
00007FF9893E6BA700007FF9893E6B10 VCRUNTIME140.dll!CxxThrowException [<unknown file> @ <unknown line number>]
00007FF922F5092700007FF922F508E0 openvino.dll!ov::Exception::create [<unknown file> @ <unknown line number>]
00007FF8F74CBD0B00007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF96A571080 <unknown symbol address> VCRUNTIME140_1.dll!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF96A57271500007FF96A571150 VCRUNTIME140_1.dll!_NLG_Return2 [<unknown file> @ <unknown line number>]
00007FF98FF94CD600007FF98FF94830 ntdll.dll!RtlCaptureContext2 [<unknown file> @ <unknown line number>]
00007FF8F69D131000007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF8F647276100007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF8F626F86900007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF8F6240599 <unknown symbol address> openvino_intel_gpu_plugin.dll!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF8F6241F7E <unknown symbol address> openvino_intel_gpu_plugin.dll!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF8F625EA0300007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF8F625EF1400007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF9232E934800007FF9232E9190 openvino.dll!ov::ISyncInferRequest::~ISyncInferRequest [<unknown file> @ <unknown line number>]
00007FF8F638586400007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF92331C91E00007FF92331C7E0 openvino.dll!ov::IAsyncInferRequest::~IAsyncInferRequest [<unknown file> @ <unknown line number>]
00007FF8F625173400007FF8F62510A0 openvino_intel_gpu_plugin.dll!create_plugin_engine [<unknown file> @ <unknown line number>]
00007FF9232929A400007FF923292960 openvino.dll!ov::VariableState::~VariableState [<unknown file> @ <unknown line number>]
00007FF9202FB1AF00007FF9202E2720 openvino_genai.dll!ov::genai::StreamerBase::write [<unknown file> @ <unknown line number>]
00007FF9202D4E5400007FF9202D4870 openvino_genai.dll!ov::genai::ReasoningParser::`default constructor closure' [<unknown file> @ <unknown line number>]
00007FF9203E0E9A00007FF9203E07D0 openvino_genai.dll!ov::genai::LLMPipeline::LLMPipeline [<unknown file> @ <unknown line number>]
00007FF9203E171B00007FF9203E0F00 openvino_genai.dll!ov::genai::LLMPipeline::~LLMPipeline [<unknown file> @ <unknown line number>]
00007FF9203E0F2500007FF9203E0F00 openvino_genai.dll!ov::genai::LLMPipeline::~LLMPipeline [<unknown file> @ <unknown line number>]
00007FF9269906A8 <unknown symbol address> py_openvino_genai.cp311-win_amd64.pyd!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF926990636 <unknown symbol address> py_openvino_genai.cp311-win_amd64.pyd!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF926AC354C <unknown symbol address> _pyopenvino.cp311-win_amd64.pyd!<unknown symbol> [<unknown file> @ <unknown line number>]
00007FF926AD328B00007FF926AD2A10 _pyopenvino.cp311-win_amd64.pyd!PyInit__pyopenvino [<unknown file> @ <unknown line number>]
00007FF92BEFFDC000007FF92BEFFBA0 python311.dll!PyObject_GenericSetAttrWithDict [<unknown file> @ <unknown line number>]
00007FF92BE98D2C00007FF92BE98CA0 python311.dll!PyObject_SetAttr [<unknown file> @ <unknown line number>]
00007FF92BEC19F000007FF92BEBCC20 python311.dll!PyEval_EvalFrameDefault [<unknown file> @ <unknown line number>]
00007FF92BEF104E00007FF92BEF0D9C python311.dll!Py_BuildValue_SizeT [<unknown file> @ <unknown line number>]
00007FF92BFCF87500007FF92BFCF5F8 python311.dll!PyImport_GetMagicNumber [<unknown file> @ <unknown line number>]
00007FF92BEC097700007FF92BEBCC20 python311.dll!PyEval_EvalFrameDefault [<unknown file> @ <unknown line number>]
00007FF92BEF104E00007FF92BEF0D9C python311.dll!Py_BuildValue_SizeT [<unknown file> @ <unknown line number>]
00007FF92BFCF87500007FF92BFCF5F8 python311.dll!PyImport_GetMagicNumber [<unknown file> @ <unknown line number>]
00007FF96DF958BF00007FF96DF91000 _asyncio.pyd!PyInit__asyncio [<unknown file> @ <unknown line number>]
00007FF96DF9573300007FF96DF91000 _asyncio.pyd!PyInit__asyncio [<unknown file> @ <unknown line number>]
00007FF92BEB83DE00007FF92BEB8060 python311.dll!PyObject_MakeTpCall [<unknown file> @ <unknown line number>]
00007FF92C0BBC2400007FF92C0BBBA4 python311.dll!PyContext_NewHamtForTests [<unknown file> @ <unknown line number>]
00007FF92C0BBED500007FF92C0BBBA4 python311.dll!PyContext_NewHamtForTests [<unknown file> @ <unknown line number>]
00007FF92BEFD2EC00007FF92BEFD208 python311.dll!PyArg_UnpackTuple [<unknown file> @ <unknown line number>]
00007FF92BF6507300007FF92BF65018 python311.dll!PyObject_Call [<unknown file> @ <unknown line number>]
00007FF92BF64D4000007FF92BF648EC python311.dll!PyObject_CallObject [<unknown file> @ <unknown line number>]
00007FF92BEC1F7F00007FF92BEBCC20 python311.dll!PyEval_EvalFrameDefault [<unknown file> @ <unknown line number>]
00007FF92BEBAE6400007FF92BEBACC0 python311.dll!PyFunction_Vectorcall [<unknown file> @ <unknown line number>]
00007FF92BF64C6A00007FF92BF648EC python311.dll!PyObject_CallObject [<unknown file> @ <unknown line number>]
00007FF92BEC1F7F00007FF92BEBCC20 python311.dll!PyEval_EvalFrameDefault [<unknown file> @ <unknown line number>]
00007FF92BEBAE6400007FF92BEBACC0 python311.dll!PyFunction_Vectorcall [<unknown file> @ <unknown line number>]
00007FF92BF64C6A00007FF92BF648EC python311.dll!PyObject_CallObject [<unknown file> @ <unknown line number>]
00007FF92BEC1F7F00007FF92BEBCC20 python311.dll!PyEval_EvalFrameDefault [<unknown file> @ <unknown line number>]
00007FF92BEBAE6400007FF92BEBACC0 python311.dll!PyFunction_Vectorcall [<unknown file> @ <unknown line number>]
00007FF92BF05C2500007FF92BF03DC4 python311.dll!PyIter_Send [<unknown file> @ <unknown line number>]
00007FF92BF64C6A00007FF92BF648EC python311.dll!PyObject_CallObject [<unknown file> @ <unknown line number>]
uv pip list
(openarc) C:\llm\openarc\201>uv pip list
Package Version Editable project location
-------------------------- ---------------------- -------------------------
about-time 4.2.1
addict 2.4.0
aiohappyeyeballs 2.6.1
aiohttp 3.12.14
aiosignal 1.4.0
alive-progress 3.2.0
annotated-types 0.7.0
anyio 4.9.0
asttokens 3.0.0
attrs 25.3.0
audioread 3.0.1
autograd 1.8.0
babel 2.17.0
blis 1.3.0
brotli 1.1.0
catalogue 2.0.10
certifi 2025.7.14
cffi 2.0.0
charset-normalizer 3.4.2
click 8.2.1
cloudpathlib 0.22.0
cma 4.2.0
colorama 0.4.6
comm 0.2.3
confection 0.1.5
contourpy 1.3.2
cryptography 46.0.3
csvw 3.6.0
curated-tokenizers 0.0.9
curated-transformers 0.1.1
cycler 0.12.1
cymem 2.0.11
datasets 4.0.0
ddgs 9.6.1
debugpy 1.8.17
decorator 5.2.1
deprecated 1.2.18
dill 0.3.8
distro 1.9.0
dlinfo 2.0.0
docopt 0.6.2
espeakng-loader 0.2.4
executing 2.2.1
fastapi 0.116.1
filelock 3.18.0
fonttools 4.58.5
frozenlist 1.7.0
fsspec 2025.3.0
grapheme 0.6.0
griffe 1.14.0
h11 0.16.0
h2 4.3.0
hpack 4.1.0
httpcore 1.0.9
httpx 0.28.1
httpx-sse 0.4.3
huggingface-hub 0.33.4
hyperframe 6.1.0
idna 3.10
iniconfig 2.3.0
inquirerpy 0.3.4
ipykernel 7.0.1
ipython 9.6.0
ipython-pygments-lexers 1.1.1
ipywidgets 8.1.7
isodate 0.7.2
jedi 0.19.2
jinja2 3.1.6
jiter 0.11.0
joblib 1.5.1
jsonschema 4.24.0
jsonschema-specifications 2025.4.1
jupyter-client 8.6.3
jupyter-core 5.9.1
jupyterlab-widgets 3.0.15
kiwisolver 1.4.8
kokoro 0.9.4
langcodes 3.5.0
language-data 1.3.0
language-tags 1.2.0
lazy-loader 0.4
librosa 0.11.0
llvmlite 0.45.0
loguru 0.7.3
lxml 6.0.2
marisa-trie 1.3.1
markdown-it-py 3.0.0
markupsafe 3.0.2
matplotlib 3.10.3
matplotlib-inline 0.1.7
mcp 1.20.0
mdurl 0.1.2
misaki 0.9.4
mpmath 1.3.0
msgpack 1.1.1
multidict 6.6.3
multiprocess 0.70.16
murmurhash 1.0.13
natsort 8.4.0
nest-asyncio 1.6.0
networkx 3.4.2
ninja 1.11.1.4
nncf 2.17.0
num2words 0.5.14
numba 0.62.0
numpy 2.2.6
onnx 1.18.0
openai 2.2.0
openai-agents 0.4.2
openarc 2.0 C:\llm\openarc\201
openvino 2026.0.0.dev20260109
openvino-genai 2026.0.0.0.dev20260109
openvino-telemetry 2025.2.0
openvino-tokenizers 2026.0.0.0.dev20260109
optimum 1.27.0
optimum-intel 1.25.2
packaging 25.0
pandas 2.2.3
parso 0.8.5
pfzy 0.3.4
phonemizer-fork 3.3.2
pillow 11.3.0
pip 25.2
platformdirs 4.4.0
pluggy 1.6.0
pooch 1.8.2
preshed 3.0.10
primp 0.15.0
prompt-toolkit 3.0.52
propcache 0.3.2
protobuf 6.31.1
psutil 7.0.0
pure-eval 0.2.3
pyarrow 20.0.0
pycparser 2.23
pydantic 2.11.7
pydantic-core 2.33.2
pydantic-settings 2.11.0
pydot 3.0.4
pygments 2.19.2
pyjwt 2.10.1
pymoo 0.6.1.5
pynput 1.8.1
pyparsing 3.2.3
pytest 8.4.2
python-dateutil 2.9.0.post0
python-dotenv 1.2.1
python-multipart 0.0.20
pytz 2025.2
pywin32 311
pyyaml 6.0.2
pyzmq 27.1.0
rdflib 7.2.1
referencing 0.36.2
regex 2024.11.6
requests 2.32.4
rfc3986 1.5.0
rich 14.0.0
rich-click 1.8.9
rpds-py 0.26.0
safetensors 0.5.3
scikit-learn 1.7.0
scipy 1.16.0
segments 2.3.0
setuptools 80.9.0
shellingham 1.5.4
six 1.17.0
smart-open 7.3.1
smolagents 1.22.0
sniffio 1.3.1
socksio 1.0.0
sounddevice 0.5.2
soundfile 0.13.1
soxr 1.0.0
spacy 3.8.7
spacy-curated-transformers 0.3.1
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.5.1
sse-starlette 3.0.3
stack-data 0.6.3
starlette 0.47.1
sympy 1.14.0
tabulate 0.9.0
termcolor 3.1.0
thinc 8.3.6
threadpoolctl 3.6.0
tokenizers 0.21.2
torch 2.8.0+cpu
torchvision 0.23.0+cpu
tornado 6.5.2
tqdm 4.67.1
traitlets 5.14.3
transformers 4.52.4
typer 0.19.2
types-requests 2.32.4.20250913
typing-extensions 4.14.1
typing-inspection 0.4.1
tzdata 2025.2
uritemplate 4.2.0
urllib3 2.5.0
uvicorn 0.35.0
wasabi 1.1.3
wcwidth 0.2.14
weasel 0.4.1
widgetsnbextension 4.0.14
win32-setctime 1.2.0
wrapt 1.17.2
xxhash 3.5.0
yarl 1.20.1
Metadata
Metadata
Assignees
Labels
No labels