Skip to content

Ministral 3B GUFF failes to load #145

@Zethalion

Description

@Zethalion

MistralAI released their new models and I'm just touching the waters at the same time. Basically, I don't understand why the model get's to be unidentified. The message I receive is the following:

`C:\Apps\KoboldCCP>koboldcpp_rocm.exe
PyInstaller\loader\pyimod02_importers.py:384: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.


Welcome to KoboldCpp - Version 1.98.1.yr0-ROCm
For command line arguments, please refer to --help


Unable to detect VRAM, please set layers manually.
Auto Selected Vulkan Backend (flag=-1)

Loading Chat Completions Adapter: C:\Users\asees\AppData\Local\Temp_MEI253162\kcpp_adapters\AutoGuess.json
Chat Completions Adapter Loaded
Auto Recommended GPU Layers: 29
System: Windows 10.0.26200 AMD64 AMD64 Family 25 Model 33 Stepping 0, AuthenticAMD
Detected Available GPU Memory: 12016 MB
Detected Available RAM: 40757 MB
Initializing dynamic library: koboldcpp_vulkan.dll

Namespace(model=[], model_param='C:/Apps/KoboldCCP/Ministral-3-3B-Reasoning-2512-BF16.gguf', port=5001, port_param=5001, host='', launch=True, config=None, threads=11, usecuda=None, usevulkan=[0], useclblast=None, usecpu=False, contextsize=8192, gpulayers=29, tensor_split=None, checkforupdates=False, version=False, analyze='', maingpu=-1, blasbatchsize=512, blasthreads=11, lora=None, loramult=1.0, noshift=False, nofastforward=False, useswa=False, ropeconfig=[0.0, 10000.0], overridenativecontext=0, usemmap=False, usemlock=False, noavx2=False, failsafe=False, debugmode=0, onready='', benchmark=None, prompt='', cli=False, promptlimit=100, multiuser=1, multiplayer=False, websearch=False, remotetunnel=False, highpriority=False, foreground=False, preloadstory=None, savedatafile=None, quiet=False, ssl=None, nocertify=False, mmproj=None, mmprojcpu=False, visionmaxres=1024, draftmodel=None, draftamount=8, draftgpulayers=999, draftgpusplit=None, password=None, ignoremissing=False, chatcompletionsadapter='AutoGuess', flashattention=False, quantkv=0, forceversion=0, smartcontext=False, unpack='', exportconfig='', exporttemplate='', nomodel=False, moeexperts=-1, moecpu=0, defaultgenamt=640, nobostoken=False, enableguidance=False, maxrequestsize=32, overridekv=None, overridetensors=None, showgui=False, skiplauncher=False, singleinstance=False, hordemodelname='', hordeworkername='', hordekey='', hordemaxctx=0, hordegenlen=0, sdmodel='', sdthreads=11, sdclamped=0, sdclampedsoft=0, sdt5xxl='', sdclipl='', sdclipg='', sdphotomaker='', sdflashattention=False, sdconvdirect='off', sdvae='', sdvaeauto=False, sdquant=0, sdlora='', sdloramult=1.0, sdtiledvae=768, whispermodel='', ttsmodel='', ttswavtokenizer='', ttsgpu=False, ttsmaxlen=4096, ttsthreads=0, embeddingsmodel='', embeddingsmaxctx=0, embeddingsgpu=False, admin=False, adminpassword='', admindir='', hordeconfig=None, sdconfig=None, noblas=False, nommap=False, sdnotile=False)

Loading Text Model: C:\Apps\KoboldCCP\Ministral-3-3B-Reasoning-2512-BF16.gguf

The reported GGUF Arch is: mistral3
Arch Category: 0


Identified as GGUF model.
Attempting to Load...

Using automatic RoPE scaling for GGUF. If the model has custom RoPE settings, they'll be used directly instead!
System Info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | AMX_INT8 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 1 | matrix cores: none
llama_model_load_from_file_impl: using device Vulkan0 (AMD Radeon RX 6700 XT) - 12016 MiB free
llama_model_loader: loaded meta data with 53 key-value pairs and 236 tensors from C:\Apps\KoboldCCP\Ministral-3-3B-Reasoning-2512-BF16.gguf (version GGUF V3 (latest))
print_info: file format = GGUF V3 (latest)
print_info: file size = 6.39 GiB (16.00 BPW)
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'mistral3'
llama_model_load_from_file_impl: failed to load model
Traceback (most recent call last):
File "koboldcpp.py", line 8135, in
File "koboldcpp.py", line 7141, in main
File "koboldcpp.py", line 7591, in kcpp_main_process
File "koboldcpp.py", line 1445, in load_model
OSError: exception: access violation reading 0x0000000000000004
[PYI-33064:ERROR] Failed to execute script 'koboldcpp' due to unhandled exception`

What went wrong and how to fix this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions