Skip to content

Failed to load model nllb-200-3.3B #5188

Open
@PrzemekSkw

Description

@PrzemekSkw

Hello,
I install docker Local AI on unriad v7.0.1.

LocalAI version:
v2.27.0 (6d7ac09e96fc85fb45b4ce098b658b5040ed201b)

Environment, CPU architecture, OS, and Version:
Linux unRAID 6.6.78-Unraid #2 SMP PREEMPT_DYNAMIC Thu Feb 20 13:33:15 PST 2025 x86_64 12th Gen Intel(R) Core(TM) i5-12600K GenuineIntel GNU/Linux, NVIDIA RTX3060, intel-core i5 12600k, 64GB RAM

Describe the bug

When I want to try it out:

curl --max-time 60 -X POST http://localhost:8180/v1/completions -H "Content-Type: application/json" -d '{"model": "nllb-200-3.3B", "prompt": "Translate to Polish: Hello, how are you?", "max_tokens": 50}'
curl: (28) Operation timed out after 60001 milliseconds with 0 bytes received

To Reproduce
add variable: REBUILD=true

Expected behavior
translate text from english to polish

Logs

10:06AM INF Success ip=192.168.0.215 latency="13.083µs" method=GET status=200 url=/static/assets/fontawesome/webfonts/fa-solid-900.woff2
10:06AM INF Success ip=192.168.0.215 latency="8.08µs" method=GET status=200 url=/static/assets/fontawesome/webfonts/fa-brands-400.woff2
10:06AM INF Success ip=127.0.0.1 latency="23.122µs" method=GET status=200 url=/readyz
10:07AM INF Success ip=127.0.0.1 latency="27.854µs" method=GET status=200 url=/readyz
10:08AM INF Success ip=127.0.0.1 latency="40.644µs" method=GET status=200 url=/readyz
10:09AM INF Success ip=127.0.0.1 latency="91.746µs" method=GET status=200 url=/readyz
10:10AM INF Success ip=127.0.0.1 latency="29.971µs" method=GET status=200 url=/readyz
10:11AM INF Success ip=127.0.0.1 latency="12.196µs" method=GET status=200 url=/readyz
10:12AM INF Success ip=127.0.0.1 latency="30.91µs" method=GET status=200 url=/readyz
10:13AM INF Success ip=127.0.0.1 latency="32.756µs" method=GET status=200 url=/readyz
10:14AM INF Success ip=127.0.0.1 latency="30.82µs" method=GET status=200 url=/readyz
10:15AM INF Success ip=127.0.0.1 latency="33.233µs" method=GET status=200 url=/readyz
10:16AM INF Success ip=127.0.0.1 latency="43.469µs" method=GET status=200 url=/readyz
10:17AM INF Success ip=127.0.0.1 latency="32.85µs" method=GET status=200 url=/readyz
10:18AM INF Success ip=127.0.0.1 latency="32.05µs" method=GET status=200 url=/readyz
10:19AM INF Success ip=127.0.0.1 latency="29.845µs" method=GET status=200 url=/readyz
10:19AM INF env file found, loading environment variables from file envFile=.env
10:19AM DBG Setting logging to debug
10:19AM INF Starting LocalAI using 8 threads, with models path: /build/models
10:19AM INF LocalAI version: v2.27.0 (6d7ac09e96fc85fb45b4ce098b658b5040ed201b)
10:19AM DBG CPU capabilities: [3dnowprefetch abm acpi adx aes aperfmperf apic arat arch_capabilities arch_lbr arch_perfmon art avx avx2 avx_vnni bmi1 bmi2 bts clflush clflushopt clwb cmov constant_tsc cpuid cpuid_fault cx16 cx8 de ds_cpl dtes64 dtherm dts epb ept ept_ad erms est f16c flexpriority flush_l1d fma fpu fsgsbase fsrm fxsr gfni hfi ht hwp hwp_act_window hwp_epp hwp_notify hwp_pkg_req ibpb ibrs ibrs_enhanced ibt ida intel_pt invpcid lahf_lm lm mca mce md_clear mmx monitor movbe movdir64b movdiri msr mtrr nonstop_tsc nopl nx ospke pae pat pbe pclmulqdq pconfig pdcm pdpe1gb pebs pge pku pln pni popcnt pse pse36 pts rdpid rdrand rdseed rdtscp rep_good sdbg sep serialize sha_ni smap smep smx split_lock_detect ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tm tm2 tme tpr_shadow tsc tsc_adjust tsc_deadline_timer tsc_known_freq umip vaes vme vmx vnmi vpclmulqdq vpid waitpkg x2apic xgetbv1 xsave xsavec xsaveopt xsaves xtopology xtpr]
10:19AM DBG GPU count: 2
10:19AM DBG GPU: card #0 @0000:00:02.0 -> driver: 'i915' class: 'Display controller' vendor: 'Intel Corporation' product: 'AlderLake-S GT1'
10:19AM DBG GPU: card #1 @0000:01:00.0 -> driver: 'nvidia' class: 'Display controller' vendor: 'NVIDIA Corporation' product: 'GA106 [GeForce RTX 3060 Lite Hash Rate]'
10:19AM INF Preloading models from /build/models
10:19AM DBG Model: nllb-200-3.3B (config: {PredictionOptions:{BasicModelRequest:{Model:facebook/nllb-200-3.3B} Language: Translate:false N:0 TopP:0xc000c71ff0 TopK:0xc000c71ff8 Temperature:0xc000cce000 Maxtokens:0xc000cce030 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000cce028 TypicalP:0xc000cce020 Seed:0xc000cce048 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:nllb-200-3.3B F16:0xc000c71fd8 Threads:0xc000c71fe0 Debug:0xc000cce040 Roles:map[] Embeddings:0xc000cce041 Backend:transformers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_ANY] KnownUsecases:<nil> PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000cce018 MirostatTAU:0xc000cce010 Mirostat:0xc000cce008 NGPULayers:0xc000cce038 MMap:0xc000cce040 MMlock:0xc000cce041 LowVRAM:0xc000cce041 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc000cce050 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[]})
10:19AM DBG Extracting backend assets files to /tmp/localai/backend_data
10:19AM DBG processing api keys runtime update
10:19AM DBG processing external_backends.json
10:19AM DBG external backends loaded from external_backends.json
10:19AM INF core/startup process completed!
10:19AM DBG No configuration file found at /tmp/localai/upload/uploadedFiles.json
10:19AM DBG No configuration file found at /tmp/localai/config/assistants.json
10:19AM DBG No configuration file found at /tmp/localai/config/assistantsFile.json
10:19AM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080
10:20AM WRN SetDefaultModelNameToFirstAvailable used with no matching models installed
10:20AM DBG context local model name not found, setting to default defaultModelName=gpt-4o
10:20AM DBG Parameter Config: &{PredictionOptions:{BasicModelRequest:{Model:facebook/nllb-200-3.3B} Language: Translate:false N:0 TopP:0xc000c71ff0 TopK:0xc000c71ff8 Temperature:0xc000cce000 Maxtokens:0xc0006449f0 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000cce028 TypicalP:0xc000cce020 Seed:0xc000cce048 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:nllb-200-3.3B F16:0xc000c71fd8 Threads:0xc000c71fe0 Debug:0xc000644a98 Roles:map[] Embeddings:0xc000cce041 Backend:transformers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_ANY] KnownUsecases:<nil> PromptStrings:[Translate to Polish: Hello, how are you?] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000cce018 MirostatTAU:0xc000cce010 Mirostat:0xc000cce008 NGPULayers:0xc000cce038 MMap:0xc000cce040 MMlock:0xc000cce041 LowVRAM:0xc000cce041 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc000cce050 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[]}
10:20AM DBG Template found, input modified to: Translate to Polish: Hello, how are you?
10:20AM INF BackendLoader starting backend=transformers modelID=nllb-200-3.3B o.model=facebook/nllb-200-3.3B
10:20AM DBG Loading model in memory from file: /build/models/facebook/nllb-200-3.3B
10:20AM DBG Loading Model nllb-200-3.3B with gRPC (file: /build/models/facebook/nllb-200-3.3B) (backend: transformers): {backendString:transformers model:facebook/nllb-200-3.3B modelID:nllb-200-3.3B assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000499b08 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
10:20AM DBG Loading external backend: /build/backend/python/transformers/run.sh
10:20AM DBG external backend is file: &{name:run.sh size:253 mode:448 modTime:{wall:0 ext:63879013981 loc:0x592d6da0} sys:{Dev:229 Ino:92785 Nlink:1 Mode:33216 Uid:0 Gid:0 X__pad0:0 Rdev:0 Size:253 Blksize:4096 Blocks:8 Atim:{Sec:1743417181 Nsec:0} Mtim:{Sec:1743417181 Nsec:0} Ctim:{Sec:1744783011 Nsec:918585119} X__unused:[0 0 0]}}
10:20AM DBG Loading GRPC Process: /build/backend/python/transformers/run.sh
10:20AM DBG GRPC Service for nllb-200-3.3B will be running at: '127.0.0.1:36447'
10:20AM DBG GRPC Service state dir: /tmp/go-processmanager3991976067
10:20AM DBG GRPC Service Started
10:20AM DBG Wait for the service to start up
10:20AM DBG Options: ContextSize:1024 Seed:1277843637 NBatch:512 F16Memory:true MMap:true NGPULayers:99999999 Threads:8
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stdout Initializing libbackend for transformers
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stdout virtualenv activated
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stdout activated virtualenv has been ensured
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr /build/backend/python/transformers/venv/lib/python3.10/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at backend.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr   warnings.warn(
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr /build/backend/python/transformers/venv/lib/python3.10/site-packages/transformers/utils/hub.py:105: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr   warnings.warn(
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr Server started. Listening on: 127.0.0.1:36447
10:20AM DBG GRPC Service Ready
10:20AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:0xc00041ce58} sizeCache:0 unknownFields:[] Model:facebook/nllb-200-3.3B ContextSize:1024 Seed:1277843637 NBatch:512 F16Memory:true MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/build/models/facebook/nllb-200-3.3B Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 LoadFormat: DisableLogStatus:false DType: LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false ModelPath:/build/models LoraAdapters:[] LoraScales:[] Options:[] CacheTypeKey: CacheTypeValue: GrammarTriggers:[]}
10:20AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr Automodel
Fetching 3 files: 100%|██████████| 3/3 [00:00<00:00, 12748.64it/s]
10:20AM INF Success ip=127.0.0.1 latency="45.81µs" method=GET status=200 url=/readyz
10:21AM INF Success ip=127.0.0.1 latency="27.841µs" method=GET status=200 url=/readyz
Loading checkpoint shards: 100%|██████████| 3/3 [01:26<00:00, 28.73s/it]
10:22AM DBG GRPC(nllb-200-3.3B-127.0.0.1:36447): stderr Error: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
10:22AM ERR Server error error="failed to load model with internal loader: could not load model (no success): Unexpected err=NotImplementedError('Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.'), type(err)=<class 'NotImplementedError'>" ip=172.17.0.1 latency=1m34.309480122s method=POST status=500 url=/v1/completions
10:22AM INF Success ip=127.0.0.1 latency="18.603µs" method=GET status=200 url=/readyz
10:23AM INF Success ip=127.0.0.1 latency="21.783µs" method=GET status=200 url=/readyz
10:24AM INF Success ip=127.0.0.1 latency="15.497µs" method=GET status=200 url=/readyz
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DGGML_F16C=OFF -DGGML_AVX512=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
model name      : 12th Gen Intel(R) Core(TM) i5-12600K
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr ibt flush_l1d arch_capabilities
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@

  Model name: nllb-200-3.3B       

Additional context
docker template
Image

folder structure:

root@unRAID:~# ls -l /mnt/user/NAS/LocalAI/models/
total 8
drwxr-xr-x 1 root   root    85 Apr 16 07:35 models--facebook--nllb-200-3.3B/
drwxrwxrwx 1 nobody users 4096 Apr 16 06:47 nllb-200-3.3B/
-rw-rw-rw- 1 root   root   109 Apr 16 07:25 nllb-200-3.3B.yaml
root@unRAID:~# cat /mnt/user/NAS/LocalAI/models/nllb-200-3.3B.yaml 
name: nllb-200-3.3B
backend: transformers
parameters:
  model: facebook/nllb-200-3.3B
f16: true
device: cuda
root@unRAID:~# ls -l /mnt/user/NAS/LocalAI/models/nllb-200-3.3B/
total 17187284
-rw-rw-rw- 1 nobody users       7640 Apr 15 21:26 README.md
-rw-rw-rw- 1 nobody users        808 Apr 15 21:26 config.json
-rw-rw-rw- 1 nobody users        189 Apr 15 21:26 generation_config.json
-rw-rw-rw- 1 nobody users       1224 Apr 15 21:26 gitattributes
-rw-rw-rw- 1 nobody users 6933375959 Apr 15 21:30 pytorch_model-00001-of-00003.bin
-rw-rw-rw- 1 nobody users 8545249147 Apr 15 21:39 pytorch_model-00002-of-00003.bin
-rw-rw-rw- 1 nobody users 2098840299 Apr 15 21:36 pytorch_model-00003-of-00003.bin
-rw-rw-rw- 1 nobody users      90035 Apr 15 21:34 pytorch_model.bin.index.json
-rw-rw-rw- 1 nobody users    4852054 Apr 15 21:34 sentencepiece.bpe.model
-rw-rw-rw- 1 nobody users       3548 Apr 15 21:34 special_tokens_map.json
-rw-rw-rw- 1 nobody users   17331176 Apr 15 21:34 tokenizer.json
-rw-rw-rw- 1 nobody users        564 Apr 15 21:34 tokenizer_config.json
root@unRAID:~# 

webui:
Image

Regards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions