Skip to content

Conversation

@richiejp
Copy link
Collaborator

@richiejp richiejp commented Dec 3, 2025

Redo #7414

Related: #7399

TODO:

  • Test Z-Image
  • Test Flux 2 (uses Mistral 24b == I can't be bothered)
  • Fix CI failures

Previously Z-Image ran out of VRAM, but I have 16GB and it should be using 4bit quants

@netlify
Copy link

netlify bot commented Dec 3, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit e444e4f
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/6931a235a00b97000759b60e
😎 Deploy Preview https://deploy-preview-7419--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@richiejp
Copy link
Collaborator Author

richiejp commented Dec 3, 2025

It seems that both z-image and flux1 randomly produce a black image. Sometimes they work and it seems to make it more likely to work if I set "offload_params_to_cpu" or "keep_clip_on_cpu".

image

Also the second time I run generation it segfaults or SYCL fails to allocate memory, or there is an assert failure.

@richiejp
Copy link
Collaborator Author

richiejp commented Dec 3, 2025

The stablediffusion CLI seems to work fine every time. It initializes some structures itself instead of using the library functions e.g sd_image_gen_params, so there are a lot of places we could be going wrong. On the other hand we don't use most of those settings and it does work sometimes, so I'm leaning away from our usage of stablediffusion being wrong due to the below.

I noticed somethign very strange when sending two requests in serial (which segfaults it)

localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr [INFO ] stable-diffusion.cpp:3390 - generate_image completed in 15.85s
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Writing PNG
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr DST: /tmp/generated/content/images/b642520670272.png
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Width: 256
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Height: 256
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Channel: 3
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Data: 0x55557a3c7010
localai-1  | 11:12AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr Saved resulting image to '/tmp/generated/content/images/b642520670272.png'
localai-1  | 11:12AM DBG Response: {"created":1764760358,"id":"741a1de8-0f7f-48c0-a36d-30005a87f0e1","data":[{"embedding":null,"index":0,"url":"http://dell:8081/generated-images/b642520670272.png"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
localai-1  | 11:12AM INF HTTP request method=POST path=/v1/images/generations status=200
localai-1  | 11:12AM INF HTTP request method=GET path=/generated-images/b642520670272.png status=200
localai-1  | 11:13AM INF HTTP request method=GET path=/readyz status=200
localai-1  | 11:14AM INF HTTP request method=GET path=/readyz status=200
localai-1  | 11:15AM INF HTTP request method=GET path=/readyz status=200
localai-1  | 11:16AM INF HTTP request method=GET path=/readyz status=200
localai-1  | 11:17AM DBG context local model name not found, setting to the first model first model name=z-image-test
localai-1  | 11:17AM DBG guessDefaultsFromFile: NGPULayers set NGPULayers=99999999
localai-1  | 11:17AM DBG guessDefaultsFromFile: template already set name=z-image-test
localai-1  | 11:17AM DBG Parameter Config: &{modelConfigFile:/models/z-image-test.yaml PredictionOptions:{BasicModelRequest:{Model:z_image_turbo-Q4_K.gguf} Language: Translate:false N:0 TopP:0xc001b7af38 TopK:0xc001b7af40 Temperature:0xc001b7af48 Maxtokens:0xc001b7af78 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc001b7af70 TypicalP:0xc001b7af68 Seed:0xc001b7af88 Logprobs:{Enabled:false} TopLogprobs:<nil> LogitBias:map[] NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 ClipSkip:0 Tokenizer:} Name:z-image-test F16:0xc001b7af30 Threads:0xc001b7af28 Debug:0xc001a3a430 Roles:map[] Embeddings:0xc001b7af81 Backend:stablediffusion-ggml TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:true JoinChatMessagesByCharacter:<nil> Multimodal: ReplyPrefix:} KnownUsecaseStrings:[FLAG_CHAT FLAG_IMAGE] KnownUsecases:0xc0019015a8 Pipeline:{TTS: LLM: Transcription: VAD:} PromptStrings:[a pink crab] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:true Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc001b7af60 MirostatTAU:0xc001b7af58 Mirostat:0xc001b7af50 NGPULayers:0xc001bfb6e0 MMap:0xc001b7af80 MMlock:0xc001b7af81 LowVRAM:0xc001b7af81 Reranking:0xc001b7af81 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc001bbf1f0 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:<nil> NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:1} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:25 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[diffusion_model llm_path:Qwen3-4B.Q4_K_M.gguf vae_path:ae.safetensors offload_params_to_cpu:true use_jinja:true] Overrides:[] MCP:{Servers: Stdio:} Agent:{MaxAttempts:0 MaxIterations:0 EnableReasoning:false EnablePlanning:false EnableMCPPrompts:false EnablePlanReEvaluator:false}}
localai-1  | 11:17AM DBG Model already loaded in memory: z-image-test
localai-1  | 11:17AM DBG Checking model availability (z-image-test)
localai-1  | 11:17AM DBG Model 'z-image-test' already loaded
localai-1  | 11:17AM DBG GRPC(z-image-test-127.0.0.1:34425): stderr gen_image is done: /tmp/generated/content/images/b642520670272.pngGenerating image

before the model generation for the second request is done it says "gen_image is done" and gives the image name for the previous generation. This image doesn't actually appear to get saved to disk.

EDIT: this just appears to be due to a missing newline and some changes to the directory structure, so it's looking more like the usage of SD is wrong.

… and flux2

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
… PNG

Signed-off-by: Richard Palethorpe <io@richiejp.com>
@richiejp
Copy link
Collaborator Author

richiejp commented Dec 3, 2025

So there are a few defaults we did different and we are using strings allocated by Go which would explain the segfaults on the second try because the pointers stored in the ctx object could have been reclaimed by then.

@richiejp
Copy link
Collaborator Author

richiejp commented Dec 4, 2025

It appears I have stopped the segfaults by also setting free_params_immediately to false which defaults to true.

Stil getting black images randomly.

Signed-off-by: Richard Palethorpe <io@richiejp.com>
@richiejp richiejp marked this pull request as ready for review December 4, 2025 15:01
@richiejp
Copy link
Collaborator Author

richiejp commented Dec 4, 2025

I've exhausted every possibility I can think of for why I'm getting a black box randomly. I can't see any parameter that is different or see how the Go/CPP memory interface could be causing it. I've never seen it on the upstream CLI, but I'm wondering if this could be an Intel related issue and would be good to have someone test it on another system @mudler

@mudler
Copy link
Owner

mudler commented Dec 4, 2025

Interesting.. also I don't see anything from a quick scan. let's merge it so can have better testing overall

@mudler mudler merged commit c2e4a1f into mudler:master Dec 4, 2025
39 checks passed
@richiejp
Copy link
Collaborator Author

richiejp commented Dec 5, 2025

Probably the black image is related to this: leejet/stable-diffusion.cpp#1031, strange though that I only get it in LocalAI and not in the upstream CLI.

@richiejp
Copy link
Collaborator Author

richiejp commented Dec 8, 2025

I've tested this with NVIDIA and did not experience this issue. All I can think is that it is an upstream issue that is highly sensitive to environmental inputs and will require some more indepth debugging to figure out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants