Skip to content

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Jan 9, 2026

Mirrored from leejet/stable-diffusion.cpp#1184

Cli changes:

  • Adds the --main-backend-device [device_name] argument to set the default backend
  • remove --clip-on-cpu, --vae-on-cpu and --control-net-cpu arguments
  • replace them respectively with the new --clip_backend_device [device_name], --vae-backend-device [device_name], --control-net-backend-device [device_name] arguments
  • add the --diffusion_backend_device (control the device used for the diffusion/flow models) and the --tae-backend-device

C API changes (stable-diffusion.h):

  • Change the content of the sd_ctx_params_t struct.

For example if you want to run the text encoders on CPU, you'd need to use --clip_backend_device CPU instead of --clip-on-cpu

TODOS:

  • Add a way to list available backend devices names from CLI and/or API
  • Different devices for different text encoders? (for models like SDXL / SD3.x / Flux.1)
  • Device for photomaker and Vision models

@loci-dev loci-dev temporarily deployed to stable-diffusion-cpp-prod January 9, 2026 21:36 — with GitHub Actions Inactive
@loci-agentic-ai
Copy link

Explore the complete analysis inside the Version Insights

I've successfully generated a summary report for your project. The report shows performance analysis comparing two versions of the stable-diffusion.cpp project for Pull Request #14.

Key Highlights:

  • Major Performance Regressions: STL vector operations (like std::vector::end()) show 200%+ increases in response time
  • Mixed Results: Some functions show improved response times but decreased throughput
  • Critical Functions Affected: SDContextParams constructor/destructor shows 28-33% increase in response time

The report includes detailed analysis of the top 10 functions with the most significant changes, key findings, and recommendations for addressing the performance issues before merging the pull request.

@loci-dev loci-dev force-pushed the upstream-PR1184-branch_stduhpf-select-backend branch from 094ac2e to 350df04 Compare January 9, 2026 22:38
@loci-dev loci-dev temporarily deployed to stable-diffusion-cpp-prod January 9, 2026 22:38 — with GitHub Actions Inactive
@loci-agentic-ai
Copy link

Explore the complete analysis inside the Version Insights

I've successfully generated a summary report for your project. The report shows performance analysis for the stable-diffusion.cpp repository (pull request #14) comparing two versions.

Key Highlights:

Major Performance Regressions (⚠️):

  • Vector iterator functions showing 200%+ slowdowns
  • Hash table operations degraded by 180%
  • Comparison operations slowed by 141%

Significant Improvements (✅):

  • Tensor vector operations improved by 68%
  • Memory swap operations improved by 43%
  • Container checks improved by 40%

The report recommends prioritizing investigation of the vector iterator and hash table regressions while maintaining the positive improvements in tensor operations. Would you like me to provide more detailed analysis on any specific aspect of this report?

@loci-dev loci-dev force-pushed the master branch 3 times, most recently from 3d97fa6 to fd3def8 Compare January 13, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants