Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iOS 16 Palettization + Varying Model Architecture Support #223

Open
nkpkg23 opened this issue Jul 31, 2023 · 2 comments
Open

iOS 16 Palettization + Varying Model Architecture Support #223

nkpkg23 opened this issue Jul 31, 2023 · 2 comments

Comments

@nkpkg23
Copy link

nkpkg23 commented Jul 31, 2023

Hi, I'm new to running Stable Diffusion on iOS and I have two clarification questions regarding using this repository–

  1. Am I able to run palettized stable diffusion models on iOS 16? In this article (https://huggingface.co/blog/fast-diffusers-coreml) from Huggingface, there was a note that mentioned that "In order to use 6-bit models, you need the development versions of iOS/iPadOS 17". However, in this video (https://developer.apple.com/videos/play/wwdc2023/10047/), there was a slide mentioning that there is support in iOS 16 for compressed models (sparse weights, quantized weights, palettized weights). What kinds of compressed models can I run on iOS 16 without needing the development version of iOS 17?

  2. I am also testing out some variants of Stable Diffusion with compressed architectures (e.g. nota-ai/bk-sdm-tiny) that removes several residual and attention blocks from the U-Net. When I ran the torch2coreml script, I was getting an error on assert mid_block_type == "UNetMidBlock2DCrossAttn". It appears that I'll need to modify the python_coreml_stable_diffusion/unet.py code to be compatible with the new architecture. Are there any other files that I'll need to modify in order for the script to work with such variants of Stable Diffusion?

@nkpkg23 nkpkg23 changed the title iOS 16 Palettization + Varying SDM Architecture Support iOS 16 Palettization + Varying Model Architecture Support Jul 31, 2023
@atiorh
Copy link
Collaborator

atiorh commented Aug 1, 2023

Hi!

  1. The same WWDC video also talks about ahead-of-time (eager, at load time) vs. just-in-time (lazy, during execution) decompression which is the main difference between iOS16 and 17. You will get storage size savings in both cases. You will only get runtime peak memory and latency savings on the latter.
  2. You should be able to modify only the unet.py file and fix such issues with different architectures. If the fix is not overly specific and preserves existing behavior for other known-to-work models, please submit a PR :)

@nkpkg23
Copy link
Author

nkpkg23 commented Aug 3, 2023

Thank you for the clarifications! I also wanted to check if I'm using the right approach for the inpainting functionality on iOS 17, since I am currently facing a memory leak whenever cgImage.plannerRGBShapedArray() is called on controlNetInputs.

The application's Documents and Data size grows every time the application is used. In the intermediate generation images, I can see that it starts inpainting inside the masked region of the original image, but it crashes every time after a few steps. (However, I don't have any issues with just generating images with palettized stable diffusion. This only occurs when I add inpainting with controlnet). Are there any other steps I am missing in order for inpainting to work? Or is this the wrong format for the controlnet inputs? Any tips would be helpful, thanks!

The details for inpainting are: I am using palettized runwayml/stable-diffusion-v1-5 with controlnet (lllyasviel/control_v11p_sd15_inpaint model) on iOS 17. For the controlnet input, I set the image size as 512x512 and set the pixels I wanted to inpaint in the original image as transparent. When instantiating the StableDiffusionPipeline, I included the following configurations and parameter for controlnet:


let configuration = MLModelConfiguration()
                configuration.computeUnits = .cpuAndNeuralEngine
  
                pipeline = try StableDiffusionPipeline(resourcesAt: url!, controlNet: ["LllyasvielControlV11PSd15Inpaint"], configuration: configuration, disableSafety: false, reduceMemory: true)
            } catch let error {
                print(error.localizedDescription)
            }

And, then inside a generateImage() function, this is how I added the controlnet inputs:


 var pipelineConfig = StableDiffusionPipeline.Configuration(prompt: prompt)
 if let controlNetInput = UIImage(named: “images/sample.png“)?.cgImage {
        pipelineConfig.controlNetInputs = [controlNetInput]
 } else {
         print("Error loading control net input image")
 }
         pipelineConfig.schedulerType = StableDiffusionScheduler.dpmSolverMultistepScheduler
              
         images = try pipeline?.generateImages(configuration: pipelineConfig, progressHandler: {......                                              
   
                                                                                                                                                                                                                                                                                                                                                                          

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants