-
Notifications
You must be signed in to change notification settings - Fork 942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image2image - swift #116
Image2image - swift #116
Conversation
Encoder
Thank you @littleowl ! I reviewed #115 and leaving the noise situation as is should be fine, please see my comment there :) I believe the only implication of my review on #115 on this PR is that the model I/O names should be changed from camelCase to underscore and that is it. Thank you again for your amazing contributions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple of small things.
swift/StableDiffusion/pipeline/DPMSolverMultistepScheduler.swift
Outdated
Show resolved
Hide resolved
swift/StableDiffusion/pipeline/StableDiffusionPipeline+Resources.swift
Outdated
Show resolved
Hide resolved
@atiorh, @alejandro-isaza - Thanks for your reviews! I believe that I resolved all the comments. Please let me know if there is anything more that I can do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The header comment needs to be fixed. Other than that just a couple more nits.
swift/StableDiffusion/pipeline/StableDiffusionPipeline+SampleInput.swift
Outdated
Show resolved
Hide resolved
swift/StableDiffusion/pipeline/StableDiffusionPipeline+SampleInput.swift
Outdated
Show resolved
Hide resolved
swift/StableDiffusion/pipeline/StableDiffusionPipeline+SampleInput.swift
Outdated
Show resolved
Hide resolved
@alejandro-isaza, thanks again for the review. The additional comments have been resolved. |
} | ||
|
||
/// Image generation configuration | ||
public struct Configuration: Hashable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one more thing. Let's rename this file StableDiffusionPipeline.Configuration.swift
@littleowl In addition to the file name change that @alejandro-isaza asked for, if you could just add a quick note in the Example CLI Usage section to indicate that an image argument is now supported, that would be great! |
I just tested the CLI with different images and it seems that we need to guard against or fix a few things: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@littleowl I just realized that we left scattered comments after the initial review, apologies! Here are the two remaining items before we can merge:
swift/StableDiffusion/pipeline/StableDiffusionPipeline+SampleInput.swift
renamed toswift/StableDiffusion/pipeline/StableDiffusionPipeline.Configuration.swift
- specific error message when input image file format and/or resolution do not match expectations
I know you all have this awesome PR under control, but just wanted to say that I successfully converted the v2 base model VAEEncoder using the main (w/ the python image2image PR), and then converted the rest of v2 base using this PR, and am currently generating image2image using my own swift code. So from an end-user perspective: well done! |
@pj4533 well done! I'm so jealous! how can I try?? |
also fix 512 hard coded
Again, sorry for the delay @atiorh. |
Thanks for your hard work @littleowl, this is going to benefit many developers! I just tested your changes and they are working like a charm. Merging now. |
@littleowl congratulations on merging! I'm sooo excited to use it but don't know how. Can you please add instructions in the readme or make a YouTube video? |
Thanks for your exceptional work @littleowl and thank you for sharing the example @ynagatomo! Can you please explain what the |
an experiment on strength; |
Adds image2image functionality.
This is only the swift portion. A separate library that generates the model for the VAE Encoder exists here: #115
The original PR with both Swift and Python libs can be found here: #73
Sadly, I could not complete the request for the python simplification. I believe it is better not to do so. Consequently, sorry for the delay splitting up the PR. There are not many changes from the previous PR apart from removing the python and rebasing the project.
In Swift, an Encoder class is created. Various changes to the scheduler, pipeline, and CLI to support input image and strength. CGImage creation from MLShapedArray is moved into its own file along with the new function to create a MLShapedArray from a CGImage. Image loading and preparation is currently handled / optimized with vImage.
This should work with both Schedulers that we have so far. (Thanks @pcuenca) for the fix to work with DPMSolverMultistepScheduler.
Do not erase the below when submitting your pull request:
#########