Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random source that matches PyTorch #124

Merged
merged 3 commits into from
Feb 15, 2023
Merged

Conversation

liuliu
Copy link
Contributor

@liuliu liuliu commented Feb 8, 2023

This added random source that matches PyTorch on CPU. In particular, it matches: torch.randn([], dtype=torch.float) result.

PyTorch's RNG is a bit convoluted and not claimed to be version-stable (will open a separate issue in PyTorch repo on this). However, the current implementation on CPU is fairly straightforward^*.

  1. If it is less than 16 elements, it uses Gaussian distribution sampled from MT19937 for double + Box-Muller transformation.

  2. If it is more than 16 (16 included), it first do uniform sampling with whatever the resulting data type would be (in this case, torch.float), and then apply Box-Muller transformation over 16-element segment at a type, treating the first floating-point and the 8th as a pair, so on so forth.

  3. If it is not a multiple of 16, trace back from the end for 16 elements and redo step 2.

#########

  • I agree to the terms outlined in CONTRIBUTING.md

@atiorh atiorh requested a review from msiracusa February 9, 2023 07:51
@atiorh
Copy link
Collaborator

atiorh commented Feb 9, 2023

This is a great feature for the Swift package @liuliu !

To provide a bit more context for others who might be reading this:

  • There are severals factors that cause the generated images to potentially differ across Swift+CoreML and PyTorch+diffusers as outlined in FAQ Q8. Different RNG implementations is a major source of difference.
  • Our current RNG implementation aims to match that of NumPy in order to keep generations consistent across Python+CoreML vs Swift+CoreML.
  • With @liuliu 's latest PR, we can optionally match either PyTorch or NumPy RNG behavior. Matching the former will remove a major source of difference between Swift+CoreML and other image generation pipelines based on PyTorch in the ecosystem.

@atiorh
Copy link
Collaborator

atiorh commented Feb 9, 2023

In order to integrate this as an optional RNG configuration to the user, let's use the Configuration abstraction being introduced by #116 here. We are planning on merging #116 very soon and I will ping this PR when we do.

Copy link
Collaborator

@atiorh atiorh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just merged #116. Could you please use the Configuration abstraction from here and let the CLI switch between RNG behaviors? Thank you!

This added random source that matches PyTorch on CPU. In particular, it
matches: `torch.randn([], dtype=torch.float)` result.

PyTorch's RNG is a bit convoluted and not claimed to be version-stable
(will open a separate issue in PyTorch repo on this). However, the
current implementation on CPU is fairly straightforward^*.

1. If it is less than 16 elements, it uses Gaussian distribution sampled
   from MT19937 for double + Box-Muller transformation.

2. If it is more than 16 (16 included), it first do uniform sampling
   with whatever the resulting data type would be (in this case, torch.float),
   and then apply Box-Muller transformation over 16-element segment at a
   type, treating the first floating-point and the 8th as a pair, so on
   so forth.

3. If it is not a multiple of 16, trace back from the end for 16
   elements and redo step 2.
@liuliu
Copy link
Contributor Author

liuliu commented Feb 13, 2023

The PR is rebased and updated!

@liuliu liuliu requested review from atiorh and removed request for msiracusa February 13, 2023 22:18
@H1p3ri0n
Copy link

It's so nice to see the original author Liuliu who made the the DrawThings app for Stable Diffusion in swift to be finally at this repo. Any chance we can ask your help on getting additional samplers such as Euler A implemented? Which would be awesome.

@liuliu
Copy link
Contributor Author

liuliu commented Feb 13, 2023

Thanks for the kind words! It is hard to keep up with the landscape of samplers for sure, since the newest of the day changed again: AUTOMATIC1111/stable-diffusion-webui#7710

Anyway, Euler A is a pretty simple sampler (comparing to DPM++ SDE Karras), here is the gist of the code (it would take more effort to migrate my code over):

  x = sigmas[0] * xT
  for i in 0..<endStep {
    let sigma = sigmas[i]
    let timestep = parameters.timestep(from: sigma)
    let sqrtAlphaCumprod = 1.0 / (sigma * sigma + 1).squareRoot()
    let input = sqrtAlphaCumprod * x
    xIn[0..<batchSize, 0..<startHeight, 0..<startWidth, 0..<4] = input
    xIn[batchSize..<(batchSize * 2), 0..<startHeight, 0..<startWidth, 0..<4] = input
    let etOut = unet(timestep: timestep, inputs: xIn, t, c)
    let etUncond =
      etOut[0..<batchSize, 0..<startHeight, 0..<startWidth, 0..<4]
    let etCond[0..<batchSize, 0..<startHeight, 0..<startWidth, 0..<4] =
      etOut[batchSize..<(batchSize * 2), 0..<startHeight, 0..<startWidth, 0..<4]
    et = etUncond + cfgScale * (etCond - etUncond)
    let sigmaUp = min(
      sigmas[i + 1],
      1.0
        * ((sigmas[i + 1] * sigmas[i + 1]) * (sigma * sigma - sigmas[i + 1] * sigmas[i + 1])
        / (sigma * sigma)).squareRoot())
    let sigmaDown = (sigmas[i + 1] * sigmas[i + 1] - sigmaUp * sigmaUp).squareRoot()
    let dt = sigmaDown - sigma  // Notice this is already a negative.
    if predictV {
      // denoised = Float(1.0 / (sigma * sigma + 1)) * x - (sigma * sqrtAlphaCumprod) * et
      // d = (x - denoised) / sigma // (x - Float(1.0 / (sigma * sigma + 1)) * x + (sigma * sqrtAlphaCumprod) * et) / sigma = (sigma / (sigma * sigma + 1)) * x + sqrtAlphaCumprod * et
      let d = Functional.add(
        left: x, right: et, leftScalar: sigma / (sigma * sigma + 1),
        rightScalar: sqrtAlphaCumprod)
      x = Functional.add(left: x, right: d, leftScalar: 1, rightScalar: dt)
    } else {
      // denoised = x - sigma * et
      // d = (x - denoised) / sigma // (x - x + sigma * et) / sigma = et
      x = Functional.add(left: x, right: et, leftScalar: 1, rightScalar: dt)
    }
    noise.randn(std: 1, mean: 0)
    if sigmaUp > 0 {
      x = Functional.add(left: x, right: noise, leftScalar: 1, rightScalar: sigmaUp)
    }
  }

Copy link
Collaborator

@atiorh atiorh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback @liuliu, we are almost there! I just tested these changes and the CLI argument does not seem to be able to modify the default RNG (.numpyRNG). Do you mind fixing that? After that, this looks ready to go :)

@liuliu
Copy link
Contributor Author

liuliu commented Feb 15, 2023

Good catch! Missed the update!

@atiorh atiorh merged commit ddefb61 into apple:main Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants