Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory-efficient attention and gradio mask fixed #117

Merged
merged 2 commits into from
Sep 5, 2022

Conversation

neonsecret
Copy link

No description provided.

(cherry picked from commit ddde264)
@neonsecret neonsecret changed the title Memory-efficient attention (single file changed) Memory-efficient attention and gradio mask fixed Sep 4, 2022
@MrLavender
Copy link

Nice work. Applying the attention.py change to the original SD lets me do 512x512 on 8GB, previously could only do 448x448.

But (on the original SD anyway) the size of sim is 16 so sim[8:] and sim[:8] is more memory efficient (makes the difference between it working or failing with out-of-memory). A more general way to do this would be;

half = int(sim.size(dim=0) / 2)
sim[:half] = sim[:half].softmax(dim=-1)
sim[half:] = sim[half:].softmax(dim=-1)

or for maximum memory efficiency (with about 1% performance difference for me);

for i in range(sim.size(dim=0)):
    sim[i] = sim[i].softmax(dim=-1)

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

I've found a way to split up the einsum too, can go to insane resolutions on my card now... Might be a better way to do this, my knowledge of torch and python is very limited (meaning almost 0 ;)

Also not quite sure if all the deletes are really needed, no idea when the garbage collector triggers for unused tensors, but guess can't hurt to force it.

def forward(self, x, context=None, mask=None):
    h = self.heads

    q = self.to_q(x)
    context = default(context, x)
    k = self.to_k(context)
    v = self.to_v(context)
    del context, x

    q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v))

    r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device)
    for i in range(0, q.shape[0], 4):
        end = i + 4
        s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end])
        s1 *= self.scale

        s2 = s1.softmax(dim=-1)
        del s1

        r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end])
        del s2

    r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
    del r1

    return self.to_out(r2)

@neonsecret
Copy link
Author

I've found a way to split up the einsum too, can go to insane resolutions on my card now... Might be a better way to do this, my knowledge of torch and python is very limited (meaning almost 0 ;)

Also not quite sure if all the deletes are really needed, no idea when the garbage collector triggers for unused tensors, but guess can't hurt to force it.

def forward(self, x, context=None, mask=None):
    h = self.heads

    q = self.to_q(x)
    context = default(context, x)
    k = self.to_k(context)
    v = self.to_v(context)
    del context, x

    q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v))

    r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device)
    for i in range(0, q.shape[0], 4):
        end = i + 4
        s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end])
        s1 *= self.scale

        s2 = s1.softmax(dim=-1)
        del s1

        r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end])
        del s2

    r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
    del r1

    return self.to_out(r2)

it won't work, you are only multiplying parts and the whole tensor, the tensor for einsum shouldn't be split

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

I've found a way to split up the einsum too, can go to insane resolutions on my card now... Might be a better way to do this, my knowledge of torch and python is very limited (meaning almost 0 ;)
Also not quite sure if all the deletes are really needed, no idea when the garbage collector triggers for unused tensors, but guess can't hurt to force it.

def forward(self, x, context=None, mask=None):
    h = self.heads

    q = self.to_q(x)
    context = default(context, x)
    k = self.to_k(context)
    v = self.to_v(context)
    del context, x

    q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v))

    r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device)
    for i in range(0, q.shape[0], 4):
        end = i + 4
        s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end])
        s1 *= self.scale

        s2 = s1.softmax(dim=-1)
        del s1

        r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end])
        del s2

    r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
    del r1

    return self.to_out(r2)

it won't work, you are only multiplying parts and the whole tensor, the tensor for einsum shouldn't be split

Seems to work fine, gives same results, I have no idea how einsum works though, but as far as I can see there are no side effects

@neonsecret
Copy link
Author

and memory?

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

and memory?

I went from being able to do 1920x640 to 1920x832, it's about 1/4th for the einsum now, I don't have any other optimizations though, only this one (from the compvis version)

@neonsecret
Copy link
Author

hmm very weird

@neonsecret
Copy link
Author

fucking hell it works

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

It actually works with steps of 2 as well, I can go to 1920x1024 then, it breaks at steps of 1, no idea how this stuff works hehe

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

Does seem to make it slower though

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

For comparison, I tested the same prompt/seed/settings etc.
at different step sizes:

8 - 7.0 it/s
4 - 6.2 it/s
2 - 4.7 it/s

the drop from 8 to 4 isn't too bad, but not sure if to 2 is worth it. Unless you want to render really high

@neonsecret
Copy link
Author

4 doesnt seem to make any difference for me
I'm going to add both options

@victorbessa96
Copy link

victorbessa96 commented Sep 4, 2022

It would be great to have option to decide between faster renders or really high resolution, so perhaps an option to switch between 8 and 2?

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 4, 2022

I've found a way to split up the einsum too, can go to insane resolutions on my card now... Might be a better way to do this, my knowledge of torch and python is very limited (meaning almost 0 ;)

@Doggettx Wow, your code works amazingly well!

I can not see any significant slowdown, it works great even using a step amount of 1 in the for loop. I did also check that the output from the same seed is fully identical.

This is the speed I'm getting when measuring generating a 512x512 image, using a RTX 2070 Super:

  • Default SD: 5.0 it/s | 0.39 Megapixels Max Res
  • Your modified def forward with loop steps of 8: 4.94 it/s | Didn't test Max Res
  • Your modified def forward with loop steps of 4: 4.87 it/s | 0.79 Megapixels Max Res
  • Your modified def forward with loop steps of 2: 4.78 it/s | 1.14 Megapixels Max Res
  • Your modified def forward with loop steps of 1: 4.46 it/s | 1.5 Megapixels Max Res

The resolution I can do with a for-loop steps amount of 1 is incredible. It's fully worth the very small reduction in speed. But ideally, the amount of loop steps would be made a command line option that can be set.

So this is the code I'm using for a loop step amount of 1:

    def forward(self, x, context=None, mask=None):
        h = self.heads

        q = self.to_q(x)
        context = default(context, x)
        k = self.to_k(context)
        v = self.to_v(context)
        del context, x

        q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v))

        r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device)
        for i in range(0, q.shape[0], 1):
            end = i + 1
            s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end])
            s1 *= self.scale

            s2 = s1.softmax(dim=-1)
            del s1

            r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end])
            del s2

        r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
        del r1

        return self.to_out(r2)

With this code, I can do 1216x1216 on 8 GB VRAM. That is 4.4 times as many pixels compared to the maximum I can do with default SD. It's amazing!

To be clear, I did my testing above with default SD at half precision, not with the "optimized" version from this repo, so I was comparing default SD at half precision vs only the changed attention.py. With the other optimizations from this repo, I could surely go even higher than 1216x1216 on 8 GB VRAM now. But the other optimizations from this repo hurt speed a lot more, so I think they are not really worth doing any more now.

@TheEnhas
Copy link

TheEnhas commented Sep 4, 2022

How does this translate into doing batches of images though? One thing I tend to do is 20 512x512 50 step generations with turbo mode, how is VRAM use with half precision + the "loop step 1" code above on base SD compared to that? Because if it's much better or even comparable than yeah, the old optimizations shouldn't really be used anymore except maybe to have as an option to save even more on VRAM-limited (ie. 4GB or less) GPUs, or for really big images.

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 4, 2022

I noticed that the "step 1" version does not actually work for me too - I didn't pay attention to what exactly the log showed. I thought it run through to 100% and succeeded, but what it's actually doing is it runs through to 100%, but then crashes with an out of memory error at high resolutions. Lower resolutions work fine in the "step 1" code without crashes, but then I can also use the "step 2" version with a slightly higher speed.

There's probably some other code somewhere that needs to be optimized more for the "step 1" version to make sense and not crash at 100%.

So what I said above regarding "step 1" clearly being the best is not true. It's "step 2" that's the best because that actually works. The table I showed above is still accurate, just ignore the "loop steps of 1" row.

The maximum I can do now with 8 GB VRAM, using the "step 2" code, is 1.14 Megapixels, as mentioned in my previous comment. A factor of 2.91 improvement over default SD.

So this code:

    def forward(self, x, context=None, mask=None):
        h = self.heads

        q = self.to_q(x)
        context = default(context, x)
        k = self.to_k(context)
        v = self.to_v(context)
        del context, x

        q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v))

        r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device)
        for i in range(0, q.shape[0], 2):
            end = i + 2
            s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end])
            s1 *= self.scale

            s2 = s1.softmax(dim=-1)
            del s1

            r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end])
            del s2

        r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
        del r1

        return self.to_out(r2)

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 4, 2022

How does this translate into doing batches of images though? One thing I tend to do is 20 512x512 50 step generations with turbo mode, how is VRAM use with half precision + the "loop step 1" code above on base SD compared to that? Because if it's much better or even comparable than yeah, the old optimizations shouldn't really be used anymore except maybe to have as an option to save even more on VRAM-limited (ie. 4GB or less) GPUs, or for really big images.

Not sure what exactly you're asking about? I can of course still set --n_iter 20 and then it generates 20 images, the amount of images that are generated does not affect VRAM usage. What does affect VRAM usage is --n_samples, but I think there is no reason to ever have that higher than 1.

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

  • Default SD: 5.0 it/s | 0.39 Megapixels Max Res
  • Your modified def forward with loop steps of 8: 4.94 it/s | Didn't test Max Res
  • Your modified def forward with loop steps of 4: 4.87 it/s | 0.79 Megapixels Max Res
  • Your modified def forward with loop steps of 2: 4.78 it/s | 1.14 Megapixels Max Res
  • Your modified def forward with loop steps of 1: 4.46 it/s | 1.5 Megapixels Max Res

@JohnAlcatraz So weird to me that you see almost no difference, your steps 1 is actually faster than on my 3090, just wondering what OS are you using? and which version of torch?
I'm running it in windows 11 with torch 1.12.1+cu116. Wonder if that can make a difference, I'm just running the default SD as well with some custom modifications but those have nothing to do with the rendering part.

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 4, 2022

@Doggettx I'm on Windows 10, 21H1. If I'd knew which version of torch I'm using I'd tell you, but I have no idea how to check that, I'm a C++ programmer with no clue about Python ;) I'm not usually doing anything with torch, only installed it for Stable Diffusion. So probably a very new version.

Maybe you are not running at half precision? That is a difference how I run it compared to fully default SD. Just adding that model.half(). Most forks by now do that by default.

@Doggettx
Copy link

Doggettx commented Sep 4, 2022

I checked that to be sure, my model was running at full still, set that at half but doesn't really effect speed, it just allowed me to render at even higher res now (1920x1536 with only this change).

Think I'll just make it configurable in my version, for higher resolutions the speed difference seems to get less, but at low resolutions it's more than twice as slow and not really needed.

My workflow is usually first rendering with one dimension at 512 (so 512x768 or something) with a normal upscaler and then img2img the upscaled version at native res. Keeps coherence high while still allowing to render native at high resolutions. But then it's nicer if you can pump out those low res images fast to find a good one ;)

@MrLavender
Copy link

The optimization work done here in the last few hours really is awesome. Thank you all!

I know nothing about Machine Learning and never heard of an einsum before today but looking at the pytorch docs I see this interesting note;

This function does not optimize the given expression, so a different formula for the same computation may run faster or consume less memory. Projects like opt_einsum (https://optimized-einsum.readthedocs.io/en/stable/) can optimize the formula for you.

https://pytorch.org/docs/stable/generated/torch.einsum.html

So maybe there are further improvements to be had in this forward() function (in speed if not memory)?

@willlllllio
Copy link

This is crazy, with step=2 I can do 1088x1024 on a 6GB card with no noticeable extra slowdown, though I do need the cuda max_split arg for that res.

@7flash
Copy link

7flash commented Sep 5, 2022

The only noticable optimization in this PR in these lines, halving of attention, but what does actually mean?

        sim[4:] = sim[4:].softmax(dim=-1)
        sim[:4] = sim[:4].softmax(dim=-1)

Seems like applying softmax separately to each half of array? Does it make it faster?

__
sema-logo  Summary: ❓ I have a question  |  Tags: Efficient

@CaptnSeraph
Copy link

The second step works for me, helped me push my 8gb 1070 to 896x896

@willlllllio where do you specify the max_split? i assume you mean PYTORCH_CUDA_ALLOC_CONF but which file should that go into or do i need to type it each time as an environment variable.

also, what would be the ideal max size to set for a card with 8192mb?

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 5, 2022

It seems like the original PR version was merged, which gives a lot less VRAM savings than the new optimization code by @Doggettx later figured out in this thread.

@basujindal
Copy link
Owner

basujindal commented Sep 5, 2022

It seems like the original PR version was merged, which gives a lot less VRAM savings than the new optimization code by @Doggettx later figured out in this thread.

Is there a PR request for the optimization discussed here?

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 5, 2022

No, no one made a new PR for it yet.

You can see the exact changes in the best way implemented in this branch by @Doggettx : https://github.com/Doggettx/stable-diffusion/commits/main

I don't know if he intends to open a PR himself with them?

@ryudrigo
Copy link

ryudrigo commented Sep 5, 2022

I just opened a PR, but it was just about my comment -- there might be other optimizations I didn`t look at

@camenduru
Copy link

1 step 1216x1216 on 8 GB VRAM with 1070 O8G 🎉 Thank You, Everyone.

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 5, 2022

1 step 1216x1216 on 8 GB VRAM with 1070 O8G 🎉 Thank You, Everyone.

If you mean you are using the code shown here with 1 step, you likely see it crash at 100%. But with the newest version of the optimization from @Doggettx, you will likely be able to successfully go that high or even higher.

@JohnAlcatraz
Copy link

JohnAlcatraz commented Sep 5, 2022

1920x1088 with 1070 O8G 1034.58s/it https://i.imgur.com/CbIfbHp.png 🎉🎉🎉

1920x1088 on 8 GB VRAM is certainly impressive!

@ryudrigo
Copy link

ryudrigo commented Sep 5, 2022

There, polished it a little bit more. Now 1024px in turbo mode takes 8117 MB and 90 seconds (total) for me.

@jimovonz
Copy link

jimovonz commented Sep 5, 2022

Anyone else finding that with increased resolution, the images are loosing coherence with multiple random occurances of the subject elements?

@JohnAlcatraz
Copy link

Anyone else finding that with increased resolution, the images are loosing coherence with multiple random occurances of the subject elements?

That is a known issue with stable diffusion, yes. The model was trained at 512x512 so that's the only resolution it can do very well.

@jimovonz
Copy link

jimovonz commented Sep 5, 2022

Anyone else finding that with increased resolution, the images are loosing coherence with multiple random occurances of the subject elements?

That is a known issue with stable diffusion, yes. The model was trained at 512x512 so that's the only resolution it can do very well.

Unfortunately this seems to make most of these higher resolution images useless - unless of course you are specifically after something more abstract....

@JohnAlcatraz
Copy link

Unfortunately this seems to make most of these higher resolution images useless - unless of course you are specifically after something more abstract....

These optimizations are not just about being able to generate larger resolutions, but also about being able to generate the same resolution on a lower amount of VRAM, making Stable Diffusion more accessible to people with low VRAM GPUs.

@ryudrigo
Copy link

ryudrigo commented Sep 5, 2022

Indeed! I should've talked about the normal setting. Least memory usage I can get with PR #122 for 512x512 is just under 3GB VRAM

@CaptnSeraph
Copy link

Unfortunately this seems to make most of these higher resolution images useless - unless of course you are specifically after something more abstract....

As the img2img uses the txt2img sequence (I think) you can use lower res within txt2img to get a good seed and a good "thumbnail" and then refine larger with img2img before running through goBig and gfpgan for serious high quality and sizes (I've got photorealism at DSLR resolutions)

@jimovonz
Copy link

jimovonz commented Sep 6, 2022 via email

@GordonFreeeman
Copy link

GordonFreeeman commented Sep 7, 2022

Holy crap, this is actually working! I'm only a casual when it comes to python, or coding in general, but after fiddling with the above tweaks/fixes, I can generate incredibly high resolutions on my measly 6GB 1660 Ti (laptop card). Plus I have to run at full precision, because fp16 is broken exclusively on 1660 series cards.

512x512: 1.55 it/s
1024x576: 2.21 s/it
1024x1024: 6.36 s/it
1280x768: 6.51 s/it
1408x768: 7.59 s/it
1920x576: 7.74 s/it
1536x960 was working (13.80 s/it), but crashed during image export, when VRAM usage went from 5.4 GB to >6 GB

I ran all tests with 50 ddim_steps. Had to restart twice because for some reason, VRAM wasn't cleared up completely sometimes, and at higher res, it's going a little crazy with the iteration time. But it's still pretty mindblowing as a proof of concept.

zaidorx added a commit to zaidorx/stable-diffusion-webui-1 that referenced this pull request Sep 12, 2022
Doggettx referenced this pull request in Birch-san/stable-diffusion Sep 14, 2022
…aboration incorporating a lot of people's contributions -- including for example @Doggettx and the original code from @neonsecret on which the Doggetx optimizations were based (see invoke-ai/InvokeAI#431, https://github.com/sd-webui/stable-diffusion-webui/pull/771\#issuecomment-1239716055). Takes exactly the same amount of time to run 8 steps as original CompVis code does (10.4 secs, ~1.25s/it).
Cabbagec added a commit to Cabbagec/stable-diffusion that referenced this pull request Sep 19, 2022
kylewlacy pushed a commit to kylewlacy/stable-diffusion that referenced this pull request Sep 23, 2022
* start refactoring -not yet functional

* first phase of refactor done - not sure weighted prompts working

* Second phase of refactoring. Everything mostly working.
* The refactoring has moved all the hard-core inference work into
ldm.dream.generator.*, where there are submodules for txt2img and
img2img. inpaint will go in there as well.
* Some additional refactoring will be done soon, but relatively
minor work.

* fix -save_orig flag to actually work

* add @neonsecret attention.py memory optimization

* remove unneeded imports

* move token logging into conditioning.py

* add placeholder version of inpaint; porting in progress

* fix crash in img2img

* inpainting working; not tested on variations

* fix crashes in img2img

* ported attention.py memory optimization basujindal#117 from basujindal branch

* added @torch_no_grad() decorators to img2img, txt2img, inpaint closures

* Final commit prior to PR against development
* fixup crash when generating intermediate images in web UI
* rename ldm.simplet2i to ldm.generate
* add backward-compatibility simplet2i shell with deprecation warning

* add back in mps exception, addresses @Vargol comment in CompVis#354

* replaced Conditioning class with exported functions

* fix wrong type of with_variations attribute during intialization

* changed "image_iterator()" to "get_make_image()"

* raise NotImplementedError for calling get_make_image() in parent class

* Update ldm/generate.py

better error message

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>

* minor stylistic fixes and assertion checks from code review

* moved get_noise() method into img2img class

* break get_noise() into two methods, one for txt2img and the other for img2img

* inpainting works on non-square images now

* make get_noise() an abstract method in base class

* much improved inpainting

Co-authored-by: Kevin Gibbons <bakkot@gmail.com>
kylewlacy pushed a commit to kylewlacy/stable-diffusion that referenced this pull request Sep 23, 2022
commit 1c649e4
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 12 13:29:16 2022 -0400

    fix torchvision dependency version CompVis#511

commit 4d197f6
Merge: a3e07fb 190ba78
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 12 07:29:19 2022 -0400

    Merge branch 'development' of github.com:lstein/stable-diffusion into development

commit a3e07fb
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 12 07:28:58 2022 -0400

    fix grid crash

commit 9fa1f31
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 12 07:07:05 2022 -0400

    fix opencv and realesrgan dependencies in mac install

commit 190ba78
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 12 01:50:58 2022 -0400

    Update requirements-mac.txt

    Fixed dangling dash on last line.

commit 25d9ccc
Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com>
Date:   Mon Sep 12 03:17:29 2022 +0200

    Update model.py

commit 9cdf3ac
Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com>
Date:   Mon Sep 12 02:52:36 2022 +0200

    Update attention.py

    Performance improvements to generate larger images in M1 CompVis#431

    Update attention.py

    Added dtype=r1.dtype to softmax

commit 49a96b9
Author: Mihai <299015+mh-dm@users.noreply.github.com>
Date:   Sat Sep 10 16:58:07 2022 +0300

    ~7% speedup (1.57 to 1.69it/s) from switch to += in ldm.modules.attention. (CompVis#482)

    Tested on 8GB eGPU nvidia setup so YMMV.
    512x512 output, max VRAM stays same.

commit aba94b8
Author: Niek van der Maas <mail@niekvandermaas.nl>
Date:   Fri Sep 9 15:01:37 2022 +0200

    Fix macOS `pyenv` instructions, add code block highlight (CompVis#441)

    Fix: `anaconda3-latest` does not work, specify the correct virtualenv, add missing init.

commit aac5102
Author: Henry van Megen <h.vanmegen@gmail.com>
Date:   Thu Sep 8 05:16:35 2022 +0200

    Disabled debug output (CompVis#436)

    Co-authored-by: Henry van Megen <hvanmegen@gmail.com>

commit 0ab5a36
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 17:19:46 2022 -0400

    fix missing lines in outputs

commit 5e43372
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 16:20:14 2022 -0400

    upped max_steps in v1-finetune.yaml and fixed TI docs to address CompVis#493

commit 7708f4f
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 16:03:37 2022 -0400

    slight efficiency gain by using += in attention.py

commit b86a1de
Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
Date:   Mon Sep 12 07:47:12 2022 +1200

    Remove print statement styling (CompVis#504)

    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit 4951e66
Author: chromaticist <mhostick@gmail.com>
Date:   Sun Sep 11 12:44:26 2022 -0700

    Adding support for .bin files from huggingface concepts (CompVis#498)

    * Adding support for .bin files from huggingface concepts

    * Updating documentation to include huggingface .bin info

commit 79b445b
Merge: a323070 f7662c1
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 15:39:38 2022 -0400

    Merge branch 'development' of github.com:lstein/stable-diffusion into development

commit a323070
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 15:28:57 2022 -0400

    update requirements for new location of gfpgan

commit f7662c1
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 15:00:24 2022 -0400

    update requirements for changed location of gfpgan

commit 93c242c
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:47:58 2022 -0400

    make gfpgan_model_exists flag available to web interface

commit c7c6cd7
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:43:07 2022 -0400

    Update UPSCALE.md

    New instructions needed to accommodate fact that the ESRGAN and GFPGAN packages are now installed by environment.yaml.

commit 77ca83e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:31:56 2022 -0400

    Update CLI.md

    Final documentation tweak.

commit 0ea145d
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:29:26 2022 -0400

    Update CLI.md

    More doc fixes.

commit 162285a
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:28:45 2022 -0400

    Update CLI.md

    Minor documentation fix

commit 37c921d
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 14:26:41 2022 -0400

    documentation enhancements

commit 4f72cb4
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 13:05:38 2022 -0400

    moved the notebook files into their own directory

commit 878ef2e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 12:58:06 2022 -0400

    documentation tweaks

commit 4923118
Merge: 16f6a67 defafc0
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 12:51:25 2022 -0400

    Merge branch 'development' of github.com:lstein/stable-diffusion into development

commit defafc0
Author: Dominic Letz <dominic@diode.io>
Date:   Sun Sep 11 18:51:01 2022 +0200

    Enable upscaling on m1 (CompVis#474)

commit 16f6a67
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 12:47:26 2022 -0400

    install GFPGAN inside SD repository in order to fix 'dark cast' issue basujindal#169

commit 0881d42
Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
Date:   Mon Sep 12 03:52:43 2022 +1200

    Docs Update (CompVis#466)

    Authored-by: @blessedcoolant
    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit 9a29d44
Author: Gérald LONLAS <gerald@lonlas.com>
Date:   Sun Sep 11 23:23:18 2022 +0800

    Revert "Add 3x Upscale option on the Web UI (CompVis#442)" (CompVis#488)

    This reverts commit f8a5408.

commit d301836
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 10:52:19 2022 -0400

    can select prior output for init_img using -1, -2, etc

commit 70aa674
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 10:34:06 2022 -0400

    merge PR CompVis#495 - keep using float16 in ldm.modules.attention

commit 8748370
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 10:22:32 2022 -0400

    negative -S indexing recovers correct previous seed; closes issue CompVis#476

commit 839e30e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 11 10:02:44 2022 -0400

    improve CUDA VRAM monitoring

    extra check that device==cuda before getting VRAM stats

commit bfb2781
Author: tildebyte <337875+tildebyte@users.noreply.github.com>
Date:   Sat Sep 10 10:15:56 2022 -0400

    fix(readme): add note about updating env via conda (CompVis#475)

commit 5c43988
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 10 10:02:43 2022 -0400

    reduce VRAM memory usage by half during model loading

    * This moves the call to half() before model.to(device) to avoid GPU
    copy of full model. Improves speed and reduces memory usage dramatically

    * This fix contributed by @mh-dm (Mihai)

commit 9912270
Merge: 817c4a2 ecc6b75
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 10 09:54:34 2022 -0400

    Merge branch 'development' of github.com:lstein/stable-diffusion into development

commit 817c4a2
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 10 09:53:27 2022 -0400

    remove -F option from normalized prompt; closes CompVis#483

commit ecc6b75
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 10 09:53:27 2022 -0400

    remove -F option from normalized prompt

commit 723d074
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Fri Sep 9 18:49:51 2022 -0400

    Allow ctrl c when using --from_file (CompVis#472)

    * added ansi escapes to highlight key parts of CLI session

    * adjust exception handling so that ^C will abort when reading prompts from a file

commit 75f633c
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Fri Sep 9 12:03:45 2022 -0400

    re-add new logo

commit 10db192
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Fri Sep 9 09:26:10 2022 -0400

    changes to dogettx optimizations to run on m1
    * Author @Any-Winter-4079
    * Author @dogettx
    Thanks to many individuals who contributed time and hardware to
    benchmarking and debugging these changes.

commit c85ae00
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 23:57:45 2022 -0400

    fix bug which caused seed to get "stuck" on previous image even when UI specified -1

commit 1b5aae3
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 22:36:47 2022 -0400

    add icon to dream web server

commit 6abf739
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 22:25:09 2022 -0400

    add favicon to web server

commit db825b8
Merge: 33874ba afee7f9
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 22:17:37 2022 -0400

    Merge branch 'deNULL-development' into development

commit 33874ba
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 22:16:29 2022 -0400

    Squashed commit of the following:

    commit afee7f9
    Merge: 6531446 171f8db
    Author: Lincoln Stein <lincoln.stein@gmail.com>
    Date:   Thu Sep 8 22:14:32 2022 -0400

        Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development

    commit 171f8db
    Author: Denis Olshin <me@denull.ru>
    Date:   Thu Sep 8 03:15:20 2022 +0300

        saving full prompt to metadata when using web ui

    commit d7e67b6
    Author: Denis Olshin <me@denull.ru>
    Date:   Thu Sep 8 01:51:47 2022 +0300

        better logic for clicking to make variations

commit afee7f9
Merge: 6531446 171f8db
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 22:14:32 2022 -0400

    Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development

commit 6531446
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 20:41:37 2022 -0400

    work around unexplained crash when timesteps=1000 (CompVis#440)

    * work around unexplained crash when timesteps=1000

    * this fix seems to work

commit c33a84c
Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
Date:   Fri Sep 9 12:39:51 2022 +1200

    Add New Logo (CompVis#454)

    * Add instructions on how to install alongside pyenv (CompVis#393)

    Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`.
    After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system.
    But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes.

    It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv.

    Feel free to incorporate these instructions as you see fit.

    Thanks a million for all your hard work.

    * Disabled debug output (CompVis#436)

    Co-authored-by: Henry van Megen <hvanmegen@gmail.com>

    * Add New Logo

    Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no>
    Co-authored-by: Henry van Megen <h.vanmegen@gmail.com>
    Co-authored-by: Henry van Megen <hvanmegen@gmail.com>
    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit f8a5408
Author: Gérald LONLAS <gerald@lonlas.com>
Date:   Fri Sep 9 01:45:54 2022 +0800

    Add 3x Upscale option on the Web UI (CompVis#442)

commit 244239e
Author: James Reynolds <magnusviri@users.noreply.github.com>
Date:   Thu Sep 8 05:36:33 2022 -0600

    macOS CI workflow, dream.py exits with an error, but the workflow com… (CompVis#396)

    * macOS CI workflow, dream.py exits with an error, but the workflow completes.

    * Files for testing

    Co-authored-by: James Reynolds <magnsuviri@me.com>
    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit 711d49e
Author: James Reynolds <magnusviri@users.noreply.github.com>
Date:   Thu Sep 8 05:35:08 2022 -0600

    Cache model workflow (CompVis#394)

    * Add workflow that caches the model, step 1 for CI

    * Change name of workflow job

    Co-authored-by: James Reynolds <magnsuviri@me.com>
    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit 7996a30
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Thu Sep 8 07:34:03 2022 -0400

    add auto-creation of mask for inpainting (CompVis#438)

    * now use a single init image for both image and mask

    * turn on debugging for now to write out mask and image

    * add back -M option as a fallback

commit a69ca31
Author: elliotsayes <elliotsayes@gmail.com>
Date:   Thu Sep 8 15:30:06 2022 +1200

    .gitignore WebUI temp files (CompVis#430)

    * Add instructions on how to install alongside pyenv (CompVis#393)

    Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`.
    After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system.
    But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes.

    It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv.

    Feel free to incorporate these instructions as you see fit.

    Thanks a million for all your hard work.

    * .gitignore WebUI temp files

    Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no>

commit 5c6b612
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Wed Sep 7 22:50:55 2022 -0400

    fix bug that caused same seed to be redisplayed repeatedly

commit 56f155c
Author: Johan Roxendal <johan@roxendal.com>
Date:   Thu Sep 8 04:50:06 2022 +0200

    added support for parsing run log and displaying images in the frontend init state (CompVis#410)

    Co-authored-by: Johan Roxendal <johan.roxendal@litteraturbanken.se>
    Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>

commit 4168774
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Wed Sep 7 20:24:35 2022 -0400

    added missing initialization of latent_noise to None

commit 171f8db
Author: Denis Olshin <me@denull.ru>
Date:   Thu Sep 8 03:15:20 2022 +0300

    saving full prompt to metadata when using web ui

commit d7e67b6
Author: Denis Olshin <me@denull.ru>
Date:   Thu Sep 8 01:51:47 2022 +0300

    better logic for clicking to make variations

commit d1d044a
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Wed Sep 7 17:56:59 2022 -0400

    actual image seed now written into web log rather than -1 (CompVis#428)

commit edada04
Author: Arturo Mendivil <60411196+artmen1516@users.noreply.github.com>
Date:   Wed Sep 7 10:42:26 2022 -0700

    Improve notebook and add requirements file (CompVis#422)

commit 29ab3c2
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Wed Sep 7 13:28:11 2022 -0400

    disable neonpixel optimizations on M1 hardware (CompVis#414)

    * disable neonpixel optimizations on M1 hardware

    * fix typo that was causing random noise images on m1

commit 7670ecc
Author: cody <cnmizell@gmail.com>
Date:   Wed Sep 7 12:24:41 2022 -0500

    add more keyboard support on the web server (CompVis#391)

    add ability to submit prompts with the "enter" key
    add ability to cancel generations with the "escape" key

commit dd2aeda
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Wed Sep 7 13:23:53 2022 -0400

    report VRAM usage stats during initial model loading (CompVis#419)

commit f628477
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Tue Sep 6 17:12:39 2022 -0400

    Squashed commit of the following:

    commit 7d1344282d942a33dcecda4d5144fc154ec82915
    Merge: caf4ea3 ebeb556
    Author: Lincoln Stein <lincoln.stein@gmail.com>
    Date:   Mon Sep 5 10:07:27 2022 -0400

        Merge branch 'development' of github.com:WebDev9000/stable-diffusion into WebDev9000-development

    commit ebeb556
    Author: Web Dev 9000 <rirath@gmail.com>
    Date:   Sun Sep 4 18:05:15 2022 -0700

        Fixed unintentionally removed lines

    commit ff2c4b9
    Author: Web Dev 9000 <rirath@gmail.com>
    Date:   Sun Sep 4 17:50:13 2022 -0700

        Add ability to recreate variations via image click

    commit c012929
    Author: Web Dev 9000 <rirath@gmail.com>
    Date:   Sun Sep 4 14:35:33 2022 -0700

        Add files via upload

    commit 02a6018
    Author: Web Dev 9000 <rirath@gmail.com>
    Date:   Sun Sep 4 14:35:07 2022 -0700

        Add files via upload

commit eef7889
Author: Olivier Louvignes <olivier@mg-crea.com>
Date:   Tue Sep 6 12:41:08 2022 +0200

    feat(txt2img): allow from_file to work with len(lines) < batch_size (CompVis#349)

commit 720e5cd
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 5 20:40:10 2022 -0400

    Refactoring simplet2i (CompVis#387)

    * start refactoring -not yet functional

    * first phase of refactor done - not sure weighted prompts working

    * Second phase of refactoring. Everything mostly working.
    * The refactoring has moved all the hard-core inference work into
    ldm.dream.generator.*, where there are submodules for txt2img and
    img2img. inpaint will go in there as well.
    * Some additional refactoring will be done soon, but relatively
    minor work.

    * fix -save_orig flag to actually work

    * add @neonsecret attention.py memory optimization

    * remove unneeded imports

    * move token logging into conditioning.py

    * add placeholder version of inpaint; porting in progress

    * fix crash in img2img

    * inpainting working; not tested on variations

    * fix crashes in img2img

    * ported attention.py memory optimization basujindal#117 from basujindal branch

    * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures

    * Final commit prior to PR against development
    * fixup crash when generating intermediate images in web UI
    * rename ldm.simplet2i to ldm.generate
    * add backward-compatibility simplet2i shell with deprecation warning

    * add back in mps exception, addresses @Vargol comment in CompVis#354

    * replaced Conditioning class with exported functions

    * fix wrong type of with_variations attribute during intialization

    * changed "image_iterator()" to "get_make_image()"

    * raise NotImplementedError for calling get_make_image() in parent class

    * Update ldm/generate.py

    better error message

    Co-authored-by: Kevin Gibbons <bakkot@gmail.com>

    * minor stylistic fixes and assertion checks from code review

    * moved get_noise() method into img2img class

    * break get_noise() into two methods, one for txt2img and the other for img2img

    * inpainting works on non-square images now

    * make get_noise() an abstract method in base class

    * much improved inpainting

    Co-authored-by: Kevin Gibbons <bakkot@gmail.com>

commit 1ad2a8e
Author: thealanle <35761977+thealanle@users.noreply.github.com>
Date:   Mon Sep 5 17:35:04 2022 -0700

    Fix --outdir function for web (CompVis#373)

    * Fix --outdir function for web

    * Removed unnecessary hardcoded path

commit 52d8bb2
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Mon Sep 5 10:31:59 2022 -0400

    Squashed commit of the following:

    commit 0cd48e932f1326e000c46f4140f98697eb9bdc79
    Author: Lincoln Stein <lincoln.stein@gmail.com>
    Date:   Mon Sep 5 10:27:43 2022 -0400

        resolve conflicts with development

    commit d7bc8c12e05535a363ac7c745a3f3abc2773bfcf
    Author: Scott McMillin <scott@scottmcmillin.com>
    Date:   Sun Sep 4 18:52:09 2022 -0500

        Add title attribute back to img tag

    commit 5397c89184ebfb8260bc2d8c3f23e73e103d24e6
    Author: Scott McMillin <scott@scottmcmillin.com>
    Date:   Sun Sep 4 13:49:46 2022 -0500

        Remove temp code

    commit 1da080b50972696db2930681a09cb1c14e524758
    Author: Scott McMillin <scott@scottmcmillin.com>
    Date:   Sun Sep 4 13:33:56 2022 -0500

        Cleaned up HTML; small style changes; image click opens image; add seed to figcaption beneath image

commit caf4ea3
Author: Adam Rice <adam@askadam.io>
Date:   Mon Sep 5 10:05:39 2022 -0400

    Add a 'Remove Image' button to clear the file upload field (CompVis#382)

    * added "remove image" button

    * styled a new "remove image" button

    * Update index.js

commit 95c088b
Author: Kevin Gibbons <bakkot@gmail.com>
Date:   Sun Sep 4 19:04:14 2022 -0700

    Revert "Add CORS headers to dream server to ease integration with third-party web interfaces" (CompVis#371)

    This reverts commit 91e826e.

commit a20113d
Author: Kevin Gibbons <bakkot@gmail.com>
Date:   Sun Sep 4 18:59:12 2022 -0700

    put no_grad decorator on make_image closures (CompVis#375)

commit 0f93dad
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 4 21:39:15 2022 -0400

    fix several dangling references to --gfpgan option, which no longer exists

commit f4004f6
Author: tildebyte <337875+tildebyte@users.noreply.github.com>
Date:   Sun Sep 4 19:43:04 2022 -0400

    TOIL(requirements): Split requirements to per-platform (CompVis#355)

    * toil(reqs): split requirements to per-platform

    Signed-off-by: Ben Alkov <ben.alkov@gmail.com>

    * toil(reqs): fix for Win and Lin...

    ...allow pip to resolve latest torch, numpy

    Signed-off-by: Ben Alkov <ben.alkov@gmail.com>

    * toil(install): update reqs in Win install notebook

    Signed-off-by: Ben Alkov <ben.alkov@gmail.com>

    Signed-off-by: Ben Alkov <ben.alkov@gmail.com>

commit 4406fd1
Merge: 5116c81 fd7a72e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 4 08:23:53 2022 -0400

    Merge branch 'SebastianAigner-main' into development
    Add support for full CORS headers for dream server.

commit fd7a72e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 4 08:23:11 2022 -0400

    remove debugging message

commit 3a2be62
Merge: 91e826e 5116c81
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sun Sep 4 08:15:51 2022 -0400

    Merge branch 'development' into main

commit 5116c81
Author: Justin Wong <1584142+wongjustin99@users.noreply.github.com>
Date:   Sun Sep 4 07:17:58 2022 -0400

    fix save_original flag saving to the same filename (CompVis#360)

    * Update README.md with new Anaconda install steps (CompVis#347)

    pip3 version did not work for me and this is the recommended way to install Anaconda now it seems

    * fix save_original flag saving to the same filename

    Before this, the `--save_orig` flag was not working. The upscaled/GFPGAN would overwrite the original output image.

    Co-authored-by: greentext2 <112735219+greentext2@users.noreply.github.com>

commit 91e826e
Author: Sebastian Aigner <SebastianAigner@users.noreply.github.com>
Date:   Sun Sep 4 10:22:54 2022 +0200

    Add CORS headers to dream server to ease integration with third-party web interfaces

commit 6266d9e
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 15:45:20 2022 -0400

    remove stray debugging message

commit 138956e
Author: greentext2 <112735219+greentext2@users.noreply.github.com>
Date:   Sat Sep 3 13:38:57 2022 -0500

    Update README.md with new Anaconda install steps (CompVis#347)

    pip3 version did not work for me and this is the recommended way to install Anaconda now it seems

commit 60be735
Author: Cora Johnson-Roberson <cora.johnson.roberson@gmail.com>
Date:   Sat Sep 3 14:28:34 2022 -0400

    Switch to regular pytorch channel and restore Python 3.10 for Macs. (CompVis#301)

    * Switch to regular pytorch channel and restore Python 3.10 for Macs.

    Although pytorch-nightly should in theory be faster, it is currently
    causing increased memory usage and slower iterations:

    invoke-ai/InvokeAI#283 (comment)

    This changes the environment-mac.yaml file back to the regular pytorch
    channel and moves the `transformers` dep into pip for now (since it
    cannot be satisfied until tokenizers>=0.11 is built for Python 3.10).

    * Specify versions for Pip packages as well.

commit d0d95d3
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 14:10:31 2022 -0400

    make initimg appear in web log

commit b90a215
Merge: 1eee811 6270e31
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 13:47:15 2022 -0400

    Merge branch 'prixt-seamless' into development

commit 6270e31
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 13:46:29 2022 -0400

    add credit to prixt for seamless circular tiling

commit a01b7bd
Merge: 1eee811 9d88abe
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 13:43:04 2022 -0400

    add web interface for seamless option

commit 1eee811
Merge: 64eca42 fb857f0
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 12:33:39 2022 -0400

    Merge branch 'development' of github.com:lstein/stable-diffusion into development

commit 64eca42
Merge: 9130ad7 21a1f68
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 12:33:05 2022 -0400

    Merge branch 'main' into development
    * brings in small documentation fixes that were
    added directly to main during release tweaking.

commit fb857f0
Author: Lincoln Stein <lincoln.stein@gmail.com>
Date:   Sat Sep 3 12:07:07 2022 -0400

    fix typo in docs

commit 9d88abe
Author: prixt <paraxite@naver.com>
Date:   Sat Sep 3 22:42:16 2022 +0900

    fixed typo

commit a61e49b
Author: prixt <paraxite@naver.com>
Date:   Sat Sep 3 22:39:35 2022 +0900

    * Removed unnecessary code
    * Added description about --seamless

commit 02bee4f
Author: prixt <paraxite@naver.com>
Date:   Sat Sep 3 16:08:03 2022 +0900

    added --seamless tag logging to normalize_prompt

commit d922b53
Author: prixt <paraxite@naver.com>
Date:   Sat Sep 3 15:13:31 2022 +0900

    added seamless tiling mode and commands
techeng322 pushed a commit to techeng322/stable-diffusion-automatic that referenced this pull request Nov 12, 2023
@tzayuan
Copy link

tzayuan commented Apr 1, 2024

Hi @MrLavender,

I would like to ask: SD has loaded a pretrained model, why has the implementation model of attention mechanism been changed, and the pretrained model still works correctly? Are there any techniques and areas to pay attention to in this process? thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.