-
-
Notifications
You must be signed in to change notification settings - Fork 768
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
start removing groupnorms because of https://arxiv.org/abs/2312.02696
- Loading branch information
1 parent
dfc5d53
commit c166739
Showing
3 changed files
with
51 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = '1.26.3' | ||
__version__ = '2.0.0' |
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work better even without the magnitude preserving layers? I got the impression they are supposed to work together
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danbochman it isn't just Karras' paper
a Brain researcher told me years ago to be cautious around using groupnorms, which I ignored at the time. a recent issue in another repo tipped me over the edge
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucidrains thanks for the reply and reference, I will give it a go and let you know how it works
I also got rid of groupnorms in the past and replaced them with adaptive groupnorm from k-diffusion and it really helped
But this is a much more parameter friendly alternative
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danbochman cool, let me know what you see on your end!
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucidrains It is too early to say something about the end quality results, but in terms of early convergence it converged earlier (x4 faster) and looks quite good for simple examples.
Virtual Try-On task
Ground Truth | Input Person | Input Garment | Model Output
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! leave it to Karras to point out something everyone commonly uses is defective..
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following back on this to report that with long training times, colors start to get consistently shifted
Might be related to some other random training dynamic, but unfortunately for now I am reverting back to (adaptive) group norms.
c166739
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the data point