Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mannequins/Blogpost Trained with 16K VQGAN #319

Closed
afiaka87 opened this issue Jun 25, 2021 · 6 comments
Closed

Mannequins/Blogpost Trained with 16K VQGAN #319

afiaka87 opened this issue Jun 25, 2021 · 6 comments

Comments

@afiaka87
Copy link
Contributor

afiaka87 commented Jun 25, 2021

https://www.dropbox.com/s/x9gbxiartrqaewf/afiaka_544I_96TSL_12D_16H_64DH_sparse.zip

@robvanvolt @rom1504 sorry i dont have time to make a PR to dalle-models yet but I will asap.

If you're here from the discord - sorry I didn't leave this better packaged. The zip is 2 GiB but the main dalle.pt is just 445 megabytes. It's not a huge model or anything but you don't really need one on this dataset. If you wanna finetune from it, i've included in the optimizer states from the deep speed training session as well. I used deep speeds WarmupDecayScheduler.

Anyway I finally hit the loss floor with it which I believe to be around 2.2. If anyone thinks they can get lower than that - good luck.

Sorry there are no generations posted - I'm sure they look great; but I'm only posting because there are newer...better VQGANs out now. And I must of course switch to those.

@johnpaulbin
Copy link

Awesome stuff! I'll try to be testing / playing with this model on my part and possibly add it to the colab if your comfortable with that.

@johnpaulbin
Copy link

@afiaka87 Works great!
a female mannequin dressed in a bright blue jacket and bright red skirt
samples

@afiaka87
Copy link
Contributor Author

afiaka87 commented Jun 26, 2021

@afiaka87 Works great!
a female mannequin dressed in a bright blue jacket and bright red skirt

Feel free to link it! It can also (try) to do lots of other stuff like -

an emoji of a baby penguin wearing a blue hat, blue gloves, red shirt, and green pants
a photo of the city streets of france
a professional high quality emoji of a pikachu tiger chimera. a pikachu imitating a tiger. a pikachu made of tiger. a professional emoji.
a black and white photograph of an eagle sitting in a forest during spring
a pencil sketch of a cougar sitting in a field at twilight
a female mannequin dressed in a black pullover sweater and blue wrap skirt
a living room with two red armchairs and a painting of a galleon. the painting is mounted behind an indoor palm.
a photo of the a museum in new zealand
a black and white photograph of a capybara sitting on a mountain at dusk
a female mannequin dressed in an olive button-down shirt and gold palazzo pants
a female mannequin dressed in a beige leather jacket and orange palazzo pants
an illustration of a baby hedgehog in a cape staring at its reflection in a mirror
2 panel image of the exact same teapot. on the top, a photo of the teapot. on the bottom, the teapot with a brain. [source teapot photo 5]
a professional high quality illustration of a turtle octopus chimera. a turtle imitating an octopus. a turtle made of octopus.
a female mannequin dressed in a gray leather jacket and yellow sweatpants
a painting of an eagle sitting in a field in the morning in surrealist style
a professional high quality emoji of a chicken phoenix chimera. a chicken imitating a phoenix. a chicken made of phoenix. a professional emoji.
a purse in the form of a butterfly wing. a purse imitating a butterfly wing.
an illustration of a shrimp with headphones igniting a firework
a professional high quality illustration of a cat rabbit chimera. a cat imitating a rabbit. a cat made of rabbit.
a professional high quality illustration of a toad octopus chimera. a toad imitating an octopus. a toad made of octopus.
a male mannequin dressed in an olive and black checkered button-down shirt and black pleated trousers
an illustration of a baby tapir in a cape writing a letter
a small blue book standing to the right of a large red book
a photo of westwood park, san francisco, from the water in the afternoon
an emoji of a baby hedgehog wearing a yellow hat, red gloves, blue shirt, and green pants

Here is a line separated list of 12,800 captions from the dataset. You could literally just pipe them into it automatically in colab - it probably won't do very well on much else.

sample_12800.txt.gz

Download and grab 32 mannequin captions.

wget 'https://github.com/lucidrains/DALLE-pytorch/files/6719423/sample_12800.txt.gz'
tar xf sample_12800.txt.gz;
shuf sample_12800.txt | grep 'mannequin' | head -n 32 > 32_mannequin_captions.txt

@afiaka87
Copy link
Contributor Author

afiaka87 commented Jun 26, 2021

Keep in mind these image masks are all in there -

you can literally just go to the blog post and right click -> save as url on the mask they use to use it for generate.py.

Screen Shot 2021-06-25 at 8 53 08 PM

Screen Shot 2021-06-25 at 8 53 19 PM

Screen Shot 2021-06-25 at 8 53 33 PM

Screen Shot 2021-06-25 at 8 53 45 PM

Screen Shot 2021-06-25 at 8 53 50 PM

Screen Shot 2021-06-25 at 8 54 22 PM

Screen Shot 2021-06-25 at 8 54 32 PM

Screen Shot 2021-06-25 at 8 54 37 PM

Screen Shot 2021-06-25 at 8 54 51 PM

Screen Shot 2021-06-25 at 8 54 56 PM

@johnpaulbin
Copy link

Made a colab specifically for inferencing this model: https://colab.research.google.com/drive/11V2xw1eLPfZvzW8UQyTUhqCEU71w6Pr4?usp=sharing

@afiaka87
Copy link
Contributor Author

moving to discussions #322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants