[WIP] PixArt-Sigma training pipeline #1341

kabachuha · 2024-05-19T22:19:37Z

Current state

I'm very excited to work with PixArt and its great size/prompt adherence ratio, in addition to the awesome lora techniques in this repo, so if it goes well it should start working in a couple of days

Addresses #979

PixArt repo: https://github.com/PixArt-alpha/PixArt-sigma

I'm very likely going to edit this post with updates, comments, pictures quite often

FurkanGozukara · 2024-05-19T22:54:44Z

awesome will this have single safetensors file ? PixArt could be the future if SD3 never get released

kabachuha · 2024-05-20T12:35:21Z

awesome will this have single safetensors file

Yes, ofc the diffusion transformer is saved in safetensors format, if the safetensors option is specified.

As to embedding T5 and the SD vae inside along with it, it may be an option, but not sure about it's being practical, as the original PA format loads them from the separated hf files and the existing workflows such as in comfy load them from the different subfolders (T5 and SDXL VAE) as well

PixArt could be the future if SD3 never get released

Yes the SD3's future is clouded. Additionally, it's parameters count is much bigger than PA-Sigma with comparable quality and prompt adherence as we see from the api examples, meaning it's more accessible and faster, enabling easier community fine-tuning. And regardless of the things, we have PA now, and we can experiment with different things and techniques on Diffusion Transformers, that can be handy when another transformer-like diffusion is released

FurkanGozukara · 2024-05-20T13:08:07Z

@kabachuha can we make it include text encoder and vae as well like single sdxl checkpoints and load from it?

I plan to make a standalone gradio app and maybe auto1111 adds support later

kabachuha · 2024-05-20T13:15:34Z

Very well, I see your point :)

raulc0399 · 2024-05-30T06:33:37Z

hi @kabachuha just a fyi:

the sigma and alpha models have scripts for HF Lora training.
https://github.com/PixArt-alpha/PixArt-sigma/blob/master/train_scripts/train_pixart_lora_hf.py
https://github.com/PixArt-alpha/PixArt-alpha/blob/master/train_scripts/train_pixart_lora_hf.py

there are also PRs for text-encoder training in parallel.
if you want to integrate those i can provide help with that.

kabachuha · 2024-05-30T08:54:47Z

Yes, I know. But it's a kohya repo, and the lora modules and the training scrips have different styles, so need to suit them :)

If you'd like to help, it would be nice (I can add you to the fork, so you may be able to take over the pr, as I have load this week. Dm or comment here)

raulc0399 · 2024-05-30T09:58:49Z

@kabachuha yes, please add me i will start next week.
do you also have an overview of what has been done so far and what needs to be done?

kabachuha · 2024-05-30T10:26:39Z

Added :) And see the list

raulc0399 · 2024-05-31T06:41:13Z

@kabachuha so the still open todos are following. looks like mostly testing?

Combine T5 and SDXL vae in checkpoint when saving, recommended by FurkanGozukara
Do test launch on base and lora/etc to test compat and debug
Test aspect-ratio conditioning
Diffusers format save mode and other leftover TODOs
Setup a ShareCaption/CogVLM2/LLaVA*/etc. InternLM-XComposer2-4KHD-based multimodal prompt enhancer

do you have any more details on the "Combine T5 and SDXL vae" and "other leftover TODOs" ?

AtomisteBX · 2024-06-17T11:32:21Z

Any news? SD3 is a disappointment so maybe there should be more focus towards training Pixart Sigma...

DanPli · 2024-07-04T13:52:12Z

"Combine T5 and SDXL vae in checkpoint when saving" Is not a good idea, if the T5/vae (mostly the T5 checkpoint) gets multiplied by this, people will just go use other training scripts that make alot smaller checkpoints, and rightfully so. Storage space reqs would explode with increasing amounts of custom networks trained or downloaded. You should have one T5 and a VAE separate for not only this reason, also to be able to modularly approach and substitute other future Text or Autoencoders and combine them freely with any finetune/network you have.

kabachuha added 13 commits May 19, 2024 15:59

add choosable pixart aspect ratios

f9baa35

add original pixart blocks

502874e

detach PA blocks from PA repo deps

0c6c276

add option for 300 input tokens length to config

713e79d

wip pixart core kohya training blocks

24da26e

prepare pixart for text enc caching

37760ea

text encoder caching logic for pixart

94f408f

pass right variables in cache te

7c62eff

add comment to cte

0ed7c82

tidy up pixart IO

2d94912

more work on pixart IO

6ae2887

preparations for inference pipeline

27cd350

pass right args in pixart train network

205e121

kabachuha added 5 commits May 21, 2024 22:38

add pixart inference pipeline as diffusers wrapper

0c9ad52

use only vanilla xformers for now

9a0aba1

add pixart to extract lora from models

35d795d

add lora for quantized text encoder not supported

d717e0f

add pixart to dylora, lora (diffusers), fa, oft

abd6a94

kabachuha mentioned this pull request May 22, 2024

[Feat]: PixArt-Sigma training pipeline support Nerogar/OneTrainer#312

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] PixArt-Sigma training pipeline #1341

[WIP] PixArt-Sigma training pipeline #1341

kabachuha commented May 19, 2024 •

edited

Loading

FurkanGozukara commented May 19, 2024

kabachuha commented May 20, 2024

FurkanGozukara commented May 20, 2024

kabachuha commented May 20, 2024

raulc0399 commented May 30, 2024

kabachuha commented May 30, 2024

raulc0399 commented May 30, 2024

kabachuha commented May 30, 2024

raulc0399 commented May 31, 2024 •

edited

Loading

AtomisteBX commented Jun 17, 2024

DanPli commented Jul 4, 2024

[WIP] PixArt-Sigma training pipeline #1341

Are you sure you want to change the base?

[WIP] PixArt-Sigma training pipeline #1341

Conversation

kabachuha commented May 19, 2024 • edited Loading

Current state

FurkanGozukara commented May 19, 2024

kabachuha commented May 20, 2024

FurkanGozukara commented May 20, 2024

kabachuha commented May 20, 2024

raulc0399 commented May 30, 2024

kabachuha commented May 30, 2024

raulc0399 commented May 30, 2024

kabachuha commented May 30, 2024

raulc0399 commented May 31, 2024 • edited Loading

AtomisteBX commented Jun 17, 2024

DanPli commented Jul 4, 2024

kabachuha commented May 19, 2024 •

edited

Loading

raulc0399 commented May 31, 2024 •

edited

Loading