Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux shift method update #1564

Open
wants to merge 11 commits into
base: sd3
Choose a base branch
from
Open

Flux shift method update #1564

wants to merge 11 commits into from

Conversation

sdbds
Copy link
Contributor

@sdbds sdbds commented Sep 5, 2024

Mainly the math part is updated.

def get_lin_function(x1: float = 256, y1: float = 0.5, x2: float = 4096, y2: float = 1.15) -> Callable[[float], float]:
    m = (y2 - y1) / (x2 - x1)
    b = y1 - m * x1
    return lambda x: m * x + b

The original y1, y2 is based on the minimum resolution x1=256,maximum resolution x2=4096 for muti-reso calculation.

However, we usually use 256~2048 for training, so we directly use the original multiplier for linear scaling.

For example if x2=2048 then the original function result should be 0.8.

Since it is scaled down twice, the final result is scaled up to 1.6.

@kohya-ss
Copy link
Owner

kohya-ss commented Sep 5, 2024

I'm not sure, but I think that x1 and x2 are the hyperparameters ​​when the model is trained. Therefore, no matter what resolution we train at, I think that x2 and x2 should remain their original values.

If we train it on a large dataset with x1=256 and x2=2048, the model will be able to generate images with a resolution of 256-2048 very well. But I think that would require some really extensive fine tuning.

@kohya-ss kohya-ss added the help wanted Extra attention is needed label Sep 5, 2024
@sdbds
Copy link
Contributor Author

sdbds commented Sep 5, 2024

I'm not sure, but I think that x1 and x2 are the hyperparameters ​​when the model is trained. Therefore, no matter what resolution we train at, I think that x2 and x2 should remain their original values.

If we train it on a large dataset with x1=256 and x2=2048, the model will be able to generate images with a resolution of 256-2048 very well. But I think that would require some really extensive fine tuning.

Further experiments are being conducted to see if the parameters can be modified.
Theoretically these should be fixed parameters after the model has been trained, but some comments on Twitter say that using 1.5 to 1.6 works better, suspecting a linear relationship.
The results so far have been both fuzzy and good.

@kohya-ss
Copy link
Owner

kohya-ss commented Sep 6, 2024

Further experiments are being conducted to see if the parameters can be modified.
Theoretically these should be fixed parameters after the model has been trained, but some comments on Twitter say that using 1.5 to 1.6 works better, suspecting a linear relationship.
The results so far have been both fuzzy and good.

Hmm, I see, that sounds like a good idea. So it might be better to add a new option, even though it would be more complicated... For example, I think, something like flux_shift_dataset_reso, although it might be long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants