-
-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model_refactor (#571) #572
Conversation
* original model to new structure * IAE model to new structure * OriginalHiRes to new structure * Fix trainer for different resolutions * Initial config implementation * Configparse library added * improved training data loader * dfaker model working * Add logging to training functions * Non blocking input for cli training * Add error handling to threads. Add non-mp queues to queue_handler * Improved Model Building and NNMeta * refactor lib/models * training refactor. DFL H128 model Implementation * Dfaker - use hashes * Move timelapse. Remove perceptual loss arg * Update INSTALL.md. Add logger formatting. Update Dfaker training * DFL h128 partially ported
Like the re-org, I have a few of the same losses in an offline repo. Along with the testing, I can add some of the loss codes |
I am getting the following error when trying the dfaker model: |
Here is the complete log: |
There is another bug in the loss function for dfaker, so this may be related. On the face of it, I don't see an issue with your setup. Next on my list is to implement the mask for dfaker and fix the loss function, so hold tight until the next update (probably tomorrow) and try again. |
* Remove old models. Add mask to dfaker
Looking at the i/o pre-processing into the gan model... |
* DFL H128 Mask. Mask type selectable in config.
@Enyakk I just randomly hit the same bug. Should be fixed now. |
Creating Input Size config for models Will be used downstream in converters. Also name change of image_shape to input_shape to clarify ( for future models with potentially different output_shapes)
…into train_refactor
…into train_refactor
Includes auto-calculation of proper padding shapes, input_shapes, output_shapes Flag included in config now
…into train_refactor
Hi I took this branch for a spin. I have been using (a fork of) dfaker's repo for a while and I wanted to check this project out. My fork didn't touch the model architecture at all so I figured I could use my weights files on the dfaker model of your branch. Let me first say that I love the refactor done in this branch! When I last checked out master a few months ago I quickly gave up. Thanks for all that work. That said, I did ran into some issues:
Regarding the model changes (from point 1), listing changes I changed back to make the model compatible again:
I've only started looking into this codebase today, so I apologize if I missed anything and I don't want to step on anyone's toes here, just wanting to share some thoughts while I have them. Please let me know your thoughts, thanks! |
Thanks for the feedback! To answer some of your points. In an ideal world dfaker would be model compatible with the original @dfaker model. My main goal was to keep it as 'vanilla' as possible, whilst extending functionality where possible. The main purpose of this refactor is to standardize as much as possible, and make any resource used in one model available for all existing models and any new models. Unfortunately some of those ideals don't necessarily play nicely with each other, so I will always choose to move towards standardized beyond maintaining custom compatibility. That said @kvrooman would be better placed to comment on the reasoning behind the changes to the nn_blocks. When you say legacy dfaker? Do you mean earlier versions in this branch? If so, unfortunately we won't maintain backwards compatibility. Anything and everything in this branch is subject to change until it gets merged to staging (hopefully very soon). Whilst "While True" loops aren't particularly great practice, they also don't generally eat too many CPU cycles, so it shouldn't be too much of an issue here. There is definitely an issue with the feeders, and the plan is to move A and B into their own processes, as everything is competing for single threaded CPU time at the moment. I have also noticed that dfaker feeds particularly slowly, and I will investigate why. I've decided to put it on the backburner for now as, whilst it isn't great, it isn't model breaking, and moving to multiprocesses is likely to involve a fairly hefty rewrite to keep everything thread-safe. It is high on the list once we've got this migrated into master though. I will look into the possibility of adding a converter for dfaker's alignments files. |
Thanks for the answers, that clears up some things I was wondering. Where I said "legacy dfaker" I was referring to the models defined in the original df repo. I have quite some weights files that match this model, which I would love to re-use with this project. They have many hours of training in them and in my experience it works quite well to re-use existing weights from decoders as a crude form of transfer learning. Regarding the required resources for the dfaker feeder, if it is similar to the original dfaker code with its warping and matching of similar landmarks I can imagine it would be slower than others. I understand this gets a lower priority compared to getting this merged into master. In the mean time I will dig into this part of the code base and see if is either some low hanging fruit or if I can start getting the feeding multiprocess, at the very least get an idea on what needs to be done there. Thanks! |
some notes on the items you highlighted. there was a focus on improving performance & better stability for models, especially when some training instability errors crept up. I generally applied some typical practices in resnet models to our code. realize this may make some back compatability issues with legacy models. you could use a weight loader to load your old weights in the updated model arch. as the layers are still all the same. we don't do this is our code, but I've done it myself for other models ---There now is a res_block_follows param to upscale, when it is true the LeakyReLU gets moved into in the res_block. However upscale does add a PixelShuffler. This thus result in a reversed order of these layers compared to the original model. ie orginal upscale conv2d -> leaky_re_lu -> pixel_shuffler, model in this branch conv2d -> pixel_shuffler -> leaky_re_lu (c9d6698)
---With change to LeakyReLU from upscale to res_block, the alpha changed from 0.1 to 0.2. (c9d6698)
---Removal of Bias=False in res_block's conv2d layers
|
Thanks for the explanations on the these changes, makes a lot of sense now. I am interested in the loading of weights into a different model arch. There already is a
but that will not work because the layer names are not only different, layers with the same name are used in different places. ie a Would I need to make a mapping between old names and new names of corresponding layers and the Thanks! |
This is alpha! It is getting close to merging to staging, but bear in mind that:
Some items won't work.
Some items will work.
Some will stay the same.
Some will be changed.
Do not rely on anything in this branch until it has been merged to master. That said, you are welcome to test and report bugs. If reporting bugs please provide a crash report.
This PR significantly refactors the training part of Faceswap.
New models
Support for the following models added:
dfaker (@dfaker)
dfl h128
villain. A very resource intensive model by @VillainGuy
GAN has been removed with a view to look at adding GAN v2.2 down the line.
lowmem removed, but you can access the same functionality by enabling the 'lowmem' option in the config.ini for original model
Config ini files
Config files for each section will be generated on the first section run, or when running the GUI and will be placed in
/<faceswap folder>/config/
. These config files contains customizable options for each of the plugins and some global options. They are also accessible from the "Edit" menu in the GUI.Converters have been re-written (see #574)
Known Bugs:
[GUI] Preview not working on Windows?
[Postponed - minor issue] Training takes a long time to start?
Todo:
[GUI] Auto switch to tab on open recent[GUI] Read session type from saved file[GUI] Analysis tab, allow filtering by loss type[GUI] Remove graph animation and replace with refresh buttonRead loss in gui from model loss fileStore loss to fileparallel model savingReinstate OHR RC4FixTensorBoard supportAdd dfaker "landmarks based warping" option to all modelsTweak converterUpdate config to delete old items as well as insert new itemsConfirm timelapse workingCleanup preview for masked trainingAdd masks to all modelsAdd coverage option to all models[converters] histogram currently non-functional.. working on mask / image interactionsparamatize size and padding across the codemerge @kvrooman PRsAdd dfaker mask to converts. [Cancelled]. Too similar to facehull to be worth itStandardise NN BlocksFix for Backwards compatibilityConverters for new models (Consolidation of converters & refactor #574)Input shape. Decide which can be configured and which must be staticLoad input shapes from state file for saved modelsexpand out state file (Add current model config to this and load from here)Save model definitionmerge RC4_fix into original_hiresBackup state file with modelImprove GUI CPU handling for graphConfig options in GUIModel corruption protectionDetail
A lot of the code has been standardized for all models, so they now all share the same loading/saving/training/preview functions. NN Blocks and other training functions have been separated out into their own libraries so that they can be used in multiple models. This should help to enable easier development of models by using/adding different objects to the lib/model data store.
Abandoned
commits:
original model to new structure
IAE model to new structure
OriginalHiRes to new structure
Fix trainer for different resolutions
Initial config implementation
Configparse library added
improved training data loader
dfaker model working
Add logging to training functions
Non blocking input for cli training
Add error handling to threads. Add non-mp queues to queue_handler
Improved Model Building and NNMeta
refactor lib/models
training refactor. DFL H128 model Implementation
Dfaker - use hashes
Move timelapse. Remove perceptual loss arg
Update INSTALL.md. Add logger formatting. Update Dfaker training
DFL h128 partially ported