Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added DeepDanbooru interrogator #1752

Merged
merged 12 commits into from
Oct 9, 2022

Conversation

Greendayle
Copy link
Contributor

Made a little experiment, mostly to make anime versions of photos and VRChat screenshots. As Waifu Diffusion is trained on Danbooru tags https://github.com/KichangKim/DeepDanbooru becomes the perfect interrogator to use with img2img.

Due to tensorflow not sharing VRAM properly with pytorch and model being pretty big, I made it run in a seperate process.

Feel free to include this code or not.

image

@dfaker
Copy link
Collaborator

dfaker commented Oct 5, 2022

Whereas for what I could grab, CLIP gives me:

"a anime character with green eyes and a bow tie on, wearing a green light up outfit and a black jacket, by Toei Animations"

Is this a transformative distinction, it seems more descriptive than what's seemingly chained danbooru keywords?

@Greendayle
Copy link
Contributor Author

CLIP is still old CLIP. New button creates danbooru style tagging, which works better than descriptive text for waifu diffusion, which is trained on danbooru tags

@moorehousew
Copy link
Contributor

@dfaker This is specifically for models trained on image-text pairs where the text is a tag dump from the website being scraped (i.e. Danbooru). Will work well for any model trained on a site with booru-like tags.

@dfaker
Copy link
Collaborator

dfaker commented Oct 5, 2022

Okay makes sense, so you'd be more likely to switch out the CLIP for this one if you had WD loaded rather than ever use them in parallel?

Probably be nicer toggle between the models in an option and switch the behaviour in ui.interrogate(image) on that rather than add extra specifics into the img2img tab, particularly as this nicely delays the model loading until use.

@Greendayle
Copy link
Contributor Author

Greendayle commented Oct 5, 2022

@dfaker they both give starter prompts, clip is more suited for vannilla SD, danbooru - for anime.

It doesn't delay loading, as model load, and interrogation are done in a seperate process when you click it.

Anyway, I think it's usefull and I'm gonna be using it. If anyone wants to clean this code they are free to do so.

@dfaker
Copy link
Collaborator

dfaker commented Oct 5, 2022

It doesn't delay loading, as model load, and interrogation are done in a seperate process when you click it.

Yeah, that's the delay, a delay until you click it.

Other than being a bit of a hack to get it to play nice if there's demand for 2D it makes sense, support of an extra url and adding something else to a daunting UI are the only detractors, one easily handled, the other down to demand I suppose, two thumbs already though.

oh wait, deepdanbooru,tensorflow,tensorflow-io.

@chekaaa
Copy link
Contributor

chekaaa commented Oct 5, 2022

I mainly use WD nowadays so Its a nice addition for me at least.

IMO a combobox for switching interrogate methods in settings would be nice in the case there is more added.

@BassJMagan
Copy link

100% this is the only interrogator I'd use, i switched branches over it to Greendayle's branch just for this feature. It's incredibly useful for people who use Waifu Diffusion which is a huge number of SD users.

@BassJMagan
Copy link

Whereas for what I could grab, CLIP gives me:

"a anime character with green eyes and a bow tie on, wearing a green light up outfit and a black jacket, by Toei Animations"

Is this a transformative distinction, it seems more descriptive than what's seemingly chained danbooru keywords?

These descriptions don't work in Waifu Diffusion (which is a HUGE anime model with a lot of users), only danbooru tags work for that and this does a fantastic job of giving enough tags to perfectly recreate any image nearly in that model.

@AUTOMATIC1111
Copy link
Owner

tensorflow requirement can be quite steep as you'd likely need to install CUDA also

@BassJMagan
Copy link

tensorflow requirement can be quite steep as you'd likely need to install CUDA also

it doesn't seem to require that, it errors but still produces the correct result. i uninstalled my CUDA drivers since i wasn't using them earlier so it seems to just work with whatever it's doing.

@Greendayle
Copy link
Contributor Author

Greendayle commented Oct 6, 2022

@AUTOMATIC1111
although the deepdanbooru model is quite big, it runs in reasonable amount of time on CPU (tensorflow fallbacks to CPU when can't use GPU automatically), even when including secondary process startup and model load.

I've failed to find a way to unload tensorflow model, and pytorch/tensorflow don't seem to be able to share VRAM properly. Hence the subprocess approach.

I do understand that adding second big NN library is a pain, but personally for me and a lot of my friends this has been extremely useful tool to use along Waifu Diffusion.

Maybe there is a way to transform a tensorflow network to pytorch, or just recreate it's architecture and retrain in pytorch to then use it inside rest of the framework "natively", but right now it's a useful, working tool to use with a highly popular diffusion model. and I think it's worth to include it in this application.

@OrenjiVR
Copy link

OrenjiVR commented Oct 6, 2022

Mainly using WD or an SD-WD merge, and in direct comparison, the DeepDanbooru interogation gets better results, not only when it comes to img2img, but also when taking those tags and throwing them into txt2img.
Mileage might vary heavily if I wouldn't use WD, but having this as an option to switch on/off would be amazing!

@evanjs
Copy link

evanjs commented Oct 6, 2022

A bit too lazy to diagnose/debug atm, but I had to add the DLL for SQLite to my Anaconda DLLs folder to get this working.

  File "C:\tools\Anaconda3\lib\sqlite3\dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ImportError: DLL load failed while importing _sqlite3: The specified module could not be found.

See this SO answer for an example.

As for how well it works, it seems to perform as you'd expect DeepDanbooru to work, quite well for general tags, but your mileage may vary re characters, series, copyrights, etc.

Otherwise, this is probably the cleanest setup I've used for DeepDanbooru so far.
The only thought I have is that maybe the model could be downloaded automatically, like the other models in the repo?

Thank you!

@Greendayle
Copy link
Contributor Author

@evanjs
That seems like a bug in anaconda. Sqlite is a built in python module and it always should be available.

Well, if you follow installation guide in readme using normal python and webui-user.bat which creates a normal venv it works.

@evanjs
Copy link

evanjs commented Oct 6, 2022

Yeah, it seemed odd to me
If it works with normal setups, then that should be fine

My setups tend to be complex and or/broken so I'm not really surprised 😂
Thank you for clarifying

@BassJMagan
Copy link

yea, i never got the sqlite error.
On another note when I did download CUDA drivers and CuDNN again it does utilize it and the interrogation is a bit faster but it definitely doesn't seem necessary, the CPU loaded model was fast enough as is which should be enough for regular builds, just silence the tensorflow error if it occurs.

@gfeAsdf
Copy link

gfeAsdf commented Oct 7, 2022

While I understand the apprehension of using both pytorch and tensorflow, I find that the utility provided by the deepbooru interrogation outweighs this, by a lot.

@float3
Copy link

float3 commented Oct 7, 2022

If you are committed to not merging this, I would ask that you maintain a deepdanbooru branch using GitHub Actions

@Greendayle
Copy link
Contributor Author

Greendayle commented Oct 7, 2022

Found an issue under linux - processes are started using fork by default, and it would make tensorflow block pytorch from accessing gpu, even after the function finishes. Changed executor subprocess spawning to "spawn", which makes it a completely seperate process just like on windows.

@float3
Copy link

float3 commented Oct 8, 2022

Since it looks like this won't be merged, after talking to @Greendayle, I've decided to maintain a deepdanbooru branch.

@AUTOMATIC1111
Copy link
Owner

Here's what I decided.Add an option to enable deepdanboru to launch.py, off by default. If it's on, install and use in main UI, if not, do not install anything related to it and leave the UI as it is.

@Greendayle
Copy link
Contributor Author

@AUTOMATIC1111
Like this?

@evanjs
Copy link

evanjs commented Oct 8, 2022

@Greendayle
If anything, the PR on the showcase can show what you want to add, but I believe we're using the Wiki on this repository now

@Greendayle
Copy link
Contributor Author

@evanjs it's just above your comment

@MrCheeze
Copy link
Contributor

MrCheeze commented Oct 9, 2022

Is it possible to make this escape parentheses automatically? Right now they get put in directly, and re-interpreted as emphasis parens instead.

@BassJMagan
Copy link

Is it possible to make this escape parentheses automatically? Right now they get put in directly, and re-interpreted as emphasis parens instead.

turn off the option in automatic 1111 and it'll be non-emphasized. otherwise, 1.3 was trained without the use of parenthesis or underscores, so you should remove them entirely.

@AUTOMATIC1111 AUTOMATIC1111 merged commit e00b4df into AUTOMATIC1111:master Oct 9, 2022
@AUTOMATIC1111
Copy link
Owner

Very nice.

One problem I see with multiprocessing now is that it launches launch.py and that does the whole Installing requirements for Web UI thing again. I am not familiar with multiprocessing in python, but is it possible to launch the process with a flag added? If so we could make launch.py detect the flag sand skip installation/checks, just launch webui.py without any extra checking, saving a few seconds.

Also did you consider reordering the tags by probability? That's what I use in my preprocessing script for TI.

@Greendayle
Copy link
Contributor Author

@AUTOMATIC1111

Also did you consider reordering the tags by probability?

I've heard from multiple people that on WD alphabetical sorting works better.

@BassJMagan
Copy link

@AUTOMATIC1111

Also did you consider reordering the tags by probability?

I've heard from multiple people that on WD alphabetical sorting works better.

i believe that tags in WD 1.3 were sorted in random tag order; though I'd also find it more useful if the tags were sorted by their score rather than alphabetically, would make it easier to remove the tags when it goes past the 77 CLIP token limit.

@EliEron
Copy link

EliEron commented Oct 9, 2022

It is true that WF 1.3 was trained with random tag order. Which means there is no logical reason why sorting the tags alphabetically would be beneficial.

Also given that Haru's official prompting guide makes no mention of alphabetical tag sorting being preferred, I can't help but feel that sorting them alphabetically being beneficial is more of a placebo. Or a hangover from WF 1.2 which did not randomize the tags as far as I know.

@AIAMIAUTHOR
Copy link

AIAMIAUTHOR commented Oct 10, 2022

2022-10-10 01:08:37.609118: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-10-10 01:08:47.810344: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP
2022-10-10 01:08:47.810713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP
2022-10-10 01:08:47.811322: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.

Visual Studio 2022 Community Edition
CUDA 11.8
CUDNN 8.6
Multiple GPUs
Tensorflow-gpu 2.6.0
Driver 522.06 dev

Where is this .py?

Whats the best fix for this?

from tensorflow.python.client import device_lib
device_lib.list_local_devices()
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"  
os.environ["CUDA_VISIBLE_DEVICES"]="0"

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,2,3,4"
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,1" 

CUDA_VISIBLE_DEVICES=0 
or
CUDA_VISIBLE_DEVICES=1 
or
CUDA_VISIBLE_DEVICES=2

@Mikian01
Copy link

2022-10-10 01:08:37.609118: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2022-10-10 01:08:47.810344: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP 2022-10-10 01:08:47.810713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP 2022-10-10 01:08:47.811322: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.

i have the same issue

@BassJMagan
Copy link

yea, prior to official integration i was able to use my CUDA capable device (3090) on CUDA 11.2 (compilers+library) with CuDNN manually installed, but now i get an error every time and it only runs on CPU.

@Mikian01
Copy link

Whats the best fix for this?

@AIAMIAUTHOR did you find a way to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.