-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added DeepDanbooru interrogator #1752
Added DeepDanbooru interrogator #1752
Conversation
Whereas for what I could grab, CLIP gives me: "a anime character with green eyes and a bow tie on, wearing a green light up outfit and a black jacket, by Toei Animations" Is this a transformative distinction, it seems more descriptive than what's seemingly chained danbooru keywords? |
CLIP is still old CLIP. New button creates danbooru style tagging, which works better than descriptive text for waifu diffusion, which is trained on danbooru tags |
@dfaker This is specifically for models trained on image-text pairs where the text is a tag dump from the website being scraped (i.e. Danbooru). Will work well for any model trained on a site with booru-like tags. |
Okay makes sense, so you'd be more likely to switch out the CLIP for this one if you had WD loaded rather than ever use them in parallel? Probably be nicer toggle between the models in an option and switch the behaviour in |
@dfaker they both give starter prompts, clip is more suited for vannilla SD, danbooru - for anime. It doesn't delay loading, as model load, and interrogation are done in a seperate process when you click it. Anyway, I think it's usefull and I'm gonna be using it. If anyone wants to clean this code they are free to do so. |
Yeah, that's the delay, a delay until you click it. Other than being a bit of a hack to get it to play nice if there's demand for 2D it makes sense, support of an extra url and adding something else to a daunting UI are the only detractors, one easily handled, the other down to demand I suppose, two thumbs already though. oh wait, deepdanbooru,tensorflow,tensorflow-io. |
I mainly use WD nowadays so Its a nice addition for me at least. IMO a combobox for switching interrogate methods in settings would be nice in the case there is more added. |
100% this is the only interrogator I'd use, i switched branches over it to Greendayle's branch just for this feature. It's incredibly useful for people who use Waifu Diffusion which is a huge number of SD users. |
These descriptions don't work in Waifu Diffusion (which is a HUGE anime model with a lot of users), only danbooru tags work for that and this does a fantastic job of giving enough tags to perfectly recreate any image nearly in that model. |
tensorflow requirement can be quite steep as you'd likely need to install CUDA also |
it doesn't seem to require that, it errors but still produces the correct result. i uninstalled my CUDA drivers since i wasn't using them earlier so it seems to just work with whatever it's doing. |
@AUTOMATIC1111 I've failed to find a way to unload tensorflow model, and pytorch/tensorflow don't seem to be able to share VRAM properly. Hence the subprocess approach. I do understand that adding second big NN library is a pain, but personally for me and a lot of my friends this has been extremely useful tool to use along Waifu Diffusion. Maybe there is a way to transform a tensorflow network to pytorch, or just recreate it's architecture and retrain in pytorch to then use it inside rest of the framework "natively", but right now it's a useful, working tool to use with a highly popular diffusion model. and I think it's worth to include it in this application. |
Mainly using WD or an SD-WD merge, and in direct comparison, the DeepDanbooru interogation gets better results, not only when it comes to img2img, but also when taking those tags and throwing them into txt2img. |
A bit too lazy to diagnose/debug atm, but I had to add the DLL for SQLite to my Anaconda DLLs folder to get this working.
See this SO answer for an example. As for how well it works, it seems to perform as you'd expect DeepDanbooru to work, quite well for general tags, but your mileage may vary re characters, series, copyrights, etc. Otherwise, this is probably the cleanest setup I've used for DeepDanbooru so far. Thank you! |
@evanjs Well, if you follow installation guide in readme using normal python and webui-user.bat which creates a normal venv it works. |
Yeah, it seemed odd to me My setups tend to be complex and or/broken so I'm not really surprised 😂 |
yea, i never got the sqlite error. |
While I understand the apprehension of using both pytorch and tensorflow, I find that the utility provided by the deepbooru interrogation outweighs this, by a lot. |
If you are committed to not merging this, I would ask that you maintain a deepdanbooru branch using GitHub Actions |
Found an issue under linux - processes are started using fork by default, and it would make tensorflow block pytorch from accessing gpu, even after the function finishes. Changed executor subprocess spawning to "spawn", which makes it a completely seperate process just like on windows. |
Since it looks like this won't be merged, after talking to @Greendayle, I've decided to maintain a deepdanbooru branch. |
Here's what I decided.Add an option to enable deepdanboru to launch.py, off by default. If it's on, install and use in main UI, if not, do not install anything related to it and leave the UI as it is. |
@AUTOMATIC1111 |
@Greendayle |
@evanjs it's just above your comment |
Is it possible to make this escape parentheses automatically? Right now they get put in directly, and re-interpreted as emphasis parens instead. |
turn off the option in automatic 1111 and it'll be non-emphasized. otherwise, 1.3 was trained without the use of parenthesis or underscores, so you should remove them entirely. |
Very nice. One problem I see with multiprocessing now is that it launches launch.py and that does the whole Also did you consider reordering the tags by probability? That's what I use in my preprocessing script for TI. |
I've heard from multiple people that on WD alphabetical sorting works better. |
i believe that tags in WD 1.3 were sorted in random tag order; though I'd also find it more useful if the tags were sorted by their score rather than alphabetically, would make it easier to remove the tags when it goes past the 77 CLIP token limit. |
It is true that WF 1.3 was trained with random tag order. Which means there is no logical reason why sorting the tags alphabetically would be beneficial. Also given that Haru's official prompting guide makes no mention of alphabetical tag sorting being preferred, I can't help but feel that sorting them alphabetically being beneficial is more of a placebo. Or a hangover from WF 1.2 which did not randomize the tags as far as I know. |
2022-10-10 01:08:37.609118: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected Visual Studio 2022 Community Edition Where is this .py? Whats the best fix for this?
|
i have the same issue |
yea, prior to official integration i was able to use my CUDA capable device (3090) on CUDA 11.2 (compilers+library) with CuDNN manually installed, but now i get an error every time and it only runs on CPU. |
@AIAMIAUTHOR did you find a way to fix it? |
Made a little experiment, mostly to make anime versions of photos and VRChat screenshots. As Waifu Diffusion is trained on Danbooru tags https://github.com/KichangKim/DeepDanbooru becomes the perfect interrogator to use with img2img.
Due to tensorflow not sharing VRAM properly with pytorch and model being pretty big, I made it run in a seperate process.
Feel free to include this code or not.