Added DeepDanbooru interrogator #1752

Greendayle · 2022-10-05T20:50:04Z

Made a little experiment, mostly to make anime versions of photos and VRChat screenshots. As Waifu Diffusion is trained on Danbooru tags https://github.com/KichangKim/DeepDanbooru becomes the perfect interrogator to use with img2img.

Due to tensorflow not sharing VRAM properly with pytorch and model being pretty big, I made it run in a seperate process.

Feel free to include this code or not.

dfaker · 2022-10-05T21:20:50Z

Whereas for what I could grab, CLIP gives me:

"a anime character with green eyes and a bow tie on, wearing a green light up outfit and a black jacket, by Toei Animations"

Is this a transformative distinction, it seems more descriptive than what's seemingly chained danbooru keywords?

Greendayle · 2022-10-05T21:33:48Z

CLIP is still old CLIP. New button creates danbooru style tagging, which works better than descriptive text for waifu diffusion, which is trained on danbooru tags

moorehousew · 2022-10-05T21:33:56Z

@dfaker This is specifically for models trained on image-text pairs where the text is a tag dump from the website being scraped (i.e. Danbooru). Will work well for any model trained on a site with booru-like tags.

dfaker · 2022-10-05T21:35:51Z

Okay makes sense, so you'd be more likely to switch out the CLIP for this one if you had WD loaded rather than ever use them in parallel?

Probably be nicer toggle between the models in an option and switch the behaviour in ui.interrogate(image) on that rather than add extra specifics into the img2img tab, particularly as this nicely delays the model loading until use.

Greendayle · 2022-10-05T21:47:01Z

@dfaker they both give starter prompts, clip is more suited for vannilla SD, danbooru - for anime.

It doesn't delay loading, as model load, and interrogation are done in a seperate process when you click it.

Anyway, I think it's usefull and I'm gonna be using it. If anyone wants to clean this code they are free to do so.

dfaker · 2022-10-05T21:59:10Z

It doesn't delay loading, as model load, and interrogation are done in a seperate process when you click it.

Yeah, that's the delay, a delay until you click it.

Other than being a bit of a hack to get it to play nice if there's demand for 2D it makes sense, support of an extra url and adding something else to a daunting UI are the only detractors, one easily handled, the other down to demand I suppose, two thumbs already though.

oh wait, deepdanbooru,tensorflow,tensorflow-io.

chekaaa · 2022-10-05T22:02:11Z

I mainly use WD nowadays so Its a nice addition for me at least.

IMO a combobox for switching interrogate methods in settings would be nice in the case there is more added.

BassJMagan · 2022-10-06T04:24:35Z

100% this is the only interrogator I'd use, i switched branches over it to Greendayle's branch just for this feature. It's incredibly useful for people who use Waifu Diffusion which is a huge number of SD users.

BassJMagan · 2022-10-06T04:25:58Z

Whereas for what I could grab, CLIP gives me:

"a anime character with green eyes and a bow tie on, wearing a green light up outfit and a black jacket, by Toei Animations"

Is this a transformative distinction, it seems more descriptive than what's seemingly chained danbooru keywords?

These descriptions don't work in Waifu Diffusion (which is a HUGE anime model with a lot of users), only danbooru tags work for that and this does a fantastic job of giving enough tags to perfectly recreate any image nearly in that model.

AUTOMATIC1111 · 2022-10-06T05:56:45Z

tensorflow requirement can be quite steep as you'd likely need to install CUDA also

BassJMagan · 2022-10-06T06:01:59Z

tensorflow requirement can be quite steep as you'd likely need to install CUDA also

it doesn't seem to require that, it errors but still produces the correct result. i uninstalled my CUDA drivers since i wasn't using them earlier so it seems to just work with whatever it's doing.

Greendayle · 2022-10-06T12:18:56Z

@AUTOMATIC1111
although the deepdanbooru model is quite big, it runs in reasonable amount of time on CPU (tensorflow fallbacks to CPU when can't use GPU automatically), even when including secondary process startup and model load.

I've failed to find a way to unload tensorflow model, and pytorch/tensorflow don't seem to be able to share VRAM properly. Hence the subprocess approach.

I do understand that adding second big NN library is a pain, but personally for me and a lot of my friends this has been extremely useful tool to use along Waifu Diffusion.

Maybe there is a way to transform a tensorflow network to pytorch, or just recreate it's architecture and retrain in pytorch to then use it inside rest of the framework "natively", but right now it's a useful, working tool to use with a highly popular diffusion model. and I think it's worth to include it in this application.

OrenjiVR · 2022-10-06T13:31:49Z

Mainly using WD or an SD-WD merge, and in direct comparison, the DeepDanbooru interogation gets better results, not only when it comes to img2img, but also when taking those tags and throwing them into txt2img.
Mileage might vary heavily if I wouldn't use WD, but having this as an option to switch on/off would be amazing!

evanjs · 2022-10-06T16:15:00Z

A bit too lazy to diagnose/debug atm, but I had to add the DLL for SQLite to my Anaconda DLLs folder to get this working.

  File "C:\tools\Anaconda3\lib\sqlite3\dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ImportError: DLL load failed while importing _sqlite3: The specified module could not be found.

See this SO answer for an example.

As for how well it works, it seems to perform as you'd expect DeepDanbooru to work, quite well for general tags, but your mileage may vary re characters, series, copyrights, etc.

Otherwise, this is probably the cleanest setup I've used for DeepDanbooru so far.
The only thought I have is that maybe the model could be downloaded automatically, like the other models in the repo?

Thank you!

Greendayle · 2022-10-06T17:50:05Z

@evanjs
That seems like a bug in anaconda. Sqlite is a built in python module and it always should be available.

Well, if you follow installation guide in readme using normal python and webui-user.bat which creates a normal venv it works.

evanjs · 2022-10-06T17:51:39Z

Yeah, it seemed odd to me
If it works with normal setups, then that should be fine

My setups tend to be complex and or/broken so I'm not really surprised 😂
Thank you for clarifying

BassJMagan · 2022-10-06T18:03:52Z

yea, i never got the sqlite error.
On another note when I did download CUDA drivers and CuDNN again it does utilize it and the interrogation is a bit faster but it definitely doesn't seem necessary, the CPU loaded model was fast enough as is which should be enough for regular builds, just silence the tensorflow error if it occurs.

gfeAsdf · 2022-10-07T14:12:11Z

While I understand the apprehension of using both pytorch and tensorflow, I find that the utility provided by the deepbooru interrogation outweighs this, by a lot.

float3 · 2022-10-07T14:19:52Z

If you are committed to not merging this, I would ask that you maintain a deepdanbooru branch using GitHub Actions

Greendayle · 2022-10-07T19:04:34Z

Found an issue under linux - processes are started using fork by default, and it would make tensorflow block pytorch from accessing gpu, even after the function finishes. Changed executor subprocess spawning to "spawn", which makes it a completely seperate process just like on windows.

float3 · 2022-10-08T09:17:00Z

Since it looks like this won't be merged, after talking to @Greendayle, I've decided to maintain a deepdanbooru branch.

AUTOMATIC1111 · 2022-10-08T15:01:52Z

Here's what I decided.Add an option to enable deepdanboru to launch.py, off by default. If it's on, install and use in main UI, if not, do not install anything related to it and leave the UI as it is.

…epbooru model

Greendayle · 2022-10-08T16:04:28Z

@AUTOMATIC1111
Like this?

evanjs · 2022-10-08T16:33:51Z

@Greendayle
If anything, the PR on the showcase can show what you want to add, but I believe we're using the Wiki on this repository now

Greendayle · 2022-10-08T18:49:23Z

@evanjs it's just above your comment

MrCheeze · 2022-10-09T02:27:20Z

Is it possible to make this escape parentheses automatically? Right now they get put in directly, and re-interpreted as emphasis parens instead.

BassJMagan · 2022-10-09T03:24:39Z

Is it possible to make this escape parentheses automatically? Right now they get put in directly, and re-interpreted as emphasis parens instead.

turn off the option in automatic 1111 and it'll be non-emphasized. otherwise, 1.3 was trained without the use of parenthesis or underscores, so you should remove them entirely.

AUTOMATIC1111 · 2022-10-09T08:06:33Z

Very nice.

One problem I see with multiprocessing now is that it launches launch.py and that does the whole Installing requirements for Web UI thing again. I am not familiar with multiprocessing in python, but is it possible to launch the process with a flag added? If so we could make launch.py detect the flag sand skip installation/checks, just launch webui.py without any extra checking, saving a few seconds.

Also did you consider reordering the tags by probability? That's what I use in my preprocessing script for TI.

Greendayle · 2022-10-09T11:21:08Z

@AUTOMATIC1111

Also did you consider reordering the tags by probability?

I've heard from multiple people that on WD alphabetical sorting works better.

BassJMagan · 2022-10-09T16:41:27Z

@AUTOMATIC1111

Also did you consider reordering the tags by probability?

I've heard from multiple people that on WD alphabetical sorting works better.

i believe that tags in WD 1.3 were sorted in random tag order; though I'd also find it more useful if the tags were sorted by their score rather than alphabetically, would make it easier to remove the tags when it goes past the 77 CLIP token limit.

EliEron · 2022-10-09T20:12:08Z

It is true that WF 1.3 was trained with random tag order. Which means there is no logical reason why sorting the tags alphabetically would be beneficial.

Also given that Haru's official prompting guide makes no mention of alphabetical tag sorting being preferred, I can't help but feel that sorting them alphabetically being beneficial is more of a placebo. Or a hangover from WF 1.2 which did not randomize the tags as far as I know.

AIAMIAUTHOR · 2022-10-10T06:21:31Z

2022-10-10 01:08:37.609118: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-10-10 01:08:47.810344: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP
2022-10-10 01:08:47.810713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP
2022-10-10 01:08:47.811322: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.

Visual Studio 2022 Community Edition
CUDA 11.8
CUDNN 8.6
Multiple GPUs
Tensorflow-gpu 2.6.0
Driver 522.06 dev

Where is this .py?

Whats the best fix for this?

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"  
os.environ["CUDA_VISIBLE_DEVICES"]="0"

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,2,3,4"

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,1"

CUDA_VISIBLE_DEVICES=0 
or
CUDA_VISIBLE_DEVICES=1 
or
CUDA_VISIBLE_DEVICES=2

Mikian01 · 2022-10-10T09:30:13Z

2022-10-10 01:08:37.609118: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2022-10-10 01:08:47.810344: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP 2022-10-10 01:08:47.810713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP 2022-10-10 01:08:47.811322: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.

i have the same issue

BassJMagan · 2022-10-10T17:10:12Z

yea, prior to official integration i was able to use my CUDA capable device (3090) on CUDA 11.2 (compilers+library) with CuDNN manually installed, but now i get an error every time and it only runs on CPU.

Mikian01 · 2022-10-11T18:58:39Z

Whats the best fix for this?

@AIAMIAUTHOR did you find a way to fix it?

Greendayle added 4 commits October 5, 2022 20:55

deepdanbooru interrogator

59a2b9e

removing problematic tag

1506fab

better model search

17a99ba

removing underscores and colons

4320f38

Greendayle added 4 commits October 7, 2022 18:31

Merge branch 'master' into dev/deepdanbooru

537da7a

loading tf only in interrogation process

54fa613

even more powerfull fix

fa2ea64

linux test

5f12e7e

fix conflicts

2e8ba0f

Merge branch 'master' into dev/deepdanbooru

5329d0a

made deepdanbooru optional, added to readme, automatic download of de…

01f8cb4

…epbooru model

Greendayle mentioned this pull request Oct 8, 2022

Deepbooru integrator showcase AUTOMATIC1111/stable-diffusion-webui-feature-showcase#16

Open

Merge branch 'master' into dev/deepdanbooru

0ec80f0

AUTOMATIC1111 merged commit e00b4df into AUTOMATIC1111:master Oct 9, 2022

Greendayle mentioned this pull request Oct 9, 2022

Shielded launch enviroment creation stuff from multiprocessing #2069

Merged

JC-Array mentioned this pull request Oct 10, 2022

deepbooru tags for textual inversion preproccessing #2143

Merged

Added DeepDanbooru interrogator #1752

Added DeepDanbooru interrogator #1752

Conversation

Greendayle commented Oct 5, 2022

dfaker commented Oct 5, 2022

Greendayle commented Oct 5, 2022

moorehousew commented Oct 5, 2022

dfaker commented Oct 5, 2022 • edited Loading

Greendayle commented Oct 5, 2022 • edited Loading

dfaker commented Oct 5, 2022 • edited Loading

chekaaa commented Oct 5, 2022 • edited Loading

BassJMagan commented Oct 6, 2022

BassJMagan commented Oct 6, 2022

AUTOMATIC1111 commented Oct 6, 2022

BassJMagan commented Oct 6, 2022

Greendayle commented Oct 6, 2022 • edited Loading

OrenjiVR commented Oct 6, 2022

evanjs commented Oct 6, 2022

Greendayle commented Oct 6, 2022

evanjs commented Oct 6, 2022

BassJMagan commented Oct 6, 2022

gfeAsdf commented Oct 7, 2022

float3 commented Oct 7, 2022

Greendayle commented Oct 7, 2022 • edited Loading

float3 commented Oct 8, 2022 • edited Loading

AUTOMATIC1111 commented Oct 8, 2022

Greendayle commented Oct 8, 2022

evanjs commented Oct 8, 2022 • edited Loading

Greendayle commented Oct 8, 2022

MrCheeze commented Oct 9, 2022

BassJMagan commented Oct 9, 2022

AUTOMATIC1111 commented Oct 9, 2022

Greendayle commented Oct 9, 2022

BassJMagan commented Oct 9, 2022

EliEron commented Oct 9, 2022

AIAMIAUTHOR commented Oct 10, 2022 • edited Loading

Mikian01 commented Oct 10, 2022

BassJMagan commented Oct 10, 2022

Mikian01 commented Oct 11, 2022

dfaker commented Oct 5, 2022 •

edited

Loading

Greendayle commented Oct 5, 2022 •

edited

Loading

dfaker commented Oct 5, 2022 •

edited

Loading

chekaaa commented Oct 5, 2022 •

edited

Loading

Greendayle commented Oct 6, 2022 •

edited

Loading

Greendayle commented Oct 7, 2022 •

edited

Loading

float3 commented Oct 8, 2022 •

edited

Loading

evanjs commented Oct 8, 2022 •

edited

Loading

AIAMIAUTHOR commented Oct 10, 2022 •

edited

Loading