Replies: 16 comments 13 replies
-
Here's a quick and dirty implementation for the extension, still needs to have a model trained for it.
|
Beta Was this translation helpful? Give feedback.
-
see here https://huggingface.co/GeroldMeisinger/control-edgedrawing Trained for 40000 steps on images converted with https://github.com/shaojunluo/EDLinePython
Controlnet somewhat picks up on it but results are not good so far:
|
Beta Was this translation helpful? Give feedback.
-
Update: the following issues are obselete with the opencv version, see next posts Lenna test (obsolete) There seems to be something off with the gradienThreshold, see shaojunluo/EDLinePython#4 Edgedrawing default-settings: Edgedrawing from paper: Edgedrawing no-denoising: Canny edge from Automatic1111 (using default settings: low=100 high=200) edge drawing algorithm (obsolete) The edge drawing algorithm works in 3 steps:
and then returns a list of edge-objects and the resulting image. Parameter permutations (obsolete) following is a short python snippet which you can add to (add filenames = ["000000001", "000000006", "000000014", "000000024", "000000029"]
# The goal of this step is to reduce the effect of noise in the image by blurring each pixel with its neighboring pixels. We achieve this by a standard 5x5 Gaussian kernel with σ = 1.
isSmoothed = False
ksizes = [3, 5, 7, 9, 11] # GaussianBlur kernel size, only if smoothed=False
sigmas = [0.6, 0.8, 1.0, 1.2, 1.4] # GaussianBlur sigma, only if smoothed=False
gradientThresholds = [24, 30, 36, 42, 48] # Figure 1 [High-pass filtering residue of Lena image] shows the edge areas corresponding to our test image for a threshold value of 36. White edge areas correspond to pixels for which Gx + Gy >= 36. Black pixels are suppressed as non-edges.
# We define what we call a "detail ratio" that determines how many edge anchor points are selected. The more anchor points you have to start the linking process, the more detail you will have in your final edgemap. Thus, if you are only interested in obtaining the major (long) edges in the image, you can specify a big detail ratio, e.g., a value bigger than 10. But if you want more details in your edge map, you can specify a small detail ratio such as 1, 2, 3, 4, etc.
# For a "detail ratio" value of "k", we scan every kth row or column and mark anchor points only if they fall in these rows or columns. Thus, two consecutive anchor points along the same edge will be at least "k" pixels apart from each other.
anchorThresholds = [4, 6, 8, 10, 12]
scanIntervals=[1, 2, 3, 4, 5]
def with_param(filename, param1, x, param2=None, y=None):
params = EDParam_default.copy()
params[param1] = x
if param2 is not None:
params[param2] = y
ed = EdgeDrawing(params)
image = cv2.imread(filename + ".webp", cv2.IMREAD_GRAYSCALE)
edges, edge_map = ed.EdgeDrawing(image, smoothed=isSmoothed)
if param2 is None:
cv2.imwrite(filename + "/" + filename + "_" + param1 + "=" + str(x) + ".png", edge_map, [cv2.IMWRITE_PNG_BILEVEL, 1])
else:
cv2.imwrite(filename + "/" + filename + "_" + param1 + "=" + str(x) + "_" + param2 + "=" + str(y) + ".png", edge_map, [cv2.IMWRITE_PNG_BILEVEL, 1])
for filename in filenames:
if not os.path.exists(filename):
os.makedirs(filename)
image = cv2.imread(filename + ".webp", cv2.IMREAD_GRAYSCALE)
ed = EdgeDrawing()
edges, edge_map = ed.EdgeDrawing(image, smoothed=False)
cv2.imwrite(filename + "/" + filename + ".png", edge_map, [cv2.IMWRITE_PNG_BILEVEL, 1])
if not isSmoothed:
for x in ksizes:
for y in sigmas:
with_param(filename, "ksize", x, "sigma", y)
for x in gradientThresholds: with_param(filename, "gradientThreshold", x)
for x in anchorThresholds:
for y in scanIntervals: with_param(filename, "anchorThreshold", x, "scanIntervals", y) |
Beta Was this translation helpful? Give feedback.
-
https://huggingface.co/GeroldMeisinger/control-edgedrawing 40k steps, default settings, smoothed=True (=> noisy), no drops: |
Beta Was this translation helpful? Give feedback.
-
here is the cpp implementation of edgedrawing by the original author https://github.com/CihanTopal/ED_Lib . interestingly there is a "parameter free" version (which would be nice) and a "color" version (which would also be nice). EdgeDrawing Parameter-Free
that was easy... color version is not available unfortunately. but algorithm is fast. training starts again. see you tomorrow... |
Beta Was this translation helpful? Give feedback.
-
Update https://huggingface.co/GeroldMeisinger/control-edgedrawing -> You can find all images, generation detail and comparison with canny in here: control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000.zip eagle: "a detailed high-quality professional image of an eagle flying over the mountains" lenna: "a detailed high-quality professional photo of swedish woman standing in front of a mirror, dark brown hair, white hat with purple feather" bird: "bird" lion: "lion" dog2: "a cute dog" 45000 steps with fp16 so far. resuming for another 45000 with flipped images. the results look promising. what do you think? |
Beta Was this translation helpful? Give feedback.
-
Questions about training I work on the laion2b-en-aesthetics6.5 dataset with 180k images which uses alt-tags as captions. 1 epoch takes 20h on my RTX 3060 12GB. To increase training quality we should apply certain transformations but because of the sheer amount of images this has to be done automatically of course.
more experiments:
|
Beta Was this translation helpful? Give feedback.
-
UPDATE 90000 steps fp16 (45000 on original, 45000 on left-right flipped, no drops) Resuming with epoch 2 and So I started using python for SD now and wrote a small script which generates the evaluation images. A few fixes compared to the previous (manual) generations:
If someone knows how to answers my training questions above, or has pointers to info, or knows someone who knows, that would help me alot!! |
Beta Was this translation helpful? Give feedback.
-
UPDATE 118000 steps fp16 (45000 on original, 45000 on left-right flipped, no drops; epoch 2: 28000 steps with 50% drops) results became worse, CN didn't pick up on no-prompts and answered by sending demons restarting with 50% drop. |
Beta Was this translation helpful? Give feedback.
-
UPDATE 45000 steps 50% drop https://huggingface.co/GeroldMeisinger/control-edgedrawing -> control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.safetensors => results are not good, 50% is probably too much for 45k steps. guessmode still doesn't work and tends to produces humans. resuming until 90k with right-left flipped in the hope it will get better with more images |
Beta Was this translation helpful? Give feedback.
-
UPDATE Experiment 5.0 - 45000 steps with fastdup cleaned images and fp32 https://huggingface.co/GeroldMeisinger/control-edgedrawing -> control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000 okay, so I looked into image dataset sanitizing and apparently the laion2b-en-aesthetics65 contains about 40% duplicates (with fastdup default treshold=0.9). that's crazy! a small group of greyscale images are duplicated hundred of times. why has noone ever pointed this out before?!?! /i And someday I will even understand what this means:
In my next experiment I'm going use the caption with the highest similarity value of all duplicates and train on non-squared images too. the image dataset also contained .svg files which makes Pillow throw up. it's unbelievable with how much crap one has to put up in all these images.
svg image, they all look similar to this: what the duck?! |
Beta Was this translation helpful? Give feedback.
-
UPDATE Experiment 6.0 - 135000 steps (~2.5 epochs) with rectangular, fastdup cleaned images https://huggingface.co/GeroldMeisinger/control-edgedrawing -> control-edgedrawing-cv480edpf-rect-fp16-checkpoint-XXX (45000, 90000, 135000) So I had to leave home for a few days and set to training to infinite epochs, and came home to 135000 steps :) Image dataset includes rectangular images now Some very strange intermediate evaluation images: checkpoint-77000 - all scribble art? checkpoint-97000 - all desatured? checkpoint-131000 - all high contrast? The same checkpoints for other image types (like "bird") look fine but show the same phenomenon at other checkpoints. I don't know what to make of it. Unlucky seeds => generate more images? More positive and negative prompts required to guide SD more? If we just compare the checkpoints I uploaded, we are lucky all look relatively fine, but none is significantly better than the other. Some things I derive from this (Update: all of this may be due to
Loss graph looks like this:
I read my own article again, especially the QA section about control net quality. It appears past-me already predicted many errors I made and points to "Increase gradient accumulation steps!". I don't know what "gradient accumulation steps" are, but it abbreviates to "GAS" which brings me to the conclusion that "Good control nets need more GAS". I started a new experiment with lllyasviel "The batch size should not be reduced under any circumstances" "But usually, if your logic batch size is already bigger than 256, then further extending the batch size is not very meaningful. In that case, perhaps a better idea is to train more steps. I tried some "common" logic batch size at 64 or 96 or 128 (by gradient accumulation), it seems that many complicated conditions can be solved very well already." |
Beta Was this translation helpful? Give feedback.
-
UPDATE Experiment 6.1 6696 steps with effective batch size 32 https://huggingface.co/GeroldMeisinger/control-edgedrawing -> control-edgedrawing-rect-fp16-batch32 lllyasviel: "In my experiments, [higher batch size] is usually better than [training on more steps]" What an understatement. Higher effective batch size pretty much solved all problems, and the default settings from HF are crap. No-prompt also works much better now Following lllyasviel: "Because that "sudden converge" always happens, lets say "sudden converge" will happen at 3k step and our money can optimize 90k step, then we have two options: (1) train 3k steps, sudden converge, then train 87k steps. (2) 30x gradient accumulation, train 3k steps (90k real computation steps), then sudden converge."
Result: I got a stable "dog2" at 3800 steps, leaving 2900 more steps to fine-tune. lllyasviel: "..in real cases, perhaps you may need to balance the steps before and after the "sudden converge" on your own to find a balance. The training after "sudden converge" is also important." That's very vague but from what I learned so far, see evaluation images! Loss graph looks like this: Maybe it's a recursive graph and shows "how much I'm lost" at understanding what the loss graph means.. About Sudden Convergence Phenomenon This is the original image: This is what I'm seeing: bird 2400 (all previous steps look very similar) bird 2500 (sudden change to something different) dog2 2400 (all previous steps look very similar) dog2 2500 (sudden change to something different) dog2 2600 (following silhoutte, different background at silhoutte) You can find all the original evaluation images for intermediate steps at HF. It's even less sudden with smaller batch sizes. My guess is lllyasviel used much greater effective batch sizes which makes it look like it is more sudden. And I hypothetize that the phenomenon is even more gradiual if we generate a image for every step at high batch sizes. But it's not that important anyway. We just have to know where the approximate step number is at which the model follows the conditioning to our satisfaction. However I propose to call it "sufficiently phenomenal convergence" which puts more emphasize on the gradual nature :D On evaluation images
|
Beta Was this translation helpful? Give feedback.
-
UPDATE Experiment 6.2 50% prompt dropping https://huggingface.co/GeroldMeisinger/control-edgedrawing -> control-edgedrawing-rect-fp16-batch32-drop50 first is 50% prompt dropping, second is no prompt dropping (see previous experiment). images use feathers and wood are more textured, image looks more natural dog is more dog like, less abstract lion still looks like a dog, but has more textured hair almost no difference, I guess a room is very clearly "a room" almost no difference, I guess because SD already tends to generate humans
|
Beta Was this translation helpful? Give feedback.
-
hi @geroldmeisinger , thanks for the best CN report and tutorials I found! I am also extensively training some CNs now, are you on any Discord channels so we can exchange some ideas? |
Beta Was this translation helpful? Give feedback.
-
Canny is a good edge detector but it only provides good edges after fine tuning its parameters. Images with different contrasts require different parameters.
Long time ago I found this paper that proposes an alternative that outputs great edges on most images without requiring tuning: https://www.semanticscholar.org/paper/Edge-Drawing%3A-A-Heuristic-Approach-to-Robust-Edge-Topal-Akinlar/9c8f2dc3bbd0e7e28f4a72dcdb4d77b52f478740
I found this implementation using python: https://github.com/shaojunluo/EDLinePython
I think it would be great if you could integrate it. It could make edge module a lot easier to use and potentially to automate.
Beta Was this translation helpful? Give feedback.
All reactions