Skip to content

Commit 3139aaa

Browse files
committed
Bump version, add changelog
Also updated some parts of the README. Other parts still need updating.
1 parent e02d04c commit 3139aaa

File tree

4 files changed

+30
-30
lines changed

4 files changed

+30
-30
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
## Changelog
2+
### 0.4.3 video processing tab
3+
* Added an option to process videos directly from a video file. This leads to better results than batch-processing individual frames of a video. Allows generating depthmap videos, that can be used in further generations as custom depthmap videos.
4+
* UI improvements.
5+
* Extra stereoimage generation modes - enable in extension settings if you want to use them.
6+
* New stereoimage generation parameter - offset exponent. Setting it to 1 may produce more realistic outputs.
27
### 0.4.2
38
* Added UI options for 2 additional rembg models.
49
* Heatmap generation UI option is hidden - if you want to use it, please activate it in the extension settings.

README.md

Lines changed: 20 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# High Resolution Depth Maps for Stable Diffusion WebUI
2-
This script is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates `depth maps`, and now also `3D stereo image pairs` as side-by-side or anaglyph from a single image. The result can be viewed on 3D or holographic devices like VR headsets or [Looking Glass](https://lookingglassfactory.com/) displays, used in Render- or Game- Engines on a plane with a displacement modifier, and maybe even 3D printed.
2+
This program is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates depth maps. Using either generated or custom depthmaps, it can also create 3D stereo image pairs (as side-by-side or anaglyph), normalmaps and 3D meshes. The outputs of the script can be viewed directly or used as an asset for a 3D engine. Please see [wiki](https://github.com/thygate/stable-diffusion-webui-depthmap-script/wiki/Viewing-Results) to learn more. The program has integration with Rembg. It supports batch processing and processing of videos and can also be run in standalone mode, without Stable Diffusion WebUI.
33

4-
To generate realistic depth maps `from a single image`, this script uses code and models from the [MiDaS](https://github.com/isl-org/MiDaS) and [ZoeDepth](https://github.com/isl-org/ZoeDepth) repositories by Intel ISL, or LeReS from the [AdelaiDepth](https://github.com/aim-uofa/AdelaiDepth) repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) is used to generate high resolution depth maps.
4+
To generate realistic depth maps from individual images, this script uses code and models from the [MiDaS](https://github.com/isl-org/MiDaS) and [ZoeDepth](https://github.com/isl-org/ZoeDepth) repositories by Intel ISL, or LeReS from the [AdelaiDepth](https://github.com/aim-uofa/AdelaiDepth) repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) is used to generate high resolution depth maps.
55

6-
3D stereo, and red/cyan anaglyph images are generated using code from the [stereo-image-generation](https://github.com/m5823779/stereo-image-generation) repository. Thanks to [@sina-masoud-ansari](https://github.com/sina-masoud-ansari) for the tip! Discussion [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/discussions/45). Improved techniques for generating stereo images and balancing distortion between eyes by [@semjon00](https://github.com/semjon00), see [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/51) and [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/56).
6+
Stereoscopic images are created using a custom-written algorithm.
77

88
3D Photography using Context-aware Layered Depth Inpainting by Virginia Tech Vision and Learning Lab , or [3D-Photo-Inpainting](https://github.com/vt-vl-lab/3d-photo-inpainting) is used to generate a `3D inpainted mesh` and render `videos` from said mesh.
99

@@ -20,32 +20,26 @@ video by [@graemeniedermayer](https://github.com/graemeniedermayer), more exampl
2020
![](https://user-images.githubusercontent.com/54073010/210012661-ef07986c-2320-4700-bc54-fad3899f0186.png)
2121
images generated by [@semjon00](https://github.com/semjon00) from CC0 photos, more examples [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/56#issuecomment-1367596463).
2222

23-
2423
## Install instructions
25-
The script is now also available to install from the `Available` subtab under the `Extensions` tab in the WebUI.
24+
The script can be installed directly from WebUI. Please navigate to `Extensions` tab, then click `Available`, `Load from` and then install the `Depth Maps` extension. Alternatively, the extension can be installed from URL: `https://github.com/thygate/stable-diffusion-webui-depthmap-script`.
2625

2726
### Updating
2827
In the WebUI, in the `Extensions` tab, in the `Installed` subtab, click `Check for Updates` and then `Apply and restart UI`.
2928

30-
### Automatic installation
31-
In the WebUI, in the `Extensions` tab, in the `Install from URL` subtab, enter this repository
32-
`https://github.com/thygate/stable-diffusion-webui-depthmap-script`
33-
and click install and restart.
34-
35-
>Model `weights` will be downloaded automatically on first use and saved to /models/midas, /models/leres and /models/pix2pix
29+
>Model weights will be downloaded automatically on their first use and saved to /models/midas, /models/leres and /models/pix2pix. Zoedepth models are stored in torch cache folder.
3630
3731

3832
## Usage
39-
Select the "DepthMap vX.X.X" script from the script selection box in either txt2img or img2img, or go to the Depth tab when using existing images.
33+
Select the "DepthMap" script from the script selection box in either txt2img or img2img, or go to the Depth tab when using existing images.
4034
![screenshot](options.png)
4135

42-
The models can `Compute on` GPU and CPU, use CPU if low on VRAM.
36+
The models can `Compute on` GPU and CPU, use CPU if low on VRAM.
4337

44-
There are seven models available from the `Model` dropdown. For the first model, res101, see [AdelaiDepth/LeReS](https://github.com/aim-uofa/AdelaiDepth/tree/main/LeReS) for more info. The others are the midas models: dpt_beit_large_512, dpt_beit_large_384, dpt_large_384, dpt_hybrid_384, midas_v21, and midas_v21_small. See the [MiDaS](https://github.com/isl-org/MiDaS) repository for more info. The newest dpt_beit_large_512 model was trained on a 512x512 dataset but is VERY VRAM hungry.
38+
There are ten models available from the `Model` dropdown. For the first model, res101, see [AdelaiDepth/LeReS](https://github.com/aim-uofa/AdelaiDepth/tree/main/LeReS) for more info. The others are the midas models: dpt_beit_large_512, dpt_beit_large_384, dpt_large_384, dpt_hybrid_384, midas_v21, and midas_v21_small. See the [MiDaS](https://github.com/isl-org/MiDaS) repository for more info. The newest dpt_beit_large_512 model was trained on a 512x512 dataset but is VERY VRAM hungry. The last three models are [ZoeDepth](https://github.com/isl-org/ZoeDepth) models.
4539

4640
Net size can be set with `net width` and `net height`, or will be the same as the input image when `Match input size` is enabled. There is a trade-off between structural consistency and high-frequency details with respect to net size (see [observations](https://github.com/compphoto/BoostingMonocularDepth#observations)).
4741

48-
`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results. Mitigating the observations mentioned above. Net size is ignored when enabled. Best results with res101.
42+
`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results, mitigating the observations mentioned above, and the cost of much larger compute time. Best results with res101.
4943

5044
`Clip and renormalize` allows for clipping the depthmap on the `near` and `far` side, the values in between will be renormalized to fit the available range. Set both values equal to get a b&w mask of a single depth plane at that value. This option works on the 16-bit depthmap and allows for 1000 steps to select the clip values.
5145

@@ -55,8 +49,6 @@ Regardless of global settings, `Save DepthMap` will always save the depthmap in
5549

5650
To see the generated output in the webui `Show DepthMap` should be enabled. When using Batch img2img this option should also be enabled.
5751

58-
To make the depthmap easier to analyze for human eyes, `Show HeatMap` shows an extra image in the WebUI that has a color gradient applied. It is not saved.
59-
6052
When `Combine into one image` is enabled, the depthmap will be combined with the original image, the orientation can be selected with `Combine axis`. When disabled, the depthmap will be saved as a 16 bit single channel PNG as opposed to a three channel (RGB), 8 bit per channel image when the option is enabled.
6153

6254
When either `Generate Stereo` or `Generate anaglyph` is enabled, a stereo image pair will be generated. `Divergence` sets the amount of 3D effect that is desired. `Balance between eyes` determines where the (inevitable) distortion from filling up gaps will end up, -1 Left, +1 Right, and 0 balanced.
@@ -78,17 +70,19 @@ If you often get out of memory errors when computing a depthmap on GPU while usi
7870
## FAQ
7971

8072
* `Can I use this on existing images ?`
81-
- Yes, you can now use the Depth tab to easily process existing images.
82-
- Yes, in img2img, set denoising strength to 0. This will effectively skip stable diffusion and use the input image. You will still have to set the correct size, and need to select `Crop and resize` instead of `Just resize` when the input image resolution does not match the set size perfectly.
83-
* `Can I run this on google colab ?`
73+
- Yes, you can use the Depth tab to easily process existing images.
74+
- Another way of doing this would be to use img2img with denoising strength to 0. This will effectively skip stable diffusion and use the input image. You will still have to set the correct size, and need to select `Crop and resize` instead of `Just resize` when the input image resolution does not match the set size perfectly.
75+
* `Can I run this on Google Colab ?`
8476
- You can run the MiDaS network on their colab linked here https://pytorch.org/hub/intelisl_midas_v2/
8577
- You can run BoostingMonocularDepth on their colab linked here : https://colab.research.google.com/github/compphoto/BoostingMonocularDepth/blob/main/Boostmonoculardepth.ipynb
86-
87-
## Forks and Related
88-
89-
* Several scripts by [@Extraltodeus](https://github.com/Extraltodeus) using depth maps : https://github.com/Extraltodeus?tab=repositories
90-
91-
### More updates soon .. Feel free to comment and share in the discussions.
78+
- Running this program on Colab is not officially supported, but it may work. Please look for more suitable ways of running this.
79+
* `What other depth-related projects could I check out?`
80+
- Several [scripts](https://github.com/Extraltodeus?tab=repositories) by [@Extraltodeus](https://github.com/Extraltodeus) using depth maps.
81+
- Geo11 and [Depth3D](https://github.com/BlueSkyDefender/Depth3D) for playing existing games in 3D.
82+
* `How can I know what changed in the new version of the script?`
83+
- You can see the git history log or refer to the `CHANGELOG.md` file.
84+
85+
### Feel free to comment and share in the discussions!
9286

9387
## Acknowledgements
9488

src/common_ui.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -222,9 +222,10 @@ def open_folder_action():
222222

223223

224224
def depthmap_mode_video(inp):
225-
gr.HTML(value="Single video mode allows generating videos from videos. Every frame of the video is processed, "
226-
"please adjust generation settings, so that generation is not too slow. For the best results, "
227-
"Use a zoedepth model, since they provide the highest level of temporal coherency.")
225+
gr.HTML(value="Single video mode allows generating videos from videos. Please"
226+
"keep in mind that all the frames of the video need to be processed - therefore it is important to"
227+
"pick settings so that the generation is not too slow. For the best results, "
228+
"use a zoedepth model, since they provide the highest level of coherency between frames.")
228229
inp += gr.File(elem_id='depthmap_vm_input', label="Video or animated file",
229230
file_count="single", interactive=True, type="file")
230231
inp += gr.Dropdown(elem_id="depthmap_vm_smoothening_mode", label="Smoothening", type="value", choices=['none'])

src/misc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ def call_git(dir):
2424

2525
REPOSITORY_NAME = "stable-diffusion-webui-depthmap-script"
2626
SCRIPT_NAME = "DepthMap"
27-
SCRIPT_VERSION = "v0.4.2"
27+
SCRIPT_VERSION = "v0.4.3"
2828
SCRIPT_FULL_NAME = f"{SCRIPT_NAME} {SCRIPT_VERSION} ({get_commit_hash()})"
2929

3030

0 commit comments

Comments
 (0)