-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AI Masks #12295
Comments
after the AI is implemented by someone (thats the less complex part of the job), who should do the training? |
Hi Martin,
Thanks for replying to my feature request. With regards to your question(s): I see this as follows: The AI should interpret the image in the state it is in (input state of the module at hand) and then will have to try to find the indicated object in that image. Another option could be that the AI only works with the image at the beginning of the pipe line if this is easier. I'm not an AI expert but for a trained AI it shouldn't matter in which module this is done (an AI that recognizes a cat will find this animal in many images and states of images just like a human being can do this). Of course success is not always guaranteed but that's expected. If the AI manages to find the requested object it can create a mask (on second thought [I implied otherwise in my request] a raster mask [that can be used in other modules] may be the most logical choice with the option for the user to combine this mask with a mask drawn by the user [this would be new functionality as well but would be cool])
As far as training the AI is concerned. Pre trained libraries for object recognition seem to exist (see for instance this page: https://stackabuse.com/object-detection-with-imageai-in-python/[1]) which could be a good starting point. Along with that the AI could improve/be trained further from the drawn mask the user creates to combine with the mask created by the AI). It would be cool if the user can optionally export the locally trained model of the AI and send it to the darktable team so that this can be combined with the contributions of other users to further optimize the AI. This optimized model could then be distributed as an add on to the users again (and/or incorporated in a next version of the product) This way the whole community can contribute to the improvement of the AI
I hope this helps
Best regards
…On Tuesday, 9 August 2022 20:04:36 CEST Martin Straeten wrote:
after the AI is implemented by someone, who should do the training?
you're aware that for each module in the pipe the training must be done since each input values isn't not identical depending on the preceding module.
|
I have been thinking a lot about such a feature recently, since I very successfully used rembg to save me days of work with better result than I ever was able to achieve with regular masks (examples at the end). I could imagine that such a tool is implemented as a module which is for the pipeline essentially a no-op but provides a raster mask for later modules. It would come early in the pipeline, maybe just before exposure, and bring its own downstripped fix image pipeline to prepare the image for the AI: essentially exposure, and a generic base curve to ensure the image data is in a condition that is similar to the training data of the AI. As a first step, for the AI itself, the rembg approach seems very handy, and, very important: it is comparatively fast, even for my 30 MP 48 bpp tiff images it typically runs in less than 1 s, but I have no exact timing. Most of it may be reading and writing the tiff files anyway, which would not be required in the darktable case, as rembg would be used as a library anyway and the data is already in memory. At least it would be a starting point, and improved networks and more controls could be added later. Examples:The task was to have headshots of my son's soccer team, in front of neutral background. The light was not ideal, harsh afternoon summer sun, no chance to overpower with the flashes I own. But it was the only possible date, just before the training to not get red faces. This is the best I was able to achieve in darktable 2 years ago, the GIMP tools for semi-automatic foreground extraction were worse, but even here you can see the yellow cast on the background. This took me min. an hour per photograph, plus the usual editing, and I even do not have the chance for background replacement, I had to accept the white. With rembg, this year, it took 1 hour for all 20+ players. I know it is blown out as the lighting condition was even worse this time, but this is about the background removal, which is IMO much better. It was a dark background this time, btw. It is not perfect, but given that it took <1 s and no parameters at all, and that I started this year again to try it by myself and was not able to complete 1 image in several hours (which eventually led to my choice to try rembg), having such a raster mask in darktable would be incredibly useful. Btw, it also worked very well for dark hair in front of the dark background, and also dark skin color, and the background was by no means flat and even. It only failed to recognize holes in 1 case, where a little hole in between arm and body was not detected properly, in several similar cases it recognized the same hole. With a combined painted mask this would have been very easy to fix, but I needed only the head and shoulders from this image. |
Thanks for your message. I agree that creating a special module at the beginning of he pipeline whereby the created raster mask could be used in successive modules could be a good option and probably makes the most of sense. What is important though is that the user will be allowed to modify the by the AI created raster mask in the AI module by combining it with a drawn mask (in case the mask is not perfect). Your experiences with AI shows in my opinion that this is doable although maybe not easy. It would be a logical next step in improving masks though so lets hope this is going to be approved for a future release!
…On Tuesday, 9 August 2022 21:33:38 CEST spaceChRiS wrote:
I have been thinking a lot about such a feature recently, since I very successfully used [rembg](https://github.com/danielgatis/rembg) to save me days of work with better result than I ever was able to achieve with regular masks (examples at the end).
I could imagine that such a tool is implemented as a module which is for the pipeline essentially a no-op but provides a raster mask for later modules. It would come early in the pipeline, maybe just before exposure, and bring its own downstripped fix image pipeline to prepare the image for the AI: essentially exposure, and a generic base curve to ensure the image data is in a condition that is similar to the training data of the AI.
As a first step, for the AI itself, the rembg approach seems very handy, and, very important: it is comparatively fast, even for my 30 MP 48 bpp tiff images it typically runs in less than 1 s, but I have no exact timing. Most of it may be reading and writing the tiff files anyway, which would not be required in the darktable case, as rembg would be used as a library anyway and the data is already in memory.
At least it would be a starting point, and improved networks and more controls could be added later.
# Examples:
The task was to have headshots of my son's soccer team, in front of neutral background. The light was not ideal, harsh afternoon summer sun, no chance to overpower with the flashes I own. But it was the only possible date, just before the training to not get red faces.
This is the best I was able to achieve in darktable 2 years ago, the GIMP tools for semi-automatic foreground extraction were worse, but even here you can see the yellow cast on the background. This took me min. an hour per photograph, plus the usual editing, and I even do not have the chance for background replacement, I had to accept the white.
![grafik](https://user-images.githubusercontent.com/16524534/183742194-25903cea-f348-4ddf-ae58-cde0dbf8038a.png)
With rembg, this year, it took 1 hour for all 20+ players. I know it is blown out as the lighting condition was even worse this time, but this is about the background removal, which is IMO much better. It was a dark background this time, btw.
![grafik](https://user-images.githubusercontent.com/16524534/183741864-f4efdcbd-554d-4b42-898b-46ceb859d252.png)
It is not perfect, but given that it took <1 s and no parameters at all, and that I started this year again to try it by myself and was not able to complete 1 image in several hours (which eventually led to my choice to try rembg), having such a raster mask in darktable would be incredibly useful.
Btw, it also worked very well for dark hair in front of the dark background, and also dark skin color, and the background was by no means flat and even. It only failed to recognize holes in 1 case, where a little hole in between arm and body was not detected properly, in several similar cases it recognized the same hole. With a combined painted mask this would have been very easy to fix, but I needed only the head and shoulders from this image.
|
I was experimenting with non-AI stuff like segmentation algorithms, they can do some sort of content detection and could be used as a preparation step for a drawn mask ... |
Or as preparation step for the AI solution … 😉 Seriously, I am really impressed by what this approach (AI, in particular rembg) can achieve. Of course it fails in some cases as well, but the net time saving you can have with cumbersome tasks is incredible. In particular, the results are sometimes excellent in scenarios where it is hard to believe that a “classical” algorithm would work at all. An example I had is one of the soccer team's members, who has very dark skin and almost black hair, in front of a very dark background, but the AI was able to generate a perfect mask. A classical algorithm based on local features (differences in tone, contrast, color etc.) may have failed miserably. |
There is already parametric mask and mask option to manage complex areas 🤔 |
This issue did not get any activity in the past 60 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue. |
I think the beta launch of Photoshop generative fill has amplified the opportunity for use of AI in darktable (even if via user API credentials to a cloud setvice: https://www.adobe.com/za/products/photoshop/generative-fill.html While masking can show skill, this should be secondary to photographic and artistic effort |
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue. |
I'd like to point to Segment Anything, it's open source and very impressive (you can try it with your own images on their website). I imagine a workflow like this:
Such a solution should not be completely out of reach to be implemented and would drastically improve my workflow. |
I did give it a try. Yes - impressive! Although the "send to somewhere and get back a result" workflow doesn't seem good to me. We would depend on that service being provided "for ever". We would prefer to use a git submodule and run locally i think. |
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue. |
I'm still interested in this :) |
You are not alone. Me too :) |
I am interested in this feature as well. However, instead of hooking up to a particular model, how about open up an interface for customized scripts? For example, when I need a mask and I am intended to run segment anything, all I need is (1) the original image (2) a sample coordinate that the user selects. Then call an external program (say, a command line that the user specified), and return a binary mask. I think that would make the "AI Mask" feature more flexible and also make the implementation much easier. |
it’s obvious, that AI generated mask are quite useful - but it doesn’t need more users to support the request but someone to bite the bullet and contribute code … |
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue. |
Is there any infrastructure for running models like SAM2 in Darktable yet? Or would you prefer this to be a LUA script and the user would have to figure out themselves how to get Python up and running etc.? |
Not that I'm aware of.
Probably the best. That way we can work out the bugs and mechanics without being tied to a release schedule. Once we understand it, then we would have a better idea of how to incorporate it into darktable.
If running something in Python is a requirement, then they will probably have to figure it out anyway. I don't see embedding Python in darktable as something that will happen. This will probably require an API change to get the mask imported and available for use. I can help/do the API change and help with the script if you need it. |
Yes, I expect most of these "AI models" would require Python to run because that's the easiest way to do it today. However, that means a lot more dependencies (e.g., package installed). Thus don't try to solve the user's problem. Just define the interface (i.e., how to send the current image and mouse position from darktable, and should the mask be read as XML or JSON format) and make a generic way of calling an external program.
Yes, please. I tried to read the code but it is a time-saver if you can point to where is the most relevant functions. |
@righthandabacus are you experienced with writing LUA scripts for Darktable? Haven't done that yet, would be capable to review or follow up on a piece of code though. |
@fabiangebert I know some Lua, but not on Darktable so far. I think the pain point right now is implementing the C-Lua interface for masking functions. |
For me the first point would be a design decision. What would be the best way to integrate any "content controlled" masks into the pixelpipe. As i see it, it could be an addition to the masks as we have it. Any module supporting masks could
|
Also, bear with me but I still think AI mask are not that important. I see people using them to select an object or a people but applying a effect on a sharp mask is terrible. So at least we need to do better... To me an interface to an AI mask should be run on an image and returns a set of named mask which should be populated into the mask manager. I also think that the AI mask button should be into the mask manager. From there, users should be able to pick a mask and use it in any module. |
I think the mask would allow feathering, etc in the module it's used in (so not a "sharp mask")? But 100% agree that the mask should exist in the mask manager and not in an originating module. I was actually going to request this alone as a feature for all masks - I find it incredibly irritating going back in the history stack to a previous action and losing masks - i.e. mask definitions should not be part of the history - application of masks should |
Is your feature request related to a problem? Please describe.
Yes the problem of creating a good mask for complex areas (so when it is very difficult or labour intensive to create one)
Describe the solution you'd like
Typing (or choosing from presets) the area you would like to select (e.g. 'sky' or 'trees')
An AI should then create a parametric, a drawn or a parametric + drawn mask in the module at hand. This mask can then - if necessary - be fine tuned by the user
Alternatives
None
Additional context
This would speed up the creation of masks tremendously
The text was updated successfully, but these errors were encountered: