-
Notifications
You must be signed in to change notification settings - Fork 40
Adding tensorflow.js depth-estimation #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding tensorflow.js depth-estimation #248
Conversation
Also changed the mention of this in the examples.
Removed console logs, the comments are clear enough without them. Also renamed the examples' <title> tag to match ml5.js format.
Wanted to add a to do list of task that I'll try to work on for this PR, please let me know if there are suggestions!
|
Removed the depth estimation tensor from the result object so we could handle disposing of it internally. Also tested ml5.tf.memory() on the current code and found a memory leak, which ended up being due to some segmentation tensors not being disposed. I replaced the disposal code being used here with the one used in the official tensorflow examples, which fixed the leak.
Added the dilation algorithm to the library. The level of dilation is controlled by the config option 'dilationFactor' which takes values between 0 and 10, corresponding to the amount of pixels to grow the background into the silouette. Larger dilation factors affect fps because they need longer loops to look for bounds. Also made the mask available as a p5.Image in the result, under the name 'mask'. This mask is compatible with the p5 mask() function, so it is easy to use it to cut out the profile from the background. Lastly, also optimized the helper function that turns imageData into p5.Image, by replacing set() and instead copying the imageData.data array into the pixels array.
The first two examples aimed at being starting points for using the model. One is simply a webcam depth estimation without any interface. The other is the same but using the mask to clear out the background. Also made applying the segmentation mask the default for the model, since the it performs much better with it.
Hi @nasif-co, I took a look at your latest updates and reviewed the examples, amazing work! A few quick questions / thoughts.
Is this an issue only if you resize the video during a sketch, or if you just call The new examples are fantastic!
|
It happens by calling
Yes you are right, it confused me a bit at first. Now that I'm looking into it, changing it to be the other way around seems to be a small change that could be useful in keeping consistency with transformers.js, looking ahead at integrating it. I'll commit that small change.
Yes! I was looking at including some of those next. I had this sketch I made a few months ago for class using transformers.js which sounds like what you're describing. I'll port that one to ml5. Do you think we should just do the one? I was thinking of also adding one that builds a 3D mesh using the depth map, but I don't know if that becomes more of a tutorial territory than example. Also was planning on adding an example that showcases how to "detect" distance, so that different interactions can occur depending on how close/far a subject is. Something like a chain of if/else statements, each with a different interaction. I suspect this would be a common use case of the depth estimation model.
I agree, I'll add something simple in the back, maybe a background color shifting in hue or just an image. Thanks for the detailed review! :) |
To match transformers.js: lighter pixels are closer to the camera, darker are farther from it.
To help visualize what using the mask together with the depthMap does.
Interestingly, the bug also affects the body segmentation module, when using On the other hand, it makes some sense since the Going forward, I think the best solution is rendering the video pixels to a separate canvas/p5.Graphics in ml5 and passing that as the |
Bug was due to estimation being done on the source element intrinsic dimensions and not the display dimensions set by the user, leading to an unexpected output. Needed to resize the media given by the user before passing it to the models. After some discussions in the discord, opted for resizing the input media through tensorflow.js own methods. I think this might be more performant than resizing the image in a canvas but didn't test them side by side to corroborate.
Simplified existing examples and aligned them with the changes in config defaults.
Converted console logs to comments.
Realized the mask and dilation was not being applied to the data array and therefore not to the getDepthAt() method. Fixed it for consistency.
Make the code a little simpler.
Since we already have a webcam video example, it felt redundant to have the depthEstimation-video example also use the webcam. So I modified it to instead showcase how to run depthEstimation with a video file.
The depth estimation result now includes the exact frame of the input that was used to generate the returned estimation. This is useful for aligning the image with the estimation, especially if the model is running at a lower fps than the source video (which is most often the case)
Shows how to use the depth estimation result together with p5.js 3D geometry tools to build a live mesh of the webcam video.
48186f8
to
04981a1
Compare
Replace the code fixing the size mismatch bug by using the new function designed for that: resizeImageAsTensor.
04981a1
to
085439b
Compare
Updated the examples to add the p5 2.0 version. Interestingly, the point cloud example had a performance drop, and the mesh example had a great performance boost. Looking into it in processing/p5.js#6438, it must be related to the mesh example making use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work, thank you to @alanvww for getting this started and @nasif-co for completing! This feature won't be released until likely early September so we have time to do additional testing for bugs as well as tweak or alter any of the examples if other contributors have comments. But I'd like to merge this today to mark the end of the summer research period! Happy August! 💜
Hello! This PR is adding the depth estimation functionality from tensorflow.js to ml5.
I primarily refer to this example for its performance and results.
Testing sketches:
depthEstimation-video
depthEstimation-single-image
grayscale
colormap as default