Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make frame selection from a video more intelligent #3408

Open
danie1ll opened this issue Sep 4, 2024 · 5 comments
Open

Make frame selection from a video more intelligent #3408

danie1ll opened this issue Sep 4, 2024 · 5 comments

Comments

@danie1ll
Copy link

danie1ll commented Sep 4, 2024

From my understanding, right now calling ns-process-data video just randomly samples --num-frames-target images from the video. This is suboptimal because of two reasons:

  1. The random strategy is super naive, maybe you could add some algorithm to select key frames based on some difference metric between images?
  2. An option to fix seed for random sampling would make processing videos with nerfstudio much more reproducible
@LucasArmand
Copy link

A good metric could be frame blurriness. Here's an example from Stack Overflow using the Laplacian to quantify image blurriness and select the "least blurry image" every second.
https://stackoverflow.com/questions/65949172/how-to-extract-clear-frames-from-video-file

@danie1ll
Copy link
Author

danie1ll commented Sep 9, 2024

@LucasArmand this looks very promising! Could also be combined with accumulated visual change between frames, so we not only select 'least blurry image' every second but 'the most important frames, which are different enough and are not blurry'

@LucasArmand
Copy link

@danie1ll I wonder how far we can go in optimizing the ns-process-data process? When our source is a video, we usually only select a low percentage of the total frames, so there are many unique combinations of frames to select for training. Some of these combinations would certainly result in higher quality trained scenes than others. Heuristics like "image blurriness" and "accumulated visual change" will probably give us a better combination of frames for training, but is it possible to find the best combination?

@msusag
Copy link

msusag commented Sep 9, 2024

The scripts at https://github.com/SharkWipf/nerf_dataset_preprocessing_helper could perhaps be integrated into ns-process-data video?

@Anthony-Tafoya
Copy link
Contributor

No matter the solution decided for "smarter" processing, I think there is room for a fixed seed option. I am working on a PR to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants