-
Notifications
You must be signed in to change notification settings - Fork 3k
Speedup save_episode() by optimizing video encoding | (⚡️ Performance) |
#2350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Speedup save_episode() by optimizing video encoding | (⚡️ Performance) |
#2350
Conversation
It is more efficient at doing IO and exploiting all cpu cores. benchmark results: step_time: 0.73s save_time: 18.58s psnr: 40.51 ssim: 0.98 dataset_size_mb: 10.82 random_access_time: 12.64s sequential_access_time: 10.22s,
Images are stored in a temporary folder, and ssds are now really fast so it's ok to tradeoff image size for cpu. Benchmarks results: step_time: 0.62s save_time: 13.31s psnr: 40.51 ssim: 0.98 dataset_size_mb: 10.82 random_access_time: 10.53s sequential_access_time: 7.92s
It makes the file a little bit bigger, but the encoding is a bit faster and also the dataloading is actually improved, and the quality is still good. Benchmark resutl: step_time: 0.75s save_time: 11.39s psnr: 39.83 ssim: 0.98 dataset_size_mb: 11.30 random_access_time: 11.91s sequential_access_time: 8.82s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors video encoding to use direct ffmpeg subprocess calls instead of PyAV bindings, removes an unused PIL import, updates logging statements to use %s formatting, and changes the default PNG compression level from 1 to 0 for faster image writing.
- Replaces PyAV-based video encoding with subprocess-based ffmpeg commands
- Removes unused PIL import and updates logging to use % formatting
- Changes default PNG compression level from 1 to 0
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/lerobot/datasets/video_utils.py | Refactored encode_video_frames to use subprocess ffmpeg commands instead of PyAV, removed unused PIL import, and updated logging statements to use % formatting |
| src/lerobot/datasets/image_writer.py | Changed default compress_level parameter from 1 to 0 in write_image function |
Comments suppressed due to low confidence (1)
src/lerobot/datasets/image_writer.py:83
- Documentation is outdated. The docstring states the default value is 1, but the parameter default has been changed to 0 on line 71. Update the docstring to reflect 'Defaults to 0.'
image, as used by PIL.Image.save(). Defaults to 1.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Loik Le Devehat <lokiledev@gmail.com>
|
Note: I tested the |
What this does
Combine several optimizations regarding image writing and video encoding in order to reduce the time it takes to save an episode.
Directly call ffmpeg command instead of using pyav wrapper.
This is the most impactful one, current implementation loads each image sequentially, which is an io bottleneck, then submits to pyav. The fix directly calls the ffmpeg command, delegating the image loading to ffmpeg which efficiently handles IO and encoding.
Set image compression level to 0 (instead of 1)
This has already been improved by refactor(datasets): add compress_level parameter to write_image() and set it to 1 #2135.
Since ssds are really fast now it's faster to dump uncompressed images on disk than to waste CPU time to compress them. I measured a smaller time to save an episode using this. The tradeoff is using more temporary storage while recording an episode.
Use the fastest encoder preset and enable maximum level of parallelism of av1 params
preset=13the tradeoff islp=6.This uses potentially more ram but exploits all available cpu cores during encoding.
How it was tested
Wrote an external benchmark that saves dummy action/observation but uses real source images and measures:
dataset.save_episode()I didn't run the existing video benchmarks with this PR, if someone can double check that would be great.
I also wrote this to optimize offline processing, so it uses as much cpu core as possible during encoding.
It might impact real time recording.
However the current design already delays the encoding to after the end of the episode, so I consider it is acceptable to use all available resources there to speedup the encoding.