Skip to content

Conversation

@lokiledev
Copy link

@lokiledev lokiledev commented Oct 31, 2025

What this does

Combine several optimizations regarding image writing and video encoding in order to reduce the time it takes to save an episode.

  1. Directly call ffmpeg command instead of using pyav wrapper.
    This is the most impactful one, current implementation loads each image sequentially, which is an io bottleneck, then submits to pyav. The fix directly calls the ffmpeg command, delegating the image loading to ffmpeg which efficiently handles IO and encoding.

  2. Set image compression level to 0 (instead of 1)
    This has already been improved by refactor(datasets): add compress_level parameter to write_image() and set it to 1 #2135.
    Since ssds are really fast now it's faster to dump uncompressed images on disk than to waste CPU time to compress them. I measured a smaller time to save an episode using this. The tradeoff is using more temporary storage while recording an episode.

  3. Use the fastest encoder preset and enable maximum level of parallelism of av1 params

  • Use preset=13 the tradeoff is
    • Use less cpu to encode
    • Decoding is faster, great for dataloader: 11.26s -> 8.42s
    • file size is a little bit bigger: 9.93 mb -> 11.30mb
    • compression quality is bit lower: PSNR 40.40 -> 39.83
  • Enable maximum level of parallelism :lp=6.
    This uses potentially more ram but exploits all available cpu cores during encoding.

How it was tested

Wrote an external benchmark that saves dummy action/observation but uses real source images and measures:

  • duration of dataset.save_episode()
  • image compression quality (psnr & ssim)
  • dataloader time (random access and sequential)
  • dataset size

I didn't run the existing video benchmarks with this PR, if someone can double check that would be great.

I also wrote this to optimize offline processing, so it uses as much cpu core as possible during encoding.
It might impact real time recording.
However the current design already delays the encoding to after the end of the episode, so I consider it is acceptable to use all available resources there to speedup the encoding.

It is more efficient at doing IO and exploiting all cpu cores.
benchmark results:
step_time: 0.73s
save_time: 18.58s
psnr: 40.51
ssim: 0.98
dataset_size_mb: 10.82
random_access_time: 12.64s
sequential_access_time: 10.22s,
Images are stored in a temporary folder, and ssds are now really fast so it's ok to tradeoff image size for cpu.

Benchmarks results:

step_time: 0.62s
save_time: 13.31s
psnr: 40.51
ssim: 0.98
dataset_size_mb: 10.82
random_access_time: 10.53s
sequential_access_time: 7.92s
It makes the file a little bit bigger, but the encoding is a bit faster and also the dataloading is actually improved, and the quality is still good.

Benchmark resutl:
step_time: 0.75s
save_time: 11.39s
psnr: 39.83
ssim: 0.98
dataset_size_mb: 11.30
random_access_time: 11.91s
sequential_access_time: 8.82s
Copilot AI review requested due to automatic review settings October 31, 2025 13:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors video encoding to use direct ffmpeg subprocess calls instead of PyAV bindings, removes an unused PIL import, updates logging statements to use %s formatting, and changes the default PNG compression level from 1 to 0 for faster image writing.

  • Replaces PyAV-based video encoding with subprocess-based ffmpeg commands
  • Removes unused PIL import and updates logging to use % formatting
  • Changes default PNG compression level from 1 to 0

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/lerobot/datasets/video_utils.py Refactored encode_video_frames to use subprocess ffmpeg commands instead of PyAV, removed unused PIL import, and updated logging statements to use % formatting
src/lerobot/datasets/image_writer.py Changed default compress_level parameter from 1 to 0 in write_image function
Comments suppressed due to low confidence (1)

src/lerobot/datasets/image_writer.py:83

  • Documentation is outdated. The docstring states the default value is 1, but the parameter default has been changed to 0 on line 71. Update the docstring to reflect 'Defaults to 0.'
            image, as used by PIL.Image.save(). Defaults to 1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lokiledev and others added 2 commits October 31, 2025 14:27
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Loik Le Devehat <lokiledev@gmail.com>
@lokiledev
Copy link
Author

Note: I tested the fast-decode option but didn't find any speed-up in dataloader (both random access and sequential).
It could be removed from the api.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant