Skip to content

Commit

Permalink
[docs] VideoProcessor (huggingface#7965)
Browse files Browse the repository at this point in the history
* fix?

* fix?

* fix
  • Loading branch information
stevhliu authored May 21, 2024
1 parent 6529ee6 commit fdb1baa
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 10 deletions.
8 changes: 7 additions & 1 deletion docs/source/en/api/video_processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,10 @@ specific language governing permissions and limitations under the License.

# Video Processor

The `VideoProcessor` provides a unified API for video pipelines to prepare inputs for VAE encoding and post-processing outputs once they're decoded. The class inherits [`VaeImageProcessor`] so it includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.
The [`VideoProcessor`] provides a unified API for video pipelines to prepare inputs for VAE encoding and post-processing outputs once they're decoded. The class inherits [`VaeImageProcessor`] so it includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.

## VideoProcessor

[[autodoc]] video_processor.VideoProcessor.preprocess_video

[[autodoc]] video_processor.VideoProcessor.postprocess_video
20 changes: 11 additions & 9 deletions src/diffusers/video_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,19 @@ def preprocess_video(self, video, height: Optional[int] = None, width: Optional[
Preprocesses input video(s).
Args:
video: The input video. It can be one of the following:
video (`List[PIL.Image]`, `List[List[PIL.Image]]`, `torch.Tensor`, `np.array`, `List[torch.Tensor]`, `List[np.array]`):
The input video. It can be one of the following:
* List of the PIL images.
* List of list of PIL images.
* 4D Torch tensors (expected shape for each tensor: (num_frames, num_channels, height, width)).
* 4D NumPy arrays (expected shape for each array: (num_frames, height, width, num_channels)).
* List of 4D Torch tensors (expected shape for each tensor: (num_frames, num_channels, height, width)).
* List of 4D NumPy arrays (expected shape for each array: (num_frames, height, width, num_channels)).
* 5D NumPy arrays: expected shape for each array: (batch_size, num_frames, height, width,
num_channels).
* 5D Torch tensors: expected shape for each array: (batch_size, num_frames, num_channels, height,
width).
* 4D Torch tensors (expected shape for each tensor `(num_frames, num_channels, height, width)`).
* 4D NumPy arrays (expected shape for each array `(num_frames, height, width, num_channels)`).
* List of 4D Torch tensors (expected shape for each tensor `(num_frames, num_channels, height,
width)`).
* List of 4D NumPy arrays (expected shape for each array `(num_frames, height, width, num_channels)`).
* 5D NumPy arrays: expected shape for each array `(batch_size, num_frames, height, width,
num_channels)`.
* 5D Torch tensors: expected shape for each array `(batch_size, num_frames, num_channels, height,
width)`.
height (`int`, *optional*, defaults to `None`):
The height in preprocessed frames of the video. If `None`, will use the `get_default_height_width()` to
get default height.
Expand Down

0 comments on commit fdb1baa

Please sign in to comment.