Skip to content

Orientation metadata for VideoFrame #351

Closed
@sandersdan

Description

@sandersdan

The first attempt at orientation metadata was paused due to a lack of agreement about the best way to represent it. This has been partly resolved now that a representation for color space (#47) has been selected.

Background:

  • VideoFrames have rotation and flip properties (together "orientation") that describe the transformation from the raw pixels produced from readTo() to the intended rendering.
  • Most implementations restrict this to the four 90 degree rotations and a flipY/mirrored flag, or equivalently the EXIF orientation tag which encodes a rotation and any flips in a single number. Some implementations don't consider flips at all (flips are rare except as a hardcoded parameter of texture upload).
  • We already decided to account for orientation in the CanvasImageSource representation (createImageBitmap(videoFrame) Semantics #159). Exposing these properties remains necessary for applications to correctly interpret readTo() and to configure new VideoFrame(BufferSource) frames. (In theory the implementation could reorient the readTo() data to avoid the problem, but Chrome's implementation does not currently do so.)
  • Separating rotation from flip metadata can create confusion because the operations are not commutative. EXIF orientation by comparison is unambiguous but not as widely familiar.
  • There is also ambiguity with respect to coded size, visibleRect, and display size.

Open questions:

  • Should we separate rotation from flip metadata? Provide both representations?
    • I lean toward orientation only to minimize ambiguity, but recognize that this could introduce translation errors.
  • EXIF orientations do not have names, just numbers and descriptions, so if we use orientations we would need to create our own names.
    • Internally Chrome names these like kOriginTopLeft, kOriginLeftTop, ..., which seems like a reasonable approach (perhaps we could use eg. origin = "left-top".) It is ambiguous about row vs column ordering but errors there are likely to be obvious.
  • How to define sizing.
    • I think coded size must be the unrotated data, and it follows that visibleRect should be the same. CanvasImageSource uses only care about the display size, so I think that should be the rotated version. (Note: this would require a tweak to our definition of the cropping constructor.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    TPAC2024For discussion at TPAC 2024extensionInterface changes that extend without breaking.p1

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions