Skip to content

Metal implementation is very slow with ACES 2.0 view transform #2116

@cozdas

Description

@cozdas

We hit a performance issue with ACES 2.0 output transforms using a metal shader. Metal implementation is very slow especially on Apple silicon and the performance hit increases with input size. After the preliminary analysis, the main culprit is the 362-element long const float array that the transform uses. Since the OCIO metal shader implementation uses a "wrapper" struct to encapsulate the functions and data, this array ends up being a member of the encapsulating struct and looks like the array is created in the run-time many times, trashing the L1 cache (90% miss rate). See ocio_gamut_cusp_table_0_hues_array defined in line 88 of the attached shader code which is generated with the v2.4.1 tagged branch.

OpenGL and HLSL generators don't use a wrapper and thus don't suffer from the same issue even on the same hardware. Also pulling the offending array outside of the struct and marking it "constant float" fixes the performance problem. (see the attached shader which has the array outside of the struct).

shader generated with v2.4.1.txt
shader where array pulled outside of the struct

Repro steps

  • checkout OCIO v2.4.1 branch and compile
  • download the test config that's provided here: https://github.com/AcademySoftwareFoundation/OpenColorIO/wiki/ACES-2.0-optimization
  • set OCIO environment variable to point to that config.
  • run ociodisplay utility with "-metal" and "-gpuinfo" flags and pass a relatively large image (4k) as the input image
    ociodisplay -metal -gpuinfo large_image.exr
  • with the right click in the UI set
    • the image color space to "ACES / ACES2065-1"
    • the display to "sRGB - Display"
    • set the view to "ACES 2.0 - SDR 100 nits (Rec.709)"

You'll see the generated code in the console and if you do a metal capture you'll see that in the draw call performance report, buffer L1 cache miss rate is very high.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions