NV12/YUV->RGB colour accuracy and CUDA #3799

gtebbutt · 2024-06-06T11:07:53Z

I've noticed some odd colour space conversion issues when using the yuv_to_rgb function in the otherwise very helpful NVDEC tutorial - it seems to be subtly but visibly shifting colours and/or clipping the dynamic range, but I'm not certain why. Originally thought there might be issues between BT.601/BT.709/BT.2020 content, but trying other python functions using those matricies didn't seem to help; it could definitely be my error somewhere, but I wasn't able to get correct colour output on anything that'd been through the implicit NV12->YUV444 conversion step.

Since there's been some discussion on moving the colour space conversion to CUDA anyway, I wanted to flag this implementation in case it's helpful. We ended up seeing a significant speed increase using that rather than applying conversions in tensor format, with all colours coming back exactly as expected.

cc dmlc/decord#283 (comment)

The text was updated successfully, but these errors were encountered:

vadimkantorov · 2024-06-17T17:09:41Z

The idea of adding tested/fast colorspace conversions was not supported at the time in torchvision: pytorch/vision#4029

But maybe then torchaudio could host such functions

gtebbutt · 2024-06-17T20:25:33Z

Interesting - colour space conversion is mentioned on the torchaudio to-do list at #3139, so there does seem to be some intention to add it, at least. There's also the existing NV12->YUV444 conversion built into StreamReader, so perhaps priorities have changed a bit now that the decoding is being pulled out into the torio namespace?

vadimkantorov · 2024-06-17T21:47:50Z

Maybe even then some of these could be upstreamed later into torch core to be available across the board... Regarding torio, really hoping some torch-prefixed name can be invented before it's too late and we are stuck with both torch-prefixed-domain-library-zoo and tor-prefixed-domain-library-zoo :)

NilanEkanayake · 2024-07-09T17:59:38Z

Here's a fixed version of the function:

def yuv_to_rgb(frames):
    frames = frames.to(torch.float)
    y = frames[..., 0, :, :]
    u = frames[..., 1, :, :]
    v = frames[..., 2, :, :]

    r = 1.164*(y - 16) + 1.596*(v - 128)
    g = 1.164*(y - 16) - 0.813*(v - 128) - 0.392*(u - 128)
    b = 1.164*(y - 16) + 2.017*(u - 128)

    rgb = torch.stack([r, g, b], -1)
    rgb = rgb.clamp(0, 255).to(torch.uint8)
    return rgb

NilanEkanayake · 2024-07-09T18:01:47Z

It seems that the decoded YUV is range-limited, so when re-converted back to RGB without taking that into account, it gets washed out (and the function above is likely not perfect, as a solid black video without clamping goes from -5 to 182, clamped to 0 to 182).
Surprisingly, found the working conversion code here: https://stackoverflow.com/questions/38530587/full-range-ycbcr-to-rgb-and-back-in-opencv-not-reversible

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NV12/YUV->RGB colour accuracy and CUDA #3799

NV12/YUV->RGB colour accuracy and CUDA #3799

gtebbutt commented Jun 6, 2024

vadimkantorov commented Jun 17, 2024 •

edited

Loading

gtebbutt commented Jun 17, 2024

vadimkantorov commented Jun 17, 2024

NilanEkanayake commented Jul 9, 2024 •

edited

Loading

NilanEkanayake commented Jul 9, 2024 •

edited

Loading

NV12/YUV->RGB colour accuracy and CUDA #3799

NV12/YUV->RGB colour accuracy and CUDA #3799

Comments

gtebbutt commented Jun 6, 2024

vadimkantorov commented Jun 17, 2024 • edited Loading

gtebbutt commented Jun 17, 2024

vadimkantorov commented Jun 17, 2024

NilanEkanayake commented Jul 9, 2024 • edited Loading

NilanEkanayake commented Jul 9, 2024 • edited Loading

vadimkantorov commented Jun 17, 2024 •

edited

Loading

NilanEkanayake commented Jul 9, 2024 •

edited

Loading

NilanEkanayake commented Jul 9, 2024 •

edited

Loading