Skip to content

Conversation

@QSXW
Copy link
Collaborator

@QSXW QSXW commented May 14, 2024

No description provided.

@QSXW QSXW force-pushed the feat/palette branch 4 times, most recently from d9ae5a7 to 0dd9ff3 Compare May 14, 2024 18:11
@nuomi2021
Copy link
Member

thank you @QSXW , will check it this week end.

@nuomi2021
Copy link
Member

linux is not build able.
16 files in conformance/failed/v1/PAL/ will fail.
Could you help make it all passed?
thank you

+++++++++ report +++++++++
passed files:
PALETTE_A_Alibaba_2.bit
PALETTE_B_Alibaba_2.bit
PALETTE_C_Alibaba_2.bit
PALETTE_D_Alibaba_2.bit
PALETTE_E_Alibaba_2.bit
mismatch files:
8b444_A_Kwai_2.bit
10b422_J_Sony_5.bit
8b422_J_Sony_5.bit
10b422_L_Sony_5.bit
8b422_L_Sony_5.bit
ACT_A_Kwai_3.bit
10b422_H_Sony_5.bit
8b422_H_Sony_5.bit
8b422_I_Sony_5.bit
10b422_I_Sony_5.bit
10b422_K_Sony_5.bit
8b422_K_Sony_5.bit
10b422_G_Sony_5.bit
8b422_G_Sony_5.bit
8b444_B_Kwai_2.bit
decode_err files:
ACT_B_Kwai_3.bit

total = 21, passed = 5, skipped = 0, failed = 16

@QSXW
Copy link
Collaborator Author

QSXW commented May 19, 2024

Sure. It's funny that I didn't test Alibaba_2.bit samples but they passed and other failed.

@QSXW
Copy link
Collaborator Author

QSXW commented May 19, 2024

I've checked all the pixels decoded by palette prediction are the same. The yellow rect on the right is the only difference that our first frame will decode twice and other pixels are identical.

The difference may occur in the stage of intra-prediction or IBC prediction. Can you help verify that?
image

I compare the whole frame by YUVViewer and found that the Y is almost the same, but there are some differences between the u and v components.
image

@nuomi2021
Copy link
Member

sure, I will check it

@QSXW QSXW force-pushed the feat/palette branch 3 times, most recently from 3037abd to c600313 Compare May 25, 2024 20:13
@QSXW QSXW self-assigned this May 25, 2024
@nuomi2021
Copy link
Member

@QSXW , I know why other clips are failed. the Deblock, SAO, and ALF codes need to be added.

@QSXW
Copy link
Collaborator Author

QSXW commented May 28, 2024

@QSXW , I know why other clips are failed. the Deblock, SAO, and ALF codes need to be added.

Yes, it is. Do we have an interface to get the cu by x0,y0? We need to get cu_q and cu_p for getting the pred_mode?

@nuomi2021
Copy link
Member

you can use fc->tab.cpm

@QSXW QSXW closed this May 29, 2024
@QSXW QSXW reopened this May 29, 2024
@frankplow
Copy link
Collaborator

frankplow commented Jun 3, 2024

I think I've filtered out all the flaky fuzz failures now so these appear to be legitimate.

AddressSanitizer report for ID 14. ID 56 is similar.
=================================================================
==18052==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016fcfccfa at pc 0x000102090e28 bp 0x00016fcfcb50 sp 0x00016fcfc300
WRITE of size 130 at 0x00016fcfccfa thread T5
[vvc @ 0x105a01c80] frame     1, P(  1,   1) failed with -1094995529
[vist#0:0/vvc @ 0x105602a40] [dec:vvc @ 0x105002ec0] Decoding error: Invalid data found when processing input
    #0 0x102090e24 in __asan_memcpy+0x440 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x50e24)
    #1 0x10099c90c in derive_predictor_palette_entries ctu.c:991
    #2 0x10097f4cc in set_cu_tabs ctu.c:1314
    #3 0x100973650 in hls_coding_unit ctu.c:2164
    #4 0x10096ff70 in hls_coding_tree ctu.c:2420
    #5 0x100974f14 in coding_tree_qt ctu.c:2346
    #6 0x10096fe44 in hls_coding_tree ctu.c:2409
    #7 0x10096f27c in dual_tree_implicit_qt_split ctu.c:2471
    #8 0x10096f1d4 in dual_tree_implicit_qt_split ctu.c:2461
    #9 0x100968828 in hls_coding_tree_unit ctu.c:2630
    #10 0x100965f88 in ff_vvc_coding_tree_unit ctu.c:2758
    #11 0x100b2bcc0 in run_parse thread.c:428
    #12 0x100b2b168 in task_run_stage thread.c:581
    #13 0x100b263b8 in task_run thread.c:608
    #14 0x100fd775c in run_one_task executor.c:86
    #15 0x100fd6ce4 in executor_worker_task executor.c:104
    #16 0x1904daf90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #17 0x1904d5d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

Address 0x00016fcfccfa is located in stack of thread T5 at offset 410 in frame
    #0 0x10099c220 in derive_predictor_palette_entries ctu.c:963

  This frame has 1 object(s):
    [32, 410) 'new_predictor_palette_entries' (line 966) <== Memory access at offset 410 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T5 created by T0 here:
    #0 0x10208bd6c in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bd6c)
    #1 0x100fd69b8 in av_executor_alloc executor.c:166
    #2 0x100b24ee0 in ff_vvc_executor_alloc thread.c:629
    #3 0x101149354 in vvc_decode_init dec.c:1103
    #4 0x1006ce1b4 in avcodec_open2 avcodec.c:326
    #5 0x10032f2dc in dec_open ffmpeg_dec.c:1228
    #6 0x10032d358 in dec_init ffmpeg_dec.c:1286
    #7 0x100341a10 in ist_use ffmpeg_demux.c:949
    #8 0x1003422b8 in ist_filter_add ffmpeg_demux.c:992
    #9 0x10037f6d0 in ifilter_bind_ist ffmpeg_filter.c:701
    #10 0x10037ea8c in init_simple_filtergraph ffmpeg_filter.c:1230
    #11 0x1003cc9e0 in ost_add ffmpeg_mux_init.c:1433
    #12 0x1003b97f8 in map_auto_video ffmpeg_mux_init.c:1539
    #13 0x1003af7b4 in create_streams ffmpeg_mux_init.c:1855
    #14 0x1003ad7f8 in of_open ffmpeg_mux_init.c:3265
    #15 0x1003f73b4 in open_files ffmpeg_opt.c:1206
    #16 0x1003f6d18 in ffmpeg_parse_options ffmpeg_opt.c:1253
    #17 0x1004477c4 in main ffmpeg.c:941
    #18 0x1901520dc  (<unknown module>)

SUMMARY: AddressSanitizer: stack-buffer-overflow (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x50e24) in __asan_memcpy+0x440
Shadow bytes around the buggy address:
  0x00016fcfca00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfca80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfcb00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x00016fcfcb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfcc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00016fcfcc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00[02]
  0x00016fcfcd00: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x00016fcfcd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00016fcfcf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==18052==ABORTING
AddressSanitizer report for ID 256. All other bitstreams are similar.
=================================================================
==18083==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00019050ae90 bp 0x00016defc5b0 sp 0x00016defbd70 T12)
==18083==The signal is caused by a WRITE memory access.
==18083==Hint: address points to the zero page.
    Last message repeated 2 times
[vvc @ 0x107c01c80] frame    41, P(  1,   1) failed with -1094995529
    #0 0x19050ae90 in __bzero+0x20 (libsystem_platform.dylib:arm64e+0x3e90)5e+06x
    #1 0x102b78f70 in tl_create dec.c:90
    #2 0x102b78880 in frame_context_for_each_tl dec.c:371
    #3 0x102b773c0 in pic_arrays_init dec.c:406
    #4 0x102b760b4 in frame_context_setup dec.c:707
    #5 0x102b74ad4 in frame_setup dec.c:804
    #6 0x102b742bc in decode_slice dec.c:826
    #7 0x102b73f60 in decode_nal_unit dec.c:869
    #8 0x102b730a0 in decode_nal_units dec.c:910
    #9 0x102b723dc in vvc_decode_frame dec.c:1014
    #10 0x102acf82c in decode_simple_internal decode.c:412
    #11 0x102acec68 in decode_simple_receive_frame decode.c:583
    #12 0x102ab8154 in decode_receive_frame_internal decode.c:612
    #13 0x102ab7b18 in avcodec_send_packet decode.c:703
    #14 0x102508278 in packet_decode ffmpeg_dec.c:687
    #15 0x102506184 in decoder_thread ffmpeg_dec.c:897
    #16 0x1025eb424 in task_wrapper ffmpeg_sched.c:2467
    #17 0x1904daf90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #18 0x1904d5d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

==18083==Register values:
 x[0] = 0x0000000000000000   x[1] = 0x0000000000000000   x[2] = 0x0000000000060000   x[3] = 0x0000000000000000
 x[4] = 0x000000702dbff940   x[5] = 0x0000000000000001   x[6] = 0x000000016de7c000   x[7] = 0x0000000000000001
 x[8] = 0x000000700002c000   x[9] = 0x000000700002c000  x[10] = 0x000000700002c020  x[11] = 0x0000000000000000
x[12] = 0x000000700002c000  x[13] = 0x0000000000001800  x[14] = 0x0000000000001800  x[15] = 0x0000000000000006
x[16] = 0x000000019050aed0  x[17] = 0x00000001042b85e8  x[18] = 0x0000000000000000  x[19] = 0x0000000000060000
x[20] = 0x0000000000000000  x[21] = 0x0000000000000000  x[22] = 0x0000000000000000  x[23] = 0x0000000000000000
x[24] = 0x0000000000000000  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000
x[28] = 0x0000000000000000     fp = 0x000000016defc5b0     lr = 0x0000000104264ee4     sp = 0x000000016defbd70
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (libsystem_platform.dylib:arm64e+0x3e90) in __bzero+0x20
Thread T12 created by T0 here:
    #0 0x10425fd6c in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bd6c)
    #1 0x1025e0218 in task_start ffmpeg_sched.c:416
    #2 0x1025de124 in sch_start ffmpeg_sched.c:1570
    #3 0x10261be78 in transcode ffmpeg.c:831
    #4 0x10261b920 in main ffmpeg.c:959
    #5 0x1901520dc  (<unknown module>)

==18083==ABORTING

…rocess

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
@nuomi2021
Copy link
Member

nuomi2021 commented Mar 16, 2025 via email

@QSXW
Copy link
Collaborator Author

QSXW commented Mar 19, 2025

@nuomi2021 Please use this clip as test: 8b422_G_Sony_5.bit

and use this branch https://github.com/QSXW/FFmpeg-VVC/commits/debug/palette/

This branch includes the picture hash sei so we can add -err_detect +crccheck options to help check if the hash of the frame is the same as the decoded picture hash.

@nuomi2021
Copy link
Member

nuomi2021 commented Mar 22, 2025

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index a7a3479731..f4c4e71b1d 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -1990,14 +1990,15 @@ static int palette_subblock_data(VVCLocalContext *lc, int start_comp, int num_co
     return 0;
 }
 
-static void add_palette_tu(VVCLocalContext *lc)
+static void add_palette_tu(VVCLocalContext *lc, const VVCTreeType tree_type)
 {
     CodingUnit   *cu  = lc->cu;
     const VVCSPS *sps = lc->fc->ps.sps;
 
     TransformUnit *tu = add_tu(lc->fc, cu, cu->x0, cu->y0, cu->cb_width, cu->cb_height);
-    add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
-    if (sps->r->sps_chroma_format_idc) {
+    if (tree_type != DUAL_TREE_CHROMA)
+        add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
+    if (sps->r->sps_chroma_format_idc && tree_type != DUAL_TREE_LUMA) {
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CB], tu->height >> sps->vshift[CB], CB);
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CR], tu->height >> sps->vshift[CR], CR);
     }
@@ -2048,7 +2049,7 @@ static int palette_coding(VVCLocalContext *lc, const VVCTreeType tree_type)
     if (tree_type == SINGLE_TREE)
         set_cb_tab(lc, fc->tab.cpm[CHROMA], MODE_PLT);
 
-    add_palette_tu(lc);
+    add_palette_tu(lc, tree_type);
 
     for (i = 0; i < fc->tab.predictor_palette[start_comp].size && num_predicted_entries < max_num_palette_entries; i++) {
         palette_predictor_run = ff_vvc_palette_predictor_run(lc);

We need to count in tree type.
the luma cu at 224, 68 is a DUAL_LUMA CU with 4x8 size
the chroma cu at 224,68 is a DUAL_CHROMA CU with 16x8 size

Please consider removing add_palette_tu, as the hls_transform_tree may be able to handle most of the tasks.

@nuomi2021
Copy link
Member

use VTM to encode some no SAO, ALF clip will make your life easier :)

@QSXW
Copy link
Collaborator Author

QSXW commented Mar 22, 2025

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index a7a3479731..f4c4e71b1d 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -1990,14 +1990,15 @@ static int palette_subblock_data(VVCLocalContext *lc, int start_comp, int num_co
     return 0;
 }
 
-static void add_palette_tu(VVCLocalContext *lc)
+static void add_palette_tu(VVCLocalContext *lc, const VVCTreeType tree_type)
 {
     CodingUnit   *cu  = lc->cu;
     const VVCSPS *sps = lc->fc->ps.sps;
 
     TransformUnit *tu = add_tu(lc->fc, cu, cu->x0, cu->y0, cu->cb_width, cu->cb_height);
-    add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
-    if (sps->r->sps_chroma_format_idc) {
+    if (tree_type != DUAL_TREE_CHROMA)
+        add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
+    if (sps->r->sps_chroma_format_idc && tree_type != DUAL_TREE_LUMA) {
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CB], tu->height >> sps->vshift[CB], CB);
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CR], tu->height >> sps->vshift[CR], CR);
     }
@@ -2048,7 +2049,7 @@ static int palette_coding(VVCLocalContext *lc, const VVCTreeType tree_type)
     if (tree_type == SINGLE_TREE)
         set_cb_tab(lc, fc->tab.cpm[CHROMA], MODE_PLT);
 
-    add_palette_tu(lc);
+    add_palette_tu(lc, tree_type);
 
     for (i = 0; i < fc->tab.predictor_palette[start_comp].size && num_predicted_entries < max_num_palette_entries; i++) {
         palette_predictor_run = ff_vvc_palette_predictor_run(lc);

We need to count in tree type. the luma cu at 224, 68 is a DUAL_LUMA CU with 4x8 size the chroma cu at 224,68 is a DUAL_CHROMA CU with 16x8 size

Please consider removing add_palette_tu, as the hls_transform_tree may be able to handle most of the tasks.

It works! This change saved my life. This is the power of cooperation!

@QSXW
Copy link
Collaborator Author

QSXW commented Mar 22, 2025

use VTM to encode some no SAO, ALF clip will make your life easier :)

Absolutely. For simplicity, I just changed the block of codes running sao and alf filter as comments at the present. Fortunately, there is no change need on sao and alf filter for palette mode.

@nuomi2021
Copy link
Member

diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
index a7a3479731..f4c4e71b1d 100644
--- a/libavcodec/vvc/ctu.c
+++ b/libavcodec/vvc/ctu.c
@@ -1990,14 +1990,15 @@ static int palette_subblock_data(VVCLocalContext *lc, int start_comp, int num_co
     return 0;
 }
 
-static void add_palette_tu(VVCLocalContext *lc)
+static void add_palette_tu(VVCLocalContext *lc, const VVCTreeType tree_type)
 {
     CodingUnit   *cu  = lc->cu;
     const VVCSPS *sps = lc->fc->ps.sps;
 
     TransformUnit *tu = add_tu(lc->fc, cu, cu->x0, cu->y0, cu->cb_width, cu->cb_height);
-    add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
-    if (sps->r->sps_chroma_format_idc) {
+    if (tree_type != DUAL_TREE_CHROMA)
+        add_tb(tu, lc, tu->x0, tu->y0, tu->width, tu->height, LUMA);
+    if (sps->r->sps_chroma_format_idc && tree_type != DUAL_TREE_LUMA) {
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CB], tu->height >> sps->vshift[CB], CB);
         add_tb(tu, lc, tu->x0, tu->y0, tu->width >> sps->hshift[CR], tu->height >> sps->vshift[CR], CR);
     }
@@ -2048,7 +2049,7 @@ static int palette_coding(VVCLocalContext *lc, const VVCTreeType tree_type)
     if (tree_type == SINGLE_TREE)
         set_cb_tab(lc, fc->tab.cpm[CHROMA], MODE_PLT);
 
-    add_palette_tu(lc);
+    add_palette_tu(lc, tree_type);
 
     for (i = 0; i < fc->tab.predictor_palette[start_comp].size && num_predicted_entries < max_num_palette_entries; i++) {
         palette_predictor_run = ff_vvc_palette_predictor_run(lc);

We need to count in tree type. the luma cu at 224, 68 is a DUAL_LUMA CU with 4x8 size the chroma cu at 224,68 is a DUAL_CHROMA CU with 16x8 size
Please consider removing add_palette_tu, as the hls_transform_tree may be able to handle most of the tasks.

It works! This change saved my life. This is the power of cooperation!

Hi Jianhua,
Thank you for your great efforts on this.
Once it can md5 match with one file. please refine your code, provide a list of pass/failed files, @ me for review.
I will cowork with you to polish the code and make it upstreamable.

@QSXW
Copy link
Collaborator Author

QSXW commented Apr 2, 2025

Hi @nuomi2021 Thank you for your help! Now the palette changes only fails on the 8b444_A_Kwai_2.bit. I've check the file it looks like the difference occurred on the 1st picture and 10th ... pictures. I tried to compare the frame with VTM and there is only some cu are different and others are the same actually. I've updated the commits, and maybe you can have a try. Just ignore the SEI, the decoded picture hash is helpful for our debugging.

8b444_A_Kwai_2.bit MD5 mismatch. Ref MD5 = a09794d71c019f39ec2e8c24a7ddd662, decoded MD5 = 7c57387f3c7155427aac3a0a1cb1a0c5
10b422_J_Sony_5.bit passed
8b422_J_Sony_5.bit passed
10b422_L_Sony_5.bit passed
8b422_L_Sony_5.bit passed
ACT_A_Kwai_3.bit MD5 mismatch. Ref MD5 = 01e987b04081c75215a5c1f7d64ffd56, decoded MD5 = 2cc3129018b40d568db2a787eb0ed315
10b422_H_Sony_5.bit passed
8b422_H_Sony_5.bit passed
8b422_I_Sony_5.bit passed
10b422_I_Sony_5.bit passed
10b422_K_Sony_5.bit passed
8b422_K_Sony_5.bit passed
10b422_G_Sony_5.bit passed
8b422_G_Sony_5.bit passed
PALETTE_B_Alibaba_2.bit passed
PALETTE_D_Alibaba_2.bit passed
ACT_B_Kwai_3.bit failed
69
8b444_B_Kwai_2.bit passed
PALETTE_E_Alibaba_2.bit passed
PALETTE_C_Alibaba_2.bit passed
PALETTE_A_Alibaba_2.bit passed

+++++++++ report +++++++++
passed files:
    10b422_G_Sony_5.bit
    10b422_H_Sony_5.bit
    10b422_I_Sony_5.bit
    10b422_J_Sony_5.bit
    10b422_K_Sony_5.bit
    10b422_L_Sony_5.bit
    8b422_G_Sony_5.bit
    8b422_H_Sony_5.bit
    8b422_I_Sony_5.bit
    8b422_J_Sony_5.bit
    8b422_K_Sony_5.bit
    8b422_L_Sony_5.bit
    8b444_B_Kwai_2.bit
    PALETTE_A_Alibaba_2.bit
    PALETTE_B_Alibaba_2.bit
    PALETTE_C_Alibaba_2.bit
    PALETTE_D_Alibaba_2.bit
    PALETTE_E_Alibaba_2.bit
mismatch files:
    8b444_A_Kwai_2.bit
    ACT_A_Kwai_3.bit
decode_err files:
    ACT_B_Kwai_3.bit

total = 21, passed = 18, skipped = 0, failed = 3
----------
[vvc @ 0000022F709C2C00] Verifying checksum for frame with decoder_order 1: failed
  Metadata:
    encoder         : Lavf61.9.107
  Stream #0:0: Video: mjpeg, yuv444p(pc, progressive), 1280x720, q=2-31, 200 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc61.33.102 mjpeg
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
[vvc @ 0000022F709C2C00] Verifying checksum for frame with decoder_order 10: failed

QSXW and others added 20 commits April 11, 2025 01:40
…ding unit

passed files:
    ACT_A_Kwai_3.bit
    ACT_B_Kwai_3.bit

Signed-off-by: Wu Jianhua <toqsxw@gmail.com>
Signed-off-by: Wu Jianhua <toqsxw@gmail.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
passed files:
    FIELD_A_Panasonic_4.bit
    FIELD_B_Panasonic_2.bit

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Wu Jianhua <toqsxw@gmail.com>
@QSXW
Copy link
Collaborator Author

QSXW commented Apr 11, 2025

Hi @nuomi2021. With my latest commit, all the PAL and ACT tests have passed the conformance test. The issue is caused by the incorrect ciip flag of PLT block leading to the incorrect boundary strength, then the deblocking filter doesn't output the right pixels. I didn't make the codes beautiful. Can you help polish them?
image

@nuomi2021
Copy link
Member

Sure, I’ll do this.
Several places can be simplified. It will take some time.

if (max_palette_index > 0 && !run_copy_map[scan_pos - min_sub_pos] && PALETTE_RUN_TYPE(start_comp, xc, yc) == 0) {
current_palette_index = ff_vvc_palette_idx_idc(lc, max_palette_index, *adjust);
if (scan_pos > 0) {
adjusted_ref_palette_index = max_palette_index + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment will be replaced by the following if-else block. Why doesn't this code follow the specification?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually following the VTM. Following the spec doesn't work. Maybe you can compare the codes with VTM/CABACReader.cpp/cu_palette_info

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can report a bug to VTM.

int width = tu->tbs[start_comp].tb_width;
int height = tu->tbs[start_comp].tb_height;

uint32_t (*scan_order)[2] = (uint32_t (*)[2])lc->fc->tab.traverse_scan_order[palette_transpose_flag ? TRAV_VERT : TRAV_HORIZ][av_log2(width) - 1][av_log2(height) - 1];
Copy link
Member

@nuomi2021 nuomi2021 Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QSXW could you help generate const tables like
ff_vvc_diag_scan_x and ff_vvc_diag_scan_y
it should like this:
ff_vvc_trav_scan_x[2 /* transpose /][6 / log_width - 1][6 /* log_height - 1][64 x 64]

patch to data.c and data.h is enough. I can handle others.
you can use xxd like this

echo Hello World\! > temp
xxd -i temp

Thank you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is needs 576k memory for sizeof(ff_vvc_trav_scan_x) + sizeof(ff_vvc_trav_scan_y). The diag scan only needs 12.5 k. Is it worth adding it as a constant table? If we add trav scan as constant table, it means the ffmpeg.exe will increased with 576k memory... Or we may act like crc.c, only enable the constant table when CONFIG_HARDCODED_TABLES is enabled.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

From the pattern, if there's no transpose, the first two lines for x are sufficient. We can use a table for x:
ff_vvc_trav_scan_x[2 /* transpose */][6 /* log_width - 1 */][6 /* log_height - 1 */][64 * 2]

Not sure if we have the same pattern for y and the transpose case.
It's worth looking into :)

Copy link
Member

@nuomi2021 nuomi2021 Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

en, ff_vvc_trav_scan_x[0] and ff_vvc_trav_scan_y[1] are the same table.
So you can use the only one.
also for ff_vvc_trav_scan_x[1], scan_pos / width is the x.
we can use scan_pos >> log2_width to do the calcation

Copy link
Collaborator Author

@QSXW QSXW Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use ff_vvc_trav_scan_x to initialize a lc->fc->tab.traverse_scan_order_x[2][6][6] and lc->fc->tab.traverse_scan_order_y[2][6][6]. Only need two variable with total 144 bytes to avoid the extra index calculation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for odd lines,

x = pos & (width - 1)
y = pos >> log2_width;

If we can find a solution for both even and odd lines. we do not need the table anymore.
will check it tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed to be awesome!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we can do this with

const int mask = width - 1;
const int x = (pos & mask) ^ (-((pos >> log_width) & 1) & mask);

@nuomi2021 nuomi2021 mentioned this pull request Apr 26, 2025
@nuomi2021
Copy link
Member

merged as a65d028
Thank you, Jianhua and all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants