Hi, thank you for sharing such great work! I’m very interested in applying your distributed reward mechanism in my research.
While going through the data processing code, I noticed that only the data_process/musi_search dataset includes the "support_docs" field (from example['sub_support_docs']), while the other two datasets seem to have this field empty.
May I ask if this is because the current implementation only supports that dataset? And if I’d like to use your method on other datasets, should I manually provide the ground-truth reference documents for "support_docs"?