Thank you very much for your work! Here are some of my questions, which I hope you can answer:
I want to know if this mode is suitable for evaluating the performance of editing models on a fixed test set. The scores output by this EditReward are usually between -3 and 2; I don’t know if this is normal (the comparison tables in the paper seem to be all percentages). The code shows that editreward will be evaluated from two dimensions, so why is the final value taken as reward[0][0].item? Does reward[0][1] have any practical meaning?