Length normalizing and temperature #9

markict123 · 2024-04-07T06:25:57Z

Thanks for nice work! I have two questions.
The first one is about length norm in calculating the conditional log probability. According to the paper and common practice, the denominator should be the length of response.

However, according to the code:

RTL-Coder/train/mle_scoring.py

Line 199 in 3394cce

prod.append(-loss/mask.sum(-1))

the denominator seems to include the padding part. Could you please check it?

The second question I wonder is the proper way to show experiment results.
The paper says,

Do you mean choosing the best result under each temperature , or choose the best temperature according to Pass@1 or something?
Thank you for reply.

DevinShang · 2024-05-28T11:38:33Z

Hi,
Thanks a lot for your issue!
Regarding your first question, you are right and we should only include response part of the mask in the denominator.
For your second issue, we chose the best result under different temperature configurations for each metric, i.e., pass@1, pass@5, pass@10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Length normalizing and temperature #9

Length normalizing and temperature #9

markict123 commented Apr 7, 2024

DevinShang commented May 28, 2024

Length normalizing and temperature #9

Length normalizing and temperature #9

Comments

markict123 commented Apr 7, 2024

DevinShang commented May 28, 2024