What learning rate should be used to fine-tune T5-large and T5-3B?

Hi,
From a previous discussion #16 it was said that a learning rate of 0.001 was used. When I tried both 0.001 and 0.0001, it seemed that the latter gave a lower loss. I'm wondering if this means I should use a LR of 0.0001 instead?
Thank you!
Charles