The paper mentioned that an 8x64 A100-80G was used for training 80,000 + 20,000 steps. Could you please tell me what the training time was?