Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about stage 1 training #6

Open
iyuner opened this issue Jul 27, 2023 · 18 comments
Open

questions about stage 1 training #6

iyuner opened this issue Jul 27, 2023 · 18 comments

Comments

@iyuner
Copy link

iyuner commented Jul 27, 2023

Hi,

Thank you for sharing your code. As mentioned in your paper, there are 2 stages of training. First for the denoising diffusion model, and second focuses on the leapfrog initializer. It seems the repo provides the code of stage 2 training, which loaded the pretraining checkpoint of the denoising diffusion model directly. Could you also provide the code for stage 1 training? Do you use the leapfrog initializer in the first stage? If so, what are the initialized values of the estimated mean, variance, and sample prediction you used? Thanks!

@Frank-Star-fn
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

@kkk00714
Copy link

kkk00714 commented Jan 3, 2024

Hi,

Thank you for sharing your code. As mentioned in your paper, there are 2 stages of training. First for the denoising diffusion model, and second focuses on the leapfrog initializer. It seems the repo provides the code of stage 2 training, which loaded the pretraining checkpoint of the denoising diffusion model directly. Could you also provide the code for stage 1 training? Do you use the leapfrog initializer in the first stage? If so, what are the initialized values of the estimated mean, variance, and sample prediction you used? Thanks!

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

@kkk00714
Copy link

kkk00714 commented Jan 3, 2024

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

@fangzl123
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

@kkk00714
Copy link

kkk00714 commented Jan 9, 2024

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

@ShaokangHi
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

@kkk00714
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@ShaokangHi
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?

Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

@kkk00714
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?

Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

@packer-c
Copy link

Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask)

@woyoudian2gou Hello, What does Loss_NE( ) mean? Where can I find this code? thanks.

@13629281511
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hello, i am confused about how to reimplenment the stage 1 training, would you please leave me a contact information?

@percybuttons
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?
Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

Hi,

Thank you very much for sharing your unique training method. I find it very interesting! However, I have a few questions for clarification:

  1. You mentioned using fut_traj[:,0,:] during the first training of the diffusion module. Does this mean that only the first frame of fut_traj is used?
  2. If so, how many epochs are required for the first training of the diffusion model?
  3. For the second training of the diffusion model, when the full fut_traj is used, how many epochs are required?
  4. Is the learning rate set the same as mentioned in the paper?
    I am looking forward to your response and appreciate your @kkk00714
    Best regards

@kkk00714
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?
Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

Hi,

Thank you very much for sharing your unique training method. I find it very interesting! However, I have a few questions for clarification:

  1. You mentioned using fut_traj[:,0,:] during the first training of the diffusion module. Does this mean that only the first frame of fut_traj is used?
  2. If so, how many epochs are required for the first training of the diffusion model?
  3. For the second training of the diffusion model, when the full fut_traj is used, how many epochs are required?
  4. Is the learning rate set the same as mentioned in the paper?
    I am looking forward to your response and appreciate your @kkk00714
    Best regards
  1. Yes.
    2&3&4. Use hyperparameters as same as the paper mentioned and change batchsize to 250 in stage 1

@percybuttons
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?
Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

Hi,
Thank you very much for sharing your unique training method. I find it very interesting! However, I have a few questions for clarification:

  1. You mentioned using fut_traj[:,0,:] during the first training of the diffusion module. Does this mean that only the first frame of fut_traj is used?
  2. If so, how many epochs are required for the first training of the diffusion model?
  3. For the second training of the diffusion model, when the full fut_traj is used, how many epochs are required?
  4. Is the learning rate set the same as mentioned in the paper?
    I am looking forward to your response and appreciate your @kkk00714
    Best regards
  1. Yes.
    2&3&4. Use hyperparameters as same as the paper mentioned and change batchsize to 250 in stage 1

Thank you very much for your response! It resolved my issue and provided immense help. Your suggestion to adjust the batch size from 10 to 250, processing 250*11 agents at a time, is indeed a sensible configuration for a diffusion model. However, my hardware might not support running such a large volume of data simultaneously. I will attempt to use a slightly smaller batch size. Once again, I appreciate your reply! @kkk00714

@kkk00714
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?
Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

Hi,
Thank you very much for sharing your unique training method. I find it very interesting! However, I have a few questions for clarification:

  1. You mentioned using fut_traj[:,0,:] during the first training of the diffusion module. Does this mean that only the first frame of fut_traj is used?
  2. If so, how many epochs are required for the first training of the diffusion model?
  3. For the second training of the diffusion model, when the full fut_traj is used, how many epochs are required?
  4. Is the learning rate set the same as mentioned in the paper?
    I am looking forward to your response and appreciate your @kkk00714
    Best regards
  1. Yes.
    2&3&4. Use hyperparameters as same as the paper mentioned and change batchsize to 250 in stage 1

Thank you very much for your response! It resolved my issue and provided immense help. Your suggestion to adjust the batch size from 10 to 250, processing 250*11 agents at a time, is indeed a sensible configuration for a diffusion model. However, my hardware might not support running such a large volume of data simultaneously. I will attempt to use a slightly smaller batch size. Once again, I appreciate your reply! @kkk00714

I hope you can successfully replicate the stage 1 training process. The inspiration for changing the size of future_traj from (b, T, 2) to (b, 1,2) came from the fact that I found that the noise value of the same sample was almost exactly the same at all 30 time steps. I hope this will help you with your subsequent adjustments.

@percybuttons
Copy link

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hi, have you successfully trained stage 1? I've reimplemented it but when I trained the model, the noise estimation loss will always stuck around 1.0. Thanks for any insights.

Try to change the fut_traj size to (b,1,2) to train, and save the model to train again with size(b,T,2) and batchsize 250. The loss will stuck around 0.12 which is closed to 0.06 of pretrained model.

Hi, do you mean to implement stage 1 (pre-train model) only need to change the shape type? @woyoudian2gou

Yes, if you follow the steps I described above, you will get a model that is close to pre-train model. Time step T and batchsize are both factors that affect training.

@woyoudian2gou Thank you for your reply. It will change config(cfg):

past_frames : 29
future_frames: 1
min_past_frames: 29
min_future_frames: 1

Also change these related parameters in ./trainer/train_led_trajectory_augment_input.py, right?
Looking forward to your reply! or could you share your related code via Google drive or other cloud? Thanks in advance!

No, you should change the shape of fut_traj like Loss_NE(past_traj,fut_traj[:,0,:].unsqueeze(1),traj_mask). And once you have trained this model, change the batchsize to 250 and use origin shape of fut_traj to continue training.

Hi,
Thank you very much for sharing your unique training method. I find it very interesting! However, I have a few questions for clarification:

  1. You mentioned using fut_traj[:,0,:] during the first training of the diffusion module. Does this mean that only the first frame of fut_traj is used?
  2. If so, how many epochs are required for the first training of the diffusion model?
  3. For the second training of the diffusion model, when the full fut_traj is used, how many epochs are required?
  4. Is the learning rate set the same as mentioned in the paper?
    I am looking forward to your response and appreciate your @kkk00714
    Best regards
  1. Yes.
    2&3&4. Use hyperparameters as same as the paper mentioned and change batchsize to 250 in stage 1

Thank you very much for your response! It resolved my issue and provided immense help. Your suggestion to adjust the batch size from 10 to 250, processing 250*11 agents at a time, is indeed a sensible configuration for a diffusion model. However, my hardware might not support running such a large volume of data simultaneously. I will attempt to use a slightly smaller batch size. Once again, I appreciate your reply! @kkk00714

I hope you can successfully replicate the stage 1 training process. The inspiration for changing the size of future_traj from (b, T, 2) to (b, 1,2) came from the fact that I found that the noise value of the same sample was almost exactly the same at all 30 time steps. I hope this will help you with your subsequent adjustments.

Thank you for your kind words and positive outlook! It truly is an intriguing finding, and your ability to implement it effectively showcases your talent. This discovery may not be coincidental at all; it's possible that this approach could be universally applied across diffusion models to yield even better performing ones. Once again, I appreciate your response and insight—it's invaluable for further advancements.

@VanHelen
Copy link

VanHelen commented Jul 9, 2024

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hello, I would like to know how you implemented the first stage of denoising training. Did you use the LED module in the first stage of training? Thank you very much!

@kkk00714
Copy link

kkk00714 commented Jul 9, 2024

I also have the same requirement and hope to obtain the code for the first stage of training.

I have reimplemented the stage 1 training. If you are still interesting in it, please concat me.

Hello, I would like to know how you implemented the first stage of denoising training. Did you use the LED module in the first stage of training? Thank you very much!

As the paper described, LED module is not used in the first training stage. You just need to use the loss_ne function that comes with the author's code to train fut_traj after changing its shape as I described earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants