-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with training speed / loss #46
Comments
I face the similar issue: it seems difficult to converge / generate reasonable results at early stages; I would suggest to just overfit to one object. In my toy experiment i start from I didn't try to overfit from scratch yet. I may also try that and share here later. One weird thing i observe:
|
Now i would like to provide more results regarding overfitting to one object: It seems like if i start from pretrained checkpoint, it converges slowly, but loss still goes down; When i start to train from scratch, the loss doesn't go down -- i checked the splatted 3D gaussian, which is a white image and shows PSNR around 15 w.r.t. training image. @greeneggsandyaml Do you have any update on training? @ashawkey Could you kindly provide any information on how your loss curve looks like? For me it seems difficult to converge when i train from scratch. |
@YuxuanSnow Hi, thanks for your information! |
When overfitting on a single object, have you tried to adapt the code of LR scheduler? |
I didn't try to adapt the LR scheduler. @ashawkey I have updated result when disable the lpips: The PSNR can achieve higher value (purple curve), which means it's effective strategy to disable lpips; The image is still blury but i think adding lpips and further train could resolve the problem. |
u really do a wonderful job! for me,there are some questions in this paper,hope to acquire ur help!can i have ur tel or ins to talk |
@jeremy123z Nice to meet you. Could you help me train this model? when can I download the datasets and use it? |
just download from gobjaverse, unzip it and set correct dir path to provider_gobjaverse.py |
Thank you so much. can we connect on tg? https://t.me/David_Crypto001 |
Hello, I'm looking to replicate the results of this repo. I've loaded the Objaverse data (rendered in a similar manner to G-Objaverse) and I've verified that the images look right (see below). I believe that the cameras are also being loaded correctly, although it is always possible that I made an error there.
I'm finding that the network does not train successfully.
I am asking anyone (the author or anyone else who has successfully trained a model), what the training process should look like. As in, what (approximately) should the loss be at 500, 1000, 5000 steps? Does the network take forever to converge or is something wrong with my setup?
For context, my renders look like:
And after 1500 steps of training (with a single 80GB GPU), I have losses that look like:
and predicted images that look like this:
These losses/images look very bad to me, but perhaps I need to wait for much longer.
Am I doing something wrong?
Thanks for all your help!
The text was updated successfully, but these errors were encountered: