-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing the training results on a megadepth dataset #253
Comments
Hi, Have you made any progress on this issue? |
Hello,
Thanks for your reply. Yes, I fixed the issue!
Thank you!
…On Thu, 1 Jun 2023 at 09:42, Xi Li ***@***.***> wrote:
Hi, Have you made any progress on this issue?
—
Reply to this email directly, view it on GitHub
<#253 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFOHDZZFSQDSDNVZG5TUFHTXJA2VVANCNFSM6AAAAAAWA7GPK4>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Benjamin Kelenyi
Student | Computer Science | Technical University
m: +40743586598 e: ***@***.***
a: Str. G. Baritiu nr. 26-28, 400027 Cluj-Napoca, Romania
<https://www.facebook.com/benjamin.kelenyi>
<https://www.facebook.com/benjamin.kelenyi>
<https://www.linkedin.com/in/benjamin-kelenyi-aa322710a/>
|
|
hello,my results are similar to yours,Have you tried changing your TRAIN_IMG_SIZE to 840. |
I'm training After 11 epoches training, I got the following results(val): it does not seem to grow anymore. I will try to train it 30 epoches and test the model on the test set(it may takes another two days). Has anyone else already reproduced the results using a similar setting? Would setting TRAIN_IMG_SIZE to 840 help? |
After 30 epoches training, I reproduced the test on megadepth and got: 3 points lower than the reported accuracy. |
|
no, i use the default settings. Your results are very similar to mine with 11 epoches training. So what device you use and how long you train it ? |
I also used 4 Nvidia RTX 3090 GPUs and trained for approximately 100 hours. I have tried using D2-net to process the dataset, and these are the validation results I saved during the training process. I am really eager to know if setting 'TRAIN_IMG_SIZE' to 840 would improve the accuracy after training. |
@Mysophobias I didn't process the megadepth via D2-net, and your ckpts seems similar to mine. I have no idea about your bad test results. I just used the default As to the image size=840, I think it might help as it was officially recommended after all. 3090 is enough to train with 840 and you can try it. |
``
Based on the code comments in |
@Mysophobias 3090 can train it with physical bs=1. I use accumulate grad=2 to make the bs=1 * 2 * 4=8, which is suggested by the author. i have trained it for one epoch but now i'm not available with GPUs💔. I hope my experience can help you and It would be nice if you could share the final results. |
May I ask how could you train the model on MegaDepth? I got stuck in getting the training images from D2-net. I noticed the authors of LoFTR says the differences are subtle, but I don't know how to create the symbol link. Do I need to download the MegaDepth SfM dataset? Best, |
@xmlyqing00 I think this issue can help. |
Thanks, I just fixed the training on MegaDepth |
can i ask about your device's memery capacity? |
@RunyuZhu That`s weird, I use 4 3090ti and 128GB memery(8GB swap) to get that results, nums_workers is 4. But memory consumption does indeed increase over time. I never met this bug, sorry i cannot help you. I suggest you to look at the system log and make sure that the process is killed due to OOM and if there are some other processes occupy a large amount of memory. |
thanks for your reply and precious suggestion! |
hello, how do you fix the problem in line 47 in LoFTR/src/datasets/megadepth.py ? line 47 in offical code is self.scene_info = np.load(npz_path, allow_pickle=True), which is different from this issue |
可以问下您是怎么解决loftr无法下载d2-net预处理数据的问题的吗?这个issue里面的做法有帮助嘛 我看LoFTR/src/datasets/megadepth.py 里面的第47行并不是他给的那个 而是 self.scene_info = np.load(npz_path, allow_pickle=True) 想问下你是怎么改动这个文件的呀 多谢! |
Thank you very much for your excellent work.
I recently reproduce the training results on 4 3090 GPUs for 30 epochs based on README. The batch size each GPU is 2. I trained and tested on the D2 Net-undistorted megadepth dataset, and the results are as follows:
auc@5: 44.1 acu@10: 60.28 auc@20: 72.93
At the same time, I also saw that in the previous issue, it was recommended to set the image sizes of both val and test to 640, but the results did not improve.
What is the reason for this decline in accuracy?
The text was updated successfully, but these errors were encountered: