-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strange loss curve #3
Comments
@BowieHsu , thks! I will try, and will post my result here. |
@BowieHsu , btw, can you share your trained model ? |
@BowieHsu , after 6 hours of training using 4 gpus, the loss curve is |
@BowieHsu , thks for your model, i can get meaningful result now! The model is really hard to train.. |
haha,it's really a good news |
@BowieHsu , hi, I used converted checkpoints and trained from scratch on ICDAR2015 but I got a bad result. I set the learning rate in json file like this: |
@JiasiWang Hi,wang, I'm also trained the model with default pretrain.json which shows good result,how about your batch size? or you may check loss value using tensorboard |
@BowieHsu , I did not change the batchsize, it is 32. I just changed the base_lr to 1e-4. I will check it, thanks |
@JiasiWang Yep, the default learning rate should be 5e-4. |
@JiasiWang By the way,the ICDAR2015 seglink model should pretrain on Synthtext datasets first, then finetune on ICDAR2015 train data sets if you want to reach 75% Hmean. |
@BowieHsu yeah, I know that seglink model need pretrain on Synthtext datasets. and without pretrain, I only get 58% Hmean. |
May I ask how to use your model? As I not familiar with tensorflow. I tried to load it in tensorflow 1.4, but I got following error. I did i tried following solutions:
Error log:
|
try "model_loader.restore(sess, './data/VGG_ILSVRC_16_layers_ssd/VGG_ILSVRC_16_layers_ssd.ckpt)" @Godricly |
Many thanks! That saved my ass. 👍 |
@Godricly 不客气,道友 |
@BowieHsu 请问我如何利用您Pretrain的模型跳过批pretrain那一步呢??请问exp/sgd/checkpoint里头是pretrain过程当中的模型吗?但是我将您的模型放进去他说formar不对 |
@tianzhuotao pretrain的json文件是用来训练基于sythtext数据集的模型,如果你不想训练这个模型而是想直接训练基于icdar2015的模型的话 |
@BowieHsu 那个finetune的json文件里头只有一个finetune_model, 似乎EXP/SGD里头需要有一个checkpoint文件存在,但是我没有经过pretrain所以没有,您的模型里头似乎也只有3个文件,请问这个如何解决呢? |
你可以看到finetune.json文件中有两行 |
@BowieHsu 十分感谢!好人一生平安. 还解决了一些其他的问题(gpu什么的...)终于跑起来了 |
@tianzhuotao 你可以关注一下训练的损失函数,如果是直接从vgg模型上来finetune的话,需要调整一下学习率,反正就慢慢调参吧,当然也需要根据实际的任务魔改代码,祝好运。 |
@BowieHsu 谢谢!我目前用的是默认参数,但是训练起来很慢,7个小时训练了6%,感觉很慢阿qwq 请问您训练大概用了多久呢? 我目前集群申请的16core cpu\1个gpu和32gb内存以及10g硬盘 |
你好,我最近刚好也在研究多方向文字检测,可以加个qq交流一下吗? |
你好,convert_caffemodel_to_ckpt.py 文件中import model_vgg16 这个model_vgg16需要用什么来装,装到哪里,还有运行run.sh 时报caffe的错误,网络说是python版本问题,需要换到python2.7,看您的介绍里是用的python3呀,能帮我解决一下疑惑吗 |
@13230380356 我刚刚解决了pretrain的问题 具体可以看外面#13我刚刚写的tips |
everythin is OK until |
@BowieHsu @JiasiWang 我用了SynthText 40g做的tf文件,预训练90000轮以后,因为finetune_ic15.json里面"finetune_model": "../exp/sgd/checkpoint"(默认)跑不通,我改成了"finetune_model": "../exp/sgd/checkpoint-90000"。接下来训练10000轮以后。在ic15测试集上面跑出的结果只有 为什么没有达到75%呢? |
改成batch-size32 依然hmean,61%左右。 |
我拿预训练模型跑测试,不经过finetune,结果是hmean49% |
我跟你结果都一样,目前不知道该怎么优化了 |
Thanks for the clean and elegant code!
I tried to run training from scratch (use pretrained vgg_16 model on imagenet), the traning process looks weird.
Total Loss
And the corresponding loss for others.
the loss quickly converged to about 10+, and I test the model, but no text boxes is detected, how can I diagnose this?
The text was updated successfully, but these errors were encountered: