Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained model loss stuck around a point? how many epochs to train model? #38

Open
mukeshnarendran7 opened this issue Mar 31, 2022 · 3 comments

Comments

@mukeshnarendran7
Copy link

How many epochs was the Transpose hA4 pre-trained model fine-tuned on the MPII dataset to get to the benchmarks in the paper?
I am using: the following parameters similar to the paper. but on a dataset with 10K images here
model_tp = torch.hub.load('yangsenius/TransPose:main',
'tph_a4_256x192',
pretrained=True)
model_tp.final_layer = torch.nn.Sequential(torch.nn.Conv2d(96, 18, kernel_size=1))

#Load parameters
model = model_tp.to(device)
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },
{'params': model.final_layer.parameters(), 'lr': 1e-4}])
criterion = torch.nn.MSELoss(reduction="mean")

Any suggestion to improve would be helpful this situation. Thanks
I am training trying to fine-tune it but the loss doesn't decrease:
Training model
Epoch:0, loss2.804723664186895, time taken:539.878s
Epoch:1, loss2.263692114269361, time taken:542.564s
Epoch:2, loss1.8802592728752643, time taken:542.661s
Epoch:3, loss1.5531523590907454, time taken:543.041s
Epoch:4, loss1.3379272652091458, time taken:543.445s
Epoch:5, loss1.1180460024625063, time taken:538.449s
Epoch:6, loss0.9673018065514043, time taken:534.550s
Epoch:7, loss0.8572808737517335, time taken:538.618s
Epoch:8, loss0.7790990431094542, time taken:535.940s
Epoch:9, loss0.7243237162474543, time taken:536.291s
Epoch:10, loss0.6794152171351016, time taken:535.745s
Epoch:11, loss0.6420647234190255, time taken:532.800s
Epoch:12, loss0.6094503253116272, time taken:531.308s
Epoch:13, loss0.5824214839958586, time taken:530.418s
Epoch:14, loss0.5580684408778325, time taken:530.618s
Epoch:15, loss0.538073766452726, time taken:531.255s
Epoch:16, loss0.5198041790281422, time taken:531.875s
Epoch:17, loss0.5046796562382951, time taken:529.682s
Epoch:18, loss0.49001771898474544, time taken:529.585s
Epoch:19, loss0.4768067048571538, time taken:530.031s
Epoch:20, loss0.46674167667515576, time taken:534.574s
Epoch:21, loss0.45518148655537516, time taken:532.242s
Epoch:22, loss0.4449854488193523, time taken:532.336s
Epoch:23, loss0.4369037283177022, time taken:533.899s
Epoch:24, loss0.4278696861874778, time taken:532.454s
Epoch:25, loss0.4207416394201573, time taken:538.248s
Epoch:26, loss0.41212902366532944, time taken:541.508s
Epoch:27, loss0.4052599307906348, time taken:540.419s
Epoch:28, loss0.3998840279818978, time taken:541.615s
Epoch:29, loss0.3926734702545218, time taken:541.612s
Epoch:30, loss0.3866453653026838, time taken:541.235s
Epoch:31, loss0.38077057831105776, time taken:540.944s
Epoch:32, loss0.37572325009386986, time taken:540.582s
Epoch:33, loss0.3709150122012943, time taken:540.616s
Epoch:34, loss0.36646912069409154, time taken:540.807s
Epoch:35, loss0.3614582328009419, time taken:541.298s
Epoch:36, loss0.35673171386588365, time taken:537.836s
Epoch:37, loss0.3524343741883058, time taken:538.538s
Epoch:38, loss0.34845523245166987, time taken:539.272s

@gaobo25
Copy link

gaobo25 commented Apr 20, 2022

I have another problem When I train

Test: [0/125] Time 0.854 (0.854) Loss 0.0014 (0.0014) Accuracy 0.000 (0.000)
Test: [100/125] Time 0.107 (0.124) Loss 0.0016 (0.0013) Accuracy 0.008 (0.008)

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000

May I ask what is the reason for this

@adnantariq18
Copy link

I m getting -1.000 in all .. can anyone tell the reason ..

@electroram
Copy link

因为你需要把人体检测框标出来。他这个模型实际上是把人体检测框标出来之后才开始用Transformer进行关键点检测的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants