Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

having trouble with convergence #21

Open
penguinshin opened this issue Dec 20, 2017 · 5 comments
Open

having trouble with convergence #21

penguinshin opened this issue Dec 20, 2017 · 5 comments

Comments

@penguinshin
Copy link

Hi, first off thank you for the wonderful code. I am trying to replicate the toy blob example in pytorch. I am finding that it unreliably converges to the same accuracies that you report. Sometimes it will not converge at all, and other times it will get to the 97% source/97% target accuracy. Also, the source-only training yields a 50% accuracy on target domain. I was wondering if there were any snags you encountered that hindered convergence?

Thanks

Austin

@DRJ2016
Copy link

DRJ2016 commented Jan 19, 2018

I have the same question.

@Engineero
Copy link

Does lowering your learning rate help?

@pumpikano
Copy link
Owner

Actually, the blobs example in general is fairly unreliable - I can get poor results occasionally after repeated runs. Honestly, I didn't do any tuning of hyperparams - it was just a small, fast experiment to validate the implementation when I was writing it. If you find hyperparams that work better, please share them and I can update the example.

@penguinshin
Copy link
Author

for me, the biggest discrepancy with the blobs example was that source-only training resulted in trivial (50%) accuracy. did you get this as well? as for hyperparameters, adding more dimensions to the feature extractor (i.e. going from 8 to 50) was what allowed it to converge at all for me.

@pumpikano
Copy link
Owner

I take it that that 50% was the source accuracy, not the target accuracy? In that case, there is certainly something wrong, but 50% accuracy on the target class is not unusual if you only train on the source.

One thing that might help is annealing the gradient reversal parameter. I do this in the MNIST example, following the schedule presented in the paper, but for the blobs example I keep it fixed at -1 throughout training. That is almost certainly not the optimal thing to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants