Cannot replicate the results mentioned in the repo (English-Vietnamese) . #61

mjlaali · 2017-08-11T19:49:44Z

I cannot replicate the result that mentioned in the repo. Here are my settings:
Python 2.7
Tensorflow 1.2.1
Using a docker based on nvidia/cuda:8.0-cudnn5-devel-ubuntu14.04

The command I ran was:
python2 -m nmt.nmt
--src=vi --tgt=en
--vocab_prefix=/data/nmt/iwslt15/vocab
--train_prefix=/data/nmt/iwslt15/train
--dev_prefix=/data/nmt/iwslt15/tst2012
--test_prefix=/data/nmt/iwslt15/tst2013
--out_dir=/data/nmt/models/nmt_attention_model
--hparams_path=nmt/standard_hparams/iwslt15.json
--num_gpus=2

I got the blue score of 24.83, however, on the website, 26.1 has been reported.

lmthang · 2017-08-13T07:03:24Z

You have --src=vi --tgt=en, so it's Vietnamese-English (for English-Vietnamese, try --src=en --tgt=vi) & that's about the number we got :) I'll update the tutorial with Vietnamese - English results as well.

mjlaali · 2017-08-17T09:40:50Z

Thanks for the clarification, I tried with --src=en --tgt=vi and I got a Bleu of 25.38 on the test set. Could you clarify if the difference between my results and what was reported (0.72 in the Bleu score) is normal?

bastings · 2017-08-22T08:06:25Z

@mjlaali have you tried running it again? it could be just variance (I got 26.4).

bastings · 2017-08-22T08:08:19Z

@lmthang regarding replicating IWSLT results: I manage using python 2.7.12, but in python 3.5.2 I end up with BLEU 0.7, so something seems really broken there. I'm not sure yet what it is.

oahziur · 2017-08-22T13:00:35Z

@bastings Yes, there is an encoding problem in python3.

I think this pull request fixed the problem.

bastings · 2017-08-22T13:55:03Z

Hi @oahziur in that PR you mention this is fixed in the latest version, but I tested it with the latest code yesterday, so it seems it is not fixed. Are you sure get_translation(..) in nmt_utils.py behaves correctly?

oahziur · 2017-08-22T14:04:13Z

@bastings

The printing to stdout part was fixed, but get_translation(...) still has problem in python3. I think the change in the PR should fix get_translation(...) after merging into master.

bastings · 2017-08-22T21:23:03Z

@oahziur yes you are right; I just validated it, and with those changes I get 25.9 test on IWSLT15 en->vi.

oahziur · 2017-08-23T16:22:49Z

@bastings

Are you loading the same model (the one you got 26.4 in python 2), but get 25.9 in python 3?

mjlaali · 2017-08-23T16:45:37Z

@bastings You are right, in the second run using two GPUs I got a BLEU of 26.3.

For the sake of clarity, in the run with a BLEU of 25.38, I trained my model using a CPU.

bastings · 2017-08-23T19:56:06Z

It was a separate run @oahziur so nothing to worry about.
I checked and I can do inference/evaluation in python3 without problems with a model trained with python2. (and get the same BLEU, after setting beam_width manually -- maybe this setting should transfer from the model settings).

However, setting the same random seed, I do get different results when training with python2 and python3. Is this expected?

oahziur · 2017-08-24T13:27:46Z

@bastings Thanks! I was worry about there are still some encoding issue in python3, but I think that is fixed since you can get consistent results. I think some randomness during training is expected.

You should be able to reset beam_width so you can compare results among different beam_widths. Do you mean beam_width was not set correctly after you switch to python3 and you have to manually reset it?

bastings · 2017-08-24T14:00:22Z

Yes, it seemed that beam_width defaulted to 1 for inference even though it was set to 10 during training.

oahziur mentioned this issue Aug 31, 2017

Cannot reproduce the IWSLT English-Vietnamese benchmark #89

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot replicate the results mentioned in the repo (English-Vietnamese) . #61

Cannot replicate the results mentioned in the repo (English-Vietnamese) . #61

mjlaali commented Aug 11, 2017

lmthang commented Aug 13, 2017

mjlaali commented Aug 17, 2017

bastings commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 23, 2017

mjlaali commented Aug 23, 2017

bastings commented Aug 23, 2017 •

edited

Loading

oahziur commented Aug 24, 2017

bastings commented Aug 24, 2017

Cannot replicate the results mentioned in the repo (English-Vietnamese) . #61

Cannot replicate the results mentioned in the repo (English-Vietnamese) . #61

Comments

mjlaali commented Aug 11, 2017

lmthang commented Aug 13, 2017

mjlaali commented Aug 17, 2017

bastings commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 22, 2017

bastings commented Aug 22, 2017

oahziur commented Aug 23, 2017

mjlaali commented Aug 23, 2017

bastings commented Aug 23, 2017 • edited Loading

oahziur commented Aug 24, 2017

bastings commented Aug 24, 2017

bastings commented Aug 23, 2017 •

edited

Loading