-
Notifications
You must be signed in to change notification settings - Fork 2k
version check in misc_utils.py likely needs to be changed #168
Comments
hmmm somehow the pound sign commenting out the dev2017 etc line caused that to go to bold face and large font???? |
@David-Levinthal tf-1.4 branch doesn't have the "-dev20171024" suffix anymore. |
which is why I suggested changing misc_utils.py in the nmt distribution
https://github.com/tensorflow/nmt/blob/master/nmt/utils/misc_utils.py
:-)
…On Thu, Nov 9, 2017 at 3:29 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> tf-1.4 branch
<https://github.com/tensorflow/nmt/tree/tf-1.4> doesn't have the
"-dev20171024" suffix anymore.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuT7--WLsZDHirmX8r30hp8L5l_eIMks5s04rRgaJpZM4QYyG5>
.
|
@David-Levinthal sorry, I meant you should check out tf-1.4 branch instead of master if you want to use tensorflow r1.4. If you want to use the master branch, you have to install tf-nightly. |
I thought that as NMT is now part of the tutorial suite you might want it
to work out of the box with the master/top of tree/r1.4 TF release.. The
default nmt git clone at the moment requires the trivial edit to get it to
run at all.
it seems to work just fine btw
:-)
for those of us who are not at google (anymore)
d
…On Thu, Nov 9, 2017 at 5:21 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> sorry, I meant you
should check out tf-1.4 branch instead of master if you want to use
tensorflow r1.4.
If you want to use the master branch, you have to install tf-nightly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuTyfeTQDRc6vRqrH_ec7JflvWhhY0ks5s06UKgaJpZM4QYyG5>
.
|
@David-Levinthal Thanks for the suggestion. The reason the dev suffix is enforced in the master branch is because there is a bug fix on Beam Search's correctness and some structure change on the LSTM cell in tensorflow that is not included in the r1.4 release. I will see if we can relax the version restriction in future. |
thank you..
just trying to make it easier to use these tests for HW evaluation and the
SW support the vendors provide
d
…On Fri, Nov 10, 2017 at 3:54 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> Thanks for the
suggestion. The reason the dev suffix is enforced in the master branch is
because there is a bug fix on Beam Search's correctness and some structure
change on the LSTM cell in tensorflow that is not included in the r1.4
release. I will see if we can relax the version restriction in future.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuTwl23Rgwzi8CPyzy_80d9QiorzVvks5s1OJEgaJpZM4QYyG5>
.
|
@David-Levinthal any reason you want to test against master branch? Can you test with tf-1.4 branch? For example, you can clone the single tf-1.4 branch directly for testing.
|
I did..
same thing..
that was why I locally changed my version of the misc_util.py
:-)
BTW..Should bugs against the models distribution of ptb that is referenced
in the tutorials really be sent to stack overflow?
I am fine to do that..is there a standard policy on how issues should be
reported?
Re: [tensorflow/models] ptb fails on R1.4 built with cuda9, cudnn7 (#2769)
was closed and I was told to report it in stackoverflow
d
…On Sun, Nov 12, 2017 at 6:50 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> any reason you want
to test against master branch? Can you test with tf-1.4 branch?
For example, you can clone the single tf-1.4 branch directly for testing.
git clone --single-branch --branch tf-1.4 https://github.com/tensorflow/nmt/
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuT_51T-I1v0c3w__2KONx03Z_mfwlks5s165tgaJpZM4QYyG5>
.
|
@David-Levinthal hmm, on tf-1.4 branch, the version string is "1.4.0". (https://github.com/tensorflow/nmt/blob/tf-1.4/nmt/utils/misc_utils.py#L32). Is there anything I can change on the tf-1.4 branch to make it works for you? |
sorry..my mistake..mea culpa
I thought you meant TF R1.4
:-)
thank you.
yea..that should work..I will switch to that version.
on another note. I have a few questions about interpreting the output of
nmt. Is there a preferred method for me to send them to you and would you
have time to answer them?
d
…On Mon, Nov 13, 2017 at 7:49 AM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> hmm, on tf-1.4
branch, the version string is "1.4.0". (https://github.com/
tensorflow/nmt/blob/tf-1.4/nmt/utils/misc_utils.py#L32). Is there
anything I can change on the tf-1.4 branch to make it works for you?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuT3H-VZaZ-aKkR54heqlxCAsMwilpks5s2GUZgaJpZM4QYyG5>
.
|
@David-Levinthal Feel free to post any questions related to the nmt codebase here. I will try to answer them. |
do you mean into this issue record/exchange?
…On Mon, Nov 13, 2017 at 5:30 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> Feel free to post
any questions related to the nmt codebase here. I will try to answer them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuTzmcIkBgkLGCTEOVdwkdRaOnN2Njks5s2O0lgaJpZM4QYyG5>
.
|
I am trying to understand the performance indicated by the training output
consider the output for the 4 layer default model, german to english, run
on a P4, TF R1.4 with cuda9/cudnn7
the batch size is 128 and the average sentence length is just a bit below
30 (my value)
global step 29100 lr 1 step-time 2.08s wps 3.41K ppl 20.67 bleu 19.36
if a time step is one mini batch of 128 sentences is 2.08 sec, one would
think the word per second rate would be ~ 1850 instead of 3410
is the wps value the sum of input + output words?
added note
as the sentence length will be different in the two languages it might be
good to print input and output wps values.
the number of FP operations/input word for a seq2seq lstm is ~
8*num_layers*hidden_size*hidden_size for the forward pass (I have tested
with with nvprof..it is extremely accurate for large hidden size)
training increase the count by ~ 3X
so having input and output wps would give you the flop rates as the input
seq has one bi directional layer and the input and output wps rates are
going to be a little different (1.07 for german and english)
:-)
d
…On Mon, Nov 13, 2017 at 5:30 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> Feel free to post
any questions related to the nmt codebase here. I will try to answer them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuTzmcIkBgkLGCTEOVdwkdRaOnN2Njks5s2O0lgaJpZM4QYyG5>
.
|
@David-Levinthal Yes, the wps count is (input words + output words). I believe it is just the value of this tensor (https://github.com/tensorflow/nmt/blob/master/nmt/model.py#L101). We just use the value as one of our performance measurement for the same language pair, but I think it makes sense to report both input words, output words and total words count. |
Thank you, that would be greatly appreciated.
Should you have some time you would be willing to lose forever you might
look at
https://github.com/David-Levinthal/machine-learning
and the rather long pdf file there.
I would love to get it corrected by somebody who actually understands this
stuff as I am sure I have several appalling errors in it
:-)
target audience is mainly students, hw architects, hw performance type
(like me)
BTW..the expression for FP operations in an LSTM was missing a factor
2..and should be
8*2*hidden_size*hidden_size.
this seems to be confirmed with glample and the older version of PTB in the
tf/models directory referenced in the tutorials.
:-)
d
…On Tue, Nov 14, 2017 at 7:55 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal>
(sorry, I meant to create a new issue if it is unrelated this issue's
topic)
Yes, the wps count is (input words + output words). I believe it is just
the value of this tensor (https://github.com/tensorflow/nmt/blob/master/
nmt/model.py#L101).
We just use the value as one of our performance measurement for the same
language pair, but I think it makes sense to report both input words,
output words and total words count.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuT_FOLHN05-FfpAaD341blwr2_Etaks5s2mC-gaJpZM4QYyG5>
.
|
could I ask you about bleu scores?
There seem to be a plethora of different methods to evaluate accuracy and
it is far from obvious how to compare them
and thus tell the relative accuracy of different translators.
for example tensor2tensor prints out something like:
INFO:tensorflow:Validation (step 12000): loss = 3.19503,
metrics-translate_ende_wmt32k/accuracy = 0.437646, global_step = 10833,
metrics-translate_ende_wmt32k/accuracy_per_sequence = 0.0,
metrics-translate_ende_wmt32k/accuracy_top5 = 0.628038,
metrics-translate_ende_wmt32k/rouge_L_fscore = 0.42923,
metrics-translate_ende_wmt32k/approx_bleu_score = 0.146297,
metrics-translate_ende_wmt32k/rouge_2_fscore = 0.207756,
metrics-translate_ende_wmt32k/neg_log_perplexity = -3.62409
INFO:tensorflow:global_step/sec: 1.87979
INFO:tensorflow:loss = 2.49598, step = 12001 (53.190 sec)
while NMT prints a bleu score between 20 and 30 for en-de depending on
config and duration of training etc
d
On Wed, Nov 15, 2017 at 9:03 AM, David Levinthal <david.levinthal1@gmail.com
… wrote:
Thank you, that would be greatly appreciated.
Should you have some time you would be willing to lose forever you might
look at
https://github.com/David-Levinthal/machine-learning
and the rather long pdf file there.
I would love to get it corrected by somebody who actually understands this
stuff as I am sure I have several appalling errors in it
:-)
target audience is mainly students, hw architects, hw performance type
(like me)
BTW..the expression for FP operations in an LSTM was missing a factor
2..and should be
8*2*hidden_size*hidden_size.
this seems to be confirmed with glample and the older version of PTB in
the tf/models directory referenced in the tutorials.
:-)
d
On Tue, Nov 14, 2017 at 7:55 PM, Rui Zhao ***@***.***>
wrote:
> @David-Levinthal <https://github.com/david-levinthal>
> (sorry, I meant to create a new issue if it is unrelated this issue's
> topic)
>
> Yes, the wps count is (input words + output words). I believe it is just
> the value of this tensor (https://github.com/tensorflow
> /nmt/blob/master/nmt/model.py#L101).
>
> We just use the value as one of our performance measurement for the same
> language pair, but I think it makes sense to report both input words,
> output words and total words count.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#168 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AIUuT_FOLHN05-FfpAaD341blwr2_Etaks5s2mC-gaJpZM4QYyG5>
> .
>
|
@David-Levinthal I think the bleu score is a percentage in NMT. You can do |
so
metrics-translate_ende_wmt32k/approx_bleu_score = 0.146297,
I should use 14.6 for where the training was running at that time?
d
…On Thu, Nov 16, 2017 at 8:41 PM, Rui Zhao ***@***.***> wrote:
@David-Levinthal <https://github.com/david-levinthal> I think the bleu
score is a percentage in NMT. You can do nmt_bleu_score / 100 to compare
with T2T bleu score.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#168 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIUuTzfumC00F0-I82aIX0LhHS1zTCfnks5s3Q5-gaJpZM4QYyG5>
.
|
very very minor issue.
I downloaded nmt today and found it would not run on TF version with an r1.4 f git checkout. I then built todays (11/9) top of tree qand got the same issue due to misc_util.py. Ifixed it as follows
def check_tensorflow_version():
min_tf_version = "1.4.0"
min_tf_version = "1.4.0-dev20171024"
if tf.version < min_tf_version:
raise EnvironmentError("Tensorflow version must >= %s" % min_tf_version)
The text was updated successfully, but these errors were encountered: