Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Why was bpe applied to both tokenized and cleaned texts #29

Open
anglil opened this issue Jul 24, 2017 · 0 comments
Open

Why was bpe applied to both tokenized and cleaned texts #29

anglil opened this issue Jul 24, 2017 · 0 comments

Comments

@anglil
Copy link

anglil commented Jul 24, 2017

In wmt'16 preprocessing script, why is applied to both tokenized and cleaned texts? Is bpe supposed to only be used on cleaned (pruned) texts? Thanks!

@anglil anglil changed the title Why bpe applied to both tokenized and cleaned texts Why was bpe applied to both tokenized and cleaned texts Jul 24, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant