@@ -37,7 +37,7 @@ Attention Is All You Need
37
37
* learn and apply [ Sentences were encoded using byte-pair encoding] ( https://github.com/SeonbeomKim/Python-Bype_Pair_Encoding )
38
38
* -num_merges: 35000
39
39
* -final_voca_threshold: 50
40
- * generated bpe files and voca
40
+ * generate bpe applied files and voca
41
41
42
42
## Code
43
43
* transformer.py
@@ -59,9 +59,9 @@ Attention Is All You Need
59
59
```
60
60
python make_dataset.py
61
61
-mode train
62
- -source_input_path path/bpe_wmt17.en [bpe text data]
62
+ -source_input_path path/bpe_wmt17.en [bpe applied document data]
63
63
-source_out_path path/source_idx_wmt17_en.csv [bpe idx data]
64
- -target_input_path path/bpe_wmt17.de [bpe text data]
64
+ -target_input_path path/bpe_wmt17.de [bpe applied document data]
65
65
-target_out_path path/source_idx_wmt17_de.csv [bpe idx data
66
66
-bucket_out_path ./bpe_dataset/train_set_wmt17 [bucket trainset]
67
67
-voca_path voca_path/voca_file_name [bpe voca]
@@ -70,7 +70,7 @@ Attention Is All You Need
70
70
```
71
71
python make_dataset.py
72
72
-mode infer
73
- -source_input_path path/bpe_newstest2014.en [bpe text data]
73
+ -source_input_path path/bpe_newstest2014.en [bpe applied document data]
74
74
-source_out_path path/source_idx_newstest2014_en.csv [bpe idx data]
75
75
-target_input_path path/dev.tar/newstest2014.tc.de [original raw data]
76
76
-bucket_out_path ./bpe_dataset/valid_set_newstest2014 [bucket validset]
@@ -80,7 +80,7 @@ Attention Is All You Need
80
80
```
81
81
python make_dataset.py
82
82
-mode infer
83
- -source_input_path path/bpe_newstest2015.en [bpe text data]
83
+ -source_input_path path/bpe_newstest2015.en [bpe applied document data]
84
84
-source_out_path path/source_idx_newstest2015_en.csv [bpe idx data]
85
85
-target_input_path path/dev.tar/newstest2015.tc.de [original raw data]
86
86
-bucket_out_path ./bpe_dataset/valid_set_newstest2015 [bucket testset]
@@ -90,7 +90,7 @@ Attention Is All You Need
90
90
```
91
91
python make_dataset.py
92
92
-mode infer
93
- -source_input_path path/bpe_newstest2016.en [bpe text data]
93
+ -source_input_path path/bpe_newstest2016.en [bpe applied document data]
94
94
-source_out_path path/source_idx_newstest2016_en.csv [bpe idx data]
95
95
-target_input_path path/dev.tar/newstest2016.tc.de [original raw data]
96
96
-bucket_out_path ./bpe_dataset/valid_set_newstest2016 [bucket testset]
0 commit comments