Skip to content

Commit 82720d0

Browse files
adarobtfds-copybara
authored andcommitted
Add WMT translate datasets for 2017-19, with trivial extensibility to additional years.
PiperOrigin-RevId: 237342321
1 parent e790177 commit 82720d0

File tree

76 files changed

+1540
-398
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+1540
-398
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
běžím
2+
zmizel
3+
Plav
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
I am running (cc)
2+
3+
I am swimming (cc)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
ich renne
2+
es verschwand
3+
ich schwimme
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
I am running
2+
3+
I am swimming
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# This is a fake perl script to mimic the one written to filter CzEng 1.6 to
2+
# create CzEng 1.7. Our code just parses it to find the blocks that need to be
3+
# filtered out.
4+
5+
use strict;
6+
7+
my %bad = map { ($_, 1) } qw{
8+
2 3 5
9+
9 10 16
10+
};
11+
12+
13+
print STDERR "Done.\n";
14+

0 commit comments

Comments
 (0)