Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Commit

Permalink
Merge of PR #1895
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 392071163
  • Loading branch information
syzymon authored and copybara-github committed Aug 20, 2021
1 parent 874389b commit 7ae6d28
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions tensor2tensor/data_generators/enwik8.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,12 @@ def generate_encoded_samples(self, data_dir, tmp_dir, dataset_split):

@registry.register_problem
class Enwik8L2k(Enwik8L65k):
"""Enwiki8, with examples up to 2048 characters long. Reads the input
byte-wise and chunks it into fragments of maximum length of 2048. Does not
shift byte indices (we do not assume cls or pad are used),
unlike the base class!"""
"""Enwiki8, with examples up to 2048 characters long.
Reads the input byte-wise and chunks it into fragments of maximum
length of 2048. Does not shift byte indices (we do not assume cls or
pad are used), unlike the base class!
"""

READ_MODE = "rb"

Expand Down

0 comments on commit 7ae6d28

Please sign in to comment.