-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Closed
Description
System Info
https://moon-ci-docs.huggingface.co/docs/transformers/pr_1/ja/preprocessing#pad
some code part in documentation was mistakenly translated to Japanese.
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Expected behavior
batch_sentences = [
"でもセカンドブレックファーストはどうなるの?",
"セカンドブレックファーストについては知らないと思う、ピップ。",
"イレブンジーズはどうなの?",
]
encoded_input = tokenizer(batch_sentences, padding=True)
print(encoded_input)
should be
batch_sentences = [
"でもセカンドブレックファーストはどうなるの?", #english text
"セカンドブレックファーストについては知らないと思う、ピップ。", #english text
"イレブンジーズはどうなの?", #english text
]
encoded_input = tokenizer(batch_sentences, padding=True)
print(encoded_input)Metadata
Metadata
Assignees
Labels
No labels