-
Notifications
You must be signed in to change notification settings - Fork 31.3k
🌐 [i18n-KO] Translated tokenizer_summary.md to Korean
#25023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
sim-so
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
나연님 항상 알기 쉬운 말로 번역해주셔서 좋아요!
게다가 이번 문서에서 토크나이저를 쭉 둘러볼 수 있어서 유익했습니다 😊
리뷰 하면서 glossary 관련한 수정 제안을 몇 가지 드렸습니다.
참고 부탁 드립니다!
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Juntae <79131091+sronger@users.noreply.github.com> Co-Authored-By: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
8038515 to
e9bfdc0
Compare
|
리뷰를 남기고 submit을 안 했었네요 .. ㅠ |
ㅋㅋㅋㅋㅋㅋ 다행히 지금 번역 수정중이라서 확인했습니다! |
제가 번역을 오랜만에 해서 그런지 glossary 관련 수정사항이 많군요.. 꼼꼼한 리뷰 감사합니다!! |
stevhliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! Looks good overall, except for a few formatting things 👍
|
|
||
| <a id='byte-pair-encoding'></a> | ||
|
|
||
| ### 바이트 페어 인코딩(Byte-Pair Encoding, BPE)[[bytepair-encoding-bpe]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ### 바이트 페어 인코딩(Byte-Pair Encoding, BPE)[[bytepair-encoding-bpe]] | |
| ### 바이트 페어 인코딩 (Byte-Pair Encoding, BPE)[[bytepair-encoding-bpe]] |
| 이전에 언급했듯이 어휘 크기(즉 기본 어휘 크기 + 병합 횟수)는 선택해야하는 하이퍼파라미터입니다. | ||
| 예를 들어 [GPT](model_doc/gpt)의 기본 어휘 크기는 478, 40,000번의 병합 이후에 훈련을 종료하기 때문에 어휘 크기가 40,478입니다. | ||
|
|
||
| #### 바이트 수준 (Byte-level) BPE[[bytelevel-bpe]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #### 바이트 수준 (Byte-level) BPE[[bytelevel-bpe]] | |
| #### 바이트 수준 (Byte-level BPE) [[bytelevel-bpe]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the translations, @HanNayeoniee ! Since 바이트 수준 refers to Byte-level, either:
- repeating BPE for clarity or
- removing the space in-between parenthesis would help
| #### 바이트 수준 (Byte-level) BPE[[bytelevel-bpe]] | |
| #### 바이트 수준 BPE (Byte-level BPE)[[bytelevel-bpe]] |
Please resolve the remaining suggestions as well to merge the PR 😄
|
|
||
| <a id='sentencepiece'></a> | ||
|
|
||
| ### 센텐스피스(SentencePiece)[[sentencepiece]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ### 센텐스피스(SentencePiece)[[sentencepiece]] | |
| ### 센텐스피스 (SentencePiece)[[sentencepiece]] |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Translated the
<tokenizer_summary>.mdfile of the documentation to Korean.Thank you in advance for your review.
Part of #20179
Before reviewing
[[lowercased-header]])Who can review? (Initial)
@sronger, @TaeYupNoh, @kj021, @HanNayeoniee, @eenzeenee, @sim-so
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review? (Final)