-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the usage of nltk bug #1515
Conversation
@@ -1641,6 +1641,8 @@ def _tokenize(self, text, is_sentencepiece=True): | |||
text = convert_to_unicode(text) | |||
text = " ".join(text.split()) # remove duplicate whitespace | |||
nltk = try_import('nltk') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
怎么可以再关键函数上反复try import呢,这些都得在初始化阶段去做
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已将nltk imort放到__init__函数中
nltk 在下载模型的时候会很慢很卡,这个地方是否评估过了?@joey12300 |
这里打开代理后下载就几秒,但是关了代理就要五六分钟,也没有输出进度条像是hang住一样。我把这条命令单独拿出来在README上说明一下 |
好的,争取今天内合入。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Bug fixes
PR changes
APIs
Description