加载默认tokenizer 进行字符转换时,会出现部分字符为【UNK】 TOKENIZER = TOKENIZER = BertTokenizer.from_pretrained("bert-base-uncased", do_lower_case=True) tokenizer = TOKENIZER