Skip to content

Commit

Permalink
增加loadUserDict的編碼參數
Browse files Browse the repository at this point in the history
在讀取loadUserDict的時候有時會因為錯誤的編碼格式,而無法讀取到正確的字典檔內容。
  • Loading branch information
ender503 committed Aug 12, 2014
1 parent 619b5da commit 9618759
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions src/main/java/com/huaban/analysis/jieba/WordDictionary.java
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,13 @@ private String addWord(String word) {
return r.toString();
}


public void loadUserDict(File userDict) {
loadUserDict(userDict, Charset.forName("UTF-8"));
}


public void loadUserDict(File userDict, Charset charset) {
InputStream is;
try {
is = new FileInputStream(userDict);
Expand All @@ -128,7 +133,7 @@ public void loadUserDict(File userDict) {
return;
}
try {
BufferedReader br = new BufferedReader(new InputStreamReader(is));
BufferedReader br = new BufferedReader(new InputStreamReader(is, charset));
long s = System.currentTimeMillis();
int count = 0;
while (br.ready()) {
Expand Down

0 comments on commit 9618759

Please sign in to comment.