Skip to content

Commit

Permalink
test segment speed, then update wiki.
Browse files Browse the repository at this point in the history
  • Loading branch information
piaolingxue committed Aug 6, 2013
1 parent 0ef1220 commit ca5ec8f
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 3 deletions.
31 changes: 31 additions & 0 deletions README.org
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,37 @@


* 性能评估
- 测试机配置
#+BEGIN_SRC screen
Processor 2 Intel(R) Pentium(R) CPU G620 @ 2.60GHz
Memory:8GB

分词测试时机器开了许多应用(eclipse、emacs、chrome...),可能
会影响到测试速度
#+END_SRC
- [[src/test/resources/test.txt][测试文本]]
- 测试结果(单线程,对测试文本逐行分词,并循环调用上万次)
#+BEGIN_SRC screen
循环调用一万次
第一次测试结果:
time escape:12373, rate:2486.986533kb/s, words:917319.94/s
第二次测试结果:
time escape:12284, rate:2505.005241kb/s, words:923966.10/s
第三次测试结果:
time escape:12336, rate:2494.445880kb/s, words:920071.30/s

循环调用2万次
第一次测试结果:
time escape:22237, rate:2767.593144kb/s, words:1020821.12/s
第二次测试结果:
time escape:22435, rate:2743.167762kb/s, words:1011811.87/s
第三次测试结果:
time escape:22102, rate:2784.497726kb/s, words:1027056.34/s
统计结果:词典加载时间1.8s左右,分词效率每秒2Mb多,近100万词。

#+END_SRC



* 许可证
jieba(python版本)的许可证为MIT,jieba(java版本)的许可证为ApacheLicence 2.0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,14 +128,17 @@ public void testBugSentence() {
@Test
public void testSegmentSpeed() {
long length = 0L;
long wordCount = 0L;
long start = System.currentTimeMillis();
for (int i = 0; i < 100; ++i)
for (int i = 0; i < 20000; ++i)
for (String sentence : sentences) {
segmenter.process(sentence, SegMode.INDEX);
length += sentence.getBytes().length;
wordCount += sentence.length();
}
long elapsed = (System.currentTimeMillis() - start);
System.out.println(String.format("time escape:%d, rate:%fkb/s", elapsed, (length * 1.0)
/ 1024.0f / (elapsed * 1.0 / 1000.0f)));
System.out.println(String.format("time escape:%d, rate:%fkb/s, words:%.2f/s", elapsed,
(length * 1.0) / 1024.0f / (elapsed * 1.0 / 1000.0f), wordCount * 1000.0f
/ (elapsed * 1.0)));
}
}

0 comments on commit ca5ec8f

Please sign in to comment.