-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
test: consolidate utf8 text fixtures in tests
We previously used a text that appears to be an excerpt of https://zh.wikipedia.org/wiki/%E5%8D%97%E8%B6%8A%E5%9B%BD and can have copyright/license complications. It may also include some geopolitical nuances. The text has been repeated through out the code base without much reuse. This patch consolidates the fixtures by adding a common helper string as `fixtures.utf8TestText` which is identical to a copy in test/fixtures/utf8_test_text.txt. It also updates the text to a copy of 蘭亭集序, It was chosen because: 1. It's a well-known Chinese classical piece written in 353 CE and therefore in public domain. The string is copied from https://zh.wikisource.org/zh-hant/%E8%98%AD%E4%BA%AD%E9%9B%86%E5%BA%8F which contains a disclaimer of copyright for this reason. 2. The text is in suitable length for general UTF8 string read/write tests (including punctuations, 389 code points and 1167 bytes). 3. This is also commonly used as reference text for Chinese text layout tests. 4. It's a timeless and harmless preface for a collection of poems, written by a uncontroversial figure who passed away >1600 years ago and contains no geopolitical nuances. Background and an English translation of this text can be found at https://en.wikipedia.org/wiki/Lantingji_Xu PR-URL: #50732 Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
- Loading branch information
1 parent
eecab88
commit 94462d4
Showing
8 changed files
with
34 additions
and
55 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
永和九年,嵗在癸丑,暮春之初,會於會稽山隂之蘭亭,脩稧事也。羣賢畢至,少長咸集。此地有崇山峻領,茂林脩竹;又有清流激湍,暎帶左右。引以為流觴曲水,列坐其次。雖無絲竹管弦之盛,一觴一詠,亦足以暢敘幽情。是日也,天朗氣清,恵風和暢;仰觀宇宙之大,俯察品類之盛;所以遊目騁懐,足以極視聽之娛,信可樂也。夫人之相與,俯仰一世,或取諸懐抱,悟言一室之內,或因寄所託,放浪形骸之外。雖趣舎萬殊,靜躁不同,當其欣扵所遇,暫得扵己,怏然自足,不知老之將至。及其所之既惓,情隨事遷,感慨係之矣。向之所欣,俛仰之閒以為陳跡,猶不能不以之興懐;況脩短隨化,終期扵盡。古人云:「死生亦大矣。」豈不痛哉!每攬昔人興感之由,若合一契,未嘗不臨文嗟悼,不能喻之扵懐。固知一死生為虛誕,齊彭殤為妄作。後之視今,亦由今之視昔,悲夫!故列敘時人,錄其所述,雖世殊事異,所以興懐,其致一也。後之攬者,亦將有感扵斯文。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters