Skip to content

Common lisp implementation of unicode normalization functions

License

Notifications You must be signed in to change notification settings

sabracrolleton/uax-15

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

uax-15

Updated for Unicode 15

This package provides a common lisp unicode normalization function using nfc, nfd, nfkc and nfkd as per Unicode Standard Annex #15 found at http://www.unicode.org/reports/tr15/tr15-22.html.

This is a fork of a subset of work done by Takeru Ohta in 2010. Future work is intended to provide support for https://tools.ietf.org/html/rfc8264 and https://tools.ietf.org/html/rfc7564.

Implementation Notes

This has been successfully tested on sbcl, ccl, ecl, clisp, abcl, allegro and cmucl against the unicode test file found at http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt

Usage

It has one major exported function:

  • (normalize (str unicode-normalization-method))

The currently supported normalization methods are :nfc :nfkc :nfd :nfkd

Normalization example with reference to relevant xkcd https://www.xkcd.com/936/

    (normalize "正しい馬バッテリーステープル" :nfkc)
    "正しい馬バッテリーステープル"

    (normalize "الحصان الصحيح البطارية التيلة" :nfkc)
    "الحصان الصحيح البطارية التيلة"

    (normalize "اstáplacha ceart ceallraí capall" :nfkc)
    "اstáplacha ceart ceallraí capall"

To Do list

More relevant xkcd https://xkcd.com/1726/, https://xkcd.com/1953/, https://www.xkcd.com/1209/, https://xkcd.com/1137/

Data Files

Other References