Skip to content

Conversation

@colinodell
Copy link
Contributor

These changes were created by:

  1. Checking out the latest master of this repository
  2. Downloading and extracting https://www.unicode.org/Public/zipped/14.0.0/UCD.zip
  3. Running the included ./ucgendat.php script

All tests seem to pass locally, but I'd appreciate a second set of eyes to ensure everything looks good.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge this into 8.1.

@nikic nikic closed this in fe36b81 Sep 20, 2021
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 26, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 26, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 27, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 29, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
alexdowad pushed a commit that referenced this pull request Jun 29, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, #7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Sep 15, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, php#7502, php#14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Sep 16, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, php#7502, php#14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
alexdowad pushed a commit that referenced this pull request Sep 17, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, #7502, #14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
@mvorisek mvorisek mentioned this pull request Sep 11, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants