Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-16239 reduce lateral inher., harden some values to ensure same when resolved #2734

Closed
wants to merge 1 commit into from

Conversation

btangmu
Copy link
Member

@btangmu btangmu commented Feb 21, 2023

-New -fW for CLDRModify hardens values based on output of CompareResolved

-Minor revision of CompareResolved for lines to start with ≠DIFF and locale

-Resulting xml

CLDR-16239

  • This PR completes the ticket.

ALLOW_MANY_COMMITS=true

…hen resolved

-New -fW for CLDRModify hardens values based on output of CompareResolved

-Minor revision of CompareResolved for lines to start with ≠DIFF and locale

-Resulting xml
@btangmu btangmu self-assigned this Feb 21, 2023
@btangmu
Copy link
Member Author

btangmu commented Feb 21, 2023

this time I started with the data immediately preceding #2722

that is, I started with the data from cc8805a which is the parent of 36bf3ce

@btangmu
Copy link
Member Author

btangmu commented Feb 21, 2023

the goal here is not actually to change any code or data, but to check whether there are any problematic discrepancies between this data and the data that resulted from #2722

as we discussed, there's more than one way to "harden" the xml while ensuring the fully-resolved values remain the same; this alternative way is for comparison

the way I ran CLDRModify and CompareResolved is very complicated; following are the actual steps

rm ../Generated/cldr/cldrModify/*.xml
java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar modify -fV -scommon/main > ../tmp_fv.txt
mkdir -p ../vetdata-2023-02-21-modV/vxml/common/main
mkdir -p ../vetdata-2023-02-21-modV/vxml/common/dtd
cp common/main/*.xml ../vetdata-2023-02-21-modV/vxml/common/main
cp common/dtd/ldml.dtd ../vetdata-2023-02-21-modV/vxml/common/dtd
mv ../Generated/cldr/cldrModify/*.xml ../vetdata-2023-02-21-modV/vxml/common/main
java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar CompareResolved -v -c ../vetdata-2023-02-21-modV/vxml/common/main > ../ResolvedDifferencesRaw.txt
egrep ≠DIFF ../ResolvedDifferencesRaw.txt > ../ResolvedDifferences.txt
egrep -c ≠DIFF ../ResolvedDifferences.txt
[expect about 1,951]

mkdir -p ../tmp-2023-02-21-copy-main
cp common/main/*.xml ../tmp-2023-02-21-copy-main
cp ../vetdata-2023-02-21-modV/vxml/common/main/*.xml common/main

java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar modify -fW -scommon/main > ../tmp_fw.txt
mkdir -p ../vetdata-2023-02-21-modW/vxml/common/main
mkdir -p ../vetdata-2023-02-21-modW/vxml/common/dtd
cp common/main/*.xml ../vetdata-2023-02-21-modW/vxml/common/main
cp common/dtd/ldml.dtd ../vetdata-2023-02-21-modW/vxml/common/dtd
mv ../Generated/cldr/cldrModify/*.xml ../vetdata-2023-02-21-modW/vxml/common/main

cp ../tmp-2023-02-21-copy-main/*.xml common/main
java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar CompareResolved -v -c ../vetdata-2023-02-21-modW/vxml/common/main > ../ResolvedDifferencesRaw2.txt
egrep -c ≠DIFF ../ResolvedDifferencesRaw2.txt
[expect about 2]

egrep ≠DIFF ../ResolvedDifferencesRaw2.txt >> ../ResolvedDifferences.txt
java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar modify -fW -scommon/main > ../tmp_fw2.txt
mkdir -p ../vetdata-2023-02-21-modW2/vxml/common/main
mkdir -p ../vetdata-2023-02-21-modW2/vxml/common/dtd
cp common/main/*.xml ../vetdata-2023-02-21-modW2/vxml/common/main
cp common/dtd/ldml.dtd ../vetdata-2023-02-21-modW2/vxml/common/dtd
mv ../Generated/cldr/cldrModify/*.xml ../vetdata-2023-02-21-modW2/vxml/common/main
cp ../tmp-2023-02-21-copy-main/*.xml common/main
java -DCLDR_DIR=$(pwd) -jar tools/cldr-code/target/cldr-code.jar CompareResolved -v -c ../vetdata-2023-02-21-modW2/vxml/common/main > ../ResolvedDifferencesRaw3.txt

@btangmu
Copy link
Member Author

btangmu commented Feb 21, 2023

first impression is that these changes indicate a smaller amount of "hardening" in my method than in Mark's method

in some cases however it's the other way around; some sub-locales get a few "hardened" units such as "length-nautical-mile"

@btangmu
Copy link
Member Author

btangmu commented Feb 21, 2023

Test error: "Error: (TestBasic.java:757) Error: Default content file not empty: nn_NO"

Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in all the many cases i spot checked, this replaces real values with inheritance marker. Is that the opposite of what this says it does?

@macchiati
Copy link
Member

in all the many cases i spot checked, this replaces real values with inheritance marker. Is that the opposite of what this says it does?

I think this is working from the original source, before the hardening that my PR applied. So it would be 'restoring' the values.

I scanned it over, and I think there are some valuable learnings. What I think we should do is go forward with what is in main, but for v44 look at reprocessing the source to add some inheritance markers. Namely,

  • In an L1 locale, if no other path inherits from this path, and the value is a hard value that is identical to root or code fallback, then we could replace the value by the inheritance marker. However, we may decide not to do this, if we don't want changes in root to percolate upwards; root is quite different than say de.xml in that it is usually just a stand-in if the language fails to have anything.

Let's file a ticket to consider it.

@btangmu
Copy link
Member Author

btangmu commented Feb 22, 2023

"I think this is working from the original source, before the hardening that my PR applied. So it would be 'restoring' the values." -- yes, that's my understanding as well -- in other words, these are cases where Mark's method hardened the value and my method didn't -- again, though, that's not a very important difference as long as the resulting resolved values are the same

@btangmu
Copy link
Member Author

btangmu commented Feb 22, 2023

"valuable learnings ... for v44 ... file a ticket" -- good!

@btangmu
Copy link
Member Author

btangmu commented Feb 22, 2023

New v44 ticket is https://unicode-org.atlassian.net/browse/CLDR-16414

@btangmu btangmu closed this Feb 22, 2023
@btangmu btangmu deleted the t16239_xx branch February 22, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants