Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CLDR-11888 Update French speakers (#3985)
This change makes multiple updates to the French-speaking population listed in CLDR. Gratefully, much of this data was compiled in [a comprehensive meta-study](https://observatoire.francophonie.org//wp-content/uploads/2022/03/odsef-lfdm-2022.pdf ) by the International Francophone Organization (OIF), with many collaborators such as the University of Laval. Some estimates come from other sources during the process of vetting the OIF estimates. ## Jira Tickets There are 4 CLDR tickets avoud French population estimates fixed by this change * Syria [CLDR-11885](https://unicode-org.atlassian.net/browse/CLDR-11885) -- French is no longer and official language of Syria. Unfortunately I could not find a valid population estimate so I left the number as-is * Haiti [CLDR-11886](https://unicode-org.atlassian.net/browse/CLDR-11886) * DRCongo [CLDR-11887](https://unicode-org.atlassian.net/browse/CLDR-11887) * Djibouti [CLDR-11888](https://unicode-org.atlassian.net/browse/CLDR-11888) -- Note that after this change it will recognize that French IS the biggest langauge of Djibouti, not Afar. It's surprising by back up by use data. ## Sources I couldn't accept all data points from that study (as Mark Davis recommended, I sought corroborating sources) -- leaving out Burundi, Cameroon, Mauritius, Germany, Portugal, Belgium, and Andorra because of large, uncertain discrepancies. Furthermore, I used the Canadian census directly for Canada, and the Swiss census website (updating all of them because it wasn't that much and the estimates we very different). I didn't cite every primary source since sometimes the OIF added up numbers from multiple sources and I didn't have time to thoroughly open every cited census. The Eurostat website contains many surveys that were compiled by the French organization -- [the website for one of them is here](https://ec.europa.eu/eurostat/web/microdata/adult-education-survey) . ## Effects * Likely subtags (based on the relative population) * French is now the largest language in 2 countries * Democratic Republic of Congo (replacing Swahili) * Djibouti (replacing Afar) * French is now no longer the largest language in 3 countries * Central African Republic (replaced by Sango) * Senegal (replaced by Wolof) * Chad (replaced by Arabic) * French is no longer considered official for Syria Many countries lost a lot of French users -- but since the prior figure did not have citations and other sources tended to agree, I made the change. Mali, Madagascar, Hungary, Niger, Comoros, Chad, and Central African Republic. Some countries gained a lot: D.R. Congo, Djibouti, Haiti, Lebanon, Switzerland, Morocco, Walls & Futuna, French Polynesia, Rwanda. For countries that had large changes I double-checked with other sources that it made sense, especially since it will change likely subtags. There could be literacy gaps favoring French -- if you want me to press on that data I can investigate. # Steps - [x] This PR completes 4 tickets. - [x] [CLDR-11885](https://unicode-org.atlassian.net/browse/CLDR-11885) - [x] [CLDR-11886](https://unicode-org.atlassian.net/browse/CLDR-11886) - [x] [CLDR-11887](https://unicode-org.atlassian.net/browse/CLDR-11887) - [x] [CLDR-11888](https://unicode-org.atlassian.net/browse/CLDR-11888) - [x] mvn package -DskipTests=true - [x] java -jar tools/cldr-code/target/cldr-code.jar ConvertLanguageData - [x] java -jar tools/cldr-code/target/cldr-code.jar GenerateLikelySubtags - [x] java -jar tools/cldr-code/target/cldr-code.jar GenerateTestData - [x] mvn package Thanks for reviewing this large change -- it was fun to read so many sources, read in French and German, and learn more about countries around world.
- Loading branch information