Skip to content

Commit

Permalink
CLDR-11888 Update French speakers (#3985)
Browse files Browse the repository at this point in the history
This change makes multiple updates to the French-speaking population listed in CLDR. Gratefully, much of this data was compiled in [a comprehensive meta-study](https://observatoire.francophonie.org//wp-content/uploads/2022/03/odsef-lfdm-2022.pdf ) by the International Francophone Organization (OIF), with many collaborators such as the University of Laval. Some estimates come from other sources during the process of vetting the OIF estimates.

## Jira Tickets

There are 4 CLDR tickets avoud French population estimates fixed by this change
*  Syria [CLDR-11885](https://unicode-org.atlassian.net/browse/CLDR-11885) -- French is no longer and official language of Syria. Unfortunately I could not find a valid population estimate so I left the number as-is
*  Haiti [CLDR-11886](https://unicode-org.atlassian.net/browse/CLDR-11886)
*  DRCongo [CLDR-11887](https://unicode-org.atlassian.net/browse/CLDR-11887)
*  Djibouti [CLDR-11888](https://unicode-org.atlassian.net/browse/CLDR-11888) -- Note that after this change it will recognize that French IS the biggest langauge of Djibouti, not Afar. It's surprising by back up by use data.

## Sources

I couldn't accept all data points from that study (as Mark Davis recommended, I sought corroborating sources) -- leaving out Burundi, Cameroon, Mauritius, Germany, Portugal, Belgium, and Andorra because of large, uncertain discrepancies.

Furthermore, I used the Canadian census directly for Canada, and the Swiss census website (updating all of them because it wasn't that much and the estimates we very different).

I didn't cite every primary source since sometimes the OIF added up numbers from multiple sources and I didn't have time to thoroughly open every cited census. The Eurostat website contains many surveys that were compiled by the French organization -- [the website for one of them is here](https://ec.europa.eu/eurostat/web/microdata/adult-education-survey) .

## Effects
* Likely subtags (based on the relative population)
  * French is now the largest language in 2 countries
    * Democratic Republic of Congo (replacing Swahili)
    * Djibouti (replacing Afar)
  * French is now no longer the largest language in 3 countries
    * Central African Republic (replaced by Sango)
    * Senegal (replaced by Wolof)
    * Chad (replaced by Arabic)
* French is no longer considered official for Syria

Many countries lost a lot of French users -- but since the prior figure did not have citations and other sources tended to agree, I made the change. Mali, Madagascar, Hungary, Niger, Comoros, Chad, and Central African Republic.

Some countries gained a lot: D.R. Congo, Djibouti, Haiti, Lebanon, Switzerland, Morocco, Walls & Futuna, French Polynesia, Rwanda.

For countries that had large changes I double-checked with other sources that it made sense, especially since it will change likely subtags. There could be literacy gaps favoring French -- if you want me to press on that data I can investigate.

# Steps

- [x] This PR completes 4 tickets.
  - [x] [CLDR-11885](https://unicode-org.atlassian.net/browse/CLDR-11885) 
  - [x] [CLDR-11886](https://unicode-org.atlassian.net/browse/CLDR-11886) 
  - [x] [CLDR-11887](https://unicode-org.atlassian.net/browse/CLDR-11887) 
  - [x] [CLDR-11888](https://unicode-org.atlassian.net/browse/CLDR-11888) 
- [x] mvn package -DskipTests=true
- [x] java -jar tools/cldr-code/target/cldr-code.jar ConvertLanguageData
- [x] java -jar tools/cldr-code/target/cldr-code.jar GenerateLikelySubtags
- [x] java -jar tools/cldr-code/target/cldr-code.jar GenerateTestData
- [x] mvn package

Thanks for reviewing this large change -- it was fun to read so many sources, read in French and German, and learn more about countries around world.
  • Loading branch information
conradarcturus authored Nov 5, 2024
1 parent 48afc24 commit 78ff1ae
Show file tree
Hide file tree
Showing 4 changed files with 177 additions and 152 deletions.
13 changes: 7 additions & 6 deletions common/supplemental/likelySubtags.xml
Original file line number Diff line number Diff line change
Expand Up @@ -880,8 +880,8 @@ not be patched by hand, as any changes made in that fashion may be lost.
<likelySubtag from="und_BV" to="no_Latn_BV"/> <!--?‧?‧Bouvet Island ➡ Norwegian‧Latin‧Bouvet Island-->
<likelySubtag from="und_BY" to="be_Cyrl_BY"/> <!--?‧?‧Belarus ➡ Belarusian‧Cyrillic‧Belarus-->
<likelySubtag from="und_CC" to="ms_Arab_CC"/> <!--?‧?‧Cocos (Keeling) Islands ➡ Malay‧Arabic‧Cocos (Keeling) Islands-->
<likelySubtag from="und_CD" to="sw_Latn_CD"/> <!--?‧?‧Congo - Kinshasa ➡ Swahili‧Latin‧Congo - Kinshasa-->
<likelySubtag from="und_CF" to="fr_Latn_CF"/> <!--?‧?‧Central African Republic ➡ French‧Latin‧Central African Republic-->
<likelySubtag from="und_CD" to="fr_Latn_CD"/> <!--?‧?‧Congo - Kinshasa ➡ French‧Latin‧Congo - Kinshasa-->
<likelySubtag from="und_CF" to="sg_Latn_CF"/> <!--?‧?‧Central African Republic ➡ Sango‧Latin‧Central African Republic-->
<likelySubtag from="und_CG" to="fr_Latn_CG"/> <!--?‧?‧Congo - Brazzaville ➡ French‧Latin‧Congo - Brazzaville-->
<likelySubtag from="und_CH" to="de_Latn_CH"/> <!--?‧?‧Switzerland ➡ German‧Latin‧Switzerland-->
<likelySubtag from="und_CI" to="fr_Latn_CI"/> <!--?‧?‧Côte d’Ivoire ➡ French‧Latin‧Côte d’Ivoire-->
Expand All @@ -896,7 +896,7 @@ not be patched by hand, as any changes made in that fashion may be lost.
<likelySubtag from="und_CY" to="el_Grek_CY"/> <!--?‧?‧Cyprus ➡ Greek‧Greek‧Cyprus-->
<likelySubtag from="und_CZ" to="cs_Latn_CZ"/> <!--?‧?‧Czechia ➡ Czech‧Latin‧Czechia-->
<likelySubtag from="und_DE" to="de_Latn_DE"/> <!--?‧?‧Germany ➡ German‧Latin‧Germany-->
<likelySubtag from="und_DJ" to="aa_Latn_DJ"/> <!--?‧?‧Djibouti ➡ Afar‧Latin‧Djibouti-->
<likelySubtag from="und_DJ" to="fr_Latn_DJ"/> <!--?‧?‧Djibouti ➡ French‧Latin‧Djibouti-->
<likelySubtag from="und_DK" to="da_Latn_DK"/> <!--?‧?‧Denmark ➡ Danish‧Latin‧Denmark-->
<likelySubtag from="und_DO" to="es_Latn_DO"/> <!--?‧?‧Dominican Republic ➡ Spanish‧Latin‧Dominican Republic-->
<likelySubtag from="und_DZ" to="ar_Arab_DZ"/> <!--?‧?‧Algeria ➡ Arabic‧Arabic‧Algeria-->
Expand Down Expand Up @@ -1009,14 +1009,14 @@ not be patched by hand, as any changes made in that fashion may be lost.
<likelySubtag from="und_SK" to="sk_Latn_SK"/> <!--?‧?‧Slovakia ➡ Slovak‧Latin‧Slovakia-->
<likelySubtag from="und_SL" to="kri_Latn_SL"/> <!--?‧?‧Sierra Leone ➡ Krio‧Latin‧Sierra Leone-->
<likelySubtag from="und_SM" to="it_Latn_SM"/> <!--?‧?‧San Marino ➡ Italian‧Latin‧San Marino-->
<likelySubtag from="und_SN" to="fr_Latn_SN"/> <!--?‧?‧Senegal ➡ French‧Latin‧Senegal-->
<likelySubtag from="und_SN" to="wo_Latn_SN"/> <!--?‧?‧Senegal ➡ Wolof‧Latin‧Senegal-->
<likelySubtag from="und_SO" to="so_Latn_SO"/> <!--?‧?‧Somalia ➡ Somali‧Latin‧Somalia-->
<likelySubtag from="und_SR" to="nl_Latn_SR"/> <!--?‧?‧Suriname ➡ Dutch‧Latin‧Suriname-->
<likelySubtag from="und_SS" to="ar_Arab_SS"/> <!--?‧?‧South Sudan ➡ Arabic‧Arabic‧South Sudan-->
<likelySubtag from="und_ST" to="pt_Latn_ST"/> <!--?‧?‧São Tomé & Príncipe ➡ Portuguese‧Latin‧São Tomé & Príncipe-->
<likelySubtag from="und_SV" to="es_Latn_SV"/> <!--?‧?‧El Salvador ➡ Spanish‧Latin‧El Salvador-->
<likelySubtag from="und_SY" to="ar_Arab_SY"/> <!--?‧?‧Syria ➡ Arabic‧Arabic‧Syria-->
<likelySubtag from="und_TD" to="fr_Latn_TD"/> <!--?‧?‧Chad ➡ French‧Latin‧Chad-->
<likelySubtag from="und_TD" to="ar_Arab_TD"/> <!--?‧?‧Chad ➡ Arabic‧Arabic‧Chad-->
<likelySubtag from="und_TF" to="fr_Latn_TF"/> <!--?‧?‧French Southern Territories ➡ French‧Latin‧French Southern Territories-->
<likelySubtag from="und_TG" to="fr_Latn_TG"/> <!--?‧?‧Togo ➡ French‧Latin‧Togo-->
<likelySubtag from="und_TH" to="th_Thai_TH"/> <!--?‧?‧Thailand ➡ Thai‧Thai‧Thailand-->
Expand Down Expand Up @@ -1211,7 +1211,8 @@ not be patched by hand, as any changes made in that fashion may be lost.
<likelySubtag from="und_Latn_RU" to="krl_Latn_RU"/> <!--?‧Latin‧Russia ➡ Karelian‧Latin‧Russia-->
<likelySubtag from="und_Latn_SD" to="en_Latn_SD"/> <!--?‧Latin‧Sudan ➡ English‧Latin‧Sudan-->
<likelySubtag from="und_Latn_SS" to="en_Latn_SS"/> <!--?‧Latin‧South Sudan ➡ English‧Latin‧South Sudan-->
<likelySubtag from="und_Latn_SY" to="fr_Latn_SY"/> <!--?‧Latin‧Syria ➡ French‧Latin‧Syria-->
<likelySubtag from="und_Latn_SY" to="ku_Latn_SY"/> <!--?‧Latin‧Syria ➡ Kurdish‧Latin‧Syria-->
<likelySubtag from="und_Latn_TD" to="fr_Latn_TD"/> <!--?‧Latin‧Chad ➡ French‧Latin‧Chad-->
<likelySubtag from="und_Latn_TH" to="en_Latn_TH"/> <!--?‧Latin‧Thailand ➡ English‧Latin‧Thailand-->
<likelySubtag from="und_Latn_TN" to="fr_Latn_TN"/> <!--?‧Latin‧Tunisia ➡ French‧Latin‧Tunisia-->
<likelySubtag from="und_Latn_TW" to="trv_Latn_TW"/> <!--?‧Latin‧Taiwan ➡ Taroko‧Latin‧Taiwan-->
Expand Down
Loading

0 comments on commit 78ff1ae

Please sign in to comment.