Skip to content

Conversation

@MervideHeer
Copy link
Collaborator

Dear Lexibank contributors,
I have updated the UraLex 2.0 basic vocabulary dataset README and Documentation with citation information on published papers. It would be good if the Zenodo record of UraLex 2.0 were updated as well. Could these changes be merged?

I can also see that UraLex has a tag CLDF-validation failing. Can I fix it somehow?

On behalf of the BEDLAN team
Mervi de Heer

Updated the publication information of the articles using the UraLex 2.0 data in README.
Updated the uralex_documentation.md file with information on published papers for the citations:

De Heer, Mervi; Rogier Blokland; Michael Dunn & Outi Vesakoski. 2024. “Loanwords in basic vocabulary as an indicator of borrowing profiles”. Journal of Language Contact 16 (1). 54–103. https://doi.org/10.1163/19552629-bja10057. 

Syrjänen, Kaj, Luke Maurits, Unni-Päivä Leino, Terhi Honkola, Jadranka Rota & Outi Vesakoski. 2021. “Crouching TIGER, hidden structure: Exploring the nature of lin guistic data using TIGER values”. Journal of Language Evolution 6(2). 99–118. https://doi.org/10.1093/jole/lzab004.
@xrotwang
Copy link
Collaborator

xrotwang commented Feb 5, 2026

I'll try to look into this.

@xrotwang
Copy link
Collaborator

xrotwang commented Feb 7, 2026

Is that the UraLex 1.0 plus https://github.com/bedlan/uralex-ns ?

@xrotwang
Copy link
Collaborator

xrotwang commented Feb 7, 2026

ah, I see, there are changes here https://github.com/bedlan/uralex as well.

@MervideHeer
Copy link
Collaborator Author

Is that the UraLex 1.0 plus https://github.com/bedlan/uralex-ns ?

Hi, the Northern Samoyedic expansion is intended to become a separate sub-release in the UraLex project because it has been compiled on somewhat different principles than the UraLex 1.0 and 2.0 versions. Therefore we are not planning to directly merge the NS expansion with 2.0 for a new release. This is to highlight the new interesting innovations of the NS part.

@MervideHeer
Copy link
Collaborator Author

ah, I see, there are changes here https://github.com/bedlan/uralex as well.

I made them (hopefully at the right place). We have several branches of UraLex and I'm trying to update the citation information for UraLex 2.0 that also appears on Zenodo "When you use this dataset, please also cite the following papers, introducing it: ..." before finalizing version 3.0.

https://doi.org/10.5281/zenodo.4777568

@xrotwang xrotwang merged commit a4b7240 into lexibank:master Feb 10, 2026
1 check failed
@xrotwang
Copy link
Collaborator

@MervideHeer one question: Are we just updating the citation information here, or is there actual new data supposed to be added? As far as I can see, the last changes to the data are from 2021. I suppose there is more data now, in particular regarding loanwords - correct?

@xrotwang
Copy link
Collaborator

@MervideHeer
Copy link
Collaborator Author

I've updated the info at https://zenodo.org/records/4777568 and at https://github.com/lexibank/uralex/releases/tag/v2.0

Thank you for the update and taking time to look at the issue! I see that the CLDFvalidation tag has changed to green again. However, in my pull request attempt, my goal was to update both papers using the 2.0 dataset. I can see that the first paper "De Heer et al." is updated but the second is not. It should be:

Syrjänen, Kaj, Luke Maurits, Unni-Päivä Leino, Terhi Honkola, Jadranka Rota & Outi Vesakoski. 2021. “Crouching TIGER, hidden structure: Exploring the nature of linguistic data using TIGER values”. Journal of Language Evolution 6(2). 99–118. https://doi.org/10.1093/jole/lzab004.

@MervideHeer
Copy link
Collaborator Author

@MervideHeer one question: Are we just updating the citation information here, or is there actual new data supposed to be added? As far as I can see, the last changes to the data are from 2021. I suppose there is more data now, in particular regarding loanwords - correct?

At this moment, just the citations are in need of an update. For the upcoming version 3.0, we have prepared big changes. Not only loanword information is updated but also reflexes, cognate assessments and new languages are coming. So no new data is added inside 2.0 right now.

@xrotwang
Copy link
Collaborator

Ah, I see. Second paper should be updated now, too.

@MervideHeer
Copy link
Collaborator Author

Ah, I see. Second paper should be updated now, too.

I can see them both there now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants