Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portuguese/Spanish street name normalization #493

Open
missinglink opened this issue Nov 19, 2021 · 1 comment
Open

Portuguese/Spanish street name normalization #493

missinglink opened this issue Nov 19, 2021 · 1 comment

Comments

@missinglink
Copy link
Member

Following on from #477 we could probably tackle some Portuguese/Spanish street prefix/suffix contractions.

Mentioned in pelias/parser#155 (comment) the pt/countrywide source of OpenAddresses contains contractions such as this (R GODINHO DE FARIA):

grep -i 'R Godinho De Faria' pt_addresses.csv | head -n1                     11s
pt.ine.add.PTCONT.3542119,R GODINHO DE FARIA,926,4465-151,SÃO MAMEDE DE INFESTA,-8.611103975015157,41.19918649220984
@orangejulius
Copy link
Member

Yeah, if we can do this reliably, one character abbreviations would be a great candidate for expansion and normalization at import time.

I think r->Rua is fairly unambiguous in Portugal, and would be a great place to start.

Hopefully there aren't too many tricky ones. Carrer/Calle in Spain and Catalonia both are abbreviated by c, but sorting them out might be tough unless they are really strictly present in only certain regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants