Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change performer country value to be ISO code #1922

Merged
merged 16 commits into from
Oct 28, 2022

Conversation

ghost
Copy link

@ghost ghost commented Oct 28, 2021

Countries are currently stored as text fields, which are parsed on the frontend to show flags. Any typos, or unrecognized country name variations break the flags. It's also similarly hard to translate the names, and filter by countries.

By switching to ISO codes instead we can easily show the correct flag and localized name. I've also added a country selector so users no longer need to type it in.

The migration and scraper resolving is based on the existing iso detection logic on the frontend. It has a 100% match rate on the performers in my db that have a country name set.

Fixes #1567

@ghost ghost added the improvement Something needed tweaking. label Oct 28, 2021
@7dJx1qP
Copy link
Contributor

7dJx1qP commented Oct 28, 2021

Could the migration also include mapping nationality to ISO code? ie, German, French, Italian.

In my dataset I have a lot of performers with multiple comma separated nationalities stored in the country field. This data mainly came from scraping IAFD. I wouldn't mind simply having to choose one country, but I'm not sure if other users would feel the same way about the new constraint.

And if I'm understanding the migration query correctly, it looks like the migration clears any value for which it can't find a match for? If that's the case, I don't think it's a good idea to clear user data like that. I think it's nice to encourage the ISO code, but allow the flexibility of leaving it an open text field in the UI for existing data that can't be migrated. Or at least there should definitely be a warning included in the next update about country data being modified in case people are currently using it in a now non-standard way.

@7dJx1qP
Copy link
Contributor

7dJx1qP commented Oct 28, 2021

I tested the migration on my data and found a bug. If the performer's country value is "Indonesia" then it's setting the country value to the performer ID value, because "Indonesia" maps to "ID"

Copy link
Contributor

@SmallCoccinelle SmallCoccinelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

pkg/database/migrations/29_iso_country_names.up.sql Outdated Show resolved Hide resolved
@SmallCoccinelle
Copy link
Contributor

I don't think this should be the PR, but there's an argument for defining something like

type country string

and then implement sql.Scanner / sql.Valuer for it in the future.

@ghost
Copy link
Author

ghost commented Oct 28, 2021

I tested the migration on my data and found a bug. If the performer's country value is "Indonesia" then it's setting the country value to the performer ID value, because "Indonesia" maps to "ID"

Duh, that's a dumb bug. I've fixed the quoting so it shouldn't happen again, thanks.

Could the migration also include mapping nationality to ISO code? ie, German, French, Italian.

Not entirely sure what you mean by that? A separate field for nationality distinct from country?

In my dataset I have a lot of performers with multiple comma separated nationalities stored in the country field. This data mainly came from scraping IAFD. I wouldn't mind simply having to choose one country, but I'm not sure if other users would feel the same way about the new constraint.

And if I'm understanding the migration query correctly, it looks like the migration clears any value for which it can't find a match for? If that's the case, I don't think it's a good idea to clear user data like that. I think it's nice to encourage the ISO code, but allow the flexibility of leaving it an open text field in the UI for existing data that can't be migrated. Or at least there should definitely be a warning included in the next update about country data being modified in case people are currently using it in a now non-standard way.

Yeah, it's a one way trip currently. One suggestion is to add a free text Location field, that we can shove non-parsable country data into. It could also be used for things like state, birthplace, etc.

@7dJx1qP
Copy link
Contributor

7dJx1qP commented Oct 28, 2021

Not entirely sure what you mean by that? A separate field for nationality distinct from country?

Not a separate field, just make the migration more robust so it can handle mapping the adjectival forms of country names to its ISO code. Currently if a performer's country value is "German" it just gets wiped in the migration instead of converting to "DE"

This table might help https://en.wikipedia.org/wiki/List_of_adjectival_and_demonymic_forms_for_countries_and_nations Not sure if there's a better mapping between "adjectivals" and the ISO code

@SmallCoccinelle
Copy link
Contributor

https://mledoze.github.io/countries/ contains the demonyms as well, if you want to start this.

The normal out-of-value setting for 2-rune iso countries is ZZ, so that might also be a possiblity.

Generally, if it's a user-controlled field, we have to be cautious. People can have put all kinds of random stuff in there. It's probably better to introduce a non-user-controlled field and just use that instead. Then kill off the user controlled field once people have migrated.

@AdultSun
Copy link
Contributor

One suggestion is to add a free text Location field, that we can shove non-parsable country data into. It could also be used for things like state, birthplace, etc.

It's probably better to introduce a non-user-controlled field and just use that instead. Then kill off the user controlled field once people have migrated.

There was talk on Discord about a birthplace/hometown field before, which is already provided by most performer databases. Rather than eliminating the user-controlled field, I like the idea of transitioning it into something like that. The ISO code could be added alongside it for the purposes of generating the flags.

If I remember right, the old proposal was to have three separate fields in a row for essentially city state country. The first two would need to be free text boxes, but country could be an enforced dropdown. Having all three separate like that might allow for more fine-tuned filtering, but just having free text birthplace ISO code would provide more flexibility to the user.

@SmallCoccinelle
Copy link
Contributor

Another idea. Implement country as a GraphQL scalar: https://gqlgen.com/reference/scalars/

I have a hunch this will wrap up a lot of code into a single neat local package, where most of the Go code is going to rely on it, including the sql database.

@WithoutPants
Copy link
Collaborator

We definitely should avoid erasing user data.

My suggestion is to rename the existing country column to something else and repurpose it - or just rename it to countryLegacy or something. The new country column will be set according to the mappings already present in the migration. If the value is correctly mapped (exactly), then the legacy field can be wiped.

@ghost
Copy link

ghost commented Aug 19, 2022

I would suggest an auto-fill input area with a dropdown option.
By doing that, you avoid spelling mistakes while also providing a way to enter free text. Example :
https://ej2.syncfusion.com/react/documentation/combo-box/how-to/autofill/ <- click preview to test

@WithoutPants
Copy link
Collaborator

Rebased and changed the migration so that it tries to convert countries to ISO codes, and falls back to the original value otherwise. Changed the implementation in the UI so that the user can enter a custom country, in addition to selecting country codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Something needed tweaking.
Development

Successfully merging this pull request may close these issues.

4 participants