Skip to content
This repository has been archived by the owner on Jan 28, 2020. It is now read-only.

Store Twitter IDs #271

Closed
ianroberts opened this issue Mar 16, 2015 · 9 comments · Fixed by #864
Closed

Store Twitter IDs #271

ianroberts opened this issue Mar 16, 2015 · 9 comments · Fixed by #864

Comments

@ianroberts
Copy link

Twitter usernames are strange beasts. When you want to look up https://twitter.com/username via a web browser then Twitter will accept any case - https://twitter.com/david_cameron and https://twitter.com/David_Cameron go to the same page - but when you are using Twitter's own REST APIs to perform lookups by username they require an exact case-sensitive match.

Would it be possible to check Twitter usernames when they are submitted, and store them in the correct case to exactly match the Twitter profile? At the moment I am doing this at my end but it would be nice if this were not necessary. -- edit: my mistake, this turned out to be a bug in the code I'm using

Beyond this, it would be even nicer if you could fetch (and store) the numeric Twitter user ID corresponding to the username, as that is the stable identifier which will remain correct even if the candidate chooses to change their username at a later date (e.g. #196 - sitting MPs who remove "MP" from their username when parliament is dissolved).

@andylolz
Copy link
Collaborator

but when you are using Twitter's own REST APIs to perform lookups by username they require an exact case-sensitive match.

Hmm… I just tested:
https://api.twitter.com/1.1/users/show.json?screen_name=AnDyLoLz (me)
…and it worked (in spite of the muddled case).

{"id":68828618,"id_str":"68828618","name":"Andy Lulham","screen_name":"andylolz","location":"London, UK","profile_location":null,"description […]

it would be even nicer if you could fetch (and store) the numeric Twitter user ID corresponding to the username

I do agree… However, short of that, it would be super helpful for a third party to do this e.g. @davorg’s twittelection could store IDs, and send updated handle updates back to YourNextMP. I’m currently running a script every few days that spots and fixes changed handles (as you point out, it’s usually MPs removing “MP”.) Is this something your site would be able to do, @ianroberts?

@mhl
Copy link
Contributor

mhl commented Mar 16, 2015

At the moment we're storing the Twitter username as a contact detail:

    "contact_details": [
      {
        "type": "twitter",
        "value": "david_cameron"
      }

... but it sounds as if also adding the Twitter numeric ID as an identifier might be useful without breaking any API clients that are relying on the contact detail, e.g. we could add:

    "identifiers": [
      {
        "identifier": "123456789",
        "scheme": "twitter"
      },

And then update the Twitter contact detail from that periodically. (It would be a good idea to add that Twitter numeric ID to the CSV file as well.)

@ianroberts
Copy link
Author

Hmm… I just tested:
https://api.twitter.com/1.1/users/show.json?screen_name=AnDyLoLz (me)
…and it worked (in spite of the muddled case).

Interesting. The API I use is users/lookup (the bulk lookup API to fetch up to 100 users in one call) and that one definitely requires an exact match.

@andylolz
Copy link
Collaborator

The API I use is users/lookup (the bulk lookup API to fetch up to 100 users in one call) and that one definitely requires an exact match.

Ahh, apologies – gotcha.

it sounds as if also adding the Twitter numeric ID as an identifier might be useful

Great! I caught 5 or so twitter handle changes yesterday. Having a stored ID (rather than relying on google cache) would make this much easier.

@dracos
Copy link
Member

dracos commented Mar 17, 2015

Just to note that it doesn't look like users/lookup requires an exact match:

curl --get 'https://api.twitter.com/1.1/users/lookup.json' --data 'screen_name=TWITTER' --header 'Authorization: OAuth oauth_consumer_key="…", oauth_nonce="...", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="...", oauth_token="...", oauth_version="1.0"' --verbose
[{"id":783214,"id_str":"783214","name":"Twitter","screen_name":"twitter",…]

@ianroberts
Copy link
Author

My apologies, on further investigation the case-sensitivity turned out to be an artefact of the code I'm using to access the API rather than the API itself. But the point about storing the numeric user ID as a stable identifier in case of changes of handle may still be worth considering.

@mhl mhl added 3 - Now and removed 1 - Contender labels Mar 30, 2015
@lizconlan lizconlan self-assigned this Apr 7, 2015
@lizconlan
Copy link

  • store Twitter ID as well as username
  • check the Twitter ID when the username is edited
  • add management command that can be run periodically to check the usernames against known IDs
  • add management command to hoover up the usernames without IDs(?)

https://dev.twitter.com/rest/reference/get/users/lookup is probably going to be the most useful for updating en masse as even if we're rate limited to 60 calls per 15 minute window, we can grab details for up to 100 accounts at a time

Might be worth using https://dev.twitter.com/rest/reference/get/users/show for individual lookups as there's no clear benefit - other than familiarity - and might help with avoiding rate limiting problems if run in the same quarter hour window as a scheduled check (or I might have forgotten how Twitter's rate limiting works)

@lizconlan lizconlan removed their assignment Apr 22, 2015
@ianroberts
Copy link
Author

It's the other way round really - the numeric ID is the stable identifier, the username can change, so the periodic check needs to start from the ID and update the username if it has changed (or at least flag it for human approval). Apologies if that is in fact what you meant and I mis-understood.

@mhl mhl added 1 - Contender and removed 3 - Now labels May 11, 2015
@TomSteinberg TomSteinberg changed the title Normalise case in Twitter usernames Store Twitter IDs Jul 6, 2015
mhl added a commit that referenced this issue Apr 24, 2016
Candidates frequently change their Twitter screen names (e.g. one common
reason is because once parliament is dissolved for an election, sitting
MPs who are standing for re-election aren't allowed to have 'MP' in
their screen name under parliaments rules).  Their Twitter user ID will
remain the same, however, so we can cope with these name changes by
making sure that the Twitter user ID is stored as well as the screen
name.

This script will go through every Person in the database and:

 - If they have a user ID set already, make sure the screen name is
   correct based on that.

 - If they have a screen name but not user ID set, set the use ID from
   the screen name.

 - Output details of any people whose user ID or screen name can no
   longer be found. (This will be the case if someone has deleted that
   Twitter account, or (rarely) if their screen name has been added
   immediately after the queries to the Twitter API were made.)

Partial fix for #271
mhl added a commit that referenced this issue Apr 24, 2016
So long as the site managers have set a valid application-only bearer
token as TWITTER_APP_ONLY_BEARER_TOKEN in conf/general.yml, this commit
means that:

 (a) The validation of Twitter usernames will check that that account
     actually exists

 (b) When creating or editing a candidates, their user ID will be
     discovered from the Twitter API and set as an Identifier (screen
     names can change, but the user ID should be stable)

If the TWITTER_APP_ONLY_BEARER_TOKEN is unset, it should behave as
before.

This is the second part of the fix for #271
@mhl mhl self-assigned this Apr 24, 2016
@mhl mhl added 3 - Now and removed 1 - Contender labels Apr 24, 2016
mhl added a commit that referenced this issue Apr 26, 2016
Candidates frequently change their Twitter screen names (e.g. one common
reason is because once parliament is dissolved for an election, sitting
MPs who are standing for re-election aren't allowed to have 'MP' in
their screen name under parliaments rules).  Their Twitter user ID will
remain the same, however, so we can cope with these name changes by
making sure that the Twitter user ID is stored as well as the screen
name.

This script will go through every Person in the database and:

 - If they have a user ID set already, make sure the screen name is
   correct based on that.

 - If they have a screen name but not user ID set, set the use ID from
   the screen name.

 - Output details of any people whose user ID or screen name can no
   longer be found. (This will be the case if someone has deleted that
   Twitter account, or (rarely) if their screen name has been added
   immediately after the queries to the Twitter API were made.)

Partial fix for #271
mhl added a commit that referenced this issue Apr 26, 2016
So long as the site managers have set a valid application-only bearer
token as TWITTER_APP_ONLY_BEARER_TOKEN in conf/general.yml, this commit
means that:

 (a) The validation of Twitter usernames will check that that account
     actually exists

 (b) When creating or editing a candidates, their user ID will be
     discovered from the Twitter API and set as an Identifier (screen
     names can change, but the user ID should be stable)

If the TWITTER_APP_ONLY_BEARER_TOKEN is unset, it should behave as
before.

This is the second part of the fix for #271
@mhl mhl closed this as completed in #864 Apr 26, 2016
@mhl mhl removed the 3 - Now label Apr 26, 2016
@jf1
Copy link

jf1 commented May 1, 2016

Has this been run on all candidates or just current ones? I notice it hasn't flagged-up this previous candidate's no-longer-valid twitter handle (was chris11green, now ChrisGreenMP). I realise it's not of great value for this particular candidate but it could be useful for others, especially if c.dc data might get used in sites like EveryPolitician. Especially once candidates have been elected it could be useful to list all (succesful) candidates with a no-longer-valid twitter handle so users can amend them.

andylolz pushed a commit to andylolz/yournextrepresentative that referenced this issue May 10, 2017
Candidates frequently change their Twitter screen names (e.g. one common
reason is because once parliament is dissolved for an election, sitting
MPs who are standing for re-election aren't allowed to have 'MP' in
their screen name under parliaments rules).  Their Twitter user ID will
remain the same, however, so we can cope with these name changes by
making sure that the Twitter user ID is stored as well as the screen
name.

This script will go through every Person in the database and:

 - If they have a user ID set already, make sure the screen name is
   correct based on that.

 - If they have a screen name but not user ID set, set the use ID from
   the screen name.

 - Output details of any people whose user ID or screen name can no
   longer be found. (This will be the case if someone has deleted that
   Twitter account, or (rarely) if their screen name has been added
   immediately after the queries to the Twitter API were made.)

Partial fix for mysociety#271
andylolz pushed a commit to andylolz/yournextrepresentative that referenced this issue May 10, 2017
So long as the site managers have set a valid application-only bearer
token as TWITTER_APP_ONLY_BEARER_TOKEN in conf/general.yml, this commit
means that:

 (a) The validation of Twitter usernames will check that that account
     actually exists

 (b) When creating or editing a candidates, their user ID will be
     discovered from the Twitter API and set as an Identifier (screen
     names can change, but the user ID should be stable)

If the TWITTER_APP_ONLY_BEARER_TOKEN is unset, it should behave as
before.

This is the second part of the fix for mysociety#271
mhl pushed a commit that referenced this issue May 31, 2017
Cope with the version in a LoggedAction not being found for its person
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants