WIP sekrit project #10610

carols10cents · 2025-02-18T00:05:34Z

No description provided.

That has a `provider` column that will (for now) always be set to 0, which corresponds to `AccountProvider::Github`. The table's primary key is (provider, account_id), which corresponds to (0, gh_id). This constraint will mean a particular GitHub/GitLab/etc account, identified from the provider by an ID, may only be associated with one crates.io user record, but a crates.io user record could (eventually) have *both* a GitHub *and* a GitLab account associated with it (or two GitHub accounts, even!) This is the first step of many to eventually allow for crates.io accounts linked to other OAuth providers in addition/instead of GitHub. No code aside from one test is reading from the linked accounts table at this time. No backfill has been done yet. No handling of creating/associating multiple OAuth accounts with one crates.io account has been done yet.

Right now, this will always have the same value as gh_login on the users table and login on the linked accounts table. All of these values will get updated when we get a new gh_login value. Eventually, we're going to have to decouple the concept of crates.io "username" from the logins of the various account(s), which you may or may not want to update when you update your GitHub or other login, and which may or may not conflict with other users' crates.io usernames. But for now, set up for that future and defer the harder decisions until later by making this field always get the same value as gh_login. This adds a `username` field to the JSON views returned from the API but does not use that field anywhere yet. Question: should team owner JSON also have a field named `username` as done in this commit? it's a little weird for a team to have a username because it's not a user. but it's consistent. something more generic like `name` might also be confusing. something more specific like `crates_io_identifier` is more verbose but less confusable. shrug

blm768

I had a few thoughts on the finer details of the database migration, but the approach seems sound overall.

blm768 · 2025-11-14T13:01:09Z

crates/crates_io_database/src/models/user.rs

+pub struct LinkedAccount {
+    pub user_id: i32,
+    pub provider: AccountProvider,
+    pub account_id: i32, // corresponds to user.gh_id


This could very well be a problem to deal with later, but I could imagine having a provider that represents its IDs some other way, e.g. strings or UUIDs. We probably don't want to commit to such a design yet, but perhaps we'll eventually need some kind of "table-per-provider" approach.

That's a really good point. Hmm. I'm trying to think through whether I'd rather start with an oauth_github table and then add, say, oauth_gitlab, etc tables, or if i'd rather start with what's here and then need to migrate to a table-per-provider later....

Unfortunately, it'll probably be hard to know what we need until we need it. The good news is that we shouldn't be boxing ourselves into a hole that's impossible to escape regardless of which approach we choose.

blm768 · 2025-11-14T13:05:05Z

script/backfill-linked-accounts.sql

@@ -0,0 +1,7 @@
+INSERT INTO linked_accounts (user_id, provider, account_id, access_token, login, avatar)


We'll likely want a benchmark on this to see if it takes long enough that it needs to be broken into smaller batches, but my instinct says this approach is probably fine.

Yeah, I'm starting to think I want a more general sort of harness for running SQL in a loop of batches of N until it returns 0 results, because a few of these migrations are going to need to do that sort of thing...

blm768 · 2025-11-14T13:22:23Z

migrations/2025-02-19-013433_add_username_to_user/up.sql

@@ -0,0 +1,6 @@
+ALTER TABLE users
+-- The column needs to be nullable for this migration to be fast; can be changed to non-nullable


Technically, we probably could get away with making this non-null with a default of '', but Postgres can handle changes in nullability without a table rewrite, so I don't know that it would really matter in practice.

blm768 · 2025-11-14T13:26:31Z

migrations/2025-02-19-013433_add_username_to_user/up.sql

+-- after backfill of all records.
+ADD COLUMN username VARCHAR;
+
+CREATE INDEX lower_username ON users (lower(username));


This might need a benchmark, but I suspect we'd save some work by creating this after the column has been populated.

carols10cents · 2025-11-18T01:39:19Z

Thank you for the review @blm768 !

blm768 · 2025-11-20T06:44:00Z

Glad to help! Most of my experience with large database migrations is with MySQL, not PostgreSQL, so I very well could have missed some things, but as far as I can tell, most of what I learned with MySQL seems to port over well enough.

carols10cents force-pushed the more-logins branch 2 times, most recently from 7d91bcc to e072a0c Compare February 18, 2025 02:32

carols10cents force-pushed the more-logins branch 6 times, most recently from a31b6fa to 6c9d8ff Compare March 7, 2025 21:29

carols10cents added 5 commits March 25, 2025 20:34

(untested) SQL to backfill linked_accounts (deploy 2)

7ba4532

(untested) SQL to backfill username (deploy 4)

49124ba

Transition views to use username instead of login (deploy 5)

19125bb

carols10cents force-pushed the more-logins branch from 6c9d8ff to 1e9f6a3 Compare March 26, 2025 00:34

Send linked accounts in user JSON (deploy 6)

83fac29

carols10cents force-pushed the more-logins branch from 1e9f6a3 to 83fac29 Compare March 26, 2025 01:13

carols10cents mentioned this pull request Sep 4, 2025

Refactoring to disentangle account creation from only one identity provider #10611

Open

5 tasks

blm768 reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP sekrit project #10610

WIP sekrit project #10610

Uh oh!

carols10cents commented Feb 18, 2025

Uh oh!

blm768 left a comment

Uh oh!

blm768 Nov 14, 2025

Uh oh!

carols10cents Nov 18, 2025

Uh oh!

blm768 Nov 20, 2025

Uh oh!

blm768 Nov 14, 2025

Uh oh!

carols10cents Nov 18, 2025

Uh oh!

blm768 Nov 14, 2025

Uh oh!

blm768 Nov 14, 2025

Uh oh!

carols10cents commented Nov 18, 2025

Uh oh!

blm768 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,7 @@
		INSERT INTO linked_accounts (user_id, provider, account_id, access_token, login, avatar)

		@@ -0,0 +1,6 @@
		ALTER TABLE users
		-- The column needs to be nullable for this migration to be fast; can be changed to non-nullable

WIP sekrit project #10610

Are you sure you want to change the base?

WIP sekrit project #10610

Uh oh!

Conversation

carols10cents commented Feb 18, 2025

Uh oh!

blm768 left a comment

Choose a reason for hiding this comment

Uh oh!

blm768 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

carols10cents Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

blm768 Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

blm768 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

carols10cents Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

blm768 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

blm768 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

carols10cents commented Nov 18, 2025

Uh oh!

blm768 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants