Skip to content

Genericize OAuth profiles#13477

Draft
quinnjr wants to merge 6 commits intorust-lang:mainfrom
quinnjr:feature/genericize-profiles
Draft

Genericize OAuth profiles#13477
quinnjr wants to merge 6 commits intorust-lang:mainfrom
quinnjr:feature/genericize-profiles

Conversation

@quinnjr
Copy link
Copy Markdown

@quinnjr quinnjr commented Apr 22, 2026

No description provided.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 22, 2026

☔ The latest upstream changes made this pull request unmergeable. Please resolve the merge conflicts.

@quinnjr
Copy link
Copy Markdown
Author

quinnjr commented Apr 22, 2026

This is a draft PR for work toward resolving two RFCs:

This is primarily here for asynchronous review of what I'm proposing as the resolution of the RFCs. The work is still in progress.

@quinnjr
Copy link
Copy Markdown
Author

quinnjr commented Apr 22, 2026

@carols10cents

-- was created. Batched by id range, idempotent via ON CONFLICT DO NOTHING.

SET LOCAL lock_timeout = '10s';
SET LOCAL statement_timeout = '120s';
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on why this commit is needed? After we merged #12792, we ran the SQL in https://github.com/rust-lang/crates.io/blob/150d165e4d5ecda24b60f146589db4def7fd34bf/migrations/data_oauth_github.sql outside of the migration framework that runs on deploy, so the backfill has already been handled.

View changes since the review

Comment thread src/oauth/preflight.rs Outdated
//! `users.gh_*` to `oauth_github`. That requires every user with
//! `gh_id > 0` to have a matching `oauth_github` row. The backfill migration
//! copies the data, and this module verifies the invariant on startup --
//! if any user is missing a row the app refuses to start.
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not the kind of thing we'd want to fail to start the app for

View changes since the review

Comment thread src/bin/server.rs
let github = RealGitHubClient::new(client);
let github = Box::new(github);
let github: std::sync::Arc<dyn crates_io_github::GitHubClient> =
std::sync::Arc::new(RealGitHubClient::new(client.clone()));
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this commit should be pulled out into a separate PR from anything having to do with usernames-- we want to introduce the crates.io username first, separately from adding any other oauth provider.

View changes since the review

-- it back would require distinguishing backfilled rows from rows inserted
-- by login, which we cannot do. The `oauth_github` table itself is dropped
-- by login, which we can't do. The `oauth_github` table itself is dropped
-- by the earlier migration 2026-01-20-162913_oauth_github_table/down.sql.
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah any comment corrections should be made in a separate commit, possibly a separate PR

View changes since the review

//!
//! Then run just these tests:
//!
//! cargo test --test integration docker_integration
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't use docker; could you elaborate on why separate tests are needed?

View changes since the review

Comment thread Dockerfile.integration
# Image for running the crates.io server in integration tests.
# Debug build for fast compilation and full debug output.

ARG RUST_VERSION=1.94.1
Copy link
Copy Markdown
Member

@carols10cents carols10cents Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I don't use docker, but this looks significantly different than the existing docker file at https://github.com/rust-lang/crates.io/blob/main/backend.Dockerfile, could you explain why this one is different?

View changes since the review

quinnjr added 6 commits May 1, 2026 12:01
Identity handling is hardwired to GitHub via app.github and
app.github_oauth. This adds a provider abstraction so adding a new
provider is just a new impl.

Two new types:

- OAuthProvider trait: async methods for the OAuth dance
  (authorize_url, exchange_code, fetch_user_info) and a stable
  name() used for routing oauth_<name> table lookups.

- UserInfo: provider-agnostic user profile returned by
  fetch_user_info. account_id is String because providers disagree
  on ID shape (GitHub: i64, Bitbucket: UUID, GitLab: numeric but
  can exceed i64 on some instances).

GitHubProvider wraps the existing BasicClient and GitHubClient so
no identity plumbing below it changes. Team-membership code still
calls GitHubClient directly.

App.github changes from Box to Arc so providers and App can share
the same client instance. build_github_oauth_client() is extracted
to deduplicate the OAuth endpoint URLs between App.github_oauth
and a future GitHubProvider construction site.

The provider isn't wired up to a registry or routed through yet --
that lands in the next commit.
Adds a ProviderRegistry attached to App.oauth_providers, mapping a
provider name to its OAuthProvider impl. The session controller (in
the next commit) resolves providers by name, with unknown names
returning 404.

GitHubProvider is registered at server startup as the only entry
for now. The registry's name() lookup matches OAuthProvider::name(),
which is also the table suffix (oauth_<name>) used by callers that
need to read or write provider-specific identity data.

Test setup constructs an empty registry; tests that exercise the
session controller will add provider mocks in subsequent commits.
session::begin and session::authorize now dispatch through
app.oauth_providers instead of hardcoded app.github_oauth and
app.github. The ?provider= query param selects a registered
provider (defaults to "github" so the frontend keeps working).
Provider name is carried in server-side session state, not as a
callback query param, so it can't be spoofed.

User lookups on the identity path now read from oauth_github
instead of users.gh_*. Writes stay dual because the gh_* columns
are NOT NULL; stopping writes is a follow-up after a schema change
makes them nullable.

Config rename: gh_token_encryption -> oauth_token_encryption.
The old env var (GITHUB_TOKEN_ENCRYPTION_KEY) is accepted as a
deprecated alias with a WARN log.

Team membership still goes through GitHubClient directly -- stays
GitHub-bound until Tier 4.

Adds regression test confirming sync_admins still matches users
via oauth_github.account_id after the identity read cutover.
Add Dockerfile.integration and a docker-compose `integration` service
that builds the server in debug mode and runs it against the
cargo_registry_test database on port 9888. RUST_LOG=debug is on by
default so container logs are useful for debugging.

src/tests/docker_integration.rs has tests that hit the live
container: session begin/authorize flows, provider selection (default
and explicit github, unknown provider 404), CSRF rejection, and
seeded-user identity checks via /api/v1/users/:login and /api/v1/me
with a forged session cookie. Tests skip automatically when the
container isn't up.

Also tightens a few doc comments and contractions caught during
review.
The genericize-profiles work needs a way to identify which OAuth
provider a user or credential row is associated with. A native
Postgres ENUM gives readable values in raw SQL ('github' vs. an
opaque integer) and lets the database enforce the closed set of
supported providers.

Starts with a single variant 'github'; future providers add
variants via ALTER TYPE ... ADD VALUE.
Each user needs a way to declare which OAuth provider they consider
their primary identity — the one whose username/avatar appear in
public profile views and whose credentials are used for API token
encryption. With only GitHub today every user defaults to 'github',
but the column is the prerequisite for multi-provider support.

Migration: NOT NULL DEFAULT 'github', which PG 11+ applies as a
metadata-only operation (no table rewrite). Rust side adds the
OAuthProviderId enum with FromSql/ToSql for the native PG enum,
FromStr for string-based lookups, and a round-trip integration test
that exercises both the DB default path and explicit ToSql.
@quinnjr quinnjr force-pushed the feature/genericize-profiles branch from 9c6eadc to 30855bb Compare May 1, 2026 16:03
@quinnjr
Copy link
Copy Markdown
Author

quinnjr commented May 1, 2026

Re: the Docker tests — these run against the real running server (HTTP layer, real session cookies, real DB round-trips for the OAuth flow), which the existing mock-based tests can't cover. They skip automatically when the container isn't up, so cargo test stays unchanged for anyone not using Docker.

Re: Dockerfile.integration vs backend.Dockerfile — the existing one is the production image (release build, slim runtime). This one is a debug build with RUST_LOG=debug so container logs are useful when a test fails. Happy to consolidate them if you'd prefer; I kept them separate to avoid slowing the prod image build.

@carols10cents
Copy link
Copy Markdown
Member

Re: Dockerfile.integration vs backend.Dockerfile — the existing one is the production image (release build, slim runtime). This one is a debug build with RUST_LOG=debug so container logs are useful when a test fails. Happy to consolidate them if you'd prefer; I kept them separate to avoid slowing the prod image build.

The differences I'm referring to specifically are things like the renovate comments to keep versions in the Dockerfile up to date. Why isn't the new Dockerfile here following that pattern established in this repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants