Skip to content

Add gssapi authentication method#513

Open
hb1915 wants to merge 1 commit intostarburstdata:masterfrom
hb1915:add-gssapi-auth
Open

Add gssapi authentication method#513
hb1915 wants to merge 1 commit intostarburstdata:masterfrom
hb1915:add-gssapi-auth

Conversation

@hb1915
Copy link
Copy Markdown

@hb1915 hb1915 commented May 7, 2026

Add gssapi authentication method

Summary

This PR adds a new method: gssapi to dbt-trino's profile schema, backed by trino.auth.GSSAPIAuthentication (introduced upstream in trinodb/trino-python-client#454).

The existing method: kerberos is untouchedgssapi is offered as a parallel option, mirroring the two-class separation that trino-python-client deliberately adopted.

Why a new method instead of patching kerberos

trino-python-client ships two distinct authentication classes:

The two classes have different defaults (e.g. mutual_authentication defaults to REQUIRED in the legacy class and DISABLED in the new one) and different semantics around credential acquisition (legacy is keytab-centric; the new one defers to gssapi.Credentials, which uses the default credential cache when no principal is given). Squashing them into one dbt method would hide that distinction; mirroring the upstream split keeps both paths available and makes the underlying library choice explicit.

Practical benefit for users

The current kerberos method requires a keytab — connections.py unconditionally sets KRB5_CLIENT_KTNAME from the keytab field, so any value other than a real path causes a TypeError. That forces operators to provision a keytab even when they already have a TGT in their credential cache.

The new gssapi method has no such requirement. With principal unset (the default), GSSAPIAuthentication calls gssapi.Credentials() with no name argument, which falls back to whatever's in KRB5CCNAME. So:

kinit user@REALM.EXAMPLE.COM
dbt run

just works — no keytab, no extra profile fields beyond method: gssapi, host, port, and user.

API choices

mutual_authentication is exposed as a case-insensitive string taking one of "REQUIRED", "OPTIONAL", "DISABLED", translated internally to trino-python-client's integer constants (MUTUAL_REQUIRED=1, MUTUAL_OPTIONAL=2, MUTUAL_DISABLED=3). Defaults to "DISABLED" to match the upstream class default. An invalid value raises DbtRuntimeError with a clear message at connection time.

Rationale: the existing kerberos method types this as Optional[bool] (line 190 of connections.py), which can only express REQUIRED/DISABLED and silently misbehaves when given False (which becomes int 0, not a value the underlying requests_kerberos.HTTPKerberosAuth recognises). The string-enum API is the dbt-idiomatic way to expose a small set of named choices and avoids that footgun.

All other parameters mirror the existing kerberos method's surface: principal, krb5_config, service_name, force_preemptive, hostname_override, sanitize_mutual_error_response, delegate, cert.

Example profile

my_target:
  type: trino
  method: gssapi
  host: trino.example.com
  port: 443
  user: alice@EXAMPLE.COM
  database: hive
  schema: analytics
  # All optional below this point:
  principal: alice@EXAMPLE.COM      # if omitted, uses default credential cache
  krb5_config: /etc/krb5.conf
  service_name: trino               # paired with hostname_override per upstream contract
  hostname_override: trino-internal.example.com
  mutual_authentication: OPTIONAL   # REQUIRED | OPTIONAL | DISABLED (default: DISABLED)
  force_preemptive: true
  delegate: true
  cert: /etc/ssl/certs/ca-bundle.crt

Tests

Three unit tests added in tests/unit/test_adapter.py:

  1. test_gssapi_authentication — full profile happy-path, mirrors the existing kerberos test.
  2. test_gssapi_authentication_default_mutual_authentication — default value resolves to MUTUAL_DISABLED.
  3. test_gssapi_authentication_invalid_mutual_authentication — bad string raises DbtRuntimeError.

I haven't added a docker-compose-based integration test because the existing kerberos integration test fixtures are scoped to the legacy library; happy to add gssapi-side fixtures in a follow-up if you'd like.

Out of scope (deliberately)

Two pre-existing bugs in TrinoKerberosCredentials were noted during this work but are not changed here, to keep this PR focused:

  1. mutual_authentication: Optional[bool] = False — wrong type; can't express OPTIONAL. Same string-enum fix would apply.
  2. os.environ["KRB5_CLIENT_KTNAME"] = self.keytab crashes when keytab is None despite the field being typed Optional[str].

Happy to follow up with a separate PR for these if you'd prefer them addressed; or if you'd rather have them bundled here, let me know and I'll amend.

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests
  • README.md updated and added information about my change (no Kerberos-specific section currently exists; the canonical docs live at docs.getdbt.com/reference/warehouse-setups/trino-setup — happy to file a companion PR to dbt-labs/docs.getdbt.com once this lands)
  • I have run changie new to create a changelog entry

@findinpath findinpath requested a review from damian3031 May 7, 2026 06:45
@findinpath
Copy link
Copy Markdown
Collaborator

Please address the conflicts on setup.py

This adds a new `method: gssapi` to dbt-trino's profile schema, backed by
trino-python-client's `GSSAPIAuthentication` class (added upstream in trinodb/
trino-python-client#454). The existing `method: kerberos` continues to work
unchanged; `gssapi` is offered as a parallel option.

Why a new method instead of patching kerberos:
- trino-python-client deliberately ships two separate classes,
  `KerberosAuthentication` (uses the older `requests-kerberos`) and
  `GSSAPIAuthentication` (uses the modern `requests-gssapi` + `gssapi`
  libraries). Mirroring that separation in dbt-trino keeps both paths
  available and makes the underlying library choice explicit.
- The two classes have different defaults (e.g. `mutual_authentication`
  defaults to REQUIRED in the legacy class and DISABLED in the new one) and
  slightly different semantics around credential cache vs keytab.

Practical benefit for users:
- The legacy kerberos method requires a keytab (it always sets
  KRB5_CLIENT_KTNAME and crashes if the field is None). The gssapi method
  uses gssapi.Credentials, which falls back to the default credential cache
  (KRB5CCNAME) when no principal is given. So 'kinit' followed by 'dbt run'
  works natively without configuring a keytab.

API choices:
- mutual_authentication is exposed as a case-insensitive string
  ("REQUIRED" | "OPTIONAL" | "DISABLED"), translated to trino-python-
  client's integer constants in trino_auth(). Defaults to "DISABLED" to
  match the upstream class default. An invalid value raises DbtRuntimeError
  with a clear message at connection time.
- All other parameters mirror the existing kerberos method's surface
  (principal, krb5_config, service_name, force_preemptive,
  hostname_override, sanitize_mutual_error_response, delegate, cert).

Tests:
- Adds three unit tests in tests/unit/test_adapter.py:
  - test_gssapi_authentication: full profile happy-path mirroring the
    existing kerberos test.
  - test_gssapi_authentication_default_mutual_authentication: default
    resolves to MUTUAL_DISABLED.
  - test_gssapi_authentication_invalid_mutual_authentication: bad string
    raises DbtRuntimeError.

Other:
- Updates dbt/include/trino/sample_profiles.yml to list gssapi alongside
  the other supported methods in the comment hint.
- Adds a changie Features entry under .changes/unreleased/.
@hb1915 hb1915 force-pushed the add-gssapi-auth branch from 2f21b94 to e9456b3 Compare May 7, 2026 06:51
@hb1915
Copy link
Copy Markdown
Author

hb1915 commented May 7, 2026

Please address the conflicts on setup.py

Should be done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants