Improved API Tokens #32

mdtro · 2022-10-27T17:27:36Z

Improved API tokens to provide customers an easier means of detecting accidentally leaked secrets by integrating with Github's Secret Scanning Service and other static analysis tools.

Rendered RFC

text/0032-improved-api-tokens.md

misha-sentry · 2022-10-27T20:20:27Z

a few questions:

"Verify the legacy token is valid/invalid" - are you just checking if its expired or not?
For legacy tokens, you are proposing to store it in a hashed format, and will you also have it prepended with the token type identifier?
What backwards compatibility issues need to be considered for making changes to the frontend (ie. not displaying the token after the one time after it is created)?

mdtro · 2022-10-27T22:33:44Z

@misha-sentry

"Verify the legacy token is valid/invalid" - are you just checking if its expired or not?

This would be our standard verification. For example, does the token exist and is it not expired?

For legacy tokens, you are proposing to store it in a hashed format, and will you also have it prepended with the token type identifier?

Only the hashed format. The plaintext token value that would contain the prefix would not exist in the database.

What backwards compatibility issues need to be considered for making changes to the frontend (ie. not displaying the token after the one time after it is created)?

We deploy frontend and backend separately. Deploying both at the same time is high risk since rollouts happen gradually.
Instead, we'll have to figure out a path where they can each be deployed separately without breaking each other. I'll need to partner with the engineering teams to figure that one out.

ghost

Please feel free to ack and resolve any of my ideas that have more cost than benefit.
Similarly if they're founded on my own ignorance or misapprehension :)

text/0032-improved-api-tokens.md

ghost · 2022-10-28T18:17:02Z

text/0032-improved-api-tokens.md

+
+We need a predictable token format in order to integrate properly with secret scanning services. Our current format is a 64 character alphanumeric string. This is insufficient and would likely produce a high amount of false positives in tooling like [TruffleHog](https://github.com/trufflesecurity/trufflehog), [detect-secrets](https://github.com/Yelp/detect-secrets), [Github's Secret Scanning](https://docs.github.com/en/code-security/secret-scanning/about-secret-scanning), etc.
+
+The suggested pattern is `snty[a-zA-Z]_[a-zA-Z0-9]{64}`. The character _after_ `snty` will be used to identify the token type.


Ergonomics would benefit greatly from some internal hyphenation.
Since your bitfield is twice the size of uuid, a doubled uuid scheme would make sense, to me:

`snX-fe611dff-98e0-423a-aa9a-d1f629b0f869-05b934cf-61d8-4c1e-90ae-1d8ff290cca7ty'

The static suffix provides less (but nonzero) benefit in this case.

I wouldn't expect most would be typing this key in manually, but rather a copy/paste?

I don't really hold a strong opinion here on the hyphens. Arguably, if most are copying and pasting the value the impact won't be felt, so it might just be a win-win here?

text/0032-improved-api-tokens.md

ghost

I believe this RFC actually contains two ideas that could be implemented separately:

keys at rest should be hashed -- a backend-only change
a new format for displaying and receiving secrets -- with care this can also be backend-only: API responses will only use the new form, while requests can optionally use the old during a (relatively lengthy) transition period.

I would likely want to separate these projects so that they can proceed each at their own pace.

mitsuhiko · 2022-11-02T10:39:56Z

@markstory this is largely unrelated to this RFC but I wonder if there would be some benefit to encoding the silo into the token. That way we could route based on the information in the token from the central silo. Since we are now talking about re-issuing tokens anyways we might take that opportunity.

markstory · 2022-11-02T14:39:46Z

I wonder if there would be some benefit to encoding the silo into the token. That way we could route based on the information in the token from the central silo.

The ApiToken models we currently have don't include any organization links which makes encoding a region into them hard as the tokens would work for any organization the user has access to.

text/0032-improved-api-tokens.md

ghost

Approved: I'm not qualified to approve the RFC itself, but this PR is for a draft-status rfc.

oioki · 2023-09-25T13:45:57Z

text/0032-improved-api-tokens.md

+For example:
+
+- `sntyu_a1b2c3...` - identifies a **user** API token
+- `sntya_a1b2c3...` - identifies an **API Application** token


We already have sntrys_ prefix for org auth tokens. I think we can use the same prefix and derive the type of token from the payload (aka "facts").

The sentrys_ prefix is for structured tokens. We probably should have gone with o or something to indicate an org auth token.

I think it'll be a bit before we we can derive the token type from the facts within the payload. For example, there isn't a field in the current org auth token that indicates the token type.

I think it's important to keep these separate prefixes especially when it comes to building out our service that integrates with GitHub's secret scanning program. With them, we do not need to decode a token to determine it's type, avoiding the extra processing. We can look at the first characters, determine the token type, perform a quick hashing operation, and then check for matches.

oioki · 2023-09-25T13:57:39Z

text/0032-improved-api-tokens.md

+
+**In the first release:**
+
+1. The frontend is updated to no longer display the token value for existing tokens.


We will also need to deprecate /api/0/api-tokens/ API endpoint with some notice period.

Good point, or at least update it to no longer return the full token value when just listing them (post token creation). I'll update the RFC.

Take two of #59455. - Add an option to toggle on/off automatically populating the `token_last_characters` for the `ApiToken` model. This option will ensure tokens created from here on have the field populated. It will also allow me to thoroughly test the backfill migration needed for existing tokens that will be coming in a future PR by disabling the option in tests, creating a bunch of API tokens, and then running the backfill migration test. Tracking Issue: #58918 RFC: getsentry/rfcs#32

Backfills the `token_last_characters` column on existing `ApiToken` entries. As of #62972, new `ApiToken` entries are being created with this column already populated with the correct values. This backfill will only impact old tokens created prior. Tracking Issue: #58918 Related RFC: getsentry/rfcs#32

Take two of #59455. - Add an option to toggle on/off automatically populating the `token_last_characters` for the `ApiToken` model. This option will ensure tokens created from here on have the field populated. It will also allow me to thoroughly test the backfill migration needed for existing tokens that will be coming in a future PR by disabling the option in tests, creating a bunch of API tokens, and then running the backfill migration test. Tracking Issue: #58918 RFC: getsentry/rfcs#32

Backfills the `token_last_characters` column on existing `ApiToken` entries. As of #62972, new `ApiToken` entries are being created with this column already populated with the correct values. This backfill will only impact old tokens created prior. Tracking Issue: #58918 Related RFC: getsentry/rfcs#32

Adds backend support for naming API tokens. - see issue #9600 and getsentry/customer-feedback#22 - #58918 - [RFC #32](getsentry/rfcs#32)

In support of our improved API tokens initiative (getsentry/rfcs#32), this PR adds two null-able columns to the `ApiToken` model that will hold the hash values. Hashed values for token values will be implemented over several PRs to maintain backwards compatibility and to prevent broken state between versions.

In support of getsentry/rfcs#32. Add a nullable `token_type` column to the `ApiToken` model. This will be used to help us identify the different kinds of API tokens we have in the application via a prefix. With this, we'll be able to integrate with GitHub and others' secret scanning program to prevent token leaks. Legacy (e.g. tokens that already exist) will have a null value here, so we'll know they are not one of our new tokens with the prefix format once all tokens are stored solely as hashed values.

In support of getsentry/rfcs#32. API Tokens will be prefixed with seven extra characters (ex. `sntryu_`). Eventually these plaintext `token` columns will be dropped, but to maintain backwards compatibility and a smooth transition to fully hashed tokens we'll still want to store them here.

Supports getsentry/rfcs#32. - Newly created _user auth tokens_ will be prefixed with `sntryu_`. - Introduce a custom model manager for `ApiToken` to handle the unique creation logic where we need to hash the token values and store them. - Use the `token_type` (currently optional) parameter/field on `ApiToken` when creating user auth tokens to let the model do the heavy lifting on generating and hashing the values. This will keep the logic out of views and simplify calls to just `new_token = ApiToken.objects.create(user=user, token_type=AuthTokenType.USER)`. - I've [changed some behavior](https://github.com/getsentry/sentry/pull/68148/files#diff-e68bf726258b71dbfe6c6a3dcb9a959faba4e9585762067078a38bed5bad4812R36-R37) where we only return the `refreshToken` on applicable token types. I introduce a "read once" pattern in this PR for the token secrets to prevent leaking of them in logs, exceptions, etc. It works like this when creating a new `ApiToken`: 1. The model manager sets temporary fields `__plaintext_token` and `__plaintext_refresh_token` that store the respective plaintext values for temporary reading. 2. When reading the value through the `plaintext_token` property on `ApiToken` (notice the single prepended underscore) the string value is returned and `__plaintext_token` is immediately set to `None`. 3. If you attempt to read the `_plaintext_token` property again, an exception will be raised, `PlaintextSecretAlreadyRead`.

@markstory

Supports getsentry/rfcs#32 We've been hashing tokens as they are used to authenticate (#65941), but it's started to level out. This is a backfill migration to fill in all of the hashed values for the remaining tokens. Huge thank you to @markstory @wedamija and @GabeVillalobos for helping with the migration test! 🙏

…laintext tokens

markstory reviewed Oct 27, 2022

View reviewed changes

ghost reviewed Oct 28, 2022

View reviewed changes

ghost reviewed Jan 10, 2023

View reviewed changes

text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved

ghost approved these changes Jan 11, 2023

View reviewed changes

mdtro mentioned this pull request Mar 10, 2023

Security implications of using authToken inside app.json expo/sentry-expo#321

Closed

oioki reviewed Sep 25, 2023

View reviewed changes

This was referenced Oct 26, 2023

feat: add name and last four characters to ApiToken model getsentry/sentry#58876

Closed

feat: add name and token_last_characters to ApiToken getsentry/sentry#58918

Closed

mdtro mentioned this pull request Nov 8, 2023

feat: apitoken last characters option getsentry/sentry#59455

Closed

mdtro mentioned this pull request Jan 10, 2024

feat: apitoken last characters option getsentry/sentry#62972

Merged

mdtro mentioned this pull request Jan 17, 2024

migration: backfill apitoken last characters getsentry/sentry#63342

Merged

mdtro force-pushed the mdtro/api-tokens branch from cbc9656 to 548da68 Compare January 30, 2024 16:44

mdtro marked this pull request as draft January 31, 2024 22:15

mdtro marked this pull request as ready for review January 31, 2024 22:15

mdtro mentioned this pull request Feb 2, 2024

feat: add support for name on api token endpoints getsentry/sentry#64493

Merged

mdtro added a commit to getsentry/sentry that referenced this pull request Feb 2, 2024

feat: add support for name on api token endpoints (#64493)

1f687b7

Adds backend support for naming API tokens. - see issue #9600 and getsentry/customer-feedback#22 - #58918 - [RFC #32](getsentry/rfcs#32)

mdtro mentioned this pull request Feb 16, 2024

migration: add columns for hashed values on ApiToken getsentry/sentry#65300

Merged

mdtro mentioned this pull request Feb 23, 2024

feat: apitoken types to support prefixes getsentry/sentry#65684

Merged

mdtro mentioned this pull request Feb 23, 2024

feat: increase max_length for api tokens getsentry/sentry#65743

Merged

mdtro mentioned this pull request Apr 3, 2024

feat: improved user auth tokens getsentry/sentry#68148

Merged

mdtro mentioned this pull request May 31, 2024

migration: backfill apitoken hashed values getsentry/sentry#71728

Merged

mdtro and others added 16 commits August 14, 2024 09:15

initial draft of improved api tokens

b834bdd

adding PR number

b03b2fa

add unresolved question on frontend + backend deployment order

1c89b16

improve wording to clarify option 1's rollout plan after PR feedback

cc8c77b

fixed typo

8f89e2b

include feedback from review and add suffix to the token

78e7f39

clearer instructions for option #1

d872bcd

fix ToC links

aac3449

more option 1 implementation plan details

15cf0bb

option1: note we will need to update API endpoints that try to list p…

172677c

…laintext tokens

add addtional details based on feedback

b56ce68

remove mention of suffix

61bc3c2

add PR reference for name frontend implementation

ebcf821

update status

90372a3

include additional PRs documenting the progress

e4789b3

pr for name and token last chars

eeacef6

mdtro force-pushed the mdtro/api-tokens branch from afbe4c0 to eeacef6 Compare August 14, 2024 14:15


		We need a predictable token format in order to integrate properly with secret scanning services. Our current format is a 64 character alphanumeric string. This is insufficient and would likely produce a high amount of false positives in tooling like [TruffleHog](https://github.com/trufflesecurity/trufflehog), [detect-secrets](https://github.com/Yelp/detect-secrets), [Github's Secret Scanning](https://docs.github.com/en/code-security/secret-scanning/about-secret-scanning), etc.

		The suggested pattern is `snty[a-zA-Z]_[a-zA-Z0-9]{64}`. The character _after_ `snty` will be used to identify the token type.


		In the first release:

		1. The frontend is updated to no longer display the token value for existing tokens.

Uh oh!

Improved API Tokens #32

Are you sure you want to change the base?

Improved API Tokens #32

Uh oh!

Conversation

mdtro commented Oct 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

misha-sentry commented Oct 27, 2022

Uh oh!

mdtro commented Oct 27, 2022

Uh oh!

ghost left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ghost Oct 28, 2022

Choose a reason for hiding this comment

Uh oh!

mdtro Nov 16, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ghost left a comment • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mitsuhiko commented Nov 2, 2022

Uh oh!

markstory commented Nov 2, 2022

Uh oh!

Uh oh!

ghost left a comment

Choose a reason for hiding this comment

Uh oh!

oioki Sep 25, 2023

Choose a reason for hiding this comment

Uh oh!

mdtro Sep 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oioki Sep 25, 2023

Choose a reason for hiding this comment

Uh oh!

mdtro Jan 31, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mdtro commented Oct 27, 2022 •

edited

Loading

ghost left a comment •

edited by ghost

Loading

mdtro Sep 25, 2023 •

edited

Loading