Skip to content

Commit a018ea4

Browse files
authored
feat: Update RFC#91 with new token format (#105)
1 parent c7f009e commit a018ea4

File tree

1 file changed

+41
-37
lines changed

1 file changed

+41
-37
lines changed

text/0091-ci-upload-tokens.md

Lines changed: 41 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -49,53 +49,54 @@ data. For the purpose of this document they are called **structural tokens**.
4949

5050
## Token Format
5151

52-
The proposed token format is to leverage JWT as serialization format. The goals of the
53-
token align generally with both [Macaroons](http://macaroons.io/) and
54-
[Biscuit](https://www.biscuitsec.org) but unfortunately the former standard has never seen
55-
much attention, and the latter is pretty new, not particularly proven and very complex.
56-
Either system however permits adding additional restrictions to the token which make them
57-
a very interesting choice for the use in our pipeline.
58-
59-
One of the benefits of having the tokens carry additional data is that the token alone has enough
60-
information available to route to a Sentry installation. This means that `sentry-cli` or
61-
any other tool _just_ needs the token to even determine the host that the token should be
62-
sent against. This benefit also applies to JWT or PASETO tokens which can be considered
63-
for this as well. The RFC here thus proposes to encode this data into a regular **JWT**
64-
token.
52+
We use a custom token format based on base64 encoding.
53+
54+
```
55+
PREFIX_FACTS_SECRET
56+
```
57+
58+
Concretely, a token would look like this:
59+
60+
```
61+
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
62+
```
63+
64+
* `PREFIX`: `sntrys_` - this is static and helps to identify this is a Sentry token.
65+
* `FACTS`: A base64 encoded JSON string of the facts.
66+
* `SECRET`: A random secret part for the token. We may use `b64encode(secrets.token_bytes(32)).decode("ascii").rstrip("=")`, but this is an implementation detail.
6567

6668
A serialized token is added a custom prefix `sntrys_` (sentry structure) to make
6769
it possible to detect it by security scrapers. Anyone handling such a token is
6870
required to check for the `sntrys_` prefix and disregard it before parsing it. This
6971
can also be used by the client side to detect a structural token if the client is
7072
interested in extracting data from the token.
7173

74+
The purpose of the secret is that the resulting token is not guessable. It should be a randomly generated string that is different for each token.
75+
7276
## Token Facts
7377

7478
We want to encode certain information into the tokens. The following attributes are defined:
7579

76-
* `iss`: The value `sentry.io` indicates that this is a Sentry Org Auth Token.
77-
* `nonce`: A randomly generated UUID to ensure the token content cannot be guessed.
78-
* `sentry_url`: references the root domain to be used. A token will always have a
80+
* `iat`: Timestamp when the token was issued.
81+
* `url`: references the root domain to be used. A token will always have a
7982
url in it and clients are not supposed to provide a fallback. This value can be found in `settings.SENTRY_OPTIONS["system.url-prefix"]`. Some APIs are only available on this URL, not on the region URL (see below). e.x. `https://sentry.io/`.
80-
* `sentry_region_url`: The domain that the organization's API endpoints are available on. This value can be found in `organization.links.regionUrl`. e.x. `http://us.sentry.io`.
81-
* `sentry_org`: a token is uniquely bound to an org, so the slug of that org is also always
83+
* `region_url`: The domain that the organization's API endpoints are available on. This value can be found in `organization.links.regionUrl`. e.x. `http://us.sentry.io`.
84+
* `org`: a token is uniquely bound to an org, so the slug of that org is also always
8285
contained. Note that the slug is used rather than an org ID as the clients typically
8386
need these slugs to create API requests.
8487

8588
These facts are encoded in the JWT as custom claims:
8689

8790
```json
8891
{
89-
"iss": "sentry.io",
9092
"iat": 1684154626,
91-
"nonce": "abcd-efgh-ijkl-mnop",
92-
"sentry_region_url": "https://eu.sentry.io/",
93-
"sentry_url": "https://sentry.io/",
94-
"sentry_org": "myorg"
93+
"region_url": "https://eu.sentry.io/",
94+
"url": "https://sentry.io/",
95+
"org": "myorg"
9596
}
9697
```
9798

98-
Encoded the token then is be `sntrys_{encoded_jwt}`.
99+
Encoded the token then is be `sntrys_{encoded_facts}_secret`.
99100

100101
## Token Storage
101102

@@ -115,26 +116,23 @@ unaware of the structure behind structural tokens nothing changes.
115116
Clients are strongly encouraged to parse out the containing structure of the token and
116117
to use this information to route requests. For the keys the following rules apply:
117118

118-
* `sentry_url` & `sentry_region_url`: references the target API URL that should be used. A token
119+
* `url` & `region_url`: references the target API URL that should be used. A token
119120
will always have a site in it and clients are not supposed to provide an
120121
automatic fallback.
121122
* `org`: a token is uniquely bound to an org, so the slug of that org is also always
122123
contained. Note that the slug is used rather than an org ID as the clients typically
123124
need these slugs to create API requests.
124125

125-
An example of this with a JWT token:
126+
An example of parsing the token content with python:
126127

127-
```python
128-
>>> import jwt
129-
>>> tok = "sntrys_eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzZW50cnkuaW8iLCJpYXQiOjE2ODQxNTQ2MjYsInNlbnRyeV9zaXRlIjoiaHR0cHM6Ly9teW9yZy5zZW50cnkuaW8vIiwic2VudHJ5X29yZyI6Im15b3JnIiwic2VudHJ5X3Byb2plY3RzIjpbIm15cHJvamVjdCJdfQ.ROnK3f72jGbH2CLkmswMIxXP1qZHDish9lN6kfCR0DU"
130-
>>> jwt.decode(tok[7:], options={"verify_signature": False})
131-
{
132-
'iss': 'sentry.io',
133-
'iat': 1684154626,
134-
'sentry_url': 'https://sentry.io/',
135-
'sentry_region_url': 'https://eu.sentry.io/',
136-
'sentry_org': 'myorg'
137-
}
128+
```py
129+
def parse_token(token: str):
130+
if not token.startswith("sntrys_") or token.count('_') != 2:
131+
return None
132+
133+
payload_hashed = token[7:token.rindex('_')]
134+
payload_str = b64decode((payload_hashed).encode('ascii')).decode("ascii")
135+
return json.loads(payload_str)
138136
```
139137

140138
## Token Issuance
@@ -271,6 +269,12 @@ globally unique IDs. However this today does not work for a handful of reasons:
271269
for frontend + backend deployment scenarios being able to use one token to manage releases
272270
across projects might be desirable.
273271

272+
## Why not JWT?
273+
274+
We initially set out to try to use JWT as a format. However, since we are not interested in signing the tokens (which is a fundamental concept of JWT), this lead to problems. Skipping signing means we have to use `algorithm='none'`, which is not very well supported. When using this algorithm, the resulting tokens always end in `.`, as the final part would be based on the key, which is missing. Having a trailing `.` after each token is an unnecessary error source (users may not copy it, ...). We _could_ try to handle this when decoding, but this would still make this technically invalid JWT.
275+
276+
Since we do not need signing/verification of the token client side, we decided against using JWT as a format.
277+
274278
## Why not PASETO?
275279

276280
PASETO as an alternative to JWT can be an option. This should probably be decided based on what

0 commit comments

Comments
 (0)