Skip to content

Commit e62a5bc

Browse files
author
Metabase Docs bot
committed
[auto] adding content to git-hook-token-scanner->master
1 parent ed5ce45 commit e62a5bc

26 files changed

+328
-762
lines changed

_docs/master/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,6 @@ Metabase's reference documentation.
205205
- [Interactive embedding quick start](./embedding/interactive-embedding-quick-start-guide)
206206
- [Static embedding](./embedding/static-embedding)
207207
- [Parameters for static embeds](./embedding/static-embedding-parameters)
208-
- [Securing embedded Metabase](./embedding/securing-embeds)
209208

210209
### Configuration
211210

@@ -301,5 +300,4 @@ Data jargon explained.
301300
### [Metabase Experts](/partners/)
302301

303302
If you’d like more technical resources to set up your data stack with Metabase, connect with a [Metabase Expert](/partners/).
304-
305303
<!-- bump 2 -->

_docs/master/dashboards/subscriptions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ If you check this box, Metabase will drop any visualization settings applied to
8787

8888
Here you can specify which questions Metabase should attach results for.
8989

90-
The attached files will include up to 1 048 575 rows by default. If you're self-hosting Metabase, you can adjust this row limit by setting the environment variable [MB_ATTACHMENT_ROW_LIMIT](../configuring-metabase/environment-variables#mb_attachment_row_limit). To change this row limit on your Metabase Cloud instance, you can [contact us](/help-premium) and request a different row limit.
90+
The attached files will include up to 2000 rows by default. If you're self-hosting Metabase, you can adjust this row limit by setting the environment variable [MB_UNAGGREGATED_QUERY_ROW_LIMIT](../configuring-metabase/environment-variables#mb_unaggregated_query_row_limit). To change this row limit on your Metabase Cloud instance, you can [contact us](/help-premium) and request a different row limit.
9191

9292
## Slack subscription options
9393

_docs/master/databases/ssh-tunnel.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,8 +67,6 @@ If you have problems connecting, verify the SSH host port and password by connec
6767

6868
NOTE: the SSH server needs to have "AllowTcpForwarding" configuration set to "yes" for the tunneling to work.
6969

70-
NOTE: the best practice with SSH Tunnel Hosts is to deny all traffic except from approved clients. Metabase lists our Metabase [Cloud IPs to whitelist here](/docs/latest/cloud/ip-addresses-to-whitelist).
71-
7270
## Disadvantages of indirect connections
7371

7472
While using an SSH tunnel makes it possible to use a data warehouse that is otherwise inaccessible, it's almost always preferable to use a direct connection when possible.
Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
---
2+
version: master
3+
has_magic_breadcrumbs: true
4+
show_category_breadcrumb: true
5+
show_title_breadcrumb: true
6+
category: 'Developers Guide'
7+
title: 'Security Token Scanner'
8+
source_url: 'https://github.com/metabase/metabase/blob/master/docs/developers-guide/security-token-scanner.md'
9+
layout: new-docs
10+
---
11+
12+
# Security Token Scanner
13+
14+
The security token scanner is a tool that automatically detects potentially leaked API keys, secrets, and other sensitive tokens in the Metabase codebase. It runs as a git precommit hook via `lint-staged` to prevent accidental token leaks from being committed.
15+
16+
## What it scans for
17+
18+
The scanner looks for patterns that match common token formats:
19+
20+
- **Airgap Tokens**: JWE tokens starting with `airgap_` (400+ characters)
21+
- **Hash/Dev Tokens**: 64-character hex strings or `mb_dev_` prefixed tokens
22+
- **OpenAI API Keys**: Keys starting with `sk-` (43-51 characters total)
23+
- **JWT Tokens**: Standard JWT format with header.payload.signature
24+
- **JWE Tokens**: Encrypted JWT tokens (400+ characters)
25+
- **GitHub Tokens**: Personal access tokens starting with `gh[pousr]_`
26+
- **Slack Bot Tokens**: Bot tokens starting with `xoxb-`
27+
- **AWS Access Keys**: Access key IDs starting with `AKIA`
28+
29+
## Running the scanner
30+
31+
The scanner runs automatically via `lint-staged` on staged files during git commits. You can also run it directly from mage.
32+
33+
### Basic usage
34+
35+
```bash
36+
# Scan specific files
37+
./bin/mage token-scan file1.txt file2.txt
38+
39+
# Scan all files in the project
40+
./bin/mage token-scan -a
41+
42+
# Run with verbose output
43+
./bin/mage token-scan -v file1.txt file2.txt
44+
45+
# Scan without showing line details
46+
./bin/mage token-scan --no-lines file1.txt file2.txt
47+
```
48+
49+
### Example output
50+
51+
```
52+
Scanning 143 files
53+
Using thread pool size: 16
54+
/Users/dev/metabase/src/metabase/api/auth.clj
55+
Line# 42 [OpenAI API Key]: const apiKey = "sk-1234567890abcdef1234567890abcdef123456789012";
56+
57+
Scan completed in: 89ms
58+
Files scanned: 143
59+
Files with matches: 1
60+
Total matches: 1
61+
```
62+
63+
## Whitelisting legitimate tokens
64+
65+
Sometimes you need to include token-like strings in source code for testing or examples. The scanner uses a whitelist file to avoid flagging known safe tokens.
66+
67+
The whitelist is located at `mage/resources/token_scanner_whitelist.txt` and contains strings that should not be flagged as secrets:
68+
69+
```
70+
# Common test/example tokens that appear in documentation
71+
sk-1234567890abcdef1234567890abcdef123456789012
72+
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
73+
74+
# Hash values from tests and examples
75+
430bb02a37bb2471176e54ca323d0940c4e0ee210c3ab04262cb6576fe4ded6d
76+
sha256:9ff56186de4dd0b9bb2a37c977c3a4c9358647cde60a16f11f4c05bded1fe77a
77+
78+
# Slack bot tokens from examples
79+
xoxb-781236542736-2364535789652-GkwFDQoHqzXDVsC6GzqYUypD
80+
```
81+
82+
To whitelist a token, add the exact string to this file. Each line is treated as a substring that will be checked against the entire line containing the token using exact substring matching.
83+
84+
**Important**: The whitelist uses simple substring matching, not regex patterns. Add the exact token string that should be ignored.
85+
86+
## Adding new token patterns
87+
88+
To add a new token pattern, edit `mage/mage/token_scan.clj` and add an entry to the `token-patterns` map:
89+
90+
```clojure
91+
(def ^:private token-patterns
92+
{"Existing Pattern" #"existing-regex"
93+
"Your New Token Type" #"13{2}7"})
94+
```
95+
96+
### Pattern guidelines
97+
98+
- **Be specific**: Patterns should match the actual token format, not environment variable assignments
99+
- **Include length constraints**: Use `{min,max}` quantifiers to avoid false positives
100+
- **Add comments**: Explain the token format and expected length
101+
- **Test thoroughly**: Run the scanner on the codebase to check for false positives
102+
- Run it on everything with: `mage token-scan -a`
103+
104+
Example of a good pattern:
105+
```clojure
106+
"Stripe API Key" #"sk_live_[A-Za-z0-9]{24}" ;; Stripe live keys: sk_live_ + 24 chars
107+
```
108+
109+
## Modifying file filtering
110+
111+
The scanner excludes certain files to avoid false positives from generated content. To modify the filtering, edit the `exclude-path-str?` function in `mage/mage/token_scan.clj`:
112+
113+
```clojure
114+
(defn- exclude-path-str?
115+
"Check if a file should be excluded from scanning"
116+
[path-str]
117+
(or
118+
;; Existing exclusions
119+
(str/includes? path-str "/.git/")
120+
(str/includes? path-str "/node_modules/")
121+
122+
;; Add new exclusions
123+
(str/includes? path-str "/my-generated-dir/")
124+
(str/ends-with? path-str ".generated.js")))
125+
```
126+
127+
### Common exclusions
128+
129+
The scanner currently excludes:
130+
- **Build directories**: `target/`, `node_modules/`, `.git/`
131+
- **Generated files**: `*.bundle.js`, `*.min.js`, `*.map`
132+
- **Binary files**: `*.jar`, `*.class`, `*.so`, `*.dll`
133+
- **Media files**: `*.png`, `*.jpg`, `*.svg`
134+
- **Test data**: `/stories-data/`, `/test-data/`, `/fixtures/`
135+
- **Checksum files**: `SHA256.sum`, `*.sha256`, `*.md5`
136+
137+
## Git Hook Integration
138+
139+
The scanner runs automatically as a git precommit hook. If it finds tokens or unused ignore comments, the commit will be blocked with:
140+
141+
- **Token detected**: Review the file to ensure it's not a real secret
142+
143+
144+
The scanner only scans files that are staged for commit, making it fast and focused on new changes.
145+
146+
## Troubleshooting
147+
148+
### False positives
149+
150+
If the scanner flags legitimate code:
151+
152+
1. **Add to whitelist** if it's a test token or example (edit `token_scanner_whitelist.txt`)
153+
2. **Refine the pattern** if it's too broad (edit `token-patterns`)
154+
3. **Exclude the file type** if it's generated content (edit `exclude-path-str?`)
155+
156+
### Performance issues
157+
158+
The scanner uses parallel processing and should complete in under 5 seconds for most commits. If it's slow:
159+
160+
1. Check if too many files are being scanned (`-v` flag shows file list)
161+
2. Consider excluding large generated directories
162+
3. Patterns with broad wildcards (like `.*`) can be slow
163+
164+
### Bypassing the hook
165+
166+
If you need to bypass the scanner for a specific commit (not recommended):
167+
168+
```bash
169+
git commit --no-verify -m "commit message"
170+
```
171+
172+
Use this sparingly and only when absolutely necessary.
173+
174+
### Getting help
175+
176+
For issues with the scanner:
177+
178+
1. Check the git hook output for detailed error messages
179+
2. Run the scanner locally to debug: `./bin/mage token-scan -v file1.txt file2.txt`
180+
3. Ask in the #security or #dev channels for help with patterns or exclusions
Binary file not shown.
Binary file not shown.

_docs/master/embedding/interactive-embedding.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ To embed a specific Metabase dashboard, you'll want to use the dashboard's Entit
8282
src="http://metabase.yourcompany.com/dashboard/entity/[Entity ID]"
8383
```
8484

85-
To get a dashboard's Entity ID, visit the dashboard and click on the **info** button. In the **Overview** tab, copy the **Entity ID**. Then in your iframe's `src` attribute to:
85+
To get a dashboard's Entity ID, visit the dashboard and click on the **info** button. In the **Overview** tab, copy the **Entity ID**. Then in your iframe's `src` attribute to:
8686

8787
```
8888
src=http://metabase.yourcompany.com/dashboard/entity/Dc_7X8N7zf4iDK9Ps1M3b
@@ -152,7 +152,7 @@ Learn more about [SameSite cookies](https://developer.mozilla.org/en-US/docs/Web
152152

153153
## Securing interactive embeds
154154

155-
Metabase uses HTTP cookies to authenticate people and keep them signed into your embedded Metabase, even when someone closes their browser session. If you enjoy diagrammed auth flows, check out [Interactive embedding with SSO](./securing-embeds).
155+
Metabase uses HTTP cookies to authenticate people and keep them signed into your embedded Metabase, even when someone closes their browser session. If you enjoy diagrammed auth flows, check out [Interactive embedding with SSO](/learn/metabase-basics/embedding/securing-embeds).
156156

157157
To limit the amount of time that a person stays logged in, set [`MAX_SESSION_AGE`](../configuring-metabase/environment-variables#max_session_age) to a number in minutes. The default value is 20,160 (two weeks).
158158

_docs/master/embedding/introduction.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,4 +71,3 @@ If you'd like to share your data with the good people of the internet, admins ca
7171
- [Publishing data visualizations to the web](/learn/metabase-basics/embedding/charts-and-dashboards).
7272
- [Multi-tenant self-service analytics](/learn/metabase-basics/embedding/multi-tenant-self-service-analytics).
7373
- [Customizing Metabase's appearance](../configuring-metabase/appearance).
74-
- [Securing embedded Metabase](./securing-embeds)

_docs/master/embedding/sdk/api/IconName.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

_docs/master/embedding/sdk/api/snippets/IconName.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,6 @@ type IconName =
100100
| "folder_filled"
101101
| "gauge"
102102
| "gear"
103-
| "gear_settings_filled"
104103
| "gem"
105104
| "globe"
106105
| "grabber"
@@ -187,7 +186,6 @@ type IconName =
187186
| "refresh_downstream"
188187
| "rocket"
189188
| "ruler"
190-
| "schema"
191189
| "search"
192190
| "section"
193191
| "segment"

0 commit comments

Comments
 (0)