Skip to content

Commit d6a3320

Browse files
committed
document integration credential usage
1 parent c5d47b1 commit d6a3320

File tree

1 file changed

+381
-0
lines changed

1 file changed

+381
-0
lines changed

INTEGRATIONS_CREDENTIALS.md

Lines changed: 381 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,381 @@
1+
# Deepnote Integrations & Credentials System
2+
3+
## Overview
4+
5+
The integrations system enables Deepnote notebooks to connect to external data sources (PostgreSQL, BigQuery, etc.) by securely managing credentials and exposing them to SQL blocks. The system handles:
6+
7+
1. **Credential Storage**: Secure storage using VSCode's SecretStorage API
8+
2. **Integration Detection**: Automatic discovery of integrations used in notebooks
9+
3. **UI Management**: Webview-based configuration interface
10+
4. **Kernel Integration**: Injection of credentials into Jupyter kernel environment
11+
5. **Toolkit Exposure**: Making credentials available to `deepnote-toolkit` for SQL execution
12+
13+
## Architecture
14+
15+
### Core Components
16+
17+
#### 1. **Integration Storage** (`integrationStorage.ts`)
18+
19+
Manages persistent storage of integration configurations using VSCode's encrypted SecretStorage API.
20+
21+
**Key Features:**
22+
- Uses VSCode's `SecretStorage` API for secure credential storage
23+
- Storage is scoped to the user's machine (shared across all Deepnote projects)
24+
- In-memory caching for performance
25+
- Event-driven updates via `onDidChangeIntegrations` event
26+
- Index-based storage for efficient retrieval
27+
28+
**Storage Format:**
29+
- Each integration config is stored as JSON under key: `deepnote-integrations.{integrationId}`
30+
- An index is maintained at key: `deepnote-integrations.index` containing all integration IDs
31+
32+
**Key Methods:**
33+
- `getAll()`: Retrieve all stored integration configurations
34+
- `get(integrationId)`: Get a specific integration by ID
35+
- `save(config)`: Save or update an integration configuration
36+
- `delete(integrationId)`: Remove an integration configuration
37+
- `exists(integrationId)`: Check if an integration is configured
38+
39+
**Integration Config Types:**
40+
41+
```typescript
42+
// PostgreSQL
43+
{
44+
id: string;
45+
name: string;
46+
type: 'postgres';
47+
host: string;
48+
port: number;
49+
database: string;
50+
username: string;
51+
password: string;
52+
ssl?: boolean;
53+
}
54+
55+
// BigQuery
56+
{
57+
id: string;
58+
name: string;
59+
type: 'bigquery';
60+
projectId: string;
61+
credentials: string; // JSON string of service account credentials
62+
}
63+
```
64+
65+
#### 2. **Integration Detector** (`integrationDetector.ts`)
66+
67+
Scans Deepnote projects to discover which integrations are used in SQL blocks.
68+
69+
**Detection Process:**
70+
1. Retrieves the Deepnote project from `IDeepnoteNotebookManager`
71+
2. Scans all notebooks in the project
72+
3. Examines each code block for `metadata.sql_integration_id`
73+
4. Checks if each integration is configured (has credentials)
74+
5. Returns a map of integration IDs to their status
75+
76+
**Integration Status:**
77+
- `Connected`: Integration has valid credentials stored
78+
- `Disconnected`: Integration is used but not configured
79+
- `Error`: Integration configuration is invalid
80+
81+
**Special Cases:**
82+
- Excludes `deepnote-dataframe-sql` (internal DuckDB integration)
83+
- Only processes code blocks with SQL integration metadata
84+
85+
#### 3. **Integration Manager** (`integrationManager.ts`)
86+
87+
Orchestrates the integration management UI and commands.
88+
89+
**Responsibilities:**
90+
- Registers the `deepnote.manageIntegrations` command
91+
- Updates VSCode context keys for UI visibility:
92+
- `deepnote.hasIntegrations`: True if any integrations are detected
93+
- `deepnote.hasUnconfiguredIntegrations`: True if any integrations lack credentials
94+
- Handles notebook selection changes
95+
- Opens the integration webview with detected integrations
96+
97+
**Command Flow:**
98+
1. User triggers command (from command palette or SQL cell status bar)
99+
2. Manager detects integrations in the active notebook
100+
3. Manager opens webview with integration list
101+
4. Optionally pre-selects a specific integration for configuration
102+
103+
#### 4. **Integration Webview** (`integrationWebview.ts`)
104+
105+
Provides the webview-based UI for managing integration credentials.
106+
107+
**Features:**
108+
- Persistent webview panel (survives defocus)
109+
- Real-time integration status updates
110+
- Configuration forms for each integration type
111+
- Delete/reset functionality
112+
113+
**Message Protocol:**
114+
115+
Extension → Webview:
116+
```typescript
117+
// Update integration list
118+
{ type: 'update', integrations: IntegrationWithStatus[] }
119+
120+
// Show configuration form
121+
{ type: 'showForm', integrationId: string, config: IntegrationConfig | null }
122+
123+
// Status messages
124+
{ type: 'success' | 'error', message: string }
125+
```
126+
127+
Webview → Extension:
128+
```typescript
129+
// Save configuration
130+
{ type: 'save', integrationId: string, config: IntegrationConfig }
131+
132+
// Delete configuration
133+
{ type: 'delete', integrationId: string }
134+
135+
// Request configuration form
136+
{ type: 'configure', integrationId: string }
137+
```
138+
139+
### UI Components (React)
140+
141+
#### 5. **Integration Panel** (`IntegrationPanel.tsx`)
142+
143+
Main React component that manages the webview UI state.
144+
145+
**State Management:**
146+
- `integrations`: List of detected integrations with status
147+
- `selectedIntegrationId`: Currently selected integration for configuration
148+
- `selectedConfig`: Existing configuration being edited
149+
- `message`: Success/error messages
150+
- `confirmDelete`: Confirmation state for deletion
151+
152+
**User Flows:**
153+
154+
**Configure Integration:**
155+
1. User clicks "Configure" button
156+
2. Panel shows configuration form overlay
157+
3. User enters credentials
158+
4. Panel sends save message to extension
159+
5. Extension stores credentials and updates status
160+
6. Panel shows success message and refreshes list
161+
162+
**Delete Integration:**
163+
1. User clicks "Reset" button
164+
2. Panel shows confirmation prompt (5 seconds)
165+
3. User clicks again to confirm
166+
4. Panel sends delete message to extension
167+
5. Extension removes credentials
168+
6. Panel updates status to "Disconnected"
169+
170+
#### 6. **Configuration Forms** (`PostgresForm.tsx`, `BigQueryForm.tsx`)
171+
172+
Type-specific forms for entering integration credentials.
173+
174+
**PostgreSQL Form Fields:**
175+
- Name (display name)
176+
- Host
177+
- Port (default: 5432)
178+
- Database
179+
- Username
180+
- Password
181+
- SSL (checkbox)
182+
183+
**BigQuery Form Fields:**
184+
- Name (display name)
185+
- Project ID
186+
- Service Account Credentials (JSON textarea)
187+
188+
**Validation:**
189+
- All fields are required
190+
- BigQuery credentials must be valid JSON
191+
- Port must be a valid number
192+
193+
### Kernel Integration
194+
195+
#### 7. **SQL Integration Environment Variables Provider** (`sqlIntegrationEnvironmentVariablesProvider.ts`)
196+
197+
Provides environment variables containing integration credentials for the Jupyter kernel.
198+
199+
**Process:**
200+
1. Scans the notebook for SQL cells with `sql_integration_id` metadata
201+
2. Retrieves credentials for each detected integration
202+
3. Converts credentials to the format expected by `deepnote-toolkit`
203+
4. Returns environment variables to be injected into the kernel process
204+
205+
**Environment Variable Format:**
206+
207+
Variable name: `SQL_{INTEGRATION_ID}` (uppercased, special chars replaced with `_`)
208+
209+
Example: Integration ID `my-postgres-db` → Environment variable `SQL_MY_POSTGRES_DB`
210+
211+
**Credential JSON Format:**
212+
213+
PostgreSQL:
214+
```json
215+
{
216+
"url": "postgresql://username:password@host:port/database",
217+
"params": { "sslmode": "require" },
218+
"param_style": "format"
219+
}
220+
```
221+
222+
BigQuery:
223+
```json
224+
{
225+
"url": "bigquery://?user_supplied_client=true",
226+
"params": {
227+
"project_id": "my-project",
228+
"credentials": { /* service account JSON */ }
229+
},
230+
"param_style": "format"
231+
}
232+
```
233+
234+
**Integration Points:**
235+
- Registered as an environment variable provider in the kernel environment service
236+
- Called when starting a Jupyter kernel for a Deepnote notebook
237+
- Environment variables are passed to the kernel process at startup
238+
239+
#### 8. **SQL Integration Startup Code Provider** (`sqlIntegrationStartupCodeProvider.ts`)
240+
241+
Injects Python code into the kernel at startup to set environment variables.
242+
243+
**Why This Is Needed:**
244+
Jupyter doesn't automatically pass all environment variables from the server process to the kernel process. This provider ensures credentials are available in the kernel's `os.environ`.
245+
246+
**Generated Code:**
247+
```python
248+
try:
249+
import os
250+
# [SQL Integration] Setting N SQL integration env vars...
251+
os.environ['SQL_MY_POSTGRES_DB'] = '{"url":"postgresql://...","params":{},"param_style":"format"}'
252+
os.environ['SQL_MY_BIGQUERY'] = '{"url":"bigquery://...","params":{...},"param_style":"format"}'
253+
# [SQL Integration] Successfully set N SQL integration env vars
254+
except Exception as e:
255+
import traceback
256+
print(f"[SQL Integration] ERROR: Failed to set SQL integration env vars: {e}")
257+
traceback.print_exc()
258+
```
259+
260+
**Execution:**
261+
- Registered with `IStartupCodeProviders` for `JupyterNotebookView`
262+
- Runs automatically when a Python kernel starts for a Deepnote notebook
263+
- Priority: `StartupCodePriority.Base` (runs early)
264+
- Only runs for Python kernels on Deepnote notebooks
265+
266+
### Toolkit Integration
267+
268+
#### 9. **How Credentials Are Exposed to deepnote-toolkit**
269+
270+
The `deepnote-toolkit` Python package reads credentials from environment variables to execute SQL blocks.
271+
272+
**Flow:**
273+
1. Extension detects SQL blocks in notebook
274+
2. Extension retrieves credentials from secure storage
275+
3. Extension converts credentials to JSON format
276+
4. Extension injects credentials as environment variables (two methods):
277+
- **Server Process**: Via `SqlIntegrationEnvironmentVariablesProvider` when starting Jupyter server
278+
- **Kernel Process**: Via `SqlIntegrationStartupCodeProvider` when starting Python kernel
279+
5. `deepnote-toolkit` reads environment variables when executing SQL blocks
280+
6. Toolkit creates database connections using the credentials
281+
7. Toolkit executes SQL queries and returns results
282+
283+
**Environment Variable Lookup:**
284+
When a SQL block with `sql_integration_id: "my-postgres-db"` is executed:
285+
1. Toolkit looks for environment variable `SQL_MY_POSTGRES_DB`
286+
2. Toolkit parses the JSON value
287+
3. Toolkit creates a SQLAlchemy connection using the `url` and `params`
288+
4. Toolkit executes the SQL query
289+
5. Toolkit returns results as a pandas DataFrame
290+
291+
## Data Flow
292+
293+
### Configuration Flow
294+
```
295+
User → IntegrationPanel (UI)
296+
→ vscodeApi.postMessage({ type: 'save', config })
297+
→ IntegrationWebviewProvider.onMessage()
298+
→ IntegrationStorage.save(config)
299+
→ EncryptedStorage.store() [VSCode SecretStorage API]
300+
→ IntegrationStorage fires onDidChangeIntegrations event
301+
→ SqlIntegrationEnvironmentVariablesProvider fires onDidChangeEnvironmentVariables event
302+
```
303+
304+
### Execution Flow
305+
```
306+
User executes SQL cell
307+
→ Kernel startup triggered
308+
→ SqlIntegrationEnvironmentVariablesProvider.getEnvironmentVariables()
309+
→ Scans notebook for SQL cells
310+
→ Retrieves credentials from IntegrationStorage
311+
→ Converts to JSON format
312+
→ Returns environment variables
313+
→ Environment variables passed to Jupyter server process
314+
→ SqlIntegrationStartupCodeProvider.getCode()
315+
→ Generates Python code to set os.environ
316+
→ Startup code executed in kernel
317+
→ deepnote-toolkit reads os.environ['SQL_*']
318+
→ Toolkit executes SQL query
319+
→ Results returned to notebook
320+
```
321+
322+
## Security Considerations
323+
324+
1. **Encrypted Storage**: All credentials are stored using VSCode's SecretStorage API, which uses the OS keychain
325+
2. **No Plaintext**: Credentials are never written to disk in plaintext
326+
3. **Scoped Access**: Storage is scoped to the VSCode extension
327+
4. **Environment Isolation**: Each notebook gets only the credentials it needs
328+
5. **No Logging**: Credential values are not logged (only first 100 chars for debugging)
329+
330+
## Adding New Integration Types
331+
332+
To add a new integration type (e.g., MySQL, Snowflake):
333+
334+
1. **Add type to `integrationTypes.ts`**:
335+
```typescript
336+
export enum IntegrationType {
337+
Postgres = 'postgres',
338+
BigQuery = 'bigquery',
339+
MySQL = 'mysql' // New type
340+
}
341+
342+
export interface MySQLIntegrationConfig extends BaseIntegrationConfig {
343+
type: IntegrationType.MySQL;
344+
host: string;
345+
port: number;
346+
database: string;
347+
username: string;
348+
password: string;
349+
}
350+
351+
export type IntegrationConfig = PostgresIntegrationConfig | BigQueryIntegrationConfig | MySQLIntegrationConfig;
352+
```
353+
354+
2. **Add conversion logic in `sqlIntegrationEnvironmentVariablesProvider.ts`**:
355+
```typescript
356+
case IntegrationType.MySQL: {
357+
const url = `mysql://${config.username}:${config.password}@${config.host}:${config.port}/${config.database}`;
358+
return JSON.stringify({ url, params: {}, param_style: 'format' });
359+
}
360+
```
361+
362+
3. **Create UI form component** (`MySQLForm.tsx`)
363+
364+
4. **Update `ConfigurationForm.tsx`** to render the new form
365+
366+
5. **Update webview types** (`src/webviews/webview-side/integrations/types.ts`)
367+
368+
6. **Add localization strings** for the new integration type
369+
370+
## Testing
371+
372+
Unit tests are located in:
373+
- `sqlIntegrationEnvironmentVariablesProvider.unit.test.ts`
374+
375+
Tests cover:
376+
- Environment variable generation for each integration type
377+
- Multiple integrations in a single notebook
378+
- Missing credentials handling
379+
- Integration ID to environment variable name conversion
380+
- JSON format validation
381+

0 commit comments

Comments
 (0)