|
| 1 | +# Deepnote Integrations & Credentials System |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The integrations system enables Deepnote notebooks to connect to external data sources (PostgreSQL, BigQuery, etc.) by securely managing credentials and exposing them to SQL blocks. The system handles: |
| 6 | + |
| 7 | +1. **Credential Storage**: Secure storage using VSCode's SecretStorage API |
| 8 | +2. **Integration Detection**: Automatic discovery of integrations used in notebooks |
| 9 | +3. **UI Management**: Webview-based configuration interface |
| 10 | +4. **Kernel Integration**: Injection of credentials into Jupyter kernel environment |
| 11 | +5. **Toolkit Exposure**: Making credentials available to `deepnote-toolkit` for SQL execution |
| 12 | + |
| 13 | +## Architecture |
| 14 | + |
| 15 | +### Core Components |
| 16 | + |
| 17 | +#### 1. **Integration Storage** (`integrationStorage.ts`) |
| 18 | + |
| 19 | +Manages persistent storage of integration configurations using VSCode's encrypted SecretStorage API. |
| 20 | + |
| 21 | +**Key Features:** |
| 22 | +- Uses VSCode's `SecretStorage` API for secure credential storage |
| 23 | +- Storage is scoped to the user's machine (shared across all Deepnote projects) |
| 24 | +- In-memory caching for performance |
| 25 | +- Event-driven updates via `onDidChangeIntegrations` event |
| 26 | +- Index-based storage for efficient retrieval |
| 27 | + |
| 28 | +**Storage Format:** |
| 29 | +- Each integration config is stored as JSON under key: `deepnote-integrations.{integrationId}` |
| 30 | +- An index is maintained at key: `deepnote-integrations.index` containing all integration IDs |
| 31 | + |
| 32 | +**Key Methods:** |
| 33 | +- `getAll()`: Retrieve all stored integration configurations |
| 34 | +- `get(integrationId)`: Get a specific integration by ID |
| 35 | +- `save(config)`: Save or update an integration configuration |
| 36 | +- `delete(integrationId)`: Remove an integration configuration |
| 37 | +- `exists(integrationId)`: Check if an integration is configured |
| 38 | + |
| 39 | +**Integration Config Types:** |
| 40 | + |
| 41 | +```typescript |
| 42 | +// PostgreSQL |
| 43 | +{ |
| 44 | + id: string; |
| 45 | + name: string; |
| 46 | + type: 'postgres'; |
| 47 | + host: string; |
| 48 | + port: number; |
| 49 | + database: string; |
| 50 | + username: string; |
| 51 | + password: string; |
| 52 | + ssl?: boolean; |
| 53 | +} |
| 54 | + |
| 55 | +// BigQuery |
| 56 | +{ |
| 57 | + id: string; |
| 58 | + name: string; |
| 59 | + type: 'bigquery'; |
| 60 | + projectId: string; |
| 61 | + credentials: string; // JSON string of service account credentials |
| 62 | +} |
| 63 | +``` |
| 64 | + |
| 65 | +#### 2. **Integration Detector** (`integrationDetector.ts`) |
| 66 | + |
| 67 | +Scans Deepnote projects to discover which integrations are used in SQL blocks. |
| 68 | + |
| 69 | +**Detection Process:** |
| 70 | +1. Retrieves the Deepnote project from `IDeepnoteNotebookManager` |
| 71 | +2. Scans all notebooks in the project |
| 72 | +3. Examines each code block for `metadata.sql_integration_id` |
| 73 | +4. Checks if each integration is configured (has credentials) |
| 74 | +5. Returns a map of integration IDs to their status |
| 75 | + |
| 76 | +**Integration Status:** |
| 77 | +- `Connected`: Integration has valid credentials stored |
| 78 | +- `Disconnected`: Integration is used but not configured |
| 79 | +- `Error`: Integration configuration is invalid |
| 80 | + |
| 81 | +**Special Cases:** |
| 82 | +- Excludes `deepnote-dataframe-sql` (internal DuckDB integration) |
| 83 | +- Only processes code blocks with SQL integration metadata |
| 84 | + |
| 85 | +#### 3. **Integration Manager** (`integrationManager.ts`) |
| 86 | + |
| 87 | +Orchestrates the integration management UI and commands. |
| 88 | + |
| 89 | +**Responsibilities:** |
| 90 | +- Registers the `deepnote.manageIntegrations` command |
| 91 | +- Updates VSCode context keys for UI visibility: |
| 92 | + - `deepnote.hasIntegrations`: True if any integrations are detected |
| 93 | + - `deepnote.hasUnconfiguredIntegrations`: True if any integrations lack credentials |
| 94 | +- Handles notebook selection changes |
| 95 | +- Opens the integration webview with detected integrations |
| 96 | + |
| 97 | +**Command Flow:** |
| 98 | +1. User triggers command (from command palette or SQL cell status bar) |
| 99 | +2. Manager detects integrations in the active notebook |
| 100 | +3. Manager opens webview with integration list |
| 101 | +4. Optionally pre-selects a specific integration for configuration |
| 102 | + |
| 103 | +#### 4. **Integration Webview** (`integrationWebview.ts`) |
| 104 | + |
| 105 | +Provides the webview-based UI for managing integration credentials. |
| 106 | + |
| 107 | +**Features:** |
| 108 | +- Persistent webview panel (survives defocus) |
| 109 | +- Real-time integration status updates |
| 110 | +- Configuration forms for each integration type |
| 111 | +- Delete/reset functionality |
| 112 | + |
| 113 | +**Message Protocol:** |
| 114 | + |
| 115 | +Extension → Webview: |
| 116 | +```typescript |
| 117 | +// Update integration list |
| 118 | +{ type: 'update', integrations: IntegrationWithStatus[] } |
| 119 | + |
| 120 | +// Show configuration form |
| 121 | +{ type: 'showForm', integrationId: string, config: IntegrationConfig | null } |
| 122 | + |
| 123 | +// Status messages |
| 124 | +{ type: 'success' | 'error', message: string } |
| 125 | +``` |
| 126 | + |
| 127 | +Webview → Extension: |
| 128 | +```typescript |
| 129 | +// Save configuration |
| 130 | +{ type: 'save', integrationId: string, config: IntegrationConfig } |
| 131 | + |
| 132 | +// Delete configuration |
| 133 | +{ type: 'delete', integrationId: string } |
| 134 | + |
| 135 | +// Request configuration form |
| 136 | +{ type: 'configure', integrationId: string } |
| 137 | +``` |
| 138 | + |
| 139 | +### UI Components (React) |
| 140 | + |
| 141 | +#### 5. **Integration Panel** (`IntegrationPanel.tsx`) |
| 142 | + |
| 143 | +Main React component that manages the webview UI state. |
| 144 | + |
| 145 | +**State Management:** |
| 146 | +- `integrations`: List of detected integrations with status |
| 147 | +- `selectedIntegrationId`: Currently selected integration for configuration |
| 148 | +- `selectedConfig`: Existing configuration being edited |
| 149 | +- `message`: Success/error messages |
| 150 | +- `confirmDelete`: Confirmation state for deletion |
| 151 | + |
| 152 | +**User Flows:** |
| 153 | + |
| 154 | +**Configure Integration:** |
| 155 | +1. User clicks "Configure" button |
| 156 | +2. Panel shows configuration form overlay |
| 157 | +3. User enters credentials |
| 158 | +4. Panel sends save message to extension |
| 159 | +5. Extension stores credentials and updates status |
| 160 | +6. Panel shows success message and refreshes list |
| 161 | + |
| 162 | +**Delete Integration:** |
| 163 | +1. User clicks "Reset" button |
| 164 | +2. Panel shows confirmation prompt (5 seconds) |
| 165 | +3. User clicks again to confirm |
| 166 | +4. Panel sends delete message to extension |
| 167 | +5. Extension removes credentials |
| 168 | +6. Panel updates status to "Disconnected" |
| 169 | + |
| 170 | +#### 6. **Configuration Forms** (`PostgresForm.tsx`, `BigQueryForm.tsx`) |
| 171 | + |
| 172 | +Type-specific forms for entering integration credentials. |
| 173 | + |
| 174 | +**PostgreSQL Form Fields:** |
| 175 | +- Name (display name) |
| 176 | +- Host |
| 177 | +- Port (default: 5432) |
| 178 | +- Database |
| 179 | +- Username |
| 180 | +- Password |
| 181 | +- SSL (checkbox) |
| 182 | + |
| 183 | +**BigQuery Form Fields:** |
| 184 | +- Name (display name) |
| 185 | +- Project ID |
| 186 | +- Service Account Credentials (JSON textarea) |
| 187 | + |
| 188 | +**Validation:** |
| 189 | +- All fields are required |
| 190 | +- BigQuery credentials must be valid JSON |
| 191 | +- Port must be a valid number |
| 192 | + |
| 193 | +### Kernel Integration |
| 194 | + |
| 195 | +#### 7. **SQL Integration Environment Variables Provider** (`sqlIntegrationEnvironmentVariablesProvider.ts`) |
| 196 | + |
| 197 | +Provides environment variables containing integration credentials for the Jupyter kernel. |
| 198 | + |
| 199 | +**Process:** |
| 200 | +1. Scans the notebook for SQL cells with `sql_integration_id` metadata |
| 201 | +2. Retrieves credentials for each detected integration |
| 202 | +3. Converts credentials to the format expected by `deepnote-toolkit` |
| 203 | +4. Returns environment variables to be injected into the kernel process |
| 204 | + |
| 205 | +**Environment Variable Format:** |
| 206 | + |
| 207 | +Variable name: `SQL_{INTEGRATION_ID}` (uppercased, special chars replaced with `_`) |
| 208 | + |
| 209 | +Example: Integration ID `my-postgres-db` → Environment variable `SQL_MY_POSTGRES_DB` |
| 210 | + |
| 211 | +**Credential JSON Format:** |
| 212 | + |
| 213 | +PostgreSQL: |
| 214 | +```json |
| 215 | +{ |
| 216 | + "url": "postgresql://username:password@host:port/database", |
| 217 | + "params": { "sslmode": "require" }, |
| 218 | + "param_style": "format" |
| 219 | +} |
| 220 | +``` |
| 221 | + |
| 222 | +BigQuery: |
| 223 | +```json |
| 224 | +{ |
| 225 | + "url": "bigquery://?user_supplied_client=true", |
| 226 | + "params": { |
| 227 | + "project_id": "my-project", |
| 228 | + "credentials": { /* service account JSON */ } |
| 229 | + }, |
| 230 | + "param_style": "format" |
| 231 | +} |
| 232 | +``` |
| 233 | + |
| 234 | +**Integration Points:** |
| 235 | +- Registered as an environment variable provider in the kernel environment service |
| 236 | +- Called when starting a Jupyter kernel for a Deepnote notebook |
| 237 | +- Environment variables are passed to the kernel process at startup |
| 238 | + |
| 239 | +#### 8. **SQL Integration Startup Code Provider** (`sqlIntegrationStartupCodeProvider.ts`) |
| 240 | + |
| 241 | +Injects Python code into the kernel at startup to set environment variables. |
| 242 | + |
| 243 | +**Why This Is Needed:** |
| 244 | +Jupyter doesn't automatically pass all environment variables from the server process to the kernel process. This provider ensures credentials are available in the kernel's `os.environ`. |
| 245 | + |
| 246 | +**Generated Code:** |
| 247 | +```python |
| 248 | +try: |
| 249 | + import os |
| 250 | + # [SQL Integration] Setting N SQL integration env vars... |
| 251 | + os.environ['SQL_MY_POSTGRES_DB'] = '{"url":"postgresql://...","params":{},"param_style":"format"}' |
| 252 | + os.environ['SQL_MY_BIGQUERY'] = '{"url":"bigquery://...","params":{...},"param_style":"format"}' |
| 253 | + # [SQL Integration] Successfully set N SQL integration env vars |
| 254 | +except Exception as e: |
| 255 | + import traceback |
| 256 | + print(f"[SQL Integration] ERROR: Failed to set SQL integration env vars: {e}") |
| 257 | + traceback.print_exc() |
| 258 | +``` |
| 259 | + |
| 260 | +**Execution:** |
| 261 | +- Registered with `IStartupCodeProviders` for `JupyterNotebookView` |
| 262 | +- Runs automatically when a Python kernel starts for a Deepnote notebook |
| 263 | +- Priority: `StartupCodePriority.Base` (runs early) |
| 264 | +- Only runs for Python kernels on Deepnote notebooks |
| 265 | + |
| 266 | +### Toolkit Integration |
| 267 | + |
| 268 | +#### 9. **How Credentials Are Exposed to deepnote-toolkit** |
| 269 | + |
| 270 | +The `deepnote-toolkit` Python package reads credentials from environment variables to execute SQL blocks. |
| 271 | + |
| 272 | +**Flow:** |
| 273 | +1. Extension detects SQL blocks in notebook |
| 274 | +2. Extension retrieves credentials from secure storage |
| 275 | +3. Extension converts credentials to JSON format |
| 276 | +4. Extension injects credentials as environment variables (two methods): |
| 277 | + - **Server Process**: Via `SqlIntegrationEnvironmentVariablesProvider` when starting Jupyter server |
| 278 | + - **Kernel Process**: Via `SqlIntegrationStartupCodeProvider` when starting Python kernel |
| 279 | +5. `deepnote-toolkit` reads environment variables when executing SQL blocks |
| 280 | +6. Toolkit creates database connections using the credentials |
| 281 | +7. Toolkit executes SQL queries and returns results |
| 282 | + |
| 283 | +**Environment Variable Lookup:** |
| 284 | +When a SQL block with `sql_integration_id: "my-postgres-db"` is executed: |
| 285 | +1. Toolkit looks for environment variable `SQL_MY_POSTGRES_DB` |
| 286 | +2. Toolkit parses the JSON value |
| 287 | +3. Toolkit creates a SQLAlchemy connection using the `url` and `params` |
| 288 | +4. Toolkit executes the SQL query |
| 289 | +5. Toolkit returns results as a pandas DataFrame |
| 290 | + |
| 291 | +## Data Flow |
| 292 | + |
| 293 | +### Configuration Flow |
| 294 | +``` |
| 295 | +User → IntegrationPanel (UI) |
| 296 | + → vscodeApi.postMessage({ type: 'save', config }) |
| 297 | + → IntegrationWebviewProvider.onMessage() |
| 298 | + → IntegrationStorage.save(config) |
| 299 | + → EncryptedStorage.store() [VSCode SecretStorage API] |
| 300 | + → IntegrationStorage fires onDidChangeIntegrations event |
| 301 | + → SqlIntegrationEnvironmentVariablesProvider fires onDidChangeEnvironmentVariables event |
| 302 | +``` |
| 303 | + |
| 304 | +### Execution Flow |
| 305 | +``` |
| 306 | +User executes SQL cell |
| 307 | + → Kernel startup triggered |
| 308 | + → SqlIntegrationEnvironmentVariablesProvider.getEnvironmentVariables() |
| 309 | + → Scans notebook for SQL cells |
| 310 | + → Retrieves credentials from IntegrationStorage |
| 311 | + → Converts to JSON format |
| 312 | + → Returns environment variables |
| 313 | + → Environment variables passed to Jupyter server process |
| 314 | + → SqlIntegrationStartupCodeProvider.getCode() |
| 315 | + → Generates Python code to set os.environ |
| 316 | + → Startup code executed in kernel |
| 317 | + → deepnote-toolkit reads os.environ['SQL_*'] |
| 318 | + → Toolkit executes SQL query |
| 319 | + → Results returned to notebook |
| 320 | +``` |
| 321 | + |
| 322 | +## Security Considerations |
| 323 | + |
| 324 | +1. **Encrypted Storage**: All credentials are stored using VSCode's SecretStorage API, which uses the OS keychain |
| 325 | +2. **No Plaintext**: Credentials are never written to disk in plaintext |
| 326 | +3. **Scoped Access**: Storage is scoped to the VSCode extension |
| 327 | +4. **Environment Isolation**: Each notebook gets only the credentials it needs |
| 328 | +5. **No Logging**: Credential values are not logged (only first 100 chars for debugging) |
| 329 | + |
| 330 | +## Adding New Integration Types |
| 331 | + |
| 332 | +To add a new integration type (e.g., MySQL, Snowflake): |
| 333 | + |
| 334 | +1. **Add type to `integrationTypes.ts`**: |
| 335 | + ```typescript |
| 336 | + export enum IntegrationType { |
| 337 | + Postgres = 'postgres', |
| 338 | + BigQuery = 'bigquery', |
| 339 | + MySQL = 'mysql' // New type |
| 340 | + } |
| 341 | + |
| 342 | + export interface MySQLIntegrationConfig extends BaseIntegrationConfig { |
| 343 | + type: IntegrationType.MySQL; |
| 344 | + host: string; |
| 345 | + port: number; |
| 346 | + database: string; |
| 347 | + username: string; |
| 348 | + password: string; |
| 349 | + } |
| 350 | + |
| 351 | + export type IntegrationConfig = PostgresIntegrationConfig | BigQueryIntegrationConfig | MySQLIntegrationConfig; |
| 352 | + ``` |
| 353 | + |
| 354 | +2. **Add conversion logic in `sqlIntegrationEnvironmentVariablesProvider.ts`**: |
| 355 | + ```typescript |
| 356 | + case IntegrationType.MySQL: { |
| 357 | + const url = `mysql://${config.username}:${config.password}@${config.host}:${config.port}/${config.database}`; |
| 358 | + return JSON.stringify({ url, params: {}, param_style: 'format' }); |
| 359 | + } |
| 360 | + ``` |
| 361 | + |
| 362 | +3. **Create UI form component** (`MySQLForm.tsx`) |
| 363 | + |
| 364 | +4. **Update `ConfigurationForm.tsx`** to render the new form |
| 365 | + |
| 366 | +5. **Update webview types** (`src/webviews/webview-side/integrations/types.ts`) |
| 367 | + |
| 368 | +6. **Add localization strings** for the new integration type |
| 369 | + |
| 370 | +## Testing |
| 371 | + |
| 372 | +Unit tests are located in: |
| 373 | +- `sqlIntegrationEnvironmentVariablesProvider.unit.test.ts` |
| 374 | + |
| 375 | +Tests cover: |
| 376 | +- Environment variable generation for each integration type |
| 377 | +- Multiple integrations in a single notebook |
| 378 | +- Missing credentials handling |
| 379 | +- Integration ID to environment variable name conversion |
| 380 | +- JSON format validation |
| 381 | + |
0 commit comments