-
Notifications
You must be signed in to change notification settings - Fork 193
fix(s3): allow to delete objects individually #2131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(s3): allow to delete objects individually #2131
Conversation
WalkthroughAdds configurable S3 deletion strategies to S3BlobStorage (bulk vs per-object) with paging, bounded concurrency, abort support, and a new S3BlobStorageConfig; propagates the option via BuildConfig, env flag, index wiring, GCS endpoint auto-detection, and tests for the GCS URL detector. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI 💡 Knowledge Base configuration:
You can enable these settings in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
controlplane/src/core/blobstorage/s3.ts (1)
93-106: Accept 200 or 204 for DeleteObject to be provider-agnosticSome S3-compatible providers may return 200 instead of 204 for DeleteObject. Strictly requiring 204 can cause false negatives.
Apply this diff:
- if (resp.$metadata.httpStatusCode !== 204) { + if (![200, 204].includes(resp.$metadata.httpStatusCode ?? 0)) { throw new Error(`Failed to delete object from S3: ${resp}`); }
🧹 Nitpick comments (2)
controlplane/src/core/blobstorage/s3.ts (2)
12-26: Make concurrency and delay configurable via S3BlobStorageConfigHard-coded
maxConcurrencyandbatchDelayMslimit tuning across providers (MinIO, Cloudflare R2, GCS). Exposing them in config enables safer rollout and performance tuning.Proposed interface change and defaults:
-const maxConcurrency = 10; // Maximum number of concurrent operations (conservative for GCS compatibility) -const batchDelayMs = 200; // Delay between batches in milliseconds (balance between throughput and rate limiting) +const defaultMaxConcurrency = 10; +const defaultBatchDelayMs = 200; + export interface S3BlobStorageConfig { /** * Use individual delete operations instead of bulk delete. * Set to true for GCS compatibility, false for better S3 performance. * @default false */ useIndividualDeletes?: boolean; + /** + * Max number of concurrent individual delete operations. + * @default 10 + */ + maxConcurrency?: number; + /** + * Delay between batches of individual deletes in ms. + * @default 200 + */ + batchDelayMs?: number; }And store them in the class:
export class S3BlobStorage implements BlobStorage { private readonly useIndividualDeletes: boolean; + private readonly maxConcurrency: number; + private readonly batchDelayMs: number;Constructor:
- ) { - this.useIndividualDeletes = config.useIndividualDeletes ?? false; - } + ) { + this.useIndividualDeletes = config.useIndividualDeletes ?? false; + this.maxConcurrency = config.maxConcurrency ?? defaultMaxConcurrency; + this.batchDelayMs = config.batchDelayMs ?? defaultBatchDelayMs; + }Use fields:
- await this.sleep(batchDelayMs); + await this.sleep(this.batchDelayMs);- const deletedCounts = await this.executeWithConcurrency(deleteTasks, maxConcurrency); + const deletedCounts = await this.executeWithConcurrency(deleteTasks, this.maxConcurrency);
138-164: Consider counting actual deletions when using bulk APIReturning
objectsToDelete.lengthassumes all were deleted. WhenQuiet: false, you can usedeleted.Deleted?.length ?? 0for accuracy. Not required functionally, but improves observability and correctness.Example change:
- return objectsToDelete.length; + return (deleted.Deleted?.length ?? 0) + (deleted.Errors?.length ? 0 : 0);Or simply:
- return objectsToDelete.length; + return deleted.Deleted?.length ?? objectsToDelete.length;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
controlplane/src/core/blobstorage/s3.ts(2 hunks)controlplane/src/core/build-server.ts(3 hunks)controlplane/src/core/env.schema.ts(1 hunks)controlplane/src/core/util.ts(1 hunks)controlplane/src/index.ts(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
controlplane/src/core/build-server.ts (2)
controlplane/src/core/blobstorage/s3.ts (1)
S3BlobStorage(30-215)controlplane/src/core/util.ts (1)
isGoogleCloudStorageUrl(405-418)
controlplane/src/core/blobstorage/s3.ts (1)
controlplane/src/core/blobstorage/index.ts (1)
BlobStorage(18-53)
🔇 Additional comments (3)
controlplane/src/index.ts (1)
49-50: Wiring env flag into BuildConfig looks correctThe new
S3_USE_INDIVIDUAL_DELETESis properly parsed and passed through toBuildConfig.s3Storage.useIndividualDeletes.Also applies to: 132-133
controlplane/src/core/build-server.ts (1)
40-40: Import of GCS detector is appropriate and localizedPulling
isGoogleCloudStorageUrlfrom util centralizes detection logic and keeps concerns separated.controlplane/src/core/blobstorage/s3.ts (1)
185-214: Good: paginated listing and strategy selection per pageListing with ContinuationToken and switching between bulk vs. individual deletes per page is correct and memory-safe. Abort signals are also properly propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
controlplane/test/utils.test.ts(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build_test
- GitHub Check: build_push_image
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
🔇 Additional comments (2)
controlplane/test/utils.test.ts (2)
2-2: Import addition looks correctImporting isGoogleCloudStorageUrl from util.js is consistent with existing ESM import style in this test suite.
33-40: Positive GCS URL cases covered wellGood mix of host variations, including a case-insensitive host check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
controlplane/src/core/blobstorage/s3.ts (1)
56-79: AbortSignal is ignored by putObject; align signature with interface and pass it to AWS SDKBlobStorage.putObject supports
abortSignal; the S3 implementation currently drops it. This can cause unresponsive cancellations and inconsistent behavior vs. other providers.Apply this diff to accept and propagate
abortSignal:async putObject<Metadata extends Record<string, string>>({ key, body, contentType, metadata, }: { - key: string; - body: Buffer; - contentType: string; - version?: string; - metadata?: Metadata; + key: string; + abortSignal?: AbortSignal; + body: Buffer; + contentType: string; + version?: string; + metadata?: Metadata; }): Promise<void> { const command = new PutObjectCommand({ Bucket: this.bucketName, Key: key, Body: body, ContentType: contentType, Metadata: metadata, }); - const resp = await this.s3Client.send(command); + const resp = await this.s3Client.send(command, { abortSignal }); if (resp.$metadata.httpStatusCode !== 200) { - throw new Error(`Failed to put object to S3: ${resp}`); + throw new Error(`Failed to put object to S3: ${JSON.stringify(resp)}`); } }Note: switched error interpolation to JSON.stringify for meaningful diagnostics (see similar suggestion below).
🧹 Nitpick comments (4)
controlplane/src/core/blobstorage/s3.ts (4)
12-13: Consider making deletion concurrency configurable instead of hardcoding 10Different backends (AWS S3, GCS S3-compat, MinIO) and deployments have varying rate limits. Making this configurable avoids throttling or underutilization.
If you want a minimal change, extend the config and default here:
-const maxConcurrency = 10; // Maximum number of concurrent operations +const DEFAULT_MAX_CONCURRENCY = 10; // Maximum number of concurrent operations (override via config)Outside this hunk:
- Add
maxConcurrency?: numberto S3BlobStorageConfig.- Store
this.maxConcurrency = config.maxConcurrency ?? DEFAULT_MAX_CONCURRENCY;in the constructor.- Use
this.maxConcurrencyat the call site.
40-54: Docstring says “delays between batches” but no delays are implementedThe implementation batches promises with bounded concurrency, but there’s no delay. Fix the comment to avoid confusion.
Apply this diff to the docstring:
/** - * Execute promises with limited concurrency and delays between batches + * Execute tasks with bounded concurrency in batches (no artificial delay between batches) * Retries are handled by AWS SDK internally using exponential backoff. Default 3 retries. */
156-171: Individual deletes: consider allSettled to aggregate failures instead of failing the whole batch earlyWith Promise.all, one failure aborts the entire batch, which may mask partial successes. Using allSettled allows reporting exactly which keys failed.
Apply this diff to aggregate errors without losing partial progress:
- const deletedCounts = await this.executeWithConcurrency(deleteTasks, maxConcurrency); - return deletedCounts.reduce((sum: number, count: number) => sum + count, 0); + const settled = await this.executeWithConcurrency( + deleteTasks.map((task) => async () => { + try { + return await task(); + } catch (e) { + return e as Error; + } + }), + maxConcurrency + ); + const errors = settled.filter((r): r is Error => r instanceof Error); + const successes = settled.filter((r) => typeof r === 'number') as number[]; + if (errors.length > 0) { + // Surface actionable details but still return number of successful deletions. + throw new Error(`Failed to delete ${errors.length} object(s): ${errors.map((e) => e.message).join('; ')}`); + } + return successes.reduce((sum, count) => sum + count, 0);If keeping the fail-fast behavior is intentional, ignore this suggestion.
188-196: Nit: rephrase comment for clarity (GCS vs S3 wording)The individual delete path is for S3-compatible backends that don't implement DeleteObjects (not “S3 implementation without DeleteObjectsCommand”).
Apply this diff:
- // Use individual deletes for S3 implementation without DeleteObjectsCommand + // Use per-object deletes for backends that don't support DeleteObjects (e.g., some GCS S3-compatible endpoints)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
controlplane/src/core/blobstorage/s3.ts(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
controlplane/src/core/blobstorage/s3.ts (1)
controlplane/src/core/blobstorage/index.ts (1)
BlobStorage(18-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build_push_image
- GitHub Check: build_test
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
controlplane/src/core/blobstorage/s3.ts (2)
14-24: Config option for per-object deletes looks goodThe config shape and defaulting behavior are clear and align with the PR objective.
30-38: Constructor config injection and defaulting are correctStoring
useIndividualDeleteswith a sensible default is straightforward and maintains backward compatibility.
Summary by CodeRabbit
Checklist