forked from crtahlin/swarm_connect
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Context
This is a future enhancement that builds on #11 (manifest/collection upload support).
Benchmark testing (datafund/provenance-fellowship#22) showed that HTTP overhead (~279ms per request) dominates upload latency for small files. Manifest bundling provides 15-25x throughput improvement.
However, requiring clients to create TAR archives adds complexity. This feature would allow the gateway to handle batching transparently.
Concept
The gateway accepts sequential file uploads but buffers them internally, uploading as a manifest when thresholds are met:
Client Gateway Bee
| | |
|-- POST /data (file1) -------> | [buffer] |
|<-- 202 Accepted, queued ------| |
|-- POST /data (file2) -------> | [buffer] |
|<-- 202 Accepted, queued ------| |
|-- POST /data (file3) -------> | [buffer reaches threshold] |
| |-- POST /bzz (manifest) ---------> |
| |<-- {manifest_hash} --------------|
|<-- {file1: hash1, ...} -------| |
Possible Trigger Thresholds
| Threshold | Example | Description |
|---|---|---|
| File count | 100 files | Upload when buffer has N files |
| Total size | 1 MB | Upload when buffer reaches N bytes |
| Time-based | 5 seconds | Upload after timeout even if other thresholds not met |
| Manual flush | POST /flush |
Client explicitly triggers upload |
Possible API Design
# Enable batching mode
POST /api/v1/data/?stamp_id={id}&batch=true
# With configurable thresholds
POST /api/v1/data/?stamp_id={id}&batch=true&batch_size=100&batch_timeout=5s
# Get status of pending batch
GET /api/v1/batch/status?stamp_id={id}
# Force flush pending files
POST /api/v1/batch/flush?stamp_id={id}
Response Handling Options
- Synchronous: Block until batch uploads, return all hashes (simpler but slower)
- Async with polling: Return 202, client polls for results
- Async with callback: Return 202, gateway calls webhook with results
Benefits
- Transparent to client: No TAR creation needed client-side
- Optimal batching: Gateway decides best batch size based on traffic patterns
- Reduced complexity: Simple POST per file, gateway handles optimization
Dependencies
- Requires #11 (manifest upload support) to be implemented first
- Manifest upload is the foundation this feature builds upon
Status
TBD - Exact behavior, API design, and thresholds to be determined based on:
- Real-world usage patterns
- EnergonX requirements validation
- Performance testing with manifest uploads (Add manifest/collection upload support via /bzz with Swarm-Collection header #11)
This issue tracks a future enhancement. Implementation priority depends on client needs and #11 completion.
Metadata
Metadata
Assignees
Labels
No labels