-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem Statement
The current read_variable_at_step tool loads entire variables into memory, causing significant issues when working with large scientific datasets:
- Memory Exhaustion: Variables with huge elements cause out-of-memory errors
- AI Token Limits: Large arrays overwhelm AI context windows, making analysis impossible
- Performance Issues: Loading massive multi-dimensional datasets blocks processing
- Scalability Problems: Time-series and spatial data from simulations are often too large to process
This is particularly problematic for scientific workflows where BP5 files commonly contain variables with millions of elements from computational simulations.
Proposed Solution
Implement a new read_variable_chunk tool that enables memory-efficient reading of large BP5 variables through intelligent chunking:
Core Features Needed
- Automatic Chunking: Split large variables along the first dimension while preserving other dimensions
- Dynamic Sizing: Calculate optimal chunk sizes based on variable characteristics (suggested: 100-500 elements per chunk)
- Iteration Support: Provide metadata for progressive reading with clear navigation
- Memory Safety: Use ADIOS2's selection API to read only requested chunks
- AI-Friendly Output: Include rich metadata for intelligent processing decisions
Impact
This feature would unlock AI-driven analysis of large scientific datasets that are currently impossible to process due to memory constraints. It's essential for enabling modern AI workflows in computational science and research environments.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request