Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large datasets efficienlty #582

Open
dalonsoa opened this issue Oct 9, 2024 · 1 comment
Open

Handle large datasets efficienlty #582

dalonsoa opened this issue Oct 9, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@dalonsoa
Copy link
Collaborator

dalonsoa commented Oct 9, 2024

  • Some models are going to require data at much higher temporal resolution than the wider model update tick. An example here is sub-daily or daily inputs to the Abiotic model.
  • The input data files for this use case can be very large – not something we really want to ingest into the Data object at model startup and try and store in RAM.
  • So, where do we store this kind of data, and is there a way to lazily load the data as required. This might be something that dask is well-suited to as this handles lazy loading of chunked data.
@dalonsoa dalonsoa added the enhancement New feature or request label Oct 9, 2024
@dalonsoa
Copy link
Collaborator Author

dalonsoa commented Oct 9, 2024

@vgro , we will need an example simulation with, at least, one BIG file and some indication to where it is used, so we can explore how to best handle that memory wise.

@alexdewar alexdewar self-assigned this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants