Skip to content

Add cloud data loading support #8173

Open
@KumoLiu

Description

@KumoLiu

This feature would enable the application to pull, process, and store data directly from cloud-based services, such as AWS, Google Cloud, and Azure. By integrating cloud data support, we aim to provide users with more flexibility and scalability in managing their data, especially when dealing with large datasets or distributed systems.

Implementation Considerations:
API Integrations: We would need to support APIs for major cloud providers (e.g., S3 for AWS, Blob Storage for Azure, and Google Cloud Storage).
Authentication and Security: Secure access management is critical, so we may need integrate with cloud authentication protocols like IAM.
Data Formats and Compatibility: Support for multiple data formats (e.g., CSV, JSON, DICOM) to ensure compatibility with various data types stored on the cloud.

ref:
https://github.com/webdataset/webdataset
https://github.com/mosaicml/streaming
https://aws.amazon.com/healthimaging/

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions