Skip to content

Add Streaming module #1085

Open
Open
@scottgerring

Description

@scottgerring

To fill the gap with Powertools for Python, we should add a streaming module. This will allow us to handle datasets larger than the available memory as streaming data, for instance, transforming CSVs on the fly.

Within Lambda, processing S3 objects larger than the allocated amount of memory can lead to out of memory or timeout situations. For cost efficiency, your S3 objects may be encoded and compressed in various formats (gzip, CSV, zip files, etc), increasing the amount of non-business logic and reliability risks.

Streaming utility makes this process easier by fetching parts of your data as you consume it, and transparently applying data transformations to the data stream. This allows you to process one, a few, or all rows of your large dataset while consuming a few MBs only.

Python version: https://awslabs.github.io/aws-lambda-powertools-python/2.4.0/utilities/streaming/

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Ideas

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions