Description
To fill the gap with Powertools for Python, we should add a streaming module. This will allow us to handle datasets larger than the available memory as streaming data, for instance, transforming CSVs on the fly.
Within Lambda, processing S3 objects larger than the allocated amount of memory can lead to out of memory or timeout situations. For cost efficiency, your S3 objects may be encoded and compressed in various formats (gzip, CSV, zip files, etc), increasing the amount of non-business logic and reliability risks.
Streaming utility makes this process easier by fetching parts of your data as you consume it, and transparently applying data transformations to the data stream. This allows you to process one, a few, or all rows of your large dataset while consuming a few MBs only.
Python version: https://awslabs.github.io/aws-lambda-powertools-python/2.4.0/utilities/streaming/
Metadata
Metadata
Assignees
Type
Projects
Status