Terraform module and Lambda for saving JSON log records from Kinesis Data Streams to S3.
- Records in Kinesis stream must be valid JSON data. Non-JSON data will be saved with
unknownprefix.- gzipped JSON, CloudWatch Logs subscription filters log format are supported.
- Logs without either of necessary keys listed below will be saved as
unknownas well.
- JSON data must have the following keys (key names are modifiable via variables):
log_type: Log type identifier. Log data will be saved by this key:%log_type%/YYYY-MM/DD/.log_id: Any unique identifier. Used to avoid file overwrites on S3. Also is useful to search for a specific log record.time: Any timestamp supported by dateutil.parser.parse. ISO8601 with milli/microseconds recommended.
resource "aws_kinesis_stream" "stream" {
name = "stream"
shard_count = "1"
retention_period = "24"
}
module "kinesis_to_s3" {
source = "baikonur-oss/lambda-kinesis-to-s3/aws"
version = "2.1.0"
lambda_package_url = "https://github.com/baikonur-oss/terraform-aws-lambda-kinesis-to-s3/releases/download/v2.1.0/lambda_package.zip"
name = "kinesis_to_s3"
kinesis_stream_arn = aws_kinesis_stream.stream.arn
batch_size = "100"
log_bucket = "example-bucket"
log_path_prefix = "foo/bar"
}Warning: use same module and package versions!
Use version parameter to pin to a specific version, or to specify a version constraint when pulling from Terraform Module Registry (source = baikonur-oss/lambda-kinesis-to-s3/aws).
For more information, refer to Module Versions section of Terraform Modules documentation.
Make sure to use ?ref= version pinning in module source URI when pulling from GitHub.
Pulling from GitHub is especially useful for development, as you can pin to a specific branch, tag or commit hash.
Example: source = github.com/baikonur-oss/terraform-aws-lambda-kinesis-to-s3?ref=v1.0.0
For more information on module version pinning, see Selecting a Revision section of Terraform Modules documentation.
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| batch_size | Maximum number of records passed for a single Lambda invocation | string | n/a | yes |
| handler | Lambda Function handler (entrypoint) | string | "main.handler" |
no |
| kinesis_stream_arn | Source Kinesis Data Streams stream name | string | n/a | yes |
| lambda_package_url | Lambda package URL (see Usage in README) | string | n/a | yes |
| log_bucket | Target S3 bucket to save data to | string | n/a | yes |
| log_id_field | Key name for unique log ID | string | "log_id" |
no |
| log_path_prefix | Log file path prefix | string | n/a | yes |
| log_retention_in_days | Lambda Function log retention in days | string | "30" |
no |
| log_timestamp_field | Key name for log timestamp | string | "time" |
no |
| log_type_field | Key name for log type | string | "log_type" |
no |
| log_type_field_whitelist | Log type whitelist (if empty, all types will be processed) | list(string) | [] |
no |
| log_type_unknown_prefix | Log type prefix for logs without log type field | string | "unknown" |
no |
| memory | Lambda Function memory in megabytes | string | "256" |
no |
| name | Resource name | string | n/a | yes |
| runtime | Lambda Function runtime | string | "python3.7" |
no |
| starting_position | Kinesis ShardIterator type (see: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetShardIterator.html ) | string | "TRIM_HORIZON" |
no |
| tags | Tags for Lambda Function | map(string) | {} |
no |
| timeout | Lambda Function timeout in seconds | string | "60" |
no |
| timezone | tz database timezone name (e.g. Asia/Tokyo) | string | "UTC" |
no |
| tracing_mode | X-Ray tracing mode (see: https://docs.aws.amazon.com/lambda/latest/dg/API_TracingConfig.html ) | string | "PassThrough" |
no |
Make sure to have following tools installed:
brew install pre-commit terraform terraform-docs
# set up pre-commit hooks by running below command in repository root
pre-commit install