A serverless data pipeline that processes CSV files containing coordinates to generate maps and manages truck test records using AWS services.
This project implements a serverless data pipeline to process CSV files containing coordinates, generate maps, and store truck test records. It leverages AWS services such as S3, DynamoDB, Lambda, API Gateway, and Cognito for secure and scalable data processing and user authentication.
- CSV Upload: When a CSV file is uploaded to the
IncomingCsvS3 bucket, a Lambda function is triggered. - Record Creation: The first Lambda function creates a test record in the
RecordsTableDynamoDB table. - Map Generation: A second Lambda function generates a map from the CSV coordinates, saves it to the
MapsS3 bucket, and updates theRecordsTablewith the map file name. - Authentication: User authentication is managed via a Cognito User Pool.
- Serverless Processing: Uses AWS Lambda for event-driven CSV processing and map generation.
- Secure Authentication: Integrates Cognito User Pool for user registration and authentication.
- Scalable Storage: Stores truck configurations and test records in DynamoDB tables.
- RESTful API: Provides API endpoints via API Gateway to manage truck records and retrieve data.
- Map Visualization: Generates maps from coordinates using Pandas and Folium libraries.
The application is built using the AWS Cloud Development Kit (CDK) and consists of several stacks:
Manages user authentication and authorization.
- Cognito User Pool: Supports self-sign-up, email verification, and user alias (email/username).
- Cognito User Pool Client: Facilitates authentication flows, including user-password and Secure Remote Password (SRP).
- Cognito Identity Pool: Grants authenticated users read-only access to S3 buckets via an IAM role.
- Outputs:
UserPoolIdUserPoolClientIdIdentityPoolId
Handles truck configuration storage.
- DynamoDB Table:
TrucksTablewithcurrentVinas the partition key. - Lambda Function:
EnterTruckLambda, triggered by API Gateway to insert truck records. - IAM Role: Grants the Lambda function write access to
TrucksTable. - Outputs:
TrucksTableARNAddTruckLambdaARN
Processes CSV files and stores test records.
- DynamoDB Table:
RecordsTablewithfilenameas the partition key. - Lambda Function:
CsvLambda, triggered by S3 to process CSV files and insert data intoRecordsTable. - S3 Buckets:
incomingcsvs-: Stores uploaded CSV files and triggersCsvLambda.maps-: Stores generated maps.
- Lambda Layer: Includes Pandas and Folium for map generation.
- Outputs:
CsvBucketNameMapsBucketName
Provides RESTful API endpoints.
- API Gateway:
RunlogRestApiserves as the entry point for API requests. - Cognito Authorizer: Secures API endpoints using Cognito User Pool.
- API Methods:
POST /addtruck: Adds truck records toTrucksTable.GET /alltrucks: Retrieves all truck records.GET /allrecords: Retrieves all test records.
- IAM Role: Grants read access to
TrucksTableandRecordsTable.
Orchestrates stack deployment and manages dependencies using AWS CDK.
- Lambda functions:
csv_lambda.py,maps_lambda.py,trucksdb_lambda.py. - Utility scripts:
createuser.py,addtruck.py,alltrucks.py,allrecords.py,getmap.py. - Templates for data processing.
- AWS CLI: Installed and configured with appropriate credentials.
- Node.js: Required for AWS CDK (version 14 or higher recommended).
- Python: Version 3.8 or higher.
- AWS CDK: Install via
npm install -g aws-cdk.
- Clone the repository:
git clone https://github.com/username/repo.git cd runlog - Create and activate a virtual environment:
- MacOS/Linux:
python3 -m venv .venv source .venv/bin/activate - Windows:
python -m venv .venv .venv\Scripts\activate.bat
- MacOS/Linux:
- Install dependencies:
pip install -r requirements.txt
- Synthesize the CloudFormation template:
cdk synth
- Deploy the stacks:
To skip manual approvals:
cdk deploy --all
cdk deploy --all --require-approval=never
- Populate the
variables.pyfile with required values (e.g., bucket names, API endpoints). - Create a user in the Cognito User Pool:
Note: All API calls require a JWT token from an authenticated Cognito user.
python createuser.py
- Add a truck configuration:
python addtruck.py
- List all truck configurations:
python alltrucks.py
- Upload CSV files to the
incomingcsvs-S3 bucket to trigger processing. - Retrieve test records:
python allrecords.py
- Retrieve generated maps:
python getmap.py

