Your Data Engineer with strong expertise in building scalable, cloud-native data platforms on AWS.
Designing reliable data ingestion pipelines, lakehouse architectures, and analytics-ready datasets using modern AWS services.
- Design and build end-to-end data platforms on AWS
- Convert raw data into curated, analytics-ready datasets
- Optimize large-scale ETL / ELT pipelines (cost, performance, reliability)
- Implement data quality, governance, and cataloging
- Support BI, analytics, and ML workloads
- Amazon S3
- AWS Glue
- Amazon Athena
- Amazon Redshift
- AWS Lambda
- Amazon EventBridge
- Amazon EMR
- Apache Spark (PySpark)
- AWS Glue DynamicFrames & Spark DataFrames
- Parquet / ORC / NDJSON
- Schema evolution & partitioning strategies
- Python
- Typescript
- SQL
- Git / GitHub
- Terraform / CloudFormation
- Docker
- Linux / Bash
- Lakehouse (Bronze / Silver / Gold)
- Batch & Event-driven pipelines
- Schema-on-read & schema evolution
- Idempotent ETL jobs
- Cost-optimized serverless analytics
- Incremental loads & backfills
- AWS Glue Data Quality
- Schema validation
- Null / range / freshness checks
- Glue Data Catalog & crawlers
- Partition-aware crawling strategies
- Prefer simple, observable systems
- Optimize for cost & reliability first
- Make pipelines idempotent and restart-safe
- Treat data as a product
- Automate everything that can be automated
βοΈ I enjoy building clean, scalable data systems that teams can trust.


