Skip to content
View josephcastan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report josephcastan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
josephcastan/README.md

πŸ‘‹ Hi, I'm Joseph

Your Data Engineer with strong expertise in building scalable, cloud-native data platforms on AWS.
Designing reliable data ingestion pipelines, lakehouse architectures, and analytics-ready datasets using modern AWS services.


🧠 What I Do

  • Design and build end-to-end data platforms on AWS
  • Convert raw data into curated, analytics-ready datasets
  • Optimize large-scale ETL / ELT pipelines (cost, performance, reliability)
  • Implement data quality, governance, and cataloging
  • Support BI, analytics, and ML workloads

☁️ AWS based Data Engineering 😎

Core AWS Services

  • Amazon S3
  • AWS Glue
  • Amazon Athena
  • Amazon Redshift
  • AWS Lambda
  • Amazon EventBridge
  • Amazon EMR

Data Processing

  • Apache Spark (PySpark)
  • AWS Glue DynamicFrames & Spark DataFrames
  • Parquet / ORC / NDJSON
  • Schema evolution & partitioning strategies

🧰 Programming & Tools

  • Python
  • Typescript
  • SQL
  • Git / GitHub
  • Terraform / CloudFormation
  • Docker
  • Linux / Bash

πŸ—οΈ Data Architecture Patterns

  • Lakehouse (Bronze / Silver / Gold)
  • Batch & Event-driven pipelines
  • Schema-on-read & schema evolution
  • Idempotent ETL jobs
  • Cost-optimized serverless analytics
  • Incremental loads & backfills

πŸ“Š Data Quality & Governance

  • AWS Glue Data Quality
  • Schema validation
  • Null / range / freshness checks
  • Glue Data Catalog & crawlers
  • Partition-aware crawling strategies

πŸ“Œ Engineering Principles

  • Prefer simple, observable systems
  • Optimize for cost & reliability first
  • Make pipelines idempotent and restart-safe
  • Treat data as a product
  • Automate everything that can be automated

⭐️ I enjoy building clean, scalable data systems that teams can trust.

Pinned Loading

  1. pieces pieces Public

    Simple, stand-alone, reusable components.

    JavaScript 141 41

  2. tsbw tsbw Public

    The Simplest Bitcoin Wallet

    JavaScript 44 17

  3. simcoin simcoin Public

    Simcoin project

    C 2

  4. vuepack vuepack Public

    Pick and pack Vue components you need.

    JavaScript 1 1

  5. chttp chttp Public

    Simple HTTP server in C++

    C++

  6. vuejs/awesome-vue vuejs/awesome-vue Public

    πŸŽ‰ A curated list of awesome things related to Vue.js

    73.6k 9.5k