Skip to content
View GeniDT's full-sized avatar

Block or report GeniDT

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
GeniDT/README.md

👋 Hi, I'm Eugenia Dede Teye
Data Engineer specializing in scalable cloud-native pipelines and analytics solutions

🔧 Core Expertise:
AWS Data Services | Apache Spark | Python | SQL | Apache Airflow | Delta Lake

💡 What I Do

I build robust data infrastructure that transforms raw information into actionable business insights. My focus is on creating reliable and scalable pipelines using modern cloud technologies, while ensuring data quality throughout the entire lifecycle.

🛠️ Currently Working With

Cloud Platforms: AWS (EMR, Glue, S3, Redshift, Athena, Lambda, Step Functions)
Big Data Processing: Apache Spark, distributed computing, stream processing
Orchestration: Apache Airflow, AWS MWAA, workflow automation
Data Architecture: Delta Lake, lakehouse design, real-time analytics
DevOps: CI/CD pipelines, Infrastructure as Code, containerization

🎯 Open to Collaboration

• Enterprise data pipeline projects
• Open source contributions to data engineering tools
• Cloud architecture optimization challenges
• Real-time analytics and streaming solutions

📈 Recent Projects

My repositories showcase end-to-end data engineering solutions, including EMR big data pipelines, lakehouse architectures, and event-driven streaming systems that process thousands of records with high reliability.

📬 Let's Connect

🌐 Portfolio: [Your Portfolio URL]
💼 LinkedIn: linkedin.com/in/eugenia-dede-teye
📧 Email: t.eugeniadede@gmail.com


Building the data infrastructure that powers tomorrow's insights

Popular repositories Loading

  1. healthcare-data-cleaning healthcare-data-cleaning Public

    Data cleaning project using Python to clean messy healthcare data. This includes handling missing values, formatting dates, correcting inconsistencies, and preparing the dataset for analysis. Ideal…

    Jupyter Notebook

  2. GreenLeaf-Performance-Dashboard GreenLeaf-Performance-Dashboard Public

    The GreenLeaf Performance Dashboard is a data-driven analytics dashboard that offers insights into sales trends, product performance, and profitability. Developed using Power BI, it helps stakehold…

  3. GeniDT GeniDT Public

    Config files for my GitHub profile.

  4. loan-default-prediction loan-default-prediction Public

    Predicting the likelihood of a loan application resulting in a default based on various variables in the application.

    Jupyter Notebook

  5. nsp-bolt_ride_real-time_trip_processing nsp-bolt_ride_real-time_trip_processing Public

    Python

  6. ecs-ecommerce-data-pipeline ecs-ecommerce-data-pipeline Public

    Python