Skip to content
View emsalcengiz's full-sized avatar

Block or report emsalcengiz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
emsalcengiz/README.md

Hi there, I'm Emsal 👋

🚀 I'm a Senior Data Engineer at Accenture, designing and building scalable, cloud-native data platforms for global enterprise clients.

What I do

🌐 I turn complex, raw data into reliable and actionable insights using:

    ☁️ Cloud Platforms
      • Azure (Data Factory, Functions, Databricks, Data Lake, Blob Storage)
      • AWS (S3, Athena, Lambda, Glue – project-based)

    🐍 Python & PySpark for large-scale data processing
    ⚙️ Apache Spark for distributed batch & streaming workloads
    📊 Power BI & Tableau for analytics and data storytelling

How I work

⏱ Experienced in both:

  • Real-time & near real-time processing
    (Spark Structured Streaming, event-driven pipelines)
  • 🧱 Batch data pipelines
    (ETL/ELT, data modeling, historical backfills, reporting layers)

🧰 My toolkit also includes:

  • 🗄️ Data Lakes & Storage: Parquet, Delta Lake, partitioned data models
  • 🔁 APIs & Integrations: REST APIs, webhooks, external data ingestion
  • ⛓️ Orchestration: Apache Airflow, Azure Data Factory
  • 🐳 Docker for reproducible environments
  • 🚀 CI/CD: Git-based pipelines for automated deployment & quality checks
  • 🔍 Data quality & reliability: monitoring, validation, and observability

📚 Currently Exploring

🔬 Cloud-agnostic data architecture & cost-efficient design
📐 Advanced data modeling & analytics engineering
🧠 Data reliability, observability, and data-as-a-product mindset
🤖 Modern data platforms & AI-ready pipelines

⚡ Fun fact

I enjoy building data pipelines that are boring in production
because boring means reliable, scalable, and well-designed

Pinned Loading

  1. data-normalize-with-etl-procesess data-normalize-with-etl-procesess Public

    I made various data normalization operations with python scripts. Target data in CSV format

    Python 3 1

  2. Etl_processing Etl_processing Public

    I find Apache Airflow very useful for ETL work. Here I transferred data from the source database(mysql) to the target database(postresql) and used the Airflow Bash Operator.

    Python 1 1

  3. filtering-process filtering-process Public

    You can do a lot of things with Apache Spark. What I've done here is to work with a static file and create a Batch ETL system.

    Python

  4. get_users get_users Public

    After ETL done by reading static data, an API is designed with flask_sqlalchemy, the purpose is to show the top five users

    Python

  5. random-data-generation random-data-generation Public

    Generating random data with Apache Kafka

    Python

  6. Apache-Beam-examples Apache-Beam-examples Public

    liked Apache Beam for streaming data transformations

    Python