Skip to content
View pavithra19's full-sized avatar

Block or report pavithra19

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pavithra19/README.md

Hi, I'm Pavithra 👋

I'm a Data Engineer with 7.5 years of experience in software engineering, including 4 years focused on building scalable data pipelines. I enjoy working with large datasets, automating workflows, and transforming raw data into clean, meaningful insights.

🎓 Currently pursuing my Master’s in Global Software Development in Fulda, Hesse, Germany.

👩‍💻 What I work with

  • Azure (ADF, Databricks, Synapse), Microsoft Fabric, AWS (EC2, S3)
  • Snowflake, Delta Lake, Data Lakehouse, ETL/ELT processes
  • Apache Spark, Kafka, Hadoop
  • Python, SQL, PySpark, Scala, Shell scripting
  • dbt, Fivetran, Apache Airflow (DAGs), ADF Triggers
  • REST APIs, JSON, Postman
  • Power BI, Azure DevOps, Git, GitHub Actions, Jira, Excel, Linux

🛠 Recent Projects

  • Apache Spark People Data Processor
    Built a full data pipeline using Apache Spark to process customer data, detect subscription trends, and generate dashboards.

  • ML Model Accuracy Audit
    Designed a pipeline to clean incoming test data and evaluate model accuracy under various conditions using scikit-learn.

LinkedIn - Let’s connect — always happy to chat about data, tech, and new ideas! 🚀

Pinned Loading

  1. sql_data_warehouse_project sql_data_warehouse_project Public

    A complete, production-ready SQL Data Warehouse project implementing the medallion architecture (Bronze, Silver, Gold) featuring robust T-SQL ETL, star schema data modeling and a unified data catal…

    TSQL 1

  2. apache_kafka_stock_market_data_streaming apache_kafka_stock_market_data_streaming Public

    This project focused on real-time stock market data streaming using Apache Kafka. The project demonstrates how to use Apache Kafka for capturing, processing and streaming stock market data efficien…

    Jupyter Notebook 1

  3. apache_spark_people_data_processor apache_spark_people_data_processor Public

    This project is a data processing application built with Apache Spark and Scala. This is designed to efficiently process, analyze and transform large datasets related to people data. It leverages S…

    Scala 1

  4. DataEngineeringProject DataEngineeringProject Public

    The project showcases how to build a scalable and efficient data pipeline using Azure Data Factory, Azure Databricks, and Azure Synapse Analytics. This has two logical app built and also all the av…

    1

  5. MachineLearningProject MachineLearningProject Public

    This project is a Machine Learning model built to find the accuracy of the trained model with both the tested and untested data. This derives the final confusion matrix of the classification model …

    Python 1

  6. webshopECommerceApp webshopECommerceApp Public

    webshopECommerceApp is a full-stack e-commerce web application designed for seamless online shopping experiences. Built using Java (Spring Boot) for the backend and HTML/Thymeleaf for the frontend,…

    HTML 1 1