#π Shreyash Khote [cite: 1]
π Location: Ahmedabad, Gujarat, IN [cite: 2]
π Phone: +91 8805128129 [cite: 2]
π§ Email: shreyashkhote01@gmail.com
I am a Data Engineer with expertise in designing and deploying scalable ETL/ELT architectures and GenAI infrastructure across Information Technology domains. Specialized in the Azure ecosystem (Data Factory, AI Foundry) and Apache Airflow for orchestrating complex data workflows, I have a proven track record of optimizing data pipelines, reducing query latency, and engineering robust data ingestion layers for RAG systems and real-time analytics. πβοΈ
Relay Human Cloud | Nov 2024 - Present | Ahmedabad, IN
- βοΈ Currently working on a recommendation system project, focusing on improving data pipeline reliability, security, and algorithm performance within a scalable data engineering architecture.
- π Engineered a secure AWS authentication layer by implementing temporary session token logic within the core recommendation algorithm and data processing modules, replacing static credentials with a dynamic, auto-refreshing IAM integration.
- π Resolved a high-priority "low-recommendation" bottleneck by performing full-stack root cause analysis, using systematic log injection and SQL data modeling to identify restrictive business logic constraints within CORE and CORE MAINLINE product tags.
- π€ Partnered with senior engineers to code, review, and document scalable data workflows in Jira, establishing team-wide best practices for pipeline monitoring and error handling.
S4S Technology Pvt. Ltd | Jan 2023 - Jul 2023 - π Organized, cleaned, and validated data for in-depth analysis.
- π Conducted data analysis to identify key patterns and insights, supporting strategic decisions.
- π Developed interactive dashboards to present complex data in a clear and concise manner.
- Data Engineering Fundamentals: ETL/ELT, Data Warehousing, Data Lake, Delta Lake, Data Modeling, RAG, Vector Databases.
- Cloud & Tools: Azure (AI Foundry, Azure Function, Blob Storage), AWS (S3), Snowflake, Airflow, Docker, Kafka, Git.
- Languages & Frameworks: Python π (Pandas, NumPy), SQL, Spark.
- Soft Skills: Teamwork, Collaboration, Communication, Time Management, Planning.
-
Real-Time Data Pipeline & Analytics Platform
π οΈ Built an end-to-end pipeline ingesting data from Reddit APIs, CSV/JSON files, and a mock Kafka stream, storing raw data in local Parquet/Data Lake format organized by date and source. Designed a star schema data warehouse with fact and dimension tables, applied daily partitioning, and wrote SQL transformations. Orchestrated batch and streaming ETL workflows using Apache Airflow DAGs with data quality checks and robust logging. -
Twitter Data Pipeline
π¦ Architected a daily automated ingestion pipeline using Python to extract, clean, and process raw Twitter API data. Orchestrated the ELT workflow with Apache Airflow to ensure zero-touch daily execution and robust failure handling. Engineered a secure data storage layer in AWS S3 for a centralized, query-ready data repository. -
Olist E-commerce Platform Analysis
π Built an analysis platform to understand key metrics of e-commerce sales data, providing actionable insights for decision-makers. -
Sentiment Analysis
π¬ Implemented sentiment analysis using Natural Language Processing (NLP) to classify customer reviews into positive, neutral, or negative sentiments.
- M.Sc. Data Science | MGM University, Aurangabad | 2021-2023 | CGPA: 8.99/10
- B.Sc. Computer Science | Deogiri College, Aurangabad | 2018-2021 | CGPA: 6.75/10
- π Going Above and Beyond Award | Relay Human Cloud - March 2025
- π Data Analyst Certification | ExcelR - Reg No: 18897/EXCELR/24012024
- π Data Analyst Professional | IBM
- π Data, Data, Everywhere & Relational Database and SQL | Coursera
- π Data Analyst Essentials | Cisco
Feel free to reach out to me via email: shreyashkhote01@gmail.com for any inquiries or opportunities. π§
- LinkedIn: shreyashkhote01
- GitHub: Shreyash-rsk
