Skip to content
View vicky-dx's full-sized avatar
🎯
Focusing
🎯
Focusing
  • cs50
  • India

Highlights

  • Pro

Block or report vicky-dx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vicky-dx/README.md

Hi there, I'm Nikesh Chavhan πŸ‘‹

GitHub followers LinkedIn Email

πŸš€ Data Engineer | Big Data Specialist | Cloud ETL Developer | GenAI & RAG Enthusiast

Building real-time data pipelines, intelligent ETL workflows, and scalable AI-powered solutions with 3+ years of hands-on experience

πŸ’Ό About Me

  • πŸ”­ Data Engineer specializing in Big Data, Cloud ETL, and Automation
  • 🌱 Expertise in Apache Spark, Kafka, AWS, Airflow, Selenium, GenAI & RAG Systems
  • πŸ’‘ Passionate about real-time streaming analytics, web automation, and intelligent data pipelines
  • 🎯 Built production-grade systems for fraud detection, RAG pipelines, and ETL automation
  • πŸ“« Reach me: vicky0x07@gmail.com
  • ⚑ Fun fact: I automate everything - from data pipelines to web scraping!

πŸ› οΈ Tech Stack

Languages & Core

C Python SQL Java Bash JavaScript

Big Data & Streaming

Apache Spark Kafka Flink AWS Kinesis

Cloud & ETL

AWS Apache Airflow DBT Snowflake

Automation & Web Scraping

Selenium Puppeteer BeautifulSoup Scrapy

GenAI & ML

LLMs RAG XGBoost scikit-learn

Infrastructure & DevOps

Docker Kubernetes Terraform GitHub Actions

Monitoring & Visualization

Prometheus Grafana ELK Stack Tableau


🌟 Featured Projects

Production-ready automation tool for downloading complete O'Reilly courses with automatic organization

  • πŸŽ₯ Video + Transcript extraction with Selenium automation
  • πŸš€ Headless mode with Chrome DevTools Protocol
  • πŸ“ Smart chapter-based organization
  • ⚑ 10x faster transcript-only mode
  • πŸ”„ Resume capability for interrupted downloads

Tech: Python, Selenium, FFmpeg, Chrome DevTools Protocol


πŸš€ More Projects Coming Soon...

Currently working on exciting data engineering and automation projects. Stay tuned!


πŸ“Š GitHub Stats

Nikesh's GitHub stats

Top Langs


πŸ”₯ Recent Activity


🎯 Skills & Expertise

class NikeshChavhan:
    def __init__(self):
        self.name = "Nikesh Chavhan"
        self.role = "Data Engineer"
        self.location = "Nagpur, India"
        self.experience = "3+ years"
        
    def get_skills(self):
        return {
            "big_data": ["Apache Spark", "Kafka", "Flink", "AWS Kinesis"],
            "cloud_etl": ["AWS (S3, Lambda, Redshift, EMR, Glue)", "Airflow", "DBT"],
            "automation": ["Selenium", "Puppeteer", "BeautifulSoup", "Scrapy"],
            "genai_rag": ["LLMs (GPT, LLaMA)", "RAG Pipelines", "Prompt Engineering"],
            "ml_analytics": ["XGBoost", "LightGBM", "scikit-learn", "Pandas"],
            "devops": ["Docker", "Kubernetes", "Terraform", "GitHub Actions"],
            "monitoring": ["Prometheus", "Grafana", "ELK Stack"],
            "databases": ["Snowflake", "Redshift", "PostgreSQL", "MongoDB"]
        }
    
    def current_focus(self):
        return [
            "πŸ”₯ Real-time data pipelines with Spark Streaming",
            "πŸ€– Building production-grade RAG systems",
            "🌐 Web automation with Selenium/Puppeteer",
            "☁️ Auto-scaling ETL pipelines on AWS",
            "πŸ“Š Streaming analytics with Kafka + Redshift"
        ]

πŸ’‘ What I'm Good At

  • βœ… Real-Time Data Pipelines: Kafka + Spark Streaming for sub-second processing
  • βœ… Web Automation & Scraping: Selenium, Puppeteer for intelligent data extraction
  • βœ… GenAI & RAG Systems: LLM-powered pipelines with vector search + generation
  • βœ… Cloud ETL: AWS (S3, Lambda, Glue, Redshift, EMR) + Airflow orchestration
  • βœ… ML & Analytics: XGBoost, scikit-learn for fraud detection and predictions
  • βœ… Auto-Scaling Infrastructure: Cost-optimized pipelines with Terraform + K8s
  • βœ… CI/CD Automation: Docker + Kubernetes + GitHub Actions

πŸ“« Connect With Me

LinkedIn GitHub Email


πŸŽ“ Education & Certifications

Shivaji Science College, Nagpur | BS in Computer Science (2017-2021) | CGPA: 8/10

Certifications:

  • πŸ† Data Engineering Associate (ongoing) - AWS
  • πŸ† Data Engineering Professional (ongoing) - Google Cloud
  • πŸ† Meta Database Engineer Professional - Coursera
  • πŸ† Data Scientist Professional - Datacamp

πŸ“ Latest Blog Posts


🌟 "Data is the new oil, but insights are the fuel that drives innovation"

Profile Views

⭐ From Nikesh Chavhan | Data Engineer & Automation Enthusiast

Pinned Loading

  1. radareorg/radare2 radareorg/radare2 Public

    UNIX-like reverse engineering framework and command-line toolset

    C 22.7k 3.1k