Nikesh Chavhan vicky-dx

Hi there, I'm Nikesh Chavhan 👋

🚀 Data Engineer | Big Data Specialist | Cloud ETL Developer | GenAI & RAG Enthusiast

Building real-time data pipelines, intelligent ETL workflows, and scalable AI-powered solutions with 3+ years of hands-on experience

💼 About Me

🔭 Data Engineer specializing in Big Data, Cloud ETL, and Automation
🌱 Expertise in Apache Spark, Kafka, AWS, Airflow, Selenium, GenAI & RAG Systems
💡 Passionate about real-time streaming analytics, web automation, and intelligent data pipelines
🎯 Built production-grade systems for fraud detection, RAG pipelines, and ETL automation
📫 Reach me: vicky0x07@gmail.com
⚡ Fun fact: I automate everything - from data pipelines to web scraping!

🛠️ Tech Stack

Languages & Core

Big Data & Streaming

Cloud & ETL

Automation & Web Scraping

GenAI & ML

Infrastructure & DevOps

Monitoring & Visualization

🌟 Featured Projects

🎓 O'Reilly Course Downloader

Production-ready automation tool for downloading complete O'Reilly courses with automatic organization

🎥 Video + Transcript extraction with Selenium automation
🚀 Headless mode with Chrome DevTools Protocol
📁 Smart chapter-based organization
⚡ 10x faster transcript-only mode
🔄 Resume capability for interrupted downloads

Tech: Python, Selenium, FFmpeg, Chrome DevTools Protocol

🚀 More Projects Coming Soon...

Currently working on exciting data engineering and automation projects. Stay tuned!

📊 GitHub Stats

🔥 Recent Activity

🎯 Skills & Expertise

class NikeshChavhan:
    def __init__(self):
        self.name = "Nikesh Chavhan"
        self.role = "Data Engineer"
        self.location = "Nagpur, India"
        self.experience = "3+ years"
        
    def get_skills(self):
        return {
            "big_data": ["Apache Spark", "Kafka", "Flink", "AWS Kinesis"],
            "cloud_etl": ["AWS (S3, Lambda, Redshift, EMR, Glue)", "Airflow", "DBT"],
            "automation": ["Selenium", "Puppeteer", "BeautifulSoup", "Scrapy"],
            "genai_rag": ["LLMs (GPT, LLaMA)", "RAG Pipelines", "Prompt Engineering"],
            "ml_analytics": ["XGBoost", "LightGBM", "scikit-learn", "Pandas"],
            "devops": ["Docker", "Kubernetes", "Terraform", "GitHub Actions"],
            "monitoring": ["Prometheus", "Grafana", "ELK Stack"],
            "databases": ["Snowflake", "Redshift", "PostgreSQL", "MongoDB"]
        }
    
    def current_focus(self):
        return [
            "🔥 Real-time data pipelines with Spark Streaming",
            "🤖 Building production-grade RAG systems",
            "🌐 Web automation with Selenium/Puppeteer",
            "☁️ Auto-scaling ETL pipelines on AWS",
            "📊 Streaming analytics with Kafka + Redshift"
        ]

💡 What I'm Good At

✅ Real-Time Data Pipelines: Kafka + Spark Streaming for sub-second processing
✅ Web Automation & Scraping: Selenium, Puppeteer for intelligent data extraction
✅ GenAI & RAG Systems: LLM-powered pipelines with vector search + generation
✅ Cloud ETL: AWS (S3, Lambda, Glue, Redshift, EMR) + Airflow orchestration
✅ ML & Analytics: XGBoost, scikit-learn for fraud detection and predictions
✅ Auto-Scaling Infrastructure: Cost-optimized pipelines with Terraform + K8s
✅ CI/CD Automation: Docker + Kubernetes + GitHub Actions

📫 Connect With Me

🎓 Education & Certifications

Shivaji Science College, Nagpur | BS in Computer Science (2017-2021) | CGPA: 8/10

Certifications:

🏆 Data Engineering Associate (ongoing) - AWS
🏆 Data Engineering Professional (ongoing) - Google Cloud
🏆 Meta Database Engineer Professional - Coursera
🏆 Data Scientist Professional - Datacamp

📝 Latest Blog Posts

🌟 "Data is the new oil, but insights are the fuel that drives innovation"

⭐ From Nikesh Chavhan | Data Engineer & Automation Enthusiast

Provide feedback

Saved searches

Use saved searches to filter your results more quickly