My central hub for data science, ETL, and automation projects — all open for review, reuse, and contributions.
Below is a curated overview of the projects in this repository, grouped by theme:
| Project | Overview |
|---|---|
| data_epic_capstone | AI agents directory with EC2/S3 automation (Bash, PowerShell, GitHub Actions). Part of a larger ETL orchestration setup. |
| etl_pipeline | End-to-end ETL workflow: Extract data, transform with Pandas, and load into PostgreSQL. FastAPI API layer included. |
| aws_project | Utilities and scripts targeting AWS automation — likely involving EC2/S3, IAM, or infrastructure provision. |
| Project | Overview |
|---|---|
| data-processing-api | API for ingesting and analyzing an e-commerce dataset using Pandas, Polars, and FastAPI. |
| ecommerce-api | Cleaned and enriched sales data, offering analytics on top products, regional sales via FastAPI+SQLAlchemy. |
| web-scraping-api | Weather data scraper (BeautifulSoup + Requests), cleaned with Pandas and exposed via FastAPI. |
| Project | Overview |
|---|---|
| ETL Pipeline (Movies Dataset) | Combined multiple movie datasets via RapidAPI; transformations done with Python and SQLAlchemy; served insights via APIs backed by PostgreSQL. (If different from etl_pipeline, clarify.) |
| Project | Overview |
|---|---|
| team_agile | Demonstrates agile team collaboration principles—perhaps including documentation, role-play, or process design. |
- Automation-first mindset: Multiple projects integrate CI/CD, scripting, and cloud orchestration, reflecting my ambition to build scalable and maintainable pipelines.
- Data science at the core: From e-commerce analytics to scraping and transformation, I have put data insights front and center.
- Modern tooling and architecture: I have used and using FastAPI, SQLAlchemy, PostgreSQL, and frameworks like Polars—all critical in today's data workflows.
- Full-stack capability: I build both the backend data pipelines and the APIs that expose analytics—showing my breadth of data science practice and capabilities.
- Navigate to the project folder you're interested in.
- Check the
README.mdinside each (if available) for detailed setup, tools, and usage. - Look for
Dockerfile,GitHub Actions, or deployment configs to see how automation is set up. - Run sample data inputs or API endpoints to explore functionality.
- Expand
data_epic_capstonewith an ETL dashboard (see previous suggestions) and make it deployable via Netlify or a cloud provider. - Add README badges per project—for CI status, main language, or live API links.
- Enhance docs with architecture diagrams (Mermaid or PNG) to visualize workflows.
Feel free to reach out — I'd love to discuss data engineering, automation workflows, or cloud-native ETL practices. [iyanuvicky@gmail.com]
⭐ From Iyanuvicky22