Hello! I'm a software engineer based in Los Angeles, CA with expertise in distributed systems, data engineering, and machine learning. I believe iteration is essential and that great ideas can come from anyone so it's important to keep an open mind.
1. homelab
A personal homelab environment setup and configuration repository. This project documents my home infrastructure setup with automation scripts, configuration files, and deployment procedures. Using infrastructure as code principles to maintain a robust and scalable home lab environment for experimentation and learning.
Tech Stack: Python
Ray Vector Database Input/Output Utilities. This project provides tools for seamless integration between Ray, a distributed computing framework, and various vector databases. Designed to optimize data pipeline workflows involving vector embeddings for AI applications like semantic search and recommendation systems.
Key Features:
- Streamlined data loading/extraction from vector databases
- Integration with Ray's distributed computing ecosystem
- Support for various vector database formats
An example project demonstrating how to combine Apache Spark Streaming, Kafka, and Parquet to build a robust data pipeline. This project transforms JSON objects streamed over Kafka into Parquet files stored in S3, showcasing a complete ETL workflow.
Tech Stack: Scala, Apache Spark, Kafka, Parquet, S3
Key Features:
- Real-time data processing with Spark Streaming
- JSON message consumption from Kafka topics
- Efficient storage using Parquet file format
- Cloud integration with Amazon S3
A demonstration of how to build microservices with Spring Boot using Redis for various purposes. Includes examples of authentication, feature flags, and HyperLogLog for user counting.
Tech Stack: Java, Spring Boot, Redis
Key Features:
- Microservices architecture patterns
- Redis integration for caching and data storage
- Authentication and authorization examples
- Feature flag implementation
- User count tracking with HyperLogLog
A personal repository for experimenting with Rust programming language concepts and patterns. This project serves as a learning environment and reference for Rust development.
Tech Stack: Rust
A tutorial repository demonstrating how to effectively use regular expressions with Apache Spark. Covers pattern matching, text extraction, and data transformation techniques using Spark's regex capabilities.
Tech Stack: Apache Spark, Scala
- Languages: Python, Scala, Java, Rust
- Big Data: Apache Airflow, Apache Spark, Kafka, Ray, Parquet
- Databases: Redis/Valkey, Cassandra, PostgreSQL, Qdrant
- Cloud: AWS, GCP
- DevOps: Docker, Kubernetes, GitHub Actions
Feel free to reach out to me through my LinkedIn profile or visit my personal website to learn more about my work.