Marouf Shaikh MarShaikh

Hi there 👋

I am a Research Software Engineer with a passion for building scalable machine learning systems and developing robust software tools for data-intensive applications.

💼 Professional Interests

Applied ML: Architecting and deploying machine learning models for complex challenges, including spatio-temporal forecasting and large-scale sequence analysis.
Data Engineering & Geospatial: Building cloud-native data platforms, like STAC APIs, to efficiently manage, process, and serve large-scale datasets.
ML & AI: Advancing skills in modern machine learning, including statistical modeling, Conformal Prediction for reliable uncertainty quantification, and efficient fine-tuning methods (e.g., LoRA) for large transformer models.
Cloud & MLOps: Designing and automating CI/CD pipelines for model deployment, data updates, and infrastructure management using tools like Terraform and GitHub Actions.

💻 Skills

Languages & Libraries: Python, R, PyTorch, TensorFlow, scikit-learn, Pandas, NumPy, Hugging Face
ML & AI: Supervised & Unsupervised Learning, Deep Learning, Generative AI (LLMs), Statistical Modeling, Conformal Prediction
Cloud & MLOps: Azure, Google Cloud Platform (GCP), AWS, Docker, Terraform, CI/CD, GitHub Actions, Git
Data Engineering & Geospatial: SQL, PostgreSQL, ETL, Data Pipelines, STAC API

🔭 Current Projects

Real-time Spatio-Temporal Forecasting System
- Pioneered the application of Conformal Prediction to generate reliable 95% uncertainty intervals for time-series forecasts.
- Developed advanced statistical modeling approaches for zero-inflated count data, improving prediction accuracy by 20%.
- Architected and deployed a production-ready API on Azure (using RestRServe, Docker, and Terraform) for real-time data surveillance, reducing analysis time by 40%.
- Implemented a cloud-native STAC API to ingest and manage large-scale geospatial datasets (e.g., CHIRPS, MODIS), ensuring high data integrity.
- Established a CI/CD pipeline with GitHub Actions to automate monthly data updates, accelerating run times by up to 95%.
Efficient Transformer Model Adaptation
- Implemented and optimized parameter-efficient fine-tuning (PEFT) methods like LoRA for large transformer architectures, reducing computational resource needs by 60% while retaining 95% of full fine-tuning performance.
- Engineered custom data preprocessing pipelines for complex, large-scale sequence data, enabling the analysis of inputs 40% larger than was possible with standard model limitations.

🌱 Ongoing Learning

Deepening my understanding of advanced statistical models for complex, high-dimensional data.
Exploring and implementing MLOps strategies to enhance the reproducibility, scalability, and monitoring of machine learning workflows.
Researching novel approaches for applying large language models to structured and unstructured data extraction and analysis tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marouf Shaikh MarShaikh

Achievements

Achievements

Block or report MarShaikh

Hi there 👋

💼 Professional Interests

💻 Skills

🔭 Current Projects

🌱 Ongoing Learning

📫 How to Reach Me

Pinned Loading

Uh oh!