AI-focused Data Scientist and current PhD Candidate (GPA: 3.89, Expected 2027) with 4+ years of experience specializing in transforming complex datasets into strategic initiatives that boost business and clinical performance. Experienced in healthcare, sales analytics, crime prediction, forecasting, customer segmentation and education, showcasing expertise with projects such as developing predictive models with over 96% accuracy and boosting student engagement by 70% through data-driven curriculum enhancements. Committed to leveraging AI, advanced analytics ,and machine learning techniques to support innovation and drive insightful solutions in dynamic settings.
π PhD Candidate in Data Science at National University (Graduation expected in 2027)
Data Scientist at Bizmpya.com with nearly 4 years of hands-on experience driving data-driven insights and strategies.
Data Science Intern at Huntershightech.com (6 months) specializing in predictive modeling and data analytics.
-
Harnessing AI and machine learning to solve complex, real-world problems.
-
Applying advanced predictive analytics to transform industries like Healthcare and Biopharmaceuticals with cutting-edge ML techniques.
-
Predictive Analytics in healthcare, business, real estate, sales, crime prediction, Education & Census Data: Utilizing machine learning to extract insights and forecast trends.
-
Inventory Forecasting: Expertise in using time series models (ETS, ARIMA, SARIMA) to optimize store inventory management.
-
Classification & Regression: Proficient in ensemble methods (XGBoost, Random Forest, Bagging), decision trees, LDA, logistic regression, and KNN.
-
Linear Regression: Applying regression techniques to real-world scenarios, such as predicting housing prices, college GPA, and used vehicle prices.
-
Dimensionality Reduction: Extensive experience with PCA to reduce data complexity while retaining key information.
-
Regularized Regression: Skilled in using Elastic Net, Lasso, and Ridge regression for model optimization.
-
Advanced Predictive Methods: Strong expertise in SVM, Naive Bayes, and Polynomial Regression for classification and regression tasks.
-
Clustering & Customer Segmentation: Proficient in K-means and Hierarchical Clustering for market segmentation and targeted strategies.
-
Programming: Python, R, MySQL.
-
Data Science Libraries: Pandas, NumPy, Scikit-learn, TensorFlow, Keras, and more.
-
Advanced ML Techniques: Hyperparameter tuning, cross-validation, and feature engineering to maximize model performance.
- Proven expertise in Sales, Marketing, and Entrepreneurship, combining data insights with strategic decision-making.
- Excellent ability to convey complex scientific knowldge in an easy to understand way to stakeholders using created data visualizations to show and tell.
- Languages & Tools:
- Programming: Python (Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib), R, MySQL
- Analysis & Visualization: Jupyter, Google Colab, Tableau
- Machine Learning: Supervised/Unsupervised Learning, Neural Networks (PyTorch, TensorFlow), Ensemble Methods (Random Forest, XGBoost, SVM)
- Advanced Modeling: Polynomial Regression, Spline Regression, ETS, ARIMA/SARIMA
- Platforms: KNIME, GitHub, Cloud (Azure, AWS, GCP)
- CRM: Veeva, Salesforce.com
-
Identifies key risk factors for mortality using machine learning models
-
Implements Logistic Regression, Random Forest, and XGBoost
-
Provides insights to improve preventive healthcare strategies
-
Predicts future store sales using time-series models
-
Applies ETS and ARIMA forecasting techniques
-
Helps businesses optimize inventory and sales decisions
-
Clusters wine samples based on characteristics using unsupervised learning
-
Uses PCA to reduce dimensionality and K-means for clustering
-
Supports sommelier decision-making and product differentiation
-
Classifies obesity risk factors using various machine learning models
-
Implements XGBoost, Random Forest, Decision Tree, Elastic Net, Bagging, LDA, SVM, Naive Bayes, and Multinomial Logistic Regression
-
Can assist healthcare professionals in early intervention strategies
-
Predicts the number of rings in abalones (age indicator)
-
Applies Generalized Additive Models (GAM), Cubic Spline Regression, Principal Component Regression (PCR), Elastic Net, and Random Forest vs. XGBoost
-
Enhances seafood industry insights for sustainable harvesting
-
Analyzes vehicle fuel efficiency trends using non-linear regression
-
Uses Polynomial Regression, Cubic Spline, and GAM models
-
Helps inform automotive engineering decisions for improved fuel economy
-
Conducts detailed statistical analysis for data insights
-
Utilizes descriptive statistics and visualization techniques
-
Supports data-driven decision-making across industries
-
Develops and analyzes a structured database using MySQL Workbench
-
Focuses on database management and optimization
-
Provides business intelligence insights
-
A curated collection showcasing various data science projects
-
Organizes work for professional presentation
-
Demonstrates proficiency in version control and project organization
-
Includes cloned repositories and hands-on GitHub exploration
- LinkedIn: linkedin.com/in/gladysmurage
- Email: gmurage@outlook.com
- GitHub: github.com/gmurage
Thanks for stopping by my profile. Letβs build innovative solutions together!