DS862-Assignments

Machine Learning Project

Fashion MNIST Classification with Dimensionality Reduction
- Develop an effective classification model for Fashion MNIST that maximizes accuracy.
- Evaluate Dimensionality Reduction Impact: Assess whether dimensionality reduction improves classification performance or leads to information loss.
- Hyperparameter Optimization: Identify the best hyperparameter settings for Random Forest and KNN.
- Comparison of Model Performance: Determine which classifier (Random Forest vs. KNN) is more effective for this dataset.
Fashion MNIST & Lena: Clustering for Pattern Discovery and Compression
- Explore clustering techniques for grouping images and color quantization.
- Evaluate the Effectiveness of Clustering on Image Data:
  - Understand how different clustering methods perform on Fashion MNIST.
  - Determine if they align with human-perceived categories.
- Optimize Image Compression via Clustering: Reduce the number of colors in the Lena image while maintaining visual quality.
- Compare Clustering Methods Based on Performance Metrics:
  - Measure cluster purity using the Rand Index.
  - Determine the best clustering approach for image grouping.
Customer Churn Prediction Using SVM & Logistic Regression
- Build an effective churn prediction model that accurately identifies customers likely to leave.
- Evaluate SVM Performance: Compare different SVM kernels (Linear, Polynomial, RBF) to determine the best-performing model.
- Compare Against Logistic Regression: Investigate whether regularized logistic regression can outperform SVM models in churn prediction.
- Optimize Model Hyperparameters: Use GridSearchCV to tune C, gamma, and kernel parameters for the best model performance.
MLP vs. Gradient Boosting: A Comparative Analysis for Regression and Classification
- Evaluate MLP architectures for both regression and classification and compare their performance with Gradient Boosting models.
- Identify the best MLP structure for regression and classification.
- Compare different layer depths and neuron distributions.
- Compare Neural Networks with Gradient Boosting.
- Determine whether ensemble methods outperform deep learning for structured tabular data.
- Optimize Performance Metrics.
  - Regression: Use Mean Squared Error (MSE) and R² score.
  - Classification: Use Accuracy and Mean Squared Error (MSE).
Naive Bayes Classification on German Credit Card Data
- Build an accurate and interpretable credit risk prediction model using Naive Bayes classifiers.
- Compare Naive Bayes Variants:
  - Evaluate how well each classifier (GNB, CNB, MNB) performs.
  - Identify the best approach for credit risk classification.
- Feature Importance Analysis
  - Use correlation analysis and Cramér’s V to determine the most influential features.
  - Optimize Model Performance
- Tune alpha values for categorical features.
- Balance model performance between precision and recall.
Ensemble Learning for Bank Churn Prediction
- Use ensemble learning to improve customer churn prediction accuracy.
- Evaluate the effectiveness of different ensemble methods: Compare Voting, Bagging, Boosting, and Stacking.
- Tune Hyperparameters for Best Performance: Optimize individual models before combining them.
- Compare Accuracy Between Base Learners and Ensemble Models: Identify whether ensemble models outperform individual classifiers.
Sentiment Analysis on Yelp Reviews Using Naive Bayes
- Develop a Naive Bayes-based sentiment classification model for Yelp reviews.
- Compare Feature Extraction Methods: Evaluate Bag-of-Words vs. TF-IDF performance.
- Compare Different Naive Bayes Models: Assess MultinomialNB vs. GaussianNB for text classification.
- Optimize Model Performance: Tune alpha (for MNB) and var_smoothing (for GNB) using GridSearchCV.
Uncovering News Topics: LDA vs. NMF for ABC News Headlines
- Extract meaningful topics from ABC News headlines using LDA and NMF.
- Compare LDA and NMF in topic extraction accuracy.
- Analyze how different feature extraction methods (BoW vs. TF-IDF) affect results.
- Determine the best model for news topic discovery.
Book Recommendation System Using Collaborative Filtering and Matrix Factorization
- Develop a robust book recommendation system that provides personalized suggestions.
- Compare different recommendation methods: Analyze differences between User-Based, Item-Based, and Model-Based approaches.
- Evaluate prediction accuracy: Assess predicted ratings for books across methods.
- Optimize computational efficiency: Reduce the processing time for large-scale recommendation systems.
Extra Credit: License Plate Character and Digit Recognition Using CNNs

Develop a high-accuracy CNN model to classify license plate characters.
Use data augmentation to improve model robustness.
Optimize CNN hyperparameters for better performance.
Generate a prediction file for unseen test images.

Midterm: Handling Categorical Data for Dimensionality Reduction and Clustering in Housing Prices

Develop a dimensionality reduction and clustering pipeline for mixed data types in real estate price prediction.
Evaluate whether dimensionality reduction improves regression performance.
Assess clustering quality using Normalized Mutual Information (NMI).
Compare the effectiveness of PCA + MCA vs. full feature set in Ridge Regression.
Optimize the number of clusters in K-Medoids using Gower distance.

Project: The Influence of Credit Limit Variability on Credit Score Tiers

Predict credit score categories accurately based on financial and credit behavior.
Evaluate the influence of changed_credit_limit on credit scores.
Compare performance of different classification models.
Use clustering to uncover hidden patterns in credit behavior.

SVM Model

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
Assignment1		Assignment1
Assignment2		Assignment2
Assignment3		Assignment3
Assignment4		Assignment4
Assignment5		Assignment5
Assignment6		Assignment6
Assignment7		Assignment7
Assignment8		Assignment8
Assignment9		Assignment9
ExtraCredit		ExtraCredit
Midterm Project		Midterm Project
Project		Project
SVM		SVM
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DS862-Assignments

About

Uh oh!

Releases

Packages

Languages

amogh2004/DS862-Assignments

Folders and files

Latest commit

History

Repository files navigation

DS862-Assignments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages