Final XGBoost Model with Preprocessing & Hyperparameter Optimization
This repository contains the full implementation and documentation for our Kaggle competition submission predicting y_passXtremeDurability, the probability that an aluminum cold-rolled sheet passes an extreme durability test.
Our final model used:
- Data Cleaning and Preprocessing
- Feature encoding including ordinal factors
- Design matrices using model.matrix()
- An initial XGBoost baseline model to establish an internal logloss benchmark
- A hyperparameter grid search to identify the set with the lowest CV logloss
- Final optimized XGBoost model and CSV output