🚢 Titanic Survival Prediction

A machine learning project that predicts passenger survival on the Titanic using various classification algorithms with hyperparameter optimization.

📋 Description

This repository contains a comprehensive machine learning solution for the famous Kaggle Titanic survival prediction competition. The project implements multiple classification algorithms with hyperparameter optimization to predict whether a passenger survived the Titanic disaster based on features like age, gender, ticket class, fare, cabin, and more.

The solution employs a systematic approach:

Data exploration and visualization
Feature engineering and preprocessing
Model training with hyperparameter optimization
Model evaluation and selection
Prediction generation for test data

✨ Features

Exploratory Data Analysis: Comprehensive analysis of the Titanic dataset with visualizations to understand feature relationships and survival patterns
Feature Engineering: Creation of new features like family size, title extraction from names, and family survival correlation
Hyperparameter Optimization: Uses Hyperopt library to find optimal parameters for each model
Multiple Classification Algorithms:
- Decision Tree Classifier
- Random Forest Classifier
- Gradient Boosting Classifier
- XGBoost Classifier
- K-Nearest Neighbors
- Support Vector Machine (implementation available but not used in main script)
- Neural Network with Keras (implementation available but not used in main script)
Model Comparison: Automatic selection of the best performing model for final predictions

🛠️ Setup

Prerequisites

Python 3.x
Required libraries:
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- hyperopt
- xgboost
- keras (optional, for neural network implementation)

Installation

Clone this repository:

git clone https://github.com/yourusername/KaggleTitanticSurvivalClassify.git
cd KaggleTitanticSurvivalClassify

Install required packages:

pip install pandas numpy scikit-learn matplotlib seaborn hyperopt xgboost keras

📊 Usage

Run the main script to train models and generate predictions:

python titanicPredictSurvival.py

The script will:
- Load and preprocess the training and test data
- Perform feature engineering
- Train multiple models with hyperparameter optimization
- Select the best performing model
- Generate predictions for the test set
- Save predictions to test_set_prediction.csv

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
__pycache__		__pycache__
LICENSE		LICENSE
README.md		README.md
gender_submission.csv		gender_submission.csv
hyperopt_dtc.py		hyperopt_dtc.py
hyperopt_gbc.py		hyperopt_gbc.py
hyperopt_keras.py		hyperopt_keras.py
hyperopt_knn.py		hyperopt_knn.py
hyperopt_rfc.py		hyperopt_rfc.py
hyperopt_svm.py		hyperopt_svm.py
hyperopt_xbc.py		hyperopt_xbc.py
test.csv		test.csv
test_set_prediction.csv		test_set_prediction.csv
titanicPredictSurvival.py		titanicPredictSurvival.py
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚢 Titanic Survival Prediction

📋 Description

✨ Features

🛠️ Setup

Prerequisites

Installation

📊 Usage

📜 License

About

Uh oh!

Releases

Packages

Languages

License

corticalstack/KaggleTitanticSurvivalClassify

Folders and files

Latest commit

History

Repository files navigation

🚢 Titanic Survival Prediction

📋 Description

✨ Features

🛠️ Setup

Prerequisites

Installation

📊 Usage

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages