Data mining is the process of finding and extracting hidden information, patterns, and specific relationships within a vast amount of data with the aim of predicting future events and outcomes. Colic in horses is caused by improper attention to horses, which can lead to impaction, intestinal twisting, stomach ulcers, stomach rupture, and intestinal obstruction, resulting in colic. Colic usually occurs suddenly and unexpectedly, and it is important to identify the cause of this disease as quickly as possible in order to prevent its progression and control it.
Using data mining techniques, including data preprocessing, feature selection, model training, and evaluation, the project seeks to uncover patterns and relationships within the dataset to improve prediction accuracy. Through exploratory data analysis and visualization, insights into the factors influencing horse survival outcomes are gained, enabling better-informed decision-making in veterinary medicine and animal healthcare.
- dataset/: This directory contains the dataset files used in the project.
- report/: This directory contains project report, including documentation, analysis results, and visualizations.
- src/: This directory contains the source code for data preprocessing, model training, and evaluation includes Jupyter notebooks and python scripts.
- requirements.txt: This file lists all dependencies required to run the project.
- Python
- NumPy
- Pandas
- Scikit-learn
- graphviz
- Matplotlib
- Seaborn
- Clone the repository:
git clone https://github.com/Faridghr/horse-survival-data-mining.git
- Navigate to the project directory:
cd horse-survival-data-mining
- Install dependencies:
pip install -r requirements.txt
- Run the script to preprocess data, train models, and evaluate performance.
This dataset was originally published by the UCI Machine Learning Database.