This project aims to classify asteroids using a machine learning model based on their physical and orbital characteristics. The dataset is cleaned, preprocessed, and analyzed to extract meaningful patterns that improve classification accuracy.
- Loads and preprocesses asteroid data, removing redundant and irrelevant features.
- Encodes categorical features like NEO (Near-Earth Object) and PHA (Potentially Hazardous Asteroid).
- Implements feature engineering to refine data representation.
- Trains a classification model to predict asteroid types.
- Evaluates model performance using various metrics.
- Python (Primary language)
- Scikit-learn (Machine Learning models)
- Pandas & NumPy (Data handling)
- Matplotlib & Seaborn (Data visualization)
- Jupyter Notebook (Exploratory data analysis)
- Removed unnecessary attributes like
id
,name
, and redundant time representations (epoch_mjd
,epoch_cal
).
- Eliminated missing values and irrelevant attributes.
- Unified redundant features (
per
,per_y
) that had different scales.
- Encoded categorical variables (
neo
,pha
,class
) into numerical representations. - Computed missing
neo
values using perihelion distance (q
).
- Ensured feature consistency and normalization for model training.
- Applied Supervised Learning techniques to classify asteroids.
- Split dataset into training and testing sets.
- Evaluated model performance using accuracy, precision, recall, and F1-score.
- Successfully classified asteroids based on orbital and physical properties.
- Encoding
neo
based on perihelion distance (q
) improved classification accuracy. - Some features were redundant and removing them enhanced model efficiency.