Skip to content

Humanity has forgotten how fragile we live. It would be wise to keep an eye to the stary abyss and determine if a near earth object has the potential to be hazardous.

Notifications You must be signed in to change notification settings

brandontnavarrete/nasa-neow-python

Repository files navigation

The Great Space Race to Save the Earth

Python Pandas NumPy Matplotlib seaborn sklearn SciPy Space

A Near Earth Object Classification project by Brandon Navarrete

🌎 Goal

Humanity has forgotten how fragile we live. It would be wise to keep an eye to the stary abyss and determine if a near earth object has the potential to be hazardous. With world climate becoming ever changing, we have lost sight on protecting the one resource we all share. EARTH.

  • Here I will develop a model that can classify hazardous asteroids given their respective features of diameter, magnitude, and velocity

  • This will be encompassed in a report that is easy to read and interpret to any viewers.

🗺️ Data Overview

  • This data was pulled from kaggle(2023) which has been pulled from NASA's API

  • 90836 rows, each it's own object or asteroids with 10 columns of its features

Initial Questions

  • How Many of our Objects Are Inert?

  • Will Diameter play a Big Difference in Determining Hazard Status

  • Will Relative Velocity play a Big Difference in Determining Hazard Status

  • Will Absolute Magnitude play a Big Difference in Determining Hazard Status

Data Dictionary

📂 Data Dictionary

Variable Value Meaning
ID numerical Unique Identifier for each Asteroid
Name string Name given by NASA
est_diameter_min Float Minimum Estimated Diameter in Kilometeres.
est_diameter_max Float Maximum Estimated Diameter in Kilometeres.
relative_velocity Float Velocity Relative to Earth
orbiting_body string Earth
sentry_object False Included in sentry - an automated collision monitoring system
absolute_magnitude Float Describes intrinsic luminosity
Hazardous Boolean Feature that shows whether asteroid is harmful or not

Project Plan / Process

1️⃣ Data Acquisition

Gather data from kaggle database
  • Import csv in local files

  • Read/ Creat data dictionary and extract meaningful columns

acquire.py
  • Create acquire.py and user-defined function to import data from csv

2️⃣ Data Preparation

Data Cleaning
  • Missing values:

    • No missing values in kaggle dataset
  • Outliers

    • Outliers were kept
  • Droppeds

    • id,name,orbiting_body, sentry` columns were dropped,no useful information.
Data Splitting
  • Create function to split data into train, validate, test

  • Call the function, and store the 3 data samples separately in the form of dataframe

3️⃣ Exploratory Analysis

  • Ask questions to find what are the key features that are associated with hazard status

  • Explore each feature's correlation with status

  • Using visualizations to better understand the relationship between features

4️⃣ Statistical Testing & Modeling

  • Conduct mann whitney test

  • Conclude hypothesis and address the initial questions

5️⃣ Modeling Evaluation

  • Find the amount of features that can gerenate the highest performance (Recall)

  • Generate XGboost, fit and tranform the train dataset into feature

  • Pick the model with highest accuracy and evaluate on test dataset

🏅 Key Findings

  • About 10 % of data was classified as hazardous
  • All 3 features above shows promise in determing hazard status
  • The best performing model was the XGboost and was able to detect 98% of hazardous asteroids

Recommendation

This model has a high percentage of finding the hazardous asteroids at the cost of a low accuracy, due to the false postitives

  • This model should be used UNTIL a better model is developed

:electron: # Next Steps

  • Use the API to gather more relevant features, try to increase hazardous object capture rate.

  • Combine with image recogonition, try to automate process to have 24/7 observation / protection

Steps To Clone:

  1. Clone this repo
  2. Import NASA's csv
  3. Run Notebook

    some dependencies may need to be installed such as 'xgboost'

About

Humanity has forgotten how fragile we live. It would be wise to keep an eye to the stary abyss and determine if a near earth object has the potential to be hazardous.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published