Exploratory data analysis of the quasar candidates catalog by Richards et al., ApJS 219 (2015). The analysis contain thorough data cleaning and feature engineering, exploration of important correlations between the features and significance testing of statistical hypotheses. The data after the analysis is ready for investigation with regression and classification algorithms. The ideas for particular next steps in analysis are provided.
quasars-report.pdf contains the full report of the analysis. Check it out!
quasars_eda.ipynb is the Jupyter iPython notebook where the whole analysis was performed.
cand.dat is the raw catalog which the analysis was based on (accessed remotely as it's a large file).