The code is seperated into various notebooks, namely:
eda_before_preprocessing.ipynbcontains the exploratory data analysis notebook.data_preprocessing.ipynbcontains the notebook which preprocesses the data.model_testing.ipynbcontains the various models that were used during EDA and after to find a good option.LIME.ipynbcontains the LIME explanationsSHAP.ipynbcontains the SHAP explanationsPDP_ICE.ipynbcontains the PDP/ICE explanations
This study uses the Predict Students' Dropout and Academic Success dataset from the UC Irvine Machine Learning Repository.
The dataset is located in the data directory of this repository, in which data.csv is the original data and formatted_data is the preprocessed data.
| Variable Name | Role | Type | Demographic | Description | Units | Missing Values |
|---|---|---|---|---|---|---|
| Marital Status | Feature | Integer | Marital Status | 1 – single 2 – married 3 – widower 4 – divorced 5 – facto union 6 – legally separated | no | |
| Application mode | Feature | Integer | 1 - 1st phase - general contingent 2 - Ordinance No. 612/93 ... | no | ||
| Application order | Feature | Integer | Application order (between 0 - first choice; and 9 last choice) | no | ||
| Course | Feature | Integer | 33 - Biofuel Production Technologies 171 - Animation and Multimedia Design ... | no | ||
| Daytime/evening attendance | Feature | Integer | 1 – daytime 0 - evening | no | ||
| Previous qualification | Feature | Integer | Education Level | 1 - Secondary education 2 - Higher education - bachelor's degree ... | no | |
| Previous qualification (grade) | Feature | Continuous | Grade of previous qualification (between 0 and 200) | no | ||
| Nationality | Feature | Integer | Nationality | 1 - Portuguese; 2 - German; 6 - Spanish; ... | no | |
| Mother's qualification | Feature | Integer | Education Level | 1 - Secondary Education - 12th Year of Schooling or Eq. ... | no | |
| Father's qualification | Feature | Integer | Education Level | 1 - Secondary Education - 12th Year of Schooling or Eq. ... | no | |
| Mother's occupation | Feature | Integer | Occupation | 0 - Student 1 - Representatives of the Legislative Power and Executive Bodies, Directors, Directors and Executive Managers ... | no | |
| Father's occupation | Feature | Integer | Occupation | 0 - Student 1 - Representatives of the Legislative Power and Executive Bodies, Directors, Directors and Executive Managers ... | no | |
| Admission grade | Feature | Continuous | Admission grade (between 0 and 200) | no | ||
| Displaced | Feature | Integer | 1 – yes 0 – no | no | ||
| Educational special needs | Feature | Integer | 1 – yes 0 – no | no | ||
| Debtor | Feature | Integer | 1 – yes 0 – no | no | ||
| Tuition fees up to date | Feature | Integer | 1 – yes 0 – no | no | ||
| Gender | Feature | Integer | Gender | 1 – male 0 – female | no | |
| Scholarship holder | Feature | Integer | 1 – yes 0 – no | no | ||
| Age at enrollment | Feature | Integer | Age | Age of student at enrollment | no | |
| International | Feature | Integer | 1 – yes 0 – no | no | ||
| Curricular units 1st sem (credited) | Feature | Integer | Number of curricular units credited in the 1st semester | no | ||
| Curricular units 1st sem (enrolled) | Feature | Integer | Number of curricular units enrolled in the 1st semester | no | ||
| Curricular units 1st sem (evaluations) | Feature | Integer | Number of evaluations to curricular units in the 1st semester | no | ||
| Curricular units 1st sem (approved) | Feature | Integer | Number of curricular units approved in the 1st semester | no | ||
| Curricular units 1st sem (grade) | Feature | Integer | Grade average in the 1st semester (between 0 and 20) | no | ||
| Curricular units 1st sem (without evaluations) | Feature | Integer | Number of curricular units without evaluations in the 1st semester | no | ||
| Curricular units 2nd sem (credited) | Feature | Integer | Number of curricular units credited in the 2nd semester | no | ||
| Curricular units 2nd sem (enrolled) | Feature | Integer | Number of curricular units enrolled in the 2nd semester | no | ||
| Curricular units 2nd sem (evaluations) | Feature | Integer | Number of evaluations to curricular units in the 2nd semester | no | ||
| Curricular units 2nd sem (approved) | Feature | Integer | Number of curricular units approved in the 2nd semester | no | ||
| Curricular units 2nd sem (grade) | Feature | Integer | Grade average in the 2nd semester (between 0 and 20) | no | ||
| Curricular units 2nd sem (without evaluations) | Feature | Integer | Number of curricular units without evaluations in the 2nd semester | no | ||
| Unemployment rate | Feature | Continuous | Unemployment rate (%) | no | ||
| Inflation rate | Feature | Continuous | Inflation rate (%) | no | ||
| GDP | Feature | Continuous | GDP | no | ||
| Target | Target | Categorical | Target. The problem is formulated as a three category classification task (dropout, enrolled, and graduate) at the end of the normal duration of the course | no |