-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path270_GAM.Rmd
27 lines (14 loc) · 1.27 KB
/
270_GAM.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<h2><a name="27_GAM" style="pointer-events: none; cursor: default;" ><font color="black"> 2.7. Generalised Additive Model</font></a></h2>
The GAMs are additive models that enable to overcome the linearity constraint trough the introduction of more flexible functions of the regressors. In the case of binary classification, we can write such a model as follows:
$$\log{ \biggl( \frac{P(Y=1|X)}{1-P(Y=1|X) } \biggr)}=\alpha + \sum_{k=1}^p{f_k(X)}$$
As far as our application is concerned, we consider smoothing splines as transformations of the input features. The optimal degrees of freedom for these functions are selected by 10-fold cross-validation, with a grid search in the range $[1,10]$. The best model turns out to be the one with 7 degrees of freedom, achieving an accuracy of 0.7735 e and an AUC equal to 0.8305.
```{r, fig.width=9.5, fig.height=4, echo=FALSE}
file = "results/MODELING/CLASSIFICATION/ply_val_gam.Rdata"
load( file )
plot
```
<br></br>
<br></br>
Nonetheless, the improvement introduced by this greater complexity is minimal, so it may be worth simplifying the model in favour of a more parsimonious one. For this reason, one may wish to select $df=4$ since it corresponds to a natural cubic spline and it is subject to lower variability.
<br></br>
<br></br>