JoKra1
diff --git a/‎.github/workflows/python-package.yml
+3-3 b/‎.github/workflows/python-package.yml
+3-3
diff --git a/‎README.md
+7-6 b/‎README.md
+7-6
diff --git a/‎src/mssm/models.py
+408-932 b/‎src/mssm/models.py
+408-932
diff --git a/‎src/mssm/src/cpp/cpp_solvers.cpp
+133-91 b/‎src/mssm/src/cpp/cpp_solvers.cpp
+133-91
diff --git a/‎src/mssm/src/python/compare.py
+107-31 b/‎src/mssm/src/python/compare.py
+107-31
diff --git a/‎src/mssm/src/python/constraints.py
+2-9 b/‎src/mssm/src/python/constraints.py
+2-9
@@ -103,6 +103,9 @@ jobs:
         with:
           name: wheels
           path: ./wheelhouse/
+
+      - name: Print wheel content
+        run: ls ./wheelhouse
 
       - name: Create GitHub Release
         env:
@@ -141,9 +144,6 @@ jobs:
         with:
           name: wheels
           path: ./wheelhouse/
-      
-      - name: Print wheel content
-        run: ls ./wheelhouse
 
       - name: Publish wheels to test-pypi
         uses: pypa/gh-action-pypi-publish@release/v1
 
@@ -1,8 +1,11 @@
-# mssm: Markov-switching Spline Models
+# mssm: Massive Sparse Smooth Models
+
+![GitHub CI Stable](https://github.com/jokra1/mssm/actions/workflows/python-package.yml/badge.svg?branch=stable)
+[![codecov](https://codecov.io/gh/JoKra1/mssm/graph/badge.svg?token=B2NZBO4XJ3)](https://codecov.io/gh/JoKra1/mssm)
 
 ## Description
 
-``mssm`` is a toolbox to estimate Generalized Additive Mixed Models (GAMMs) semi Markov-switching GAMMs (sMs-GAMMs) and sMs Impulse Response GAMMs (sMs-IR-GAMMs). The ``main`` branch is updated frequently to reflect new developments. The ``stable`` branch should reflect the latest releases. If you don't need the newest functionality, you should install from the ``stable`` branch (see below for instructions).
+``mssm`` is a toolbox to estimate Generalized Additive Mixed Models (GAMMs), Generalized Additive Mixed Models of Location Scale and Shape (GAMMLSS), and more general smooth models such as semi Markov-switching GAMMs (sMs-GAMMs; experimental) and sMs Impulse Response GAMMs (sMs-IR-GAMMs; experimental). The ``main`` branch is updated frequently to reflect new developments. The ``stable`` branch should reflect the latest releases. If you don't need the newest functionality, you should install from the ``stable`` branch (see below for instructions).
 
 ## Installation
 
@@ -33,10 +36,8 @@ pip install .
 
 ## To get started
 
- - With GAMMs: Take a look at tutorial 1 in the tutorial folder.
- - With sms-IR-GAMMs: Take a look at tutorial 2.
- - With sms-GAMMs: Take a look at tutorial 3.
+Take a look at the tutorials provided in this [repository](https://github.com/JoKra1/mssm_tutorials)!
 
 ## Contributing
 
-Contributions are welcome! Feel free to open issues or make pull-requests to main. Some problems that could use work are listed below.
+Contributions are welcome! Feel free to open issues or make pull-requests to main.
@@ -2,57 +2,133 @@
 import scipy as scp
 import math
 from ...models import GAMM
+from .utils import correct_VB
 import warnings
 
-def GLRT_CDL(model1:GAMM,
-            model2:GAMM,
-            alpha=0.05):
+def compare_CDL(model1:GAMM,
+                model2:GAMM,
+                correct_V:bool=True,
+                correct_t1:bool=True,
+                perform_GLRT:bool=True,
+                lR=20,
+                nR=5,
+                n_c=10,
+                alpha=0.05,
+                grid='JJJ',
+                verbose=False):
 
     """
-    Performs an approximate GLRT on twice the difference in unpenalized likelihood between the models. For the degrees of freedom the expected degrees of freedom (EDF) of each
-    model are used (i.e., this is the conditional test discussed in Wood (2017: 6.12.4)). The difference between the models in EDF serves as DoF for computing the Chi-Square statistic.
+    (Optionally) performs an approximate GLRT on twice the difference in unpenalized likelihood between model1 and model2 (see Wood, 2017). Also computes the AIC difference (see Wood et al., 2016).
+    For the GLRT to be appropriate model1 should be set to the model containing more effects and model2 should be a nested, simpler, variant of model1.
+    
+    For the degrees of freedom for the test, the expected degrees of freedom (EDF) of each model are used (i.e., this is the conditional test discussed in Wood (2017: 6.12.4)).
+    The difference between the models in EDF serves as DoF for computing the Chi-Square statistic. Similarly, for each model 2*edf is added to twice the negative (conditional) likelihood to
+    compute the aic (see Wood et al., 2016).
+    
+    By default (``correct_V=True``), ``mssm`` will attempt to correct the edf for uncertainty in the estimated \lambda parameters. This requires computing a costly
+    correction (see Greven & Scheipl, 2016 and the ``correct_VB`` function in the utils module) which will take quite some time for reasonably large models with more than 3-4 smoothing parameters.
+    In that case relying on CIs and penalty-based comparisons might be preferable (see Marra & Wood, 2011 for details on the latter).
+
+    In case ``correct_t1=True`` and ``correct_V=True`` the EDF will be set to the smoothness uncertainty corrected and smoothness bias corrected exprected degrees of freedom (t1 in section 6.1.2 of Wood, 2017),
+    for the GLRT (based on reccomendation given in section 6.12.4 in Wood, 2017). The AIC (Wood, 2017) of both models will still be based on the regular (smoothness uncertainty corrected) edf.
 
     The computation here is different to the one performed by the ``compareML`` function in the R-package ``itsadug`` - which rather performs a version of the marginal GLRT
-    (also discussed in Wood, 2017: 6.12.4). The p-value is very **very** much approximate. Even more so than when using for example ``anova()`` in R to perform this test. The reason
-    is that the lambda uncertainty correction applied by mgcv can not be obtained by ``mssm``. Also, the test should not be used to compare models differing in their random effect structures,
-    (see Wood, 2017: 6.12.4) for details on those two points.
+    (also discussed in Wood, 2017: 6.12.4). The p-value is approximate - very **very** much so if ``correct_V=False`` and the test should not be used to compare models differing in their random effect structures
+    (see Wood, 2017: 6.12.4).
 
     References:
+     - Marra, G., & Wood, S. N. (2011) Practical variable selection for generalized additive models.
+     - Wood, S. N., Pya, N., Saefken, B., (2016). Smoothing Parameter and Model Selection for General Smooth Models
+     - Greven, S., & Scheipl, F. (2016). Comment on: Smoothing Parameter and Model Selection for General Smooth Models
      - Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition (2nd ed.).
      - ``compareML`` function from ``itsadug`` R-package: https://rdrr.io/cran/itsadug/man/compareML.html
+     - ``anova.gam`` function from ``mgcv``, see: https://www.rdocumentation.org/packages/mgcv/versions/1.9-1/topics/anova.gam
     """
 
     if type(model1.family) != type(model2.family):
         raise ValueError("Both models should be estimated using the same family.")
 
-    # Collect total DOF
-    DOF1 = model1.edf
-    DOF2 = model2.edf
-
-    # Compute un-penalized likelihood()
-    llk1 = model1.get_llk(False)
-    llk2 = model2.get_llk(False)
+    if perform_GLRT and model1.formula.n_coef < model2.formula.n_coef:
+        raise ValueError("For the GLRT, model1 needs to be set to the more complex model (i.e., needs to have more coefficients than model2).")
 
-    if DOF1 < DOF2:
-        # Re-order, making sure that more complex model is 1
-        llk_tmp = llk1
-        DOF_tmp = DOF1
-        llk1 = llk2
-        llk2 = llk_tmp
-        DOF1 = DOF2
-        DOF2 = DOF_tmp
+    # Collect total DOF for uncertainty in \lambda using correction proposed by Greven & Scheipl (2016)
+    if correct_V:
+        if verbose:
+            print("Correcting for uncertainty in lambda estimates...\n")
+        _,_,DOF1,DOF12 = correct_VB(model1,nR=nR,lR=lR,n_c=n_c,form_t1=correct_t1,grid_type=grid,verbose=verbose)
+        _,_,DOF2,DOF22 = correct_VB(model2,nR=nR,lR=lR,n_c=n_c,form_t1=correct_t1,grid_type=grid,verbose=verbose)
+        
+        if correct_t1:
+            # Section 6.12.4 suggests replacing t (edf) with t1 (2*t - (F@F).trace()) with F=(X.T@X+S_\llambda)^{-1}@X.T@X for GLRT - with the latter also being corrected for
+            # uncertainty in lambda. However, Wood et al., (2016) suggest that the aic should be computed based on t - so some book-keeping is ncessary.
+            aic_DOF1 = DOF1
+            aic_DOF2 = DOF2
+            DOF1 = DOF12
+            DOF2 = DOF22
+
+    else:
+        DOF1 = model1.edf
+        DOF2 = model2.edf
+
+    # Compute un-penalized likelihood based on scale estimate of more complex (in terms of edf - so actually more complex) model if a scale was estimated (see section 3.1.4, Wood, 2017).
+    ext_scale = None
+    if model1.family.twopar:
+        if DOF1 > DOF2:
+            _,ext_scale = model1.get_pars()
+        else:
+            _,ext_scale = model2.get_pars()
 
-    # Compute Chi-square statistic
+    llk1 = model1.get_llk(penalized=False,ext_scale=ext_scale)
+    llk2 = model2.get_llk(penalized=False,ext_scale=ext_scale)
+
+    # Compute Chi-square statistic...
     stat = 2 * (llk1 - llk2)
+    test_stat = stat
+    
+    # ... and degrees of freedom under NULL (see Wood, 2017)
+    DOF_diff = DOF1-DOF2
+    test_DOF_diff = abs(DOF_diff)
 
-    if DOF1-DOF2 < 1:
-        warnings.warn("Difference in EDF is extremely small. Enforcing a minimum of 1 for the DOF of the CHI^2 distribution...")
+    # Multiple scenarios that this test needs to cover...
+    # 1) LLK1 < LLK2, DOF1 < DOF2; This is a valid test, essentially model2 turns out to be the more complicated one.
+    # 2) LLK1 < LLK2, DOF1 > DOF2; This makes no sense. Model 1 - the more complex one - has worse llk but more DOF.
+    # 3) LLK1 > LLK2, DOF1 < DOF2; Notationally correct: model1 should after all be more complex. But in terms of edf makes little sense (as pointed out by Wood, 2017).
+    # 4) LLK1 > LLK2, DOF1 > DOF2; Valid, inverse of case 1.
+    
+    # Personally, I think cases 2 & 3 should both return NAs for p-values.. But anova.gam for mgcv returns a p-value for case 3 so we will do the same here
+    # and just raise a warning. For case 1, we need to take -1*test_stat.
+    if  llk1 < llk2 and DOF1 < DOF2:
+        test_stat = -1*test_stat
 
     # Compute p-value under reference distribution.
-    # scipy seems to handle non-integer DOF quite well, so I won't bother rounding here.
-    p = 1 - scp.stats.chi2.cdf(stat,max(DOF1-DOF2,1))
+    if perform_GLRT == False or test_stat < 0: # Correct for aforementioned possibility 2: model 1 has lower llk and higher edf.
+        H1 = np.nan
+        p = np.nan
+    else:
+        if llk1 > llk2 and DOF1 < DOF2:
+            warnings.warn("Model with more coefficients has higher likelihood but lower expected degrees of freedom. Interpret results with caution.")
 
-    # Reject NULL?
-    H1 = p <= alpha
+        p = 1 - scp.stats.chi2.cdf(test_stat,test_DOF_diff)
 
-    return H1,p,stat,DOF1,DOF2
+        # Reject NULL?
+        H1 = p <= alpha
+
+    # Also correct AIC for GAM (see Wood et al., 2017)
+    if correct_t1:
+        aic1 = -2*llk1 + 2*aic_DOF1
+        aic2 = -2*llk2 + 2*aic_DOF2
+    else:
+        aic1 = -2*llk1 + 2*DOF1
+        aic2 = -2*llk2 + 2*DOF2
+
+    result = {"H1":H1,
+              "p":p,
+              "chi^2":stat,
+              "DOF1":DOF1,
+              "DOF2":DOF2,
+              "Res. DOF":DOF_diff,
+              "aic1":aic1,
+              "aic2":aic2,
+              "aic_diff":aic1-aic2}
+    
+    return result
@@ -49,21 +49,14 @@ class Constraint:
   f_i(x) overlap in their support and because the support of each individual f_i(x) is narrow, i.e., f_i(x) is sparse itself.
 
   However, this is not a true centering constraint: f(x) will not necessarily be orthogonal to the intercept, i.e., 1.T @ f(x) will not necessarily be 0. Hence, confidence intervals will usually
-  be wider when using ConstType.DIFF (also when using ConstType.DROP, for the same reason) instead of ConstType.QR! Simulations reveal that for the same smoothing problem, the difference
-  between ConstType.DIFF and ConstType.QR becomes smaller when increasing the number of basis functions. Intuitively, with many basis functions the dependency of the estimated f(x) on the
-  intercept is usually reduced suffciently, so that the confidence intervals obtained with ConstType.DIFF match those achieved with ConstType.QR. In that case, the CIs achieved with ConstType.QR
-  and ConstType.DIFF are usually substantially narrower than those obtained with ConstType.DROP.
-
-  From this, it follows that:
-    - ConstType.QR is preferred if computational efficiency is not crucial
-    - ConstType.DROP is preferred if k is small (5-15) and computational efficiency is crucial
-    - ConstType.DIFF is preferred if k is large (> 15) and computational efficiency is crucial
+  be wider when using ConstType.DIFF (also when using ConstType.DROP, for the same reason) instead of ConstType.QR (see Wood; 2017,2020)!
 
   A final note regards the use of tensor smooths when te==False. Since the value of any constant estimated for a smooth depends on the type of constraint used, the mmarginal functions estimated
   for the "main effects" (f(x),f(z)) and "interaction effect" (f(x,z)) in a model: y = a + f(x) + f(z) + f(x,z) will differ depending on the type of constraint used. The "Anova-like" decomposition
   described in detail in Wood (2017) is achievable only when using ConstType.QR.
 
   References:
+  - Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition (2nd ed.).
   - Wood, S. N. (2020). Inference and computation with generalized additive models and their extensions. TEST, 29(2), 307–339. https://doi.org/10.1007/s11749-020-00711-5
   - Eilers, P. H. C., & Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2), 89–121. https://doi.org/10.1214/ss/1038425655