reimplement the following metrics + normalized log-likelihood + number of parameters + maybe AIC or BIC for counts. But we've argued that these need to be normalized.