Machine Learning: A Probabilistic Perspective Exercise Solution Subderivative of the hinge loss function Solution Reproducing kernel property Solution Orthogonal matrices Solution Eigenvectors by hand Solution Uncorrelated does not imply independent Solution Uncorrelated and Gaussian does not imply independent unless {\em jointly Gaussian} Solution Correlation coefficient is between -1 and +1 Solution Correlation coefficient for linearly related variables is $\pm 1$ Solution Normalization constant for a multidimensional Gaussian Solution Bivariate Gaussian Solution Conditioning a bivariate Gaussian Solution Derivation of information form formulae for marginalizing and conditioning an MVN Solution Sensor fusion with known variances in 1d Solution Linear combinations of random variables Solution Legal reasoning Solution Expected value of the minimum of two rv's Solution Probabilities are sensitive to the form of the question that was used to generate the answer Solution Convolution of two Gaussians is a Gaussian Solution Variance of a sum Solution Bayes rule for medical diagnosis Solution Conditional independence Solution Pairwise independence does not imply mutual independence Solution Conditional independence iff joint factorizes Solution Deriving the inverse gamma density Solution Normalization constant for a 1D Gaussian Solution Mean, mode, variance for the beta distribution Solution MVN in exponential family form Solution Optimal threshold on classification probability Solution Reject option in classifiers Solution More reject options Solution Newsvendor problem Solution Bayes factors and ROC curves Solution Decision rule for trading off FPs and FNs Solution Posterior median is optimal estimate under L1 loss Solution Gaussian posterior credible interval Solution MAP estimation for 1D Gaussians Solution A mixture of conjugate priors is conjugate Solution BIC for Gaussians Solution BIC for a 2d discrete distribution Solution KL divergence and the number game Solution Deriving the posterior predictive density for the healthy levels game Solution Conjugate prior for univariate Gaussian in exponential family form Solution Laplace approximation to p(mu,log sigma) Given data for a univariate Gaussian. Solution Pessimism of LOOCV Solution James Stein estimator for Gaussian means \matlabex Solution MLE for the univariate Gaussian Solution ML estimator $\sigmaSqMle$ is biased Solution Estimation of $\sigma^2$ when $\mu$ is known Solution Variance and MSE of estimators for Gaussian variance Solution Expressing mutual information in terms of entropies Solution Deriving the decomposition of joint entropy Solution Relationship between D(pq) and chi2 statistic Solution Fun with entropies Solution Mutual information for correlated normals Solution A measure of correlation (normalized mutual information) Solution Conditional mutual information and naive Bayes classifiers Solution Mutual information between class and binary features Solution Fayyad-Irani binning Solution Inference in a simple Bayes net for fish classification Solution Removing leaves in BN20 networks Solution Handling negative findings in the QMR network Solution Variable elimination Solution Message passing on a tree Solution Inference in 2D lattice MRFs Solution Graphcuts for MAP estimation in binary submodular MRFs Solution Graphcuts for alpha-beta swap Solution Constant factor optimality for alpha-expansion Solution Dual decomposition for pose segmentation Solution ELBO for univariate Gaussians Solution ELBO for GMMs Solution Derivation of $\expect{\log \pi_k Solution Alternative derivation of the mean field updates for the Ising model Solution Forwards vs reverse KL divergence Solution Derivation of the structured mean field updates for FHMM Solution Variational EM for binary FA with sigmoid link Solution Derivation of the EP updates for trueskill Solution Sampling from a Cauchy Solution Optimal proposal for particle filtering with linear-Gaussian measurement model Solution Sampling from a truncated beta posterior using MH \matlabex Solution Gibbs sampling from a 2D Gaussian Solution Gibbs sampling for a 1D Gaussian mixture model Solution Gibbs sampling for robust linear regression with a Student likelihood Solution Gibbs sampling for probit regression Solution Gibbs sampling for logistic regression with the Student approximation Solution Dummy encoding and linear models Solution Multi-output linear regression Solution Centering and ridge regression Solution MLE for $\sigma^2$ for linear regression Solution MLE for the offset term in linear regression Solution Sufficient statistics for online linear regression Solution Bayesian linear regression in 1d with known $\sigma^2$ Solution Derivation of the gradient for linear regression with Student likelihood Solution EM for robust linear regression with a Student likelihood Solution Gradient and Hessian of log-likelihood for multinomial logistic regression Solution Symmetric version of $\ell_2$ regularized multinomial logistic regression Solution Elementary properties of $\ell_2$ regularized logistic regression Solution Regularizing separate terms in 2d logistic regression Solution Logistic regression vs LDA/QDA Solution Add-one smoothing for language models Solution Spam classification using logistic regression Solution Spam classification using naive Bayes Solution Partial derivative of the RSS Solution EM for ARD Solution Fixed point iteration for ARD Solution Reducing elastic net to lasso Solution Shrinkage in linear regression Solution Prior for the Bernoulli rate parameter in the spike and slab model Solution Deriving E step for GSM prior Solution GSM representation of group lasso Solution Projected gradient descent for $\ell_1$ regularized least squares Solution Fitting an SVM classifier by hand Solution Linear separability Solution Gaussian DAGs vs Gaussian MRFs Solution I-maps for a DGM Solution Bayes Ball Solution Markov blanket for a DGM Solution Hidden variables in DGMs Solution Bayes net for a rainy day Solution Moralization does not introduce new independence statements Solution Conditional independence properties of GMs Solution Causal reasoning in the sprinkler network Solution EM for FA Solution Heuristic for assessing applicability of PCA Solution Deriving the second principal component Solution Deriving the residual error for PCA Solution Derivation of Fisher's linear discriminant Solution PCA via successive deflation Solution PPCA variance terms Solution Posterior inference in PPCA Solution Imputation in a FA model Solution Efficiently evaluating the PPCA density Solution Two filter approach to smoothing in HMMs Solution Derivation of $Q$ function for HMM Solution EM for for HMMs with mixture of Gaussian observations Solution EM for for HMMs with tied mixtures Solution EM for LG-SSM Solution Seasonal LG-SSM model in standard form Solution