- What is broadcasting in connection to Linear Algebra?
- What are scalars, vectors, matrices, and tensors?
- What is Hadamard product of two matrices?
- What is an inverse matrix?
- If inverse of a matrix exists, how to calculate it?
- What is the determinant of a square matrix? How is it calculated (Laplace expansion)? What is the connection of determinant to eigenvalues?
- Discuss span and linear dependence.
- What is Ax = b? When does Ax =b has a unique solution?
- In Ax = b, what happens when A is fat or tall?
- When does inverse of A exist?
- What is a norm? What is L1, L2 and L infinity norm?
- What are the conditions a norm has to satisfy?
- Why is squared of L2 norm preferred in ML than just L2 norm?
- When L1 norm is preferred over L2 norm?
- Can the number of nonzero elements in a vector be defined as L0 norm? If no, why?
- What is Frobenius norm?
- What is a diagonal matrix? (D_i,j = 0 for i != 0)
- Why is multiplication by diagonal matrix computationally cheap? How is the multiplication different for square vs. non-square diagonal matrix?
- At what conditions does the inverse of a diagonal matrix exist? (square and all diagonal elements non-zero)
- What is a symmetrix matrix? (same as its transpose)
- What is a unit vector?
- When are two vectors x and y orthogonal? (x.T * y = 0)
- At R^n what is the maximum possible number of orthogonal vectors with non-zero norm?
- When are two vectors x and y orthonormal? (x.T * y = 0 and both have unit norm)
- What is an orthogonal matrix? Why is computationally preferred? (a square matrix whose rows are mutually orthonormal and columns are mutually orthonormal.)
- What is eigendecomposition, eigenvectors and eigenvalues?
- How to find eigen values of a matrix?
- Write the eigendecomposition formula for a matrix. If the matrix is real symmetric, how will this change?
- Is the eigendecomposition guaranteed to be unique? If not, then how do we represent it?
- What are positive definite, negative definite, positive semi definite and negative semi definite matrices?
- What is SVD? Why do we use it? Why not just use ED?
- Given a matrix A, how will you calculate its SVD?
- What are singular values, left singulars and right singulars?
- What is the connection of SVD of A with functions of A?
- Why are singular values always non-negative?
- What is the Moore Penrose pseudo inverse and how to calculate it?
- If we do Moore Penrose pseudo inverse on Ax = b, what solution is provided is A is fat? Moreover, what solution is provided if A is tall?
- Which matrices can be decomposed by ED? (Any NxN square matrix with N linearly independent eigenvectors)
- Which matrices can be decomposed by SVD? (Any matrix; V is either conjugate transpose or normal transpose depending on whether A is complex or real)
- What is the trace of a matrix?
- How to write Frobenius norm of a matrix A in terms of trace?
- Why is trace of a multiplication of matrices invariant to cyclic permutations?
- What is the trace of a scalar?
- Write the frobenius norm of a matrix in terms of trace?
- What is underflow and overflow?
- How to tackle the problem of underflow or overflow for softmax function or log softmax function?
- What is poor conditioning?
- What is the condition number?
- What are grad, div and curl?
- What are critical or stationary points in multi-dimensions?
- Why should you do gradient descent when you want to minimize a function?
- What is line search?
- What is hill climbing?
- What is a Jacobian matrix?
- What is curvature?
- What is a Hessian matrix?
- Compare "Frequentist probability" vs. "Bayesian probability"?
- What is a random variable?
- What is a probability distribution?
- What is a probability mass function?
- What is a probability density function?
- What is a joint probability distribution?
- What are the conditions for a function to be a probability mass function?
- What are the conditions for a function to be a probability density function?
- What is a marginal probability? Given the joint probability function, how will you calculate it?
- What is conditional probability? Given the joint probability function, how will you calculate it?
- State the Chain rule of conditional probabilities.
- What are the conditions for independence and conditional independence of two random variables?
- What are expectation, variance and covariance?
- Compare covariance and independence.
- What is the covariance for a vector of random variables?
- What is a Bernoulli distribution? Calculate the expectation and variance of a random variable that follows Bernoulli distribution?
- What is a multinoulli distribution?
- What is a normal distribution?
- Why is the normal distribution a default choice for a prior over a set of real numbers?
- What is the central limit theorem?
- What are exponential and Laplace distribution?
- What are Dirac distribution and Empirical distribution?
- What is mixture of distributions?
- Name two common examples of mixture of distributions? (Empirical and Gaussian Mixture)
- Is Gaussian mixture model a universal approximator of densities?
- Write the formulae for logistic and softplus function.
- Write the formulae for Bayes rule.
- What do you mean by measure zero and almost everywhere?
- If two random variables are related in a deterministic way, how are the PDFs related?
- Define self-information. What are its units?
- What are Shannon entropy and differential entropy?
- What is Kullback-Leibler (KL) divergence?
- Can KL divergence be used as a distance measure?
- Define cross-entropy.
- What are structured probabilistic models or graphical models?
- In the context of structured probabilistic models, what are directed and undirected models? How are they represented? What are cliques in undirected structured probabilistic models?
- What is population mean and sample mean?
- What is population standard deviation and sample standard deviation?
- Why population s.d. has N degrees of freedom while sample s.d. has N-1 degrees of freedom? In other words, why 1/N inside root for pop. s.d. and 1/(N-1) inside root for sample s.d.? (Here)
- What is the formula for calculating the s.d. of the sample mean?
- What is confidence interval?
- What is standard error?