-
Notifications
You must be signed in to change notification settings - Fork 79
Using LaTeX and MathJax
Joram Soch edited this page Aug 26, 2020
·
7 revisions
Source code for proofs and definitions in "The Book of Statistical Proofs" uses a combination of Markdown, MathJax and LaTeX. On this page, we collect a set of rules, recommendations and suggestions for applying LaTeX markup to typeset formulas.
- Use
$...$for in-line math, e.g.
Let $X$ be an $n \times 1$ random vector.- Use
$$...$$for stand-alone equations, e.g.
$$
y = Ax + b \sim \mathcal{N}(A\mu + b, A \Sigma A^\mathrm{T}) \; .
$$- Use
$$ \begin{split} ...&... \\ ...&... \end{split} $$to write multi-line equations, e.g.
$$
\begin{split}
M_y(t) &= \exp \left[ t^\mathrm{T} b \right] \cdot M_x(At) \\
&= \exp \left[ t^\mathrm{T} b \right] \cdot \exp \left[ t^\mathrm{T} A \mu + \frac{1}{2} t^\mathrm{T} A \Sigma A^\mathrm{T} t \right] \\
&= \exp \left[ t^\mathrm{T} \left( A \mu + b \right) + \frac{1}{2} t^\mathrm{T} A \Sigma A^\mathrm{T} t \right] \; .
\end{split}
$$- Label each stand-alone equation (also splitted ones) using
\label{eq:XYZ}, e.g.
$$ \label{eq:mvn-pdf}
f_X(x) = \frac{1}{\sqrt{(2 \pi)^n |\Sigma|}} \cdot \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} \Sigma^{-1} (x-\mu) \right] \; .
$$- You can then reference them in in-line math or other equations using
\eqref{eq:XYZ}, e.g.
$$ \label{eq:y-mgf-s2}
\begin{split}
M_y(t) &\overset{\eqref{eq:y-mgf-s1}}{=} \exp \left[ t^\mathrm{T} b \right] \cdot M_x(At) \\
&\overset{\eqref{eq:mvn-mgf}}{=} \exp \left[ t^\mathrm{T} b \right] \cdot \exp \left[ t^\mathrm{T} A \mu + \frac{1}{2} t^\mathrm{T} A \Sigma A^\mathrm{T} t \right] \\
&= \exp \left[ t^\mathrm{T} \left( A \mu + b \right) + \frac{1}{2} t^\mathrm{T} A \Sigma A^\mathrm{T} t \right] \; .
\end{split}
$$-
Do not use a vertical bar (
|) in in-line math, because it will be interpreted as indicating a table.Solution: Use
\vert,\lvert,\rvertor\mid, depending on your specific formula and context. -
Do not use two consecutive curly opening braces (
{{) in any equation, because it will cause a build error.Solution: Put a space between the two braces in order to avoid the build error:
{ {.
-
A, B, C– arbitrary random events -
A_1, \ldots, A_k– mutually exclusive random events -
\bar{A}, \bar{B}, \bar{C}– complements of random events -
X, Y, Z– scalar random variables, random vectors or random matrices -
x, y, z– realizations or values of random variables (exception: random matrices) -
\mathcal{X}, \mathcal{Y}, \mathcal{Z}– sets of possible values of random variables -
x \in \mathcal{X}, y \in \mathcal{Y}, z \in \mathcal{Z}– indexing all possible values -
p(x), q(x)– probability densities or probability masses -
\mathrm{Pr}(X=a), \mathrm{Pr}(X \in A)– specific statements about random variables -
p(x,y)– joint probability -
p(x|y)– conditional probability -
f_X(x)– probability density (PDF) or probability mass function (PMF) -
F_X(x)– cumulative distribution function (CDF) -
Q_X(p)– quantile function (QF) a.k.a. inverse CDF. -
M_X(t)– moment-generating function (MGF) -
\mathrm{E}(X)– expected value (mean) -
\mathrm{Var}(X)– variance -
\mathrm{Cov}(X,Y)– covariance -
\mathrm{Corr}(X,Y)– correlation -
\Sigma_{XX}– covariance matrix -
C_{XX}– correlation matrix -
\mu_n– n-th (central) moment -
\mathrm{H}(X)– (Shannon) entropy -
\mathrm{H}(X|Y)– conditional entropy -
\mathrm{H}(X,Y)– joint entropy (of two random variables) -
\mathrm{H}(P,Q)– cross-entropy (of two probability distributions) -
\mathrm{h}(X)– differential entropy -
\mathrm{h}(X|Y)– conditional differential entropy -
\mathrm{h}(X,Y)– joint differential entropy (of two random variables) -
\mathrm{h}(P,Q)– differential cross-entropy (of two probability distributions) -
\mathrm{I}(X,Y)– mutual information -
\mathrm{KL}[P||Q]– Kullback-Leibler divergence (between two probability distributions) -
\mathrm{KL}[p(x)||q(x)]– Kullback-Leibler divergence (between two PMFs or PDFs)
-
\lambda– hyper-parameters, parameters of a distribution -
\mathcal{D}(\lambda)– parametrized probability distribution -
X \sim \mathcal{D}(\lambda)– random variable following probability distribution -
p(x|\lambda) = \mathcal{D}(x; \lambda)– PDF or PMF of probability distribution -
\int_{-\infty}^x \mathcal{D}(z; \lambda) \, \mathrm{d}z– CDF of probability distribution -
Y = AX + b– linear transformation of random variable -
\mu– mean of random variable -
\Sigma– covariance of random variable -
\mathcal{N}(\mu, \Sigma)– multivariate normal distribution -
\mathrm{E}(X)– expected value of random variable -
\mathrm{median}(X)– median of random variable -
\mathrm{mode}(X)– mode of random variable -
\mathrm{Var}(X)– variance of random variable -
\mathrm{Cov}(X)– covariance of random vector
-
y, Y– univariate/multivariate measured data -
x, X– single predictor/design matrix -
\beta, B– univariate/multivariate regression coefficients -
\varepsilon, E– univariate/multivariate noise -
\sigma^2, \Sigma– noise variance/measurement covariance -
I_n– noise covariance matrix (i.i.d.) -
V– noise covariance matrix (not i.i.d.) -
n– number of observations -
v– number of measurements -
p– number of regressors -
y_i– i-th observation -
y_j– j-th measurement -
y_{ij}– i-th observation of j-th measurement -
m– generative model -
\theta– model parameters -
\lambda– model hyper-parameters -
p(y|\theta,m)– likelihood function -
\mathrm{LL}(\theta)– log-likelihood function -
\hat{\theta}– estimated model parameters -
\hat{y}– fitted/predicted data -
p(\theta|m)– prior distribution -
p(\theta|y,m)– posterior distribution -
p(y|m)– marginal likelihood -
\log p(y|m)– log model evidence
-
\sigma^2– noise variance -
\hat{\sigma}^2– residual variance -
R^2– coefficient of determination -
R^2_\mathrm{adj}– adjusted coefficient of determination -
\mathrm{SNR}– signal-to-noise ratio -
y– measured data -
m– generative model -
f– generative model family -
n– number of observations -
k– number of free model parameters -
\mathrm{MLL}{m}– maximum log-likelihood -
\mathrm{IC}{m}– information criterion -
p(y|m)– model evidence -
\mathrm{LME}{m}– log model evidence -
\mathrm{Acc}{m}– (Bayesian) model accuracy (term) -
\mathrm{Com}{m}– (Bayesian) model complexity (penalty) -
m \in f– indexing all models in a family -
p(y|f)– family evidence -
\mathrm{LFE}{f}– log family evidence -
\mathrm{BF}_{12}– Bayes factor -
\mathrm{LBF}_{12}– log Bayes factor -
p(m|y)– posterior model probability -
p(\theta|y)– marginal posterior distribution