added 3 proofs

JoramSoch · web-flow · commit 1bfc4f4b4d6f · 2020-11-19T09:02:14.000+01:00
diff --git a/P/beta-cdf.md b/P/beta-cdf.md
@@ -0,0 +1,71 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-19 08:01:00
+
+title: "Cumulative distribution function of the beta distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Beta distribution"
+theorem: "Cumulative distribution function"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Beta function"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-19"
+    url: "https://en.wikipedia.org/wiki/Beta_function#Incomplete_beta_function"
+
+proof_id: "P195"
+shortcut: "beta-cdf"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a positive [random variable](/D/rvar) following a [beta distribution](/D/gam):
+
+$$ \label{eq:beta}
+X \sim \mathrm{Bet}(\alpha, \beta) \; .
+$$
+
+Then, the [cumulative distribution function](/D/cdf) of $X$ is
+
+$$ \label{eq:beta-cdf}
+F_X(x) = \frac{B(x; \alpha, \beta)}{B(\alpha, \beta)}
+$$
+
+where $B(a,b)$ is the beta function and $B(x;a,b)$ is the incomplete gamma function.
+
+
+**Proof:** The [probability density function of the beta distribution](/P/beta-pdf) is:
+
+$$ \label{eq:beta-pdf}
+f_X(x) = \frac{1}{\mathrm{B}(\alpha, \beta)} \, x^{\alpha-1} \, (1-x)^{\beta-1} \; .
+$$
+
+Thus, the [cumulative distribution function](/D/cdf) is:
+
+$$ \label{eq:beta-cdf-app}
+\begin{split}
+F_X(x) &= \int_{0}^{x} \mathrm{Bet}(z; \alpha, \beta) \, \mathrm{d}z \\
+&= \int_{0}^{x} \frac{1}{\mathrm{B}(\alpha, \beta)} \, z^{\alpha-1} \, (1-z)^{\beta-1} \, \mathrm{d}z \\
+&= \frac{1}{B(x;a,b)} \int_{0}^{x} z^{\alpha-1} \, (1-z)^{\beta-1} \, \mathrm{d}z \; .
+\end{split}
+$$
+
+With the definition of the incomplete beta function
+
+$$ \label{eq:inc-beta-fct}
+B(x;a,b) = \int_{0}^{x} t^{a-1} \, (1-t)^{b-1} \, \mathrm{d}t \; ,
+$$
+
+we arrive at the final result given by equation \eqref{eq:beta-cdf}:
+
+$$ \label{eq:beta-cdf-qed}
+F_X(x) = \frac{B(x; \alpha, \beta)}{B(\alpha, \beta)} \; .
+$$
diff --git a/P/gam-qf.md b/P/gam-qf.md
@@ -0,0 +1,82 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-19 07:31:00
+
+title: "Quantile function of the gamma distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Gamma distribution"
+theorem: "Quantile function"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Incomplete gamma function"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-19"
+    url: "https://en.wikipedia.org/wiki/Incomplete_gamma_function#Definition"
+
+proof_id: "P194"
+shortcut: "gam-qf"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a [random variable](/D/rvar) following a [gamma distribution](/D/gam):
+
+$$ \label{eq:gam}
+X \sim \mathrm{Gam}(a,b) \; .
+$$
+
+Then, the [quantile function](/D/qf) of $X$ is
+
+$$ \label{eq:gam-qf}
+Q_X(p) = \left\{
+\begin{array}{rl}
+-\infty \; , & \text{if} \; p = 0 \\
+\gamma^{-1}(a, \Gamma(a) \cdot p)/b \; , & \text{if} \; p > 0
+\end{array}
+\right.
+$$
+
+where $\gamma^{-1}(s, y)$ is the inverse of the lower incomplete gamma function $\gamma(s, x)$
+
+
+**Proof:** The [cumulative distribution function of the gamma distribution](/P/gam-cdf) is:
+
+$$ \label{eq:gam-cdf}
+F_X(x) = \left\{
+\begin{array}{rl}
+0 \; , & \text{if} \; x < 0 \\
+\frac{\gamma(a,bx)}{\Gamma(a)} \; , & \text{if} \; x \geq 0 \; .
+\end{array}
+\right.
+$$
+
+The quantile function $Q_X(p)$ [is defined as](/D/qf) the smallest $x$, such that $F_X(x) = p$:
+
+$$ \label{eq:qf}
+Q_X(p) = \min \left\lbrace x \in \mathbb{R} \, \vert \, F_X(x) = p \right\rbrace \; .
+$$
+
+Thus, we have $Q_X(p) = -\infty$, if $p = 0$. When $p > 0$, [it holds that](/P/qf-cdf)
+
+$$ \label{eq:gam-qf-s1}
+Q_X(p) = F_X^{-1}(x) \; .
+$$
+
+This can be derived by rearranging equation \eqref{eq:gam-cdf}:
+
+$$ \label{eq:gam-qf-s2}
+\begin{split}
+p &= \frac{\gamma(a,bx)}{\Gamma(a)} \\
+\Gamma(a) \cdot p &= \gamma(a,bx) \\
+\gamma^{-1}(a, \Gamma(a) \cdot p) &= bx \\
+x &= \frac{\gamma^{-1}(a, \Gamma(a) \cdot p)}{b} \; .
+\end{split}
+$$
diff --git a/P/norm-kl.md b/P/norm-kl.md
@@ -0,0 +1,104 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-19 07:08:00
+
+title: "Kullback-Leibler divergence for the normal distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Normal distribution"
+theorem: "Kullback-Leibler divergence"
+
+sources:
+
+proof_id: "P193"
+shortcut: "norm-kl"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a [random variable](/D/rvar). Assume two [normal distributions](/D/norm) $P$ and $Q$ specifying the probability distribution of $X$ as
+
+$$ \label{eq:norms}
+\begin{split}
+P: \; X &\sim \mathrm{Gam}(\mu_1, \sigma_1^2) \\
+Q: \; X &\sim \mathrm{Gam}(\mu_2, \sigma_2^2) \; . \\
+\end{split}
+$$
+
+Then, the [Kullback-Leibler divergence](/D/kl) of $P$ from $Q$ is given by
+
+$$ \label{eq:norm-KL}
+\mathrm{KL}[P\,||\,Q] = \frac{1}{2} \left[ \frac{(\mu_2 - \mu_1)^2}{\sigma_2^2} + \frac{\sigma_1^2}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} - 1 \right] \; .
+$$
+
+
+**Proof:** The [KL divergence for a continuous random variable](/D/kl) is given by 
+
+$$ \label{eq:KL-cont}
+\mathrm{KL}[P\,||\,Q] = \int_{\mathcal{X}} p(x) \, \ln \frac{p(x)}{q(x)} \, \mathrm{d}x
+$$
+
+which, applied to the [normal distributions](/D/norm) in \eqref{eq:norms}, yields
+
+$$ \label{eq:norm-KL-s1}
+\begin{split}
+\mathrm{KL}[P\,||\,Q] &= \int_{-\infty}^{+\infty} \mathcal{N}(x; \mu_1, \sigma_1^2) \, \ln \frac{\mathcal{N}(x; \mu_1, \sigma_1^2)}{\mathcal{N}(x; \mu_2, \sigma_2^2)} \, \mathrm{d}x \\
+&= \left\langle \ln \frac{\mathcal{N}(x; \mu_1, \sigma_1^2)}{\mathcal{N}(x; \mu_2, \sigma_2^2)} \right\rangle_{p(x)} \; .
+\end{split}
+$$
+
+Using the [probability density function of the normal distribution](/P/norm-pdf), this becomes:
+
+$$ \label{eq:norm-KL-s2}
+\begin{split}
+\mathrm{KL}[P\,||\,Q] &= \left\langle \ln \frac{ \frac{1}{\sqrt{2 \pi} \sigma_1} \cdot \exp \left[ -\frac{1}{2} \left( \frac{x-\mu_1}{\sigma_1} \right)^2 \right] }{ \frac{1}{\sqrt{2 \pi} \sigma_2} \cdot \exp \left[ -\frac{1}{2} \left( \frac{x-\mu_2}{\sigma_2} \right)^2 \right] } \right\rangle_{p(x)} \\
+&= \left\langle \ln \left( \sqrt \frac{\sigma_2^2}{\sigma_1^2} \cdot \exp\left[ -\frac{1}{2} \left( \frac{x-\mu_1}{\sigma_1} \right)^2 + \frac{1}{2} \left( \frac{x-\mu_2}{\sigma_2} \right)^2 \right] \right) \right\rangle_{p(x)} \\
+&= \left\langle \frac{1}{2} \ln \frac{\sigma_2^2}{\sigma_1^2} -\frac{1}{2} \left( \frac{x-\mu_1}{\sigma_1} \right)^2 + \frac{1}{2} \left( \frac{x-\mu_2}{\sigma_2} \right)^2 \right\rangle_{p(x)} \\
+&= \frac{1}{2} \left\langle - \left( \frac{x-\mu_1}{\sigma_1} \right)^2 + \left( \frac{x-\mu_2}{\sigma_2} \right)^2 - \ln \frac{\sigma_1^2}{\sigma_2^2} \right\rangle_{p(x)} \\
+&= \frac{1}{2} \left\langle - \frac{(x-\mu_1)^2}{\sigma_1^2} + \frac{x^2 - 2 \mu_2 x + \mu_2^2}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} \right\rangle_{p(x)} \; .
+\end{split}
+$$
+
+Because trace function and [expected value](/D/mean) are both linear operators, the expectation can be moved inside the trace:
+
+$$ \label{eq:norm-KL-s3}
+\begin{split}
+\mathrm{KL}[P\,||\,Q] &= \frac{1}{2} \left[ - \frac{\left\langle (x-\mu_1)^2 \right\rangle}{\sigma_1^2} + \frac{\left\langle x^2 - 2 \mu_2 x + \mu_2^2 \right\rangle}{\sigma_2^2} - \left\langle \ln \frac{\sigma_1^2}{\sigma_2^2} \right\rangle \right] \\
+&= \frac{1}{2} \left[ - \frac{\left\langle (x-\mu_1)^2 \right\rangle}{\sigma_1^2} + \frac{\left\langle x^2 \right\rangle - \left\langle 2 \mu_2 x \right\rangle + \left\langle \mu_2^2 \right\rangle}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} \right] \; .
+\end{split}
+$$
+
+The first expectation corresponds to the [variance](/D/var)
+
+$$ \label{eq:var}
+\left\langle (X-\mu)^2 \right\rangle = \mathrm{E}[(X-\mathrm{E}(X))^2] = \mathrm{Var}(X)
+$$
+
+and the [variance of a normally distributed random variable](/P/norm-var) is
+
+$$ \label{eq:norm-var}
+X \sim \mathcal{N}(\mu, \sigma^2) \quad \Rightarrow \quad \mathrm{Var}(X) = \sigma^2 \; .
+$$
+
+Additionally applying the [raw moments of the normal distribution](/P/norm-mgf)
+
+$$ \label{eq:norm-mom-raw}
+X \sim \mathcal{N}(\mu, \sigma^2) \quad \Rightarrow \quad \left\langle x \right\rangle = \mu \quad \text{and} \quad \left\langle x^2 \right\rangle = \mu^2 + \sigma^2 \; ,
+$$
+
+the Kullback-Leibler divergence in \eqref{eq:norm-KL-s3} becomes
+
+$$ \label{eq:norm-KL-s4}
+\begin{split}
+\mathrm{KL}[P\,||\,Q] &= \frac{1}{2} \left[ - \frac{\sigma_1^2}{\sigma_1^2} + \frac{\mu_1^2 + \sigma_1^2 - 2 \mu_2 \mu_1 + \mu_2^2}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} \right] \\
+&= \frac{1}{2} \left[ \frac{\mu_1^2 - 2 \mu_1 \mu_2 + \mu_2^2}{\sigma_2^2} + \frac{\sigma_1^2}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} - 1 \right] \\
+&= \frac{1}{2} \left[ \frac{(\mu_1 - \mu_2)^2}{\sigma_2^2} + \frac{\sigma_1^2}{\sigma_2^2} - \ln \frac{\sigma_1^2}{\sigma_2^2} - 1 \right]
+\end{split}
+$$
+
+which is equivalent to \eqref{eq:norm-KL}.