Merge pull request #103 from JoramSoch/master

JoramSoch · web-flow · commit cb6bfd3d502a · 2020-11-25T08:28:53.000+01:00
added 3 definitions and 3 proofs
diff --git a/D/eblme.md b/D/eblme.md
@@ -0,0 +1,46 @@
+---
+layout: definition
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-25 07:43:00
+
+title: "Empirical Bayesian log model evidence"
+chapter: "Model Selection"
+section: "Bayesian model selection"
+topic: "Log model evidence"
+definition: "Empirical Bayesian log model evidence"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Empirical Bayes method"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-25"
+    url: "https://en.wikipedia.org/wiki/Empirical_Bayes_method#Introduction"
+
+def_id: "D114"
+shortcut: "eblme"
+username: "JoramSoch"
+---
+
+
+**Definition:** Let $m$ be a [generative model](/D/gm) with model parameters $\theta$ and hyper-parameters $\lambda$ implying the [likelihood function](/D/lf) $p(y \vert \theta, \lambda, m)$ and [prior distribution](/D/prior) $p(\theta \vert \lambda, m)$. Then, the [Empirical Bayesian](/D/eb) [log model evidence](/D/lme) is the logarithm of the [marginal likelihood](/D/ml), maximized with respect to the hyper-parameters:
+
+$$ \label{eq:ebLME}
+\mathrm{ebLME}(m) = \log p(y \vert \hat{\lambda}, m)
+$$
+
+where
+
+$$ \label{eq:ML}
+p(y \vert \lambda, m) = \int p(y \vert \theta, \lambda, m) \, (\theta \vert \lambda, m) \, \mathrm{d}\theta
+$$
+
+and
+
+$$ \label{eq:EB}
+\hat{\lambda} = \operatorname*{arg\,max}_{\lambda} \log p(y \vert \lambda, m) \; .
+$$
diff --git a/D/uplme.md b/D/uplme.md
@@ -0,0 +1,34 @@
+---
+layout: definition
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-25 07:28:00
+
+title: "Uniform-prior log model evidence"
+chapter: "Model Selection"
+section: "Bayesian model selection"
+topic: "Log model evidence"
+definition: "Uniform-prior log model evidence"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Lindley's paradox"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-25"
+    url: "https://en.wikipedia.org/wiki/Lindley%27s_paradox#Bayesian_approach"
+
+def_id: "D113"
+shortcut: "uplme"
+username: "JoramSoch"
+---
+
+
+**Definition:** Assume a [generative model](/D/gm) $m$ with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and a [uniform](/D/prior-uni) [prior distribution](/D/prior) $p_{\mathrm{uni}}(\theta \vert m)$. Then, the [log model evidence](/D/lme) of this model is called "log model evidence with uniform prior" or "uniform-prior log model evidence" (upLME):
+
+$$ \label{eq:upLME}
+\mathrm{upLME}(m) = \log \int p(y \vert \theta, m) \, p_{\mathrm{uni}}(\theta \vert m) \, \mathrm{d}\theta \; .
+$$
diff --git a/D/vblme.md b/D/vblme.md
@@ -0,0 +1,52 @@
+---
+layout: definition
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-25 08:10:00
+
+title: "Variational Bayesian log model evidence"
+chapter: "Model Selection"
+section: "Bayesian model selection"
+topic: "Log model evidence"
+definition: "Variational Bayesian log model evidence"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Variational Bayesian methods"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-25"
+    url: "https://en.wikipedia.org/wiki/Variational_Bayesian_methods#Evidence_lower_bound"
+  - authors: "Bishop CM"
+    year: 2006
+    title: "Variational Inference"
+    in: "Pattern Recognition for Machine Learning"
+    pages: "pp. 462-474, eqs. 10.2-10.4"
+    url: "https://www.springer.com/gp/book/9780387310732"
+
+def_id: "D115"
+shortcut: "vblme"
+username: "JoramSoch"
+---
+
+
+**Definition:** Let $m$ be a [generative model](/D/gm) with model parameters $\theta$ implying the [likelihood function](/D/lf) $p(y \vert \theta, m)$. Moreover, assume a [prior distribution](/D/prior) $p(\theta \vert m)$, a resulting [posterior distribution](/D/post) $p(\theta \vert y, m)$ and an [approximate](/D/vb) [posterior distribution](/D/post) $q(\theta)$. Then, the [Variational Bayesian](/D/vb) [log model evidence](/D/lme) is the expectation of the [log-likelihood function](/D/llf) with respect to the approximate posterior, minus the [Kullback-Leibler divergence](/D/kl) between approximate posterior and true posterior distribution:
+
+$$ \label{eq:vbLME}
+\mathrm{vbLME}(m) = \mathcal{L}\left[q(\theta)\right] - \mathrm{KL}\left[q(\theta) || p(\theta \vert y)\right]
+$$
+
+where
+
+$$ \label{eq:ELL}
+\mathcal{L}\left[q(\theta)\right] = \int q(\theta) \log \frac{p(y,\theta|m)}{q(\theta)} \, \mathrm{d}\theta
+$$
+
+and
+
+$$ \label{eq:KL}
+\mathrm{KL}\left[q(\theta) || p(\theta \vert y)\right] = \int q(\theta) \log \frac{q(\theta)}{p(\theta|y,m)} \, \mathrm{d}\theta  \; .
+$$
diff --git a/I/Table_of_Contents.md b/I/Table_of_Contents.md
@@ -257,18 +257,19 @@ title: "Table of Contents"
    &emsp;&ensp; 3.2.3. **[Relation to standard normal distribution](/P/norm-snorm)** (1) <br>
    &emsp;&ensp; 3.2.4. **[Relation to standard normal distribution](/P/norm-snorm2)** (2) <br>
    &emsp;&ensp; 3.2.5. **[Relation to standard normal distribution](/P/norm-snorm3)** (3) <br>
-   &emsp;&ensp; 3.2.6. **[Probability density function](/P/norm-pdf)** <br>
-   &emsp;&ensp; 3.2.7. **[Moment-generating function](/P/norm-mgf)** <br>
-   &emsp;&ensp; 3.2.8. **[Cumulative distribution function](/P/norm-cdf)** <br>
-   &emsp;&ensp; 3.2.9. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
-   &emsp;&ensp; 3.2.10. **[Quantile function](/P/norm-qf)** <br>
-   &emsp;&ensp; 3.2.11. **[Mean](/P/norm-mean)** <br>
-   &emsp;&ensp; 3.2.12. **[Median](/P/norm-med)** <br>
-   &emsp;&ensp; 3.2.13. **[Mode](/P/norm-mode)** <br>
-   &emsp;&ensp; 3.2.14. **[Variance](/P/norm-var)** <br>
-   &emsp;&ensp; 3.2.15. **[Full width at half maximum](/P/norm-fwhm)** <br>
-   &emsp;&ensp; 3.2.16. **[Differential entropy](/P/norm-dent)** <br>
-   &emsp;&ensp; 3.2.17. **[Kullback-Leibler divergence](/P/norm-kl)** <br>
+   &emsp;&ensp; 3.2.6. **[Gaussian integral](/P/norm-gi)** <br>
+   &emsp;&ensp; 3.2.7. **[Probability density function](/P/norm-pdf)** <br>
+   &emsp;&ensp; 3.2.8. **[Moment-generating function](/P/norm-mgf)** <br>
+   &emsp;&ensp; 3.2.9. **[Cumulative distribution function](/P/norm-cdf)** <br>
+   &emsp;&ensp; 3.2.10. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
+   &emsp;&ensp; 3.2.11. **[Quantile function](/P/norm-qf)** <br>
+   &emsp;&ensp; 3.2.12. **[Mean](/P/norm-mean)** <br>
+   &emsp;&ensp; 3.2.13. **[Median](/P/norm-med)** <br>
+   &emsp;&ensp; 3.2.14. **[Mode](/P/norm-mode)** <br>
+   &emsp;&ensp; 3.2.15. **[Variance](/P/norm-var)** <br>
+   &emsp;&ensp; 3.2.16. **[Full width at half maximum](/P/norm-fwhm)** <br>
+   &emsp;&ensp; 3.2.17. **[Differential entropy](/P/norm-dent)** <br>
+   &emsp;&ensp; 3.2.18. **[Kullback-Leibler divergence](/P/norm-kl)** <br>
 
    3.3. Gamma distribution <br>
    &emsp;&ensp; 3.3.1. *[Definition](/D/gam)* <br>
@@ -297,12 +298,14 @@ title: "Table of Contents"
    3.5. Chi-square distribution <br>
    &emsp;&ensp; 3.5.1. *[Definition](/D/chi2)* <br>
    &emsp;&ensp; 3.5.2. **[Special case of gamma distribution](/P/chi2-gam)** <br>
-   &emsp;&ensp; 3.5.3. **[Moments](/P/chi2-mom)** <br>
+   &emsp;&ensp; 3.5.3. **[Probability density function](/P/chi2-pdf)** <br>
+   &emsp;&ensp; 3.5.4. **[Moments](/P/chi2-mom)** <br>
    
    3.6. Beta distribution <br>
    &emsp;&ensp; 3.6.1. *[Definition](/D/beta)* <br>
    &emsp;&ensp; 3.6.2. **[Probability density function](/P/beta-pdf)** <br>
-   &emsp;&ensp; 3.6.3. **[Cumulative distribution function](/P/beta-cdf)** <br>
+   &emsp;&ensp; 3.6.3. **[Moment-generating function](/P/beta-mgf)** <br>
+   &emsp;&ensp; 3.6.4. **[Cumulative distribution function](/P/beta-cdf)** <br>
    
    3.7. Wald distribution <br>
    &emsp;&ensp; 3.7.1. *[Definition](/D/wald)* <br>
@@ -472,7 +475,10 @@ title: "Table of Contents"
    &emsp;&ensp; 3.1.1. *[Definition](/D/lme)* <br>
    &emsp;&ensp; 3.1.2. **[Derivation](/P/lme-der)** <br>
    &emsp;&ensp; 3.1.3. **[Partition into accuracy and complexity](/P/lme-anc)** <br>
-   &emsp;&ensp; 3.1.4. *[Cross-validated log model evidence](/D/cvlme)* <br>
+   &emsp;&ensp; 3.1.4. *[Uniform-prior log model evidence](/D/uplme)* <br>
+   &emsp;&ensp; 3.1.5. *[Cross-validated log model evidence](/D/cvlme)* <br>
+   &emsp;&ensp; 3.1.6. *[Empirical Bayesian log model evidence](/D/eblme)* <br>
+   &emsp;&ensp; 3.1.7. *[Variational Bayesian log model evidence](/D/vblme)* <br>
    
    3.2. Log family evidence <br>
    &emsp;&ensp; 3.2.1. *[Definition](/D/lfe)* <br>
diff --git a/P/beta-mgf.md b/P/beta-mgf.md
@@ -0,0 +1,110 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-11-25 06:55:00
+
+title: "Moment-generating function of the beta distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Beta distribution"
+theorem: "Moment-generating function"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Beta distribution"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-25"
+    url: "https://en.wikipedia.org/wiki/Beta_distribution#Moment_generating_function"
+  - authors: "Wikipedia"
+    year: 2020
+    title: "Confluent hypergeometric function"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2020-11-25"
+    url: "https://en.wikipedia.org/wiki/Confluent_hypergeometric_function#Kummer's_equation"
+
+proof_id: "P198"
+shortcut: "beta-mgf"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a positive [random variable](/D/rvar) following a [beta distribution](/D/gam):
+
+$$ \label{eq:beta}
+X \sim \mathrm{Bet}(\alpha, \beta) \; .
+$$
+
+Then, the [moment-generating function](/D/mgf) of $X$ is
+
+$$ \label{eq:beta-mgf}
+M_X(t) = 1 + \sum_{n=1}^{\infty} \left( \prod_{m=0}^{n-1} \frac{\alpha + m}{\alpha + \beta + m} \right) \frac{t^n}{n!} \; .
+$$
+
+
+**Proof:** The [probability density function of the beta distribution](/P/beta-pdf) is
+
+$$ \label{eq:beta-pdf}
+f_X(x) = \frac{1}{\mathrm{B}(\alpha, \beta)} \, x^{\alpha-1} \, (1-x)^{\beta-1}
+$$
+
+and the [moment-generating function](/D/mgf) is defined as
+
+$$ \label{eq:mgf-var}
+M_X(t) = \mathrm{E} \left[ e^{tX} \right] \; .
+$$
+
+Using the [expected value for continuous random variables](/D/mean), the moment-generating function of $X$ therefore is
+
+$$ \label{eq:beta-mgf-s1}
+\begin{split}
+M_X(t) &= \int_{0}^{1} \exp[tx] \cdot \frac{1}{\mathrm{B}(\alpha, \beta)} \, x^{\alpha-1} \, (1-x)^{\beta-1} \, \mathrm{d}x \\
+&= \frac{1}{\mathrm{B}(\alpha, \beta)} \int_{0}^{1} e^{tx} \, x^{\alpha-1} \, (1-x)^{\beta-1} \, \mathrm{d}x \; .
+\end{split}
+$$
+
+With the relationship between beta function and gamma function
+
+$$ \label{eq:beta-gam-fct}
+\mathrm{B}(\alpha, \beta) = \frac{\Gamma(\alpha) \, \Gamma(\beta)}{\Gamma(\alpha+\beta)}
+$$
+
+and the integral representation of the confluent hypergeometric function (Kummer's function of the first kind)
+
+$$ \label{eq:con-hyp-geo-fct-int}
+{}_1 F_1(a,b,z) = \frac{\Gamma(b)}{\Gamma(a) \, \Gamma(b-a)} \int_{0}^{1} e^{zu} \, u^{a-1} \, (1-u)^{(b-a)-1} \, \mathrm{d}u \; ,
+$$
+
+the moment-generating function can be written as
+
+$$ \label{eq:beta-mgf-s2}
+M_X(t) = {}_1 F_1(\alpha,\alpha+\beta,t) \; .
+$$
+
+Note that the series equation for the confluent hypergeometric function (Kummer's function of the first kind) is
+
+$$ \label{eq:con-hyp-geo-fct-ser}
+{}_1 F_1(a,b,z) = \sum_{n=0}^{\infty} \frac{a^{\overline{n}}}{b^{\overline{n}}} \, \frac{z^n}{n!}
+$$
+
+where $m^{\overline{n}}$ is the rising factorial
+
+$$ \label{eq:fact-rise}
+m^{\overline{n}} = \prod_{i=0}^{n-1} (m+i) \; ,
+$$
+
+so that the moment-generating function can be written as
+
+$$ \label{eq:beta-mgf-s3}
+M_X(t) = \sum_{n=0}^{\infty} \frac{\alpha^{\overline{n}}}{(\alpha+\beta)^{\overline{n}}} \, \frac{t^n}{n!} \; .
+$$
+
+Applying the rising factorial equation \eqref{eq:fact-rise} and using $m^{\overline{0}} = x^0 = 0! = 1$, we finally have:
+
+$$ \label{eq:beta-mgf-s4}
+M_X(t) = 1 + \sum_{n=1}^{\infty} \left( \prod_{m=0}^{n-1} \frac{\alpha + m}{\alpha + \beta + m} \right) \frac{t^n}{n!} \; .
+$$
diff --git a/P/chi2-pdf.md b/P/chi2-pdf.md
diff --git a/P/norm-gi.md b/P/norm-gi.md