Skip to content

Commit cb6bfd3

Browse files
authored
Merge pull request #103 from JoramSoch/master
added 3 definitions and 3 proofs
2 parents 65f5c8f + 6b25601 commit cb6bfd3

7 files changed

Lines changed: 504 additions & 15 deletions

File tree

D/eblme.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
layout: definition
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2020-11-25 07:43:00
9+
10+
title: "Empirical Bayesian log model evidence"
11+
chapter: "Model Selection"
12+
section: "Bayesian model selection"
13+
topic: "Log model evidence"
14+
definition: "Empirical Bayesian log model evidence"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2020
19+
title: "Empirical Bayes method"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2020-11-25"
22+
url: "https://en.wikipedia.org/wiki/Empirical_Bayes_method#Introduction"
23+
24+
def_id: "D114"
25+
shortcut: "eblme"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Definition:** Let $m$ be a [generative model](/D/gm) with model parameters $\theta$ and hyper-parameters $\lambda$ implying the [likelihood function](/D/lf) $p(y \vert \theta, \lambda, m)$ and [prior distribution](/D/prior) $p(\theta \vert \lambda, m)$. Then, the [Empirical Bayesian](/D/eb) [log model evidence](/D/lme) is the logarithm of the [marginal likelihood](/D/ml), maximized with respect to the hyper-parameters:
31+
32+
$$ \label{eq:ebLME}
33+
\mathrm{ebLME}(m) = \log p(y \vert \hat{\lambda}, m)
34+
$$
35+
36+
where
37+
38+
$$ \label{eq:ML}
39+
p(y \vert \lambda, m) = \int p(y \vert \theta, \lambda, m) \, (\theta \vert \lambda, m) \, \mathrm{d}\theta
40+
$$
41+
42+
and
43+
44+
$$ \label{eq:EB}
45+
\hat{\lambda} = \operatorname*{arg\,max}_{\lambda} \log p(y \vert \lambda, m) \; .
46+
$$

D/uplme.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
---
2+
layout: definition
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2020-11-25 07:28:00
9+
10+
title: "Uniform-prior log model evidence"
11+
chapter: "Model Selection"
12+
section: "Bayesian model selection"
13+
topic: "Log model evidence"
14+
definition: "Uniform-prior log model evidence"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2020
19+
title: "Lindley's paradox"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2020-11-25"
22+
url: "https://en.wikipedia.org/wiki/Lindley%27s_paradox#Bayesian_approach"
23+
24+
def_id: "D113"
25+
shortcut: "uplme"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Definition:** Assume a [generative model](/D/gm) $m$ with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and a [uniform](/D/prior-uni) [prior distribution](/D/prior) $p_{\mathrm{uni}}(\theta \vert m)$. Then, the [log model evidence](/D/lme) of this model is called "log model evidence with uniform prior" or "uniform-prior log model evidence" (upLME):
31+
32+
$$ \label{eq:upLME}
33+
\mathrm{upLME}(m) = \log \int p(y \vert \theta, m) \, p_{\mathrm{uni}}(\theta \vert m) \, \mathrm{d}\theta \; .
34+
$$

D/vblme.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
layout: definition
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2020-11-25 08:10:00
9+
10+
title: "Variational Bayesian log model evidence"
11+
chapter: "Model Selection"
12+
section: "Bayesian model selection"
13+
topic: "Log model evidence"
14+
definition: "Variational Bayesian log model evidence"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2020
19+
title: "Variational Bayesian methods"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2020-11-25"
22+
url: "https://en.wikipedia.org/wiki/Variational_Bayesian_methods#Evidence_lower_bound"
23+
- authors: "Bishop CM"
24+
year: 2006
25+
title: "Variational Inference"
26+
in: "Pattern Recognition for Machine Learning"
27+
pages: "pp. 462-474, eqs. 10.2-10.4"
28+
url: "https://www.springer.com/gp/book/9780387310732"
29+
30+
def_id: "D115"
31+
shortcut: "vblme"
32+
username: "JoramSoch"
33+
---
34+
35+
36+
**Definition:** Let $m$ be a [generative model](/D/gm) with model parameters $\theta$ implying the [likelihood function](/D/lf) $p(y \vert \theta, m)$. Moreover, assume a [prior distribution](/D/prior) $p(\theta \vert m)$, a resulting [posterior distribution](/D/post) $p(\theta \vert y, m)$ and an [approximate](/D/vb) [posterior distribution](/D/post) $q(\theta)$. Then, the [Variational Bayesian](/D/vb) [log model evidence](/D/lme) is the expectation of the [log-likelihood function](/D/llf) with respect to the approximate posterior, minus the [Kullback-Leibler divergence](/D/kl) between approximate posterior and true posterior distribution:
37+
38+
$$ \label{eq:vbLME}
39+
\mathrm{vbLME}(m) = \mathcal{L}\left[q(\theta)\right] - \mathrm{KL}\left[q(\theta) || p(\theta \vert y)\right]
40+
$$
41+
42+
where
43+
44+
$$ \label{eq:ELL}
45+
\mathcal{L}\left[q(\theta)\right] = \int q(\theta) \log \frac{p(y,\theta|m)}{q(\theta)} \, \mathrm{d}\theta
46+
$$
47+
48+
and
49+
50+
$$ \label{eq:KL}
51+
\mathrm{KL}\left[q(\theta) || p(\theta \vert y)\right] = \int q(\theta) \log \frac{q(\theta)}{p(\theta|y,m)} \, \mathrm{d}\theta \; .
52+
$$

I/Table_of_Contents.md

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -257,18 +257,19 @@ title: "Table of Contents"
257257
&emsp;&ensp; 3.2.3. **[Relation to standard normal distribution](/P/norm-snorm)** (1) <br>
258258
&emsp;&ensp; 3.2.4. **[Relation to standard normal distribution](/P/norm-snorm2)** (2) <br>
259259
&emsp;&ensp; 3.2.5. **[Relation to standard normal distribution](/P/norm-snorm3)** (3) <br>
260-
&emsp;&ensp; 3.2.6. **[Probability density function](/P/norm-pdf)** <br>
261-
&emsp;&ensp; 3.2.7. **[Moment-generating function](/P/norm-mgf)** <br>
262-
&emsp;&ensp; 3.2.8. **[Cumulative distribution function](/P/norm-cdf)** <br>
263-
&emsp;&ensp; 3.2.9. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
264-
&emsp;&ensp; 3.2.10. **[Quantile function](/P/norm-qf)** <br>
265-
&emsp;&ensp; 3.2.11. **[Mean](/P/norm-mean)** <br>
266-
&emsp;&ensp; 3.2.12. **[Median](/P/norm-med)** <br>
267-
&emsp;&ensp; 3.2.13. **[Mode](/P/norm-mode)** <br>
268-
&emsp;&ensp; 3.2.14. **[Variance](/P/norm-var)** <br>
269-
&emsp;&ensp; 3.2.15. **[Full width at half maximum](/P/norm-fwhm)** <br>
270-
&emsp;&ensp; 3.2.16. **[Differential entropy](/P/norm-dent)** <br>
271-
&emsp;&ensp; 3.2.17. **[Kullback-Leibler divergence](/P/norm-kl)** <br>
260+
&emsp;&ensp; 3.2.6. **[Gaussian integral](/P/norm-gi)** <br>
261+
&emsp;&ensp; 3.2.7. **[Probability density function](/P/norm-pdf)** <br>
262+
&emsp;&ensp; 3.2.8. **[Moment-generating function](/P/norm-mgf)** <br>
263+
&emsp;&ensp; 3.2.9. **[Cumulative distribution function](/P/norm-cdf)** <br>
264+
&emsp;&ensp; 3.2.10. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
265+
&emsp;&ensp; 3.2.11. **[Quantile function](/P/norm-qf)** <br>
266+
&emsp;&ensp; 3.2.12. **[Mean](/P/norm-mean)** <br>
267+
&emsp;&ensp; 3.2.13. **[Median](/P/norm-med)** <br>
268+
&emsp;&ensp; 3.2.14. **[Mode](/P/norm-mode)** <br>
269+
&emsp;&ensp; 3.2.15. **[Variance](/P/norm-var)** <br>
270+
&emsp;&ensp; 3.2.16. **[Full width at half maximum](/P/norm-fwhm)** <br>
271+
&emsp;&ensp; 3.2.17. **[Differential entropy](/P/norm-dent)** <br>
272+
&emsp;&ensp; 3.2.18. **[Kullback-Leibler divergence](/P/norm-kl)** <br>
272273

273274
3.3. Gamma distribution <br>
274275
&emsp;&ensp; 3.3.1. *[Definition](/D/gam)* <br>
@@ -297,12 +298,14 @@ title: "Table of Contents"
297298
3.5. Chi-square distribution <br>
298299
&emsp;&ensp; 3.5.1. *[Definition](/D/chi2)* <br>
299300
&emsp;&ensp; 3.5.2. **[Special case of gamma distribution](/P/chi2-gam)** <br>
300-
&emsp;&ensp; 3.5.3. **[Moments](/P/chi2-mom)** <br>
301+
&emsp;&ensp; 3.5.3. **[Probability density function](/P/chi2-pdf)** <br>
302+
&emsp;&ensp; 3.5.4. **[Moments](/P/chi2-mom)** <br>
301303

302304
3.6. Beta distribution <br>
303305
&emsp;&ensp; 3.6.1. *[Definition](/D/beta)* <br>
304306
&emsp;&ensp; 3.6.2. **[Probability density function](/P/beta-pdf)** <br>
305-
&emsp;&ensp; 3.6.3. **[Cumulative distribution function](/P/beta-cdf)** <br>
307+
&emsp;&ensp; 3.6.3. **[Moment-generating function](/P/beta-mgf)** <br>
308+
&emsp;&ensp; 3.6.4. **[Cumulative distribution function](/P/beta-cdf)** <br>
306309

307310
3.7. Wald distribution <br>
308311
&emsp;&ensp; 3.7.1. *[Definition](/D/wald)* <br>
@@ -472,7 +475,10 @@ title: "Table of Contents"
472475
&emsp;&ensp; 3.1.1. *[Definition](/D/lme)* <br>
473476
&emsp;&ensp; 3.1.2. **[Derivation](/P/lme-der)** <br>
474477
&emsp;&ensp; 3.1.3. **[Partition into accuracy and complexity](/P/lme-anc)** <br>
475-
&emsp;&ensp; 3.1.4. *[Cross-validated log model evidence](/D/cvlme)* <br>
478+
&emsp;&ensp; 3.1.4. *[Uniform-prior log model evidence](/D/uplme)* <br>
479+
&emsp;&ensp; 3.1.5. *[Cross-validated log model evidence](/D/cvlme)* <br>
480+
&emsp;&ensp; 3.1.6. *[Empirical Bayesian log model evidence](/D/eblme)* <br>
481+
&emsp;&ensp; 3.1.7. *[Variational Bayesian log model evidence](/D/vblme)* <br>
476482

477483
3.2. Log family evidence <br>
478484
&emsp;&ensp; 3.2.1. *[Definition](/D/lfe)* <br>

P/beta-mgf.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2020-11-25 06:55:00
9+
10+
title: "Moment-generating function of the beta distribution"
11+
chapter: "Probability Distributions"
12+
section: "Univariate continuous distributions"
13+
topic: "Beta distribution"
14+
theorem: "Moment-generating function"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2020
19+
title: "Beta distribution"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2020-11-25"
22+
url: "https://en.wikipedia.org/wiki/Beta_distribution#Moment_generating_function"
23+
- authors: "Wikipedia"
24+
year: 2020
25+
title: "Confluent hypergeometric function"
26+
in: "Wikipedia, the free encyclopedia"
27+
pages: "retrieved on 2020-11-25"
28+
url: "https://en.wikipedia.org/wiki/Confluent_hypergeometric_function#Kummer's_equation"
29+
30+
proof_id: "P198"
31+
shortcut: "beta-mgf"
32+
username: "JoramSoch"
33+
---
34+
35+
36+
**Theorem:** Let $X$ be a positive [random variable](/D/rvar) following a [beta distribution](/D/gam):
37+
38+
$$ \label{eq:beta}
39+
X \sim \mathrm{Bet}(\alpha, \beta) \; .
40+
$$
41+
42+
Then, the [moment-generating function](/D/mgf) of $X$ is
43+
44+
$$ \label{eq:beta-mgf}
45+
M_X(t) = 1 + \sum_{n=1}^{\infty} \left( \prod_{m=0}^{n-1} \frac{\alpha + m}{\alpha + \beta + m} \right) \frac{t^n}{n!} \; .
46+
$$
47+
48+
49+
**Proof:** The [probability density function of the beta distribution](/P/beta-pdf) is
50+
51+
$$ \label{eq:beta-pdf}
52+
f_X(x) = \frac{1}{\mathrm{B}(\alpha, \beta)} \, x^{\alpha-1} \, (1-x)^{\beta-1}
53+
$$
54+
55+
and the [moment-generating function](/D/mgf) is defined as
56+
57+
$$ \label{eq:mgf-var}
58+
M_X(t) = \mathrm{E} \left[ e^{tX} \right] \; .
59+
$$
60+
61+
Using the [expected value for continuous random variables](/D/mean), the moment-generating function of $X$ therefore is
62+
63+
$$ \label{eq:beta-mgf-s1}
64+
\begin{split}
65+
M_X(t) &= \int_{0}^{1} \exp[tx] \cdot \frac{1}{\mathrm{B}(\alpha, \beta)} \, x^{\alpha-1} \, (1-x)^{\beta-1} \, \mathrm{d}x \\
66+
&= \frac{1}{\mathrm{B}(\alpha, \beta)} \int_{0}^{1} e^{tx} \, x^{\alpha-1} \, (1-x)^{\beta-1} \, \mathrm{d}x \; .
67+
\end{split}
68+
$$
69+
70+
With the relationship between beta function and gamma function
71+
72+
$$ \label{eq:beta-gam-fct}
73+
\mathrm{B}(\alpha, \beta) = \frac{\Gamma(\alpha) \, \Gamma(\beta)}{\Gamma(\alpha+\beta)}
74+
$$
75+
76+
and the integral representation of the confluent hypergeometric function (Kummer's function of the first kind)
77+
78+
$$ \label{eq:con-hyp-geo-fct-int}
79+
{}_1 F_1(a,b,z) = \frac{\Gamma(b)}{\Gamma(a) \, \Gamma(b-a)} \int_{0}^{1} e^{zu} \, u^{a-1} \, (1-u)^{(b-a)-1} \, \mathrm{d}u \; ,
80+
$$
81+
82+
the moment-generating function can be written as
83+
84+
$$ \label{eq:beta-mgf-s2}
85+
M_X(t) = {}_1 F_1(\alpha,\alpha+\beta,t) \; .
86+
$$
87+
88+
Note that the series equation for the confluent hypergeometric function (Kummer's function of the first kind) is
89+
90+
$$ \label{eq:con-hyp-geo-fct-ser}
91+
{}_1 F_1(a,b,z) = \sum_{n=0}^{\infty} \frac{a^{\overline{n}}}{b^{\overline{n}}} \, \frac{z^n}{n!}
92+
$$
93+
94+
where $m^{\overline{n}}$ is the rising factorial
95+
96+
$$ \label{eq:fact-rise}
97+
m^{\overline{n}} = \prod_{i=0}^{n-1} (m+i) \; ,
98+
$$
99+
100+
so that the moment-generating function can be written as
101+
102+
$$ \label{eq:beta-mgf-s3}
103+
M_X(t) = \sum_{n=0}^{\infty} \frac{\alpha^{\overline{n}}}{(\alpha+\beta)^{\overline{n}}} \, \frac{t^n}{n!} \; .
104+
$$
105+
106+
Applying the rising factorial equation \eqref{eq:fact-rise} and using $m^{\overline{0}} = x^0 = 0! = 1$, we finally have:
107+
108+
$$ \label{eq:beta-mgf-s4}
109+
M_X(t) = 1 + \sum_{n=1}^{\infty} \left( \prod_{m=0}^{n-1} \frac{\alpha + m}{\alpha + \beta + m} \right) \frac{t^n}{n!} \; .
110+
$$

0 commit comments

Comments
 (0)