Skip to content

Commit 2c0a75a

Browse files
authored
Merge pull request #188 from JoramSoch/master
added 4 proofs
2 parents a3b4ef4 + 2ea0b3d commit 2c0a75a

5 files changed

Lines changed: 340 additions & 3 deletions

File tree

I/ToC.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -665,9 +665,13 @@ title: "Table of Contents"
665665

666666
3.1. Binomial observations <br>
667667
&emsp;&ensp; 3.1.1. *[Definition](/D/bin-data)* <br>
668-
&emsp;&ensp; 3.1.2. **[Conjugate prior distribution](/P/bin-prior)** <br>
669-
&emsp;&ensp; 3.1.3. **[Posterior distribution](/P/bin-post)** <br>
670-
&emsp;&ensp; 3.1.4. **[Log model evidence](/P/bin-lme)** <br>
668+
&emsp;&ensp; 3.1.2. **[Maximum likelihood estimation](/P/bin-mle)** <br>
669+
&emsp;&ensp; 3.1.3. **[Maximum log-likelihood](/P/bin-mll)** <br>
670+
&emsp;&ensp; 3.1.4. **[Conjugate prior distribution](/P/bin-prior)** <br>
671+
&emsp;&ensp; 3.1.5. **[Posterior distribution](/P/bin-post)** <br>
672+
&emsp;&ensp; 3.1.6. **[Log model evidence](/P/bin-lme)** <br>
673+
&emsp;&ensp; 3.1.7. **[Log Bayes factor](/P/bin-lbf)** <br>
674+
&emsp;&ensp; 3.1.8. **[Posterior probability](/P/bin-pp)** <br>
671675

672676
3.2. Multinomial observations <br>
673677
&emsp;&ensp; 3.2.1. *[Definition](/D/mult-data)* <br>

P/bin-lbf.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-11-25 14:40:00
9+
10+
title: "Log model evidence for binomial observations"
11+
chapter: "Statistical Models"
12+
section: "Count data"
13+
topic: "Binomial observations"
14+
theorem: "Log Bayes factor"
15+
16+
sources:
17+
18+
proof_id: "P383"
19+
shortcut: "bin-lbf"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
25+
26+
$$ \label{eq:Bin}
27+
y \sim \mathrm{Bin}(n,p) \; .
28+
$$
29+
30+
Moreover, assume two [statistical models](/D/fpm), one assuming that $p$ is 0.5 ([null model](/D/h0)), the other imposing a [beta distribution](/P/bin-prior) as the [prior distribution](/D/prior) on the model parameter $p$ ([alternative](/D/h1)):
31+
32+
$$ \label{eq:Bin-m01}
33+
\begin{split}
34+
m_0&: \; y \sim \mathrm{Bin}(n,p), \; p = 0.5 \\
35+
m_1&: \; y \sim \mathrm{Bin}(n,p), \; p \sim \mathrm{Bet}(\alpha_0, \beta_0) \; .
36+
\end{split}
37+
$$
38+
39+
Then, the [log Bayes factor](/D/lbf) in favor of $m_1$ against $m_0$ is
40+
41+
$$ \label{eq:Bin-LBF}
42+
\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)
43+
$$
44+
45+
where $B(x,y)$ is the beta function and $\alpha_n$ and $\beta_n$ are the [posterior hyperparameters for binomial observations](/P/bin-post) which are functions of the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.
46+
47+
48+
**Proof:** [The log Bayes factor is equal to the difference of two log model evidences](/P/lbf-lme):
49+
50+
$$ \label{eq:LBF-LME}
51+
\mathrm{LBF}_{12} = \mathrm{LME}(m_1) - \mathrm{LME}(m_2) \; .
52+
$$
53+
54+
The LME of the alternative $m_1$ is equal to the [log model evidence for binomial observations](/P/bin-lme):
55+
56+
$$ \label{eq:Bin-LME-m1}
57+
\mathrm{LME}(m_1) = \log p(y|m_1) = \log {n \choose y} + \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) \; .
58+
$$
59+
60+
Because the null model $m_0$ has no free parameter, its [log model evidence](/D/lme) (logarithmized [marginal likelihood](/D/ml)) is equal to the [log-likelihood function for binomial observations](/P/bin-mle) at the value $p = 0.5$:
61+
62+
$$ \label{eq:Bin-LME-m0}
63+
\begin{split}
64+
\mathrm{LME}(m_0) = \log p(y|p=0.5) &= \log {n \choose y} + y \log(0.5) + (n-y) \log (1-0.5) \\
65+
&= \log {n \choose y} + n \log \left( \frac{1}{2} \right) \; .
66+
\end{split}
67+
$$
68+
69+
Subtracting the two LMEs from each other, the LBF emerges as
70+
71+
$$ \label{eq:Bin-LBF-m10}
72+
\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)
73+
$$
74+
75+
where the [posterior hyperparameters](/D/post) [are given by](/P/bin-post)
76+
77+
$$ \label{eq:Bin-post-par}
78+
\begin{split}
79+
\alpha_n &= \alpha_0 + y \\
80+
\beta_n &= \beta_0 + (n-y)
81+
\end{split}
82+
$$
83+
84+
with the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.

P/bin-mle.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-11-23 18:17:00
9+
10+
title: "Maximum likelihood estimation for binomial observations"
11+
chapter: "Statistical Models"
12+
section: "Count data"
13+
topic: "Binomial observations"
14+
theorem: "Maximum likelihood estimation"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2022
19+
title: "Binomial distribution"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2022-11-23"
22+
url: "https://en.wikipedia.org/wiki/Binomial_distribution#Statistical_inference"
23+
24+
proof_id: "P381"
25+
shortcut: "bin-mle"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
31+
32+
$$ \label{eq:Bin}
33+
y \sim \mathrm{Bin}(n,p) \; .
34+
$$
35+
36+
Then, the [maximum likelihood estimator](/D/mle) of $p$ is
37+
38+
$$ \label{eq:Bin-MLE}
39+
\hat{p} = \frac{y}{n} \; .
40+
$$
41+
42+
43+
**Proof:** With the [probability mass function of the binomial distribution](/P/bin-pmf), equation \eqref{eq:Bin} implies the following [likelihood function](/D/lf):
44+
45+
$$ \label{eq:Bin-LF}
46+
\begin{split}
47+
\mathrm{p}(y|p) &= \mathrm{Bin}(y; n, p) \\
48+
&= {n \choose y} \, p^y \, (1-p)^{n-y} \; .
49+
\end{split}
50+
$$
51+
52+
Thus, the [log-likelihood function](/D/llf) is given by
53+
54+
$$ \label{eq:Bin-LL}
55+
\begin{split}
56+
\mathrm{LL}(p) &= \log \mathrm{p}(y|p) \\
57+
&= \log {n \choose y} + y \log p + (n-y) \log (1-p) \; .
58+
\end{split}
59+
$$
60+
61+
The derivative of the log-likelihood function \eqref{eq:Bin-LL} with respect to $p$ is
62+
63+
$$ \label{eq:dLL-dp}
64+
\frac{\mathrm{d}\mathrm{LL}(p)}{\mathrm{d}p} = \frac{y}{p} - \frac{n-y}{1-p}
65+
$$
66+
67+
and setting this derivative to zero gives the MLE for $p$:
68+
69+
$$ \label{eq:p-MLE}
70+
\begin{split}
71+
\frac{\mathrm{d}\mathrm{LL}(p)}{\mathrm{d}\hat{p}} &= 0 \\
72+
0 &= \frac{y}{\hat{p}} - \frac{n-y}{1-\hat{p}} \\
73+
\frac{n-y}{1-\hat{p}} &= \frac{y}{\hat{p}} \\
74+
(n-y) \, \hat{p} &= y \, (1-\hat{p}) \\
75+
n \, \hat{p} - y \, \hat{p} &= y - y \, \hat{p} \\
76+
n \, \hat{p} &= y \\
77+
\hat{p} &= \frac{y}{n} \; .
78+
\end{split}
79+
$$

P/bin-mll.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-11-24 14:19:00
9+
10+
title: "Maximum log-likelihood for binomial observations"
11+
chapter: "Statistical Models"
12+
section: "Count data"
13+
topic: "Binomial observations"
14+
theorem: "Maximum log-likelihood"
15+
16+
sources:
17+
18+
proof_id: "P382"
19+
shortcut: "bin-mll"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
25+
26+
$$ \label{eq:Bin}
27+
y \sim \mathrm{Bin}(n,p) \; .
28+
$$
29+
30+
Then, the [maximum log-likelihood](/D/mll) for this model is
31+
32+
$$ \label{eq:Bin-MLL}
33+
\begin{split}
34+
\mathrm{MLL} &= \log \Gamma(n+1) - \log \Gamma(y+1) - \log \Gamma(n-y+1) \\
35+
&- n \log (n) + y \log (y) + (n-y) \log (n-y) \; .
36+
\end{split}
37+
$$
38+
39+
40+
**Proof:** The [log-likelihood function for binomial data](/P/bin-mle) is given by
41+
42+
With the [probability mass function of the binomial distribution](/P/bin-pmf), equation \eqref{eq:Bin} implies the following [likelihood function](/D/lf):
43+
44+
$$ \label{eq:Bin-LL}
45+
\mathrm{LL}(p) = \log {n \choose y} + y \log p + (n-y) \log (1-p)
46+
$$
47+
48+
and the [maximum likelihood estimate of the success probability](/P/bin-mle) $p$ is
49+
50+
$$ \label{eq:Bin-MLE}
51+
\hat{p} = \frac{y}{n} \; .
52+
$$
53+
54+
Plugging \eqref{eq:Bin-MLE} into \eqref{eq:Bin-LL}, we obtain the [maximum log-likelihood](/D/mll) of the binomial observation model in \eqref{eq:Bin} as
55+
56+
$$ \label{eq:Bin-MLL-s1}
57+
\begin{split}
58+
\mathrm{MLL} &= \mathrm{LL}(\hat{p}) \\
59+
&= \log {n \choose y} + y \log \left( \frac{y}{n} \right) + (n-y) \log \left( 1 - \frac{y}{n} \right) \\
60+
&= \log {n \choose y} + y \log \left( \frac{y}{n} \right) + (n-y) \log \left( \frac{n-y}{n} \right) \\
61+
&= \log {n \choose y} + y \log (y) + (n-y) \log (n-y) - n \log (n) \; .
62+
\end{split}
63+
$$
64+
65+
With the definition of the binomial coefficient
66+
67+
$$ \label{eq:bin-coeff}
68+
{n \choose k} = \frac{n!}{k! \, (n-k)!}
69+
$$
70+
71+
and the definition of the gamma function
72+
73+
$$ \label{eq:gam-fct}
74+
\Gamma(n) = (n-1)! \; ,
75+
$$
76+
77+
the MLL finally becomes
78+
79+
$$ \label{eq:Bin-MLL-s2}
80+
\begin{split}
81+
\mathrm{MLL} &= \log \Gamma(n+1) - \log \Gamma(y+1) - \log \Gamma(n-y+1) \\
82+
&- n \log (n) + y \log (y) + (n-y) \log (n-y) \; .
83+
\end{split}
84+
$$

P/bin-pp.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-11-26 11:42:00
9+
10+
title: "Posterior probability of the alternative model for binomial observations"
11+
chapter: "Statistical Models"
12+
section: "Count data"
13+
topic: "Binomial observations"
14+
theorem: "Posterior probability"
15+
16+
sources:
17+
18+
proof_id: "P384"
19+
shortcut: "bin-pp"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
25+
26+
$$ \label{eq:Bin}
27+
y \sim \mathrm{Bin}(n,p) \; .
28+
$$
29+
30+
Moreover, assume two [statistical models](/D/fpm), one assuming that $p$ is 0.5 ([null model](/D/h0)), the other imposing a [beta distribution](/P/bin-prior) as the [prior distribution](/D/prior) on the model parameter $p$ ([alternative](/D/h1)):
31+
32+
$$ \label{eq:Bin-m01}
33+
\begin{split}
34+
m_0&: \; y \sim \mathrm{Bin}(n,p), \; p = 0.5 \\
35+
m_1&: \; y \sim \mathrm{Bin}(n,p), \; p \sim \mathrm{Bet}(\alpha_0, \beta_0) \; .
36+
\end{split}
37+
$$
38+
39+
Then, the [posterior probability](/D/pmp) of the [alternative model](/D/h1) is given by
40+
41+
$$ \label{eq:Bin-PP1}
42+
p(m_1|y) = \frac{1}{1 + 2^{-n} \left[ B(\alpha_0,\beta_0) / B(\alpha_n,\beta_n) \right]}
43+
$$
44+
45+
where $B(x,y)$ is the beta function and $\alpha_n$ and $\beta_n$ are the [posterior hyperparameters for binomial observations](/P/bin-post) which are functions of the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.
46+
47+
48+
**Proof:** [The posterior probability for one of two models is a function of the log Bayes factor in favor of this model](/P/pmp-lbf):
49+
50+
$$ \label{eq:PP-LBF}
51+
p(m_1|y) = \frac{\exp(\mathrm{LBF}_{12})}{\exp(\mathrm{LBF}_{12}) + 1} \; .
52+
$$
53+
54+
The [log Bayes factor in favor of the alternative model for binomial observations](/P/bin-lbf) is given by
55+
56+
$$ \label{eq:Bin-LBF10}
57+
\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right) \; .
58+
$$
59+
60+
and the corresponding [Bayes factor](/D/bf), i.e. [exponentiated log Bayes factor](/P/lbf-der), is equal to
61+
62+
$$ \label{eq:Bin-BF10}
63+
\mathrm{BF}_{10} = \exp(\mathrm{LBF}_{10}) = 2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} \; .
64+
$$
65+
66+
Thus, the posterior probability of the alternative, assuming a prior distribution over the probability $p$, compared to the null model, assuming a fixed probability $p = 0.5$, follows as
67+
68+
$$ \label{eq:Bin-PP1-qed}
69+
\begin{split}
70+
p(m_1|y) &\overset{\eqref{eq:PP-LBF}}{=} \frac{\exp(\mathrm{LBF}_{10})}{\exp(\mathrm{LBF}_{10}) + 1} \\
71+
&\overset{\eqref{eq:Bin-BF10}}{=} \frac{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)}}{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} + 1} \\
72+
&= \frac{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)}}{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} \left( 1 + 2^{-n} \frac{B(\alpha_0,\beta_0)}{B(\alpha_n,\beta_n)} \right)} \\
73+
&= \frac{1}{1 + 2^{-n} \left[ B(\alpha_0,\beta_0) / B(\alpha_n,\beta_n) \right]}
74+
\end{split}
75+
$$
76+
77+
where the [posterior hyperparameters](/D/post) [are given by](/P/bin-post)
78+
79+
$$ \label{eq:Bin-post-par}
80+
\begin{split}
81+
\alpha_n &= \alpha_0 + y \\
82+
\beta_n &= \beta_0 + (n-y)
83+
\end{split}
84+
$$
85+
86+
with the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.

0 commit comments

Comments
 (0)