Merge pull request #188 from JoramSoch/master

JoramSoch · web-flow · commit 2c0a75a1a12e · 2022-11-27T22:13:37.000+01:00
added 4 proofs
diff --git a/I/ToC.md b/I/ToC.md
@@ -665,9 +665,13 @@ title: "Table of Contents"
    
    3.1. Binomial observations <br>
    &emsp;&ensp; 3.1.1. *[Definition](/D/bin-data)* <br>
-   &emsp;&ensp; 3.1.2. **[Conjugate prior distribution](/P/bin-prior)** <br>
-   &emsp;&ensp; 3.1.3. **[Posterior distribution](/P/bin-post)** <br>
-   &emsp;&ensp; 3.1.4. **[Log model evidence](/P/bin-lme)** <br>
+   &emsp;&ensp; 3.1.2. **[Maximum likelihood estimation](/P/bin-mle)** <br>
+   &emsp;&ensp; 3.1.3. **[Maximum log-likelihood](/P/bin-mll)** <br>
+   &emsp;&ensp; 3.1.4. **[Conjugate prior distribution](/P/bin-prior)** <br>
+   &emsp;&ensp; 3.1.5. **[Posterior distribution](/P/bin-post)** <br>
+   &emsp;&ensp; 3.1.6. **[Log model evidence](/P/bin-lme)** <br>
+   &emsp;&ensp; 3.1.7. **[Log Bayes factor](/P/bin-lbf)** <br>
+   &emsp;&ensp; 3.1.8. **[Posterior probability](/P/bin-pp)** <br>
    
    3.2. Multinomial observations <br>
    &emsp;&ensp; 3.2.1. *[Definition](/D/mult-data)* <br>
diff --git a/P/bin-lbf.md b/P/bin-lbf.md
@@ -0,0 +1,84 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2022-11-25 14:40:00
+
+title: "Log model evidence for binomial observations"
+chapter: "Statistical Models"
+section: "Count data"
+topic: "Binomial observations"
+theorem: "Log Bayes factor"
+
+sources:
+
+proof_id: "P383"
+shortcut: "bin-lbf"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
+
+$$ \label{eq:Bin}
+y \sim \mathrm{Bin}(n,p) \; .
+$$
+
+Moreover, assume two [statistical models](/D/fpm), one assuming that $p$ is 0.5 ([null model](/D/h0)), the other imposing a [beta distribution](/P/bin-prior) as the [prior distribution](/D/prior) on the model parameter $p$ ([alternative](/D/h1)):
+
+$$ \label{eq:Bin-m01}
+\begin{split}
+m_0&: \; y \sim \mathrm{Bin}(n,p), \; p = 0.5 \\
+m_1&: \; y \sim \mathrm{Bin}(n,p), \; p \sim \mathrm{Bet}(\alpha_0, \beta_0) \; .
+\end{split}
+$$
+
+Then, the [log Bayes factor](/D/lbf) in favor of $m_1$ against $m_0$ is
+
+$$ \label{eq:Bin-LBF}
+\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)
+$$
+
+where $B(x,y)$ is the beta function and $\alpha_n$ and $\beta_n$ are the [posterior hyperparameters for binomial observations](/P/bin-post) which are functions of the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.
+
+
+**Proof:** [The log Bayes factor is equal to the difference of two log model evidences](/P/lbf-lme):
+
+$$ \label{eq:LBF-LME}
+\mathrm{LBF}_{12} = \mathrm{LME}(m_1) - \mathrm{LME}(m_2) \; .
+$$
+
+The LME of the alternative $m_1$ is equal to the [log model evidence for binomial observations](/P/bin-lme):
+
+$$ \label{eq:Bin-LME-m1}
+\mathrm{LME}(m_1) = \log p(y|m_1) = \log {n \choose y} + \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) \; .
+$$
+
+Because the null model $m_0$ has no free parameter, its [log model evidence](/D/lme) (logarithmized [marginal likelihood](/D/ml)) is equal to the [log-likelihood function for binomial observations](/P/bin-mle) at the value $p = 0.5$:
+
+$$ \label{eq:Bin-LME-m0}
+\begin{split}
+\mathrm{LME}(m_0) = \log p(y|p=0.5) &= \log {n \choose y} + y \log(0.5) + (n-y) \log (1-0.5)  \\
+&= \log {n \choose y} + n \log \left( \frac{1}{2} \right) \; .
+\end{split}
+$$
+
+Subtracting the two LMEs from each other, the LBF emerges as
+
+$$ \label{eq:Bin-LBF-m10}
+\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)
+$$
+
+where the [posterior hyperparameters](/D/post) [are given by](/P/bin-post)
+
+$$ \label{eq:Bin-post-par}
+\begin{split}
+\alpha_n &= \alpha_0 + y \\
+\beta_n &= \beta_0 + (n-y)
+\end{split}
+$$
+
+with the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.
diff --git a/P/bin-mle.md b/P/bin-mle.md
@@ -0,0 +1,79 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2022-11-23 18:17:00
+
+title: "Maximum likelihood estimation for binomial observations"
+chapter: "Statistical Models"
+section: "Count data"
+topic: "Binomial observations"
+theorem: "Maximum likelihood estimation"
+
+sources:
+  - authors: "Wikipedia"
+    year: 2022
+    title: "Binomial distribution"
+    in: "Wikipedia, the free encyclopedia"
+    pages: "retrieved on 2022-11-23"
+    url: "https://en.wikipedia.org/wiki/Binomial_distribution#Statistical_inference"
+
+proof_id: "P381"
+shortcut: "bin-mle"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
+
+$$ \label{eq:Bin}
+y \sim \mathrm{Bin}(n,p) \; .
+$$
+
+Then, the [maximum likelihood estimator](/D/mle) of $p$ is
+
+$$ \label{eq:Bin-MLE}
+\hat{p} = \frac{y}{n} \; .
+$$
+
+
+**Proof:** With the [probability mass function of the binomial distribution](/P/bin-pmf), equation \eqref{eq:Bin} implies the following [likelihood function](/D/lf):
+
+$$ \label{eq:Bin-LF}
+\begin{split}
+\mathrm{p}(y|p) &= \mathrm{Bin}(y; n, p) \\
+&= {n \choose y} \, p^y \, (1-p)^{n-y} \; .
+\end{split}
+$$
+
+Thus, the [log-likelihood function](/D/llf) is given by
+
+$$ \label{eq:Bin-LL}
+\begin{split}
+\mathrm{LL}(p) &= \log \mathrm{p}(y|p) \\
+&= \log {n \choose y} + y \log p + (n-y) \log (1-p) \; .
+\end{split}
+$$
+
+The derivative of the log-likelihood function \eqref{eq:Bin-LL} with respect to $p$ is
+
+$$ \label{eq:dLL-dp}
+\frac{\mathrm{d}\mathrm{LL}(p)}{\mathrm{d}p} = \frac{y}{p} - \frac{n-y}{1-p}
+$$
+
+and setting this derivative to zero gives the MLE for $p$:
+
+$$ \label{eq:p-MLE}
+\begin{split}
+\frac{\mathrm{d}\mathrm{LL}(p)}{\mathrm{d}\hat{p}} &= 0 \\
+0 &= \frac{y}{\hat{p}} - \frac{n-y}{1-\hat{p}} \\
+\frac{n-y}{1-\hat{p}} &= \frac{y}{\hat{p}} \\
+(n-y) \, \hat{p} &= y \, (1-\hat{p}) \\
+n \, \hat{p} - y \, \hat{p} &= y - y \, \hat{p} \\
+n \, \hat{p} &= y \\
+\hat{p} &= \frac{y}{n} \; .
+\end{split}
+$$
diff --git a/P/bin-mll.md b/P/bin-mll.md
@@ -0,0 +1,84 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2022-11-24 14:19:00
+
+title: "Maximum log-likelihood for binomial observations"
+chapter: "Statistical Models"
+section: "Count data"
+topic: "Binomial observations"
+theorem: "Maximum log-likelihood"
+
+sources:
+
+proof_id: "P382"
+shortcut: "bin-mll"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
+
+$$ \label{eq:Bin}
+y \sim \mathrm{Bin}(n,p) \; .
+$$
+
+Then, the [maximum log-likelihood](/D/mll) for this model is
+
+$$ \label{eq:Bin-MLL}
+\begin{split}
+\mathrm{MLL} &= \log \Gamma(n+1) - \log \Gamma(y+1) - \log \Gamma(n-y+1) \\
+&- n \log (n) + y \log (y) + (n-y) \log (n-y) \; .
+\end{split}
+$$
+
+
+**Proof:** The [log-likelihood function for binomial data](/P/bin-mle) is given by
+
+With the [probability mass function of the binomial distribution](/P/bin-pmf), equation \eqref{eq:Bin} implies the following [likelihood function](/D/lf):
+
+$$ \label{eq:Bin-LL}
+\mathrm{LL}(p) = \log {n \choose y} + y \log p + (n-y) \log (1-p)
+$$
+
+and the [maximum likelihood estimate of the success probability](/P/bin-mle) $p$ is
+
+$$ \label{eq:Bin-MLE}
+\hat{p} = \frac{y}{n} \; .
+$$
+
+Plugging \eqref{eq:Bin-MLE} into \eqref{eq:Bin-LL}, we obtain the [maximum log-likelihood](/D/mll) of the binomial observation model in \eqref{eq:Bin} as
+
+$$ \label{eq:Bin-MLL-s1}
+\begin{split}
+\mathrm{MLL} &= \mathrm{LL}(\hat{p}) \\
+&= \log {n \choose y} + y \log \left( \frac{y}{n} \right) + (n-y) \log \left( 1 - \frac{y}{n} \right) \\
+&= \log {n \choose y} + y \log \left( \frac{y}{n} \right) + (n-y) \log \left( \frac{n-y}{n} \right) \\
+&= \log {n \choose y} + y \log (y) + (n-y) \log (n-y) - n \log (n) \; .
+\end{split}
+$$
+
+With the definition of the binomial coefficient
+
+$$ \label{eq:bin-coeff}
+{n \choose k} = \frac{n!}{k! \, (n-k)!}
+$$
+
+and the definition of the gamma function
+
+$$ \label{eq:gam-fct}
+\Gamma(n) = (n-1)! \; ,
+$$
+
+the MLL finally becomes
+
+$$ \label{eq:Bin-MLL-s2}
+\begin{split}
+\mathrm{MLL} &= \log \Gamma(n+1) - \log \Gamma(y+1) - \log \Gamma(n-y+1) \\
+&- n \log (n) + y \log (y) + (n-y) \log (n-y) \; .
+\end{split}
+$$
diff --git a/P/bin-pp.md b/P/bin-pp.md
@@ -0,0 +1,86 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2022-11-26 11:42:00
+
+title: "Posterior probability of the alternative model for binomial observations"
+chapter: "Statistical Models"
+section: "Count data"
+topic: "Binomial observations"
+theorem: "Posterior probability"
+
+sources:
+
+proof_id: "P384"
+shortcut: "bin-pp"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a [binomial distribution](/D/bin):
+
+$$ \label{eq:Bin}
+y \sim \mathrm{Bin}(n,p) \; .
+$$
+
+Moreover, assume two [statistical models](/D/fpm), one assuming that $p$ is 0.5 ([null model](/D/h0)), the other imposing a [beta distribution](/P/bin-prior) as the [prior distribution](/D/prior) on the model parameter $p$ ([alternative](/D/h1)):
+
+$$ \label{eq:Bin-m01}
+\begin{split}
+m_0&: \; y \sim \mathrm{Bin}(n,p), \; p = 0.5 \\
+m_1&: \; y \sim \mathrm{Bin}(n,p), \; p \sim \mathrm{Bet}(\alpha_0, \beta_0) \; .
+\end{split}
+$$
+
+Then, the [posterior probability](/D/pmp) of the [alternative model](/D/h1) is given by
+
+$$ \label{eq:Bin-PP1}
+p(m_1|y) = \frac{1}{1 + 2^{-n} \left[ B(\alpha_0,\beta_0) / B(\alpha_n,\beta_n) \right]}
+$$
+
+where $B(x,y)$ is the beta function and $\alpha_n$ and $\beta_n$ are the [posterior hyperparameters for binomial observations](/P/bin-post) which are functions of the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.
+
+
+**Proof:** [The posterior probability for one of two models is a function of the log Bayes factor in favor of this model](/P/pmp-lbf):
+
+$$ \label{eq:PP-LBF}
+p(m_1|y) = \frac{\exp(\mathrm{LBF}_{12})}{\exp(\mathrm{LBF}_{12}) + 1} \; .
+$$
+
+The [log Bayes factor in favor of the alternative model for binomial observations](/P/bin-lbf) is given by
+
+$$ \label{eq:Bin-LBF10}
+\mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right) \; .
+$$
+
+and the corresponding [Bayes factor](/D/bf), i.e. [exponentiated log Bayes factor](/P/lbf-der), is equal to
+
+$$ \label{eq:Bin-BF10}
+\mathrm{BF}_{10} = \exp(\mathrm{LBF}_{10}) = 2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} \; .
+$$
+
+Thus, the posterior probability of the alternative, assuming a prior distribution over the probability $p$, compared to the null model, assuming a fixed probability $p = 0.5$, follows as
+
+$$ \label{eq:Bin-PP1-qed}
+\begin{split}
+p(m_1|y) &\overset{\eqref{eq:PP-LBF}}{=} \frac{\exp(\mathrm{LBF}_{10})}{\exp(\mathrm{LBF}_{10}) + 1} \\
+&\overset{\eqref{eq:Bin-BF10}}{=} \frac{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)}}{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} + 1} \\
+&= \frac{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)}}{2^n \cdot \frac{B(\alpha_n,\beta_n)}{B(\alpha_0,\beta_0)} \left( 1 + 2^{-n} \frac{B(\alpha_0,\beta_0)}{B(\alpha_n,\beta_n)} \right)} \\
+&= \frac{1}{1 + 2^{-n} \left[ B(\alpha_0,\beta_0) / B(\alpha_n,\beta_n) \right]}
+\end{split}
+$$
+
+where the [posterior hyperparameters](/D/post) [are given by](/P/bin-post)
+
+$$ \label{eq:Bin-post-par}
+\begin{split}
+\alpha_n &= \alpha_0 + y \\
+\beta_n &= \beta_0 + (n-y)
+\end{split}
+$$
+
+with the [number of trials](/D/bin) $n$ and the [number of successes](/D/bin) $y$.