Skip to content

Commit f95dcfe

Browse files
authored
added proof "mult-map"
1 parent 0406ae2 commit f95dcfe

1 file changed

Lines changed: 80 additions & 0 deletions

File tree

P/mult-map.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2023-12-08 15:14:47
9+
10+
title: "Maximum-a-posteriori estimation for multinomial observations"
11+
chapter: "Statistical Models"
12+
section: "Count data"
13+
topic: "Multinomial observations"
14+
theorem: "Maximum-a-posteriori estimation"
15+
16+
sources:
17+
18+
proof_id: "P428"
19+
shortcut: "mult-map"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $y = [y_1, \ldots, y_k]$ be the number of observations in $k$ categories resulting from $n$ independent trials with unknown category probabilities $p = [p_1, \ldots, p_k]$, such that $y$ follows a [multinomial distribution](/D/mult):
25+
26+
$$ \label{eq:Mult}
27+
y \sim \mathrm{Mult}(n,p) \; .
28+
$$
29+
30+
Moreover, assume a [Dirichlet prior distribution](/P/mult-prior) over the model parameter $p$:
31+
32+
$$ \label{eq:Mult-prior}
33+
\mathrm{p}(p) = \mathrm{Dir}(p; \alpha_0) \; .
34+
$$
35+
36+
Then, the [maximum-a-posteriori estimates](/D/map) of $p$ are
37+
38+
$$ \label{eq:Mult-MAP}
39+
\hat{p}_\mathrm{MAP} = \frac{\alpha_0+y-1}{\sum_{j=1}^k \alpha_{0j} + n - k} \; .
40+
$$
41+
42+
43+
**Proof:** Given the [prior distribution](/D/prior) in \eqref{eq:Mult-prior}, the [posterior distribution](/D/post) for [multinomial observations](/D/mult-data) [is also a Dirichlet distribution](/P/mult-post)
44+
45+
$$ \label{eq:Mult-post}
46+
\mathrm{p}(p|y) = \mathrm{Dir}(p; \alpha_n)
47+
$$
48+
49+
where the [posterior hyperparameters](/D/post) are equal to
50+
51+
$$ \label{eq:Mult-post-par}
52+
\alpha_{nj} = \alpha_{0j} + y_j, \; j = 1,\ldots,k \; .
53+
$$
54+
55+
The [mode of the Dirichlet distribution](/P/dir-mode) is given by:
56+
57+
$$ \label{eq:Dir-mode}
58+
X \sim \mathrm{Dir}(\alpha) \quad \Rightarrow \quad \mathrm{mode}(X_i) = \frac{\alpha_i-1}{\sum_j \alpha_j - k} \; .
59+
$$
60+
61+
Applying \eqref{eq:Dir-mode} to \eqref{eq:Mult-post} with \eqref{eq:Mult-post-par}, the [maximum-a-posteriori estimates](/D/map) of $p$ follow as
62+
63+
$$ \label{eq:Mult-MAP-s1}
64+
\begin{split}
65+
\hat{p}_{i,\mathrm{MAP}} &= \frac{\alpha_{ni} - 1}{\sum_j \alpha_{nj} - k} \\
66+
&\overset{\eqref{eq:Mult-post-par}}{=} \frac{\alpha_{0i} + y_i - 1}{\sum_j (\alpha_{0j} + y_j) - k} \\
67+
&= \frac{\alpha_{0i} + y_i - 1}{\sum_j \alpha_{0j} + \sum_j y_j - k} \; .
68+
\end{split}
69+
$$
70+
71+
Since $y_1 + \ldots + y_k = n$ [by definition](/D/mult-data), this becomes
72+
73+
$$ \label{eq:Mult-MAP-s2}
74+
\hat{p}_{i,\mathrm{MAP}} = \frac{\alpha_{0i} + y_i - 1}{\sum_j \alpha_{0j} + n - k} \end{equation}
75+
76+
which, using the $1 \times k$ [vectors](/D/mult-data) $y$, $p$ and $\alpha_0$, can be written as:
77+
78+
\begin{equation} \label{eq:Mult-MAP-qed}
79+
\hat{p}_\mathrm{MAP} = \frac{\alpha_0+y-1}{\sum_{j=1}^k \alpha_{0j} + n - k} \; .
80+
$$

0 commit comments

Comments
 (0)