|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Joram Soch" |
| 6 | +affiliation: "BCCN Berlin" |
| 7 | +e_mail: "joram.soch@bccn-berlin.de" |
| 8 | +date: 2023-12-23 21:41:54 |
| 9 | + |
| 10 | +title: "Multinomial test" |
| 11 | +chapter: "Statistical Models" |
| 12 | +section: "Count data" |
| 13 | +topic: "Multinomial observations" |
| 14 | +theorem: "Multinomial test" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Wikipedia" |
| 18 | + year: 2023 |
| 19 | + title: "Multinomial test" |
| 20 | + in: "Wikipedia, the free encyclopedia" |
| 21 | + pages: "retrieved on 2023-12-23" |
| 22 | + url: "https://en.wikipedia.org/wiki/Multinomial_test" |
| 23 | + |
| 24 | +proof_id: "P430" |
| 25 | +shortcut: "mult-test" |
| 26 | +username: "JoramSoch" |
| 27 | +--- |
| 28 | + |
| 29 | + |
| 30 | +**Theorem:** Let $y = [y_1, \ldots, y_k]$ be the number of observations in $k$ categories resulting from $n$ independent trials with unknown category probabilities $p = [p_1, \ldots, p_k]$, such that $y$ follows a [multinomial distribution](/D/mult): |
| 31 | + |
| 32 | +$$ \label{eq:Mult} |
| 33 | +y \sim \mathrm{Mult}(n,p) \; . |
| 34 | +$$ |
| 35 | + |
| 36 | +Then, the [null hypothesis](/D/h0) |
| 37 | + |
| 38 | +$$ \label{eq:mult-test-h0} |
| 39 | +H_0: \; p = p_0 = [p_{01}, \ldots, p_{0k}] |
| 40 | +$$ |
| 41 | + |
| 42 | +is [rejected](/D/test) at [significance level](/D/alpha) $\alpha$, if |
| 43 | + |
| 44 | +$$ \label{eq:mult-test-rej} |
| 45 | +\mathrm{Pr}_\mathrm{sig} = \sum_{x: \; \mathrm{Pr}_0(x) \leq \mathrm{Pr}_0(y)} \mathrm{Pr}_0(x) < \alpha |
| 46 | +$$ |
| 47 | + |
| 48 | +where $\mathrm{Pr}_0(x)$ is the probability of observing the numbers of occurences $x = [x_1, \ldots, x_k]$ under the null hypothesis: |
| 49 | + |
| 50 | +$$ \label{eq:mult-test-prob} |
| 51 | +\mathrm{Pr}_0(x) = n! \prod_{j=1}^k \frac{p_{0j}^{x_j}}{x_j!} \; . |
| 52 | +$$ |
| 53 | + |
| 54 | + |
| 55 | +**Proof:** The [alternative hypothesis](/D/h1) relative to $H_0$ is |
| 56 | + |
| 57 | +$$ \label{eq:bin-test-h1} |
| 58 | +H_1: \; p_j \neq p_{0j} \quad \text{for at least one} \quad j = 1, \ldots, k \; . |
| 59 | +$$ |
| 60 | + |
| 61 | +We can use $y$ as a [test statistic](/D/tstat). Its [sampling distribution](/D/dist-samp) is given by \eqref{eq:Mult}. The [probability mass function](/D/pmf) (PMF) of the test statistic under the null hypothesis is thus equal to the [probability mass function of the multionomial distribution](/P/mult-pmf) with [category probabilities](/D/mult) $p_0$: |
| 62 | + |
| 63 | +$$ \label{eq:y-pmf} |
| 64 | +\mathrm{Pr}(y = x \vert H_0) = \mathrm{Mult}(x; n, p_0) = {n \choose {x_1, \ldots, x_k}} \, \prod_{j=1}^k {p_j}^{x_j} \; . |
| 65 | +$$ |
| 66 | + |
| 67 | +The multinomial coefficient in this equation is equal to |
| 68 | + |
| 69 | +$$ \label{eq:mult-coeff} |
| 70 | +{n \choose {k_1, \ldots, k_m}} = \frac{n!}{k_1! \cdot \ldots \cdot k_m!} \; , |
| 71 | +$$ |
| 72 | + |
| 73 | +such that the probability of observing the counts $y$, given $H_0$, is |
| 74 | + |
| 75 | +$$ \label{eq:Pr0-y} |
| 76 | +\mathrm{Pr}(y \vert H_0) = n! \prod_{j=1}^k \frac{{p_{0i}}^{y_j}}{y_j!} \; . |
| 77 | +$$ |
| 78 | + |
| 79 | +The probability of observing any other set of counts $x$, given $H_0$, is |
| 80 | + |
| 81 | +$$ \label{eq:Pr0-x} |
| 82 | +\mathrm{Pr}(x \vert H_0) = n! \prod_{j=1}^k \frac{{p_{0i}}^{x_j}}{x_j!} \; . |
| 83 | +$$ |
| 84 | + |
| 85 | +The [p-value](/D/pval) is the probability of observing a value of the [test statistic](/D/tstat) that is as extreme or more extreme then the actually observed test statistic. Any set of counts $x$ might be considered as extreme or more extreme than the actually observed counts $y$, if the former is equally probable or less probably than the latter: |
| 86 | + |
| 87 | +$$ \label{eq:mult-test-cond} |
| 88 | +\mathrm{Pr}_0(x) \leq \mathrm{Pr}_0(y) \; . |
| 89 | +$$ |
| 90 | + |
| 91 | +Thus, the [p-value](/D/pval) for the data in \eqref{eq:Mult} is equal to |
| 92 | + |
| 93 | +$$ \label{eq:mult-test-p} |
| 94 | +p = \sum_{x: \; \mathrm{Pr}_0(x) \leq \mathrm{Pr}_0(y)} \mathrm{Pr}_0(x) |
| 95 | +$$ |
| 96 | + |
| 97 | +and the null hypothesis in \eqref{eq:mult-test-h0} is [rejected](/D/test), if |
| 98 | + |
| 99 | +$$ \label{eq:mult-test-rej-qed} |
| 100 | +p < \alpha \; . |
| 101 | +$$ |
0 commit comments