|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Joram Soch" |
| 6 | +affiliation: "BCCN Berlin" |
| 7 | +e_mail: "joram.soch@bccn-berlin.de" |
| 8 | +date: 2020-10-22 08:04:00 |
| 9 | + |
| 10 | +title: "Exceedance probabilities for the the Dirichlet distribution" |
| 11 | +chapter: "Probability Distributions" |
| 12 | +section: "Multivariate continuous distributions" |
| 13 | +topic: "Dirichlet distribution" |
| 14 | +theorem: "Exceedance probabilities" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Soch J, Allefeld C" |
| 18 | + year: 2016 |
| 19 | + title: "Exceedance Probabilities for the Dirichlet Distribution" |
| 20 | + in: "arXiv stat.AP" |
| 21 | + pages: "1611.01439" |
| 22 | + url: "https://arxiv.org/abs/1611.01439" |
| 23 | + |
| 24 | +proof_id: "P181" |
| 25 | +shortcut: "dir-ep" |
| 26 | +username: "JoramSoch" |
| 27 | +--- |
| 28 | + |
| 29 | + |
| 30 | +**Theorem:** Let $r = [r_1, \ldots, r_k]$ be a [random vector](/D/rvec) following a [Dirichlet distribution](/D/dir) with concentration parameters $\alpha = [\alpha_1, \ldots, \alpha_k]$: |
| 31 | + |
| 32 | +$$ \label{eq:r-Dir} |
| 33 | +r \sim \mathrm{Dir}(\alpha) \; . |
| 34 | +$$ |
| 35 | + |
| 36 | +<br> |
| 37 | +1) If $k = 2$, then the [exceedance probability](/D/prob-exc) for $r_1$ is |
| 38 | + |
| 39 | +$$ \label{eq:Dir2-EP} |
| 40 | +\varphi_1 = 1 - \frac{\mathrm{B}\left( \frac{1}{2};\alpha_1,\alpha_2 \right)}{\mathrm{B}(\alpha_1,\alpha_2)} |
| 41 | +$$ |
| 42 | + |
| 43 | +where $\mathrm{B}(x,y)$ is the beta function and $\mathrm{B}(x;a,b)$ is the incomplete beta function. |
| 44 | + |
| 45 | +<br> |
| 46 | +2) If $k > 2$, then the [exceedance probability](/D/prob-exc) for $r_i$ is |
| 47 | + |
| 48 | +$$ \label{eq:Dir-EP} |
| 49 | +\varphi_i = \int_0^\infty \prod_{j \neq i} \left( \frac{\gamma(\alpha_j,q_i)}{\Gamma(\alpha_j)} \right) \, \frac{q_i^{\alpha_i-1} \exp[-q_i]}{\Gamma(\alpha_i)} \, \mathrm{d}q_i \; . |
| 50 | +$$ |
| 51 | + |
| 52 | +where $\Gamma(x)$ is the gamma function and $\gamma(s,x)$ is the lowerr incomplete gamma function. |
| 53 | + |
| 54 | + |
| 55 | +**Proof:** In the context of the [Dirichlet distribution](/D/dir), the [exceedance probability](/D/prob-exc) for a particular $r_i$ is defined as: |
| 56 | + |
| 57 | +$$ \label{eq:Dir-EP-def} |
| 58 | +\begin{split} |
| 59 | +\varphi_i &= p \Bigl( \forall j \in \left\lbrace 1, \ldots, k \Bigm| j \neq i \right\rbrace: \, r_i > r_j |\alpha \bigr) \\ |
| 60 | +&= p \Bigl( \bigwedge_{j \neq i} r_i > r_j \Bigm| \alpha \Bigr) \; . |
| 61 | +\end{split} |
| 62 | +$$ |
| 63 | + |
| 64 | +The [probability density function of the Dirichlet distribution](/P/dir-pdf) is given by: |
| 65 | + |
| 66 | +$$ \label{eq:Dir-pdf} |
| 67 | +\mathrm{Dir}(r; \alpha) = \frac{\Gamma\left( \sum_{i=1}^k \alpha_i \right)}{\prod_{i=1}^k \Gamma(\alpha_i)} \, \prod_{i=1}^k {r_i}^{\alpha_i-1} \; . |
| 68 | +$$ |
| 69 | + |
| 70 | +Note that the probability density function is only calculated, if |
| 71 | + |
| 72 | +$$ \label{eq:Dir-req} |
| 73 | +r_i \in [0,1] \quad \text{for} \quad i = 1,\ldots,k \quad \text{and} \quad \sum_{i=1}^k r_i = 1 \; , |
| 74 | +$$ |
| 75 | + |
| 76 | +and [defined to be zero otherwise](/D/dir). |
| 77 | + |
| 78 | +<br> |
| 79 | +1) If $k = 2$, the [probability density function of the Dirichlet distribution](/P/dir-pdf) reduces to |
| 80 | + |
| 81 | +$$ \label{eq:Dir2-pdf} |
| 82 | +p(r) = \frac{\Gamma(\alpha_1 + \alpha_2)}{\Gamma(\alpha_1) \, \Gamma(\alpha_2)} \, r_1^{\alpha_1-1} \, r_2^{\alpha_2-1} |
| 83 | +$$ |
| 84 | + |
| 85 | +which is equivalent to the [probability density function of the beta distribution](/P/beta-pdf) |
| 86 | + |
| 87 | +$$ \label{eq:Beta-pdf} |
| 88 | +p(r_1) = \frac{r_1^{\alpha_1-1} \, (1-r_1)^{\alpha_2-1}}{\mathrm{B}(\alpha_1,\alpha_2)} |
| 89 | +$$ |
| 90 | + |
| 91 | +with the beta function given by |
| 92 | + |
| 93 | +$$ \label{eq:beta-fct} |
| 94 | +\mathrm{B}(x,y) = \frac{\Gamma(x) \, \Gamma(y)}{\Gamma(x + y)} \; . |
| 95 | +$$ |
| 96 | + |
| 97 | +With \eqref{eq:Dir-req}, the exceedance probability for this bivariate case simplifies to |
| 98 | + |
| 99 | +$$ \label{eq:Dir2-EP-def} |
| 100 | +\varphi_1 = p(r_1 > r_2) = p(r_1 > 1 - r_1) = p(r_1 > 1/2) = \int_{\frac{1}{2}}^1 p(r_1) \, \mathrm{d}r_1 \; . |
| 101 | +$$ |
| 102 | + |
| 103 | +Using the [cumulative distribution function of the beta distribution](/P/beta-cdf), it evaluates to |
| 104 | + |
| 105 | +$$ \label{eq:Dir2-EP-qed} |
| 106 | +\varphi_1 = 1 - \int_0^{\frac{1}{2}} p(r_1) \, \mathrm{d}r_1 = 1 - \frac{\mathrm{B}\left( \frac{1}{2};\alpha_1,\alpha_2 \right)}{\mathrm{B}(\alpha_1,\alpha_2)} |
| 107 | +$$ |
| 108 | + |
| 109 | +with the incomplete beta function |
| 110 | + |
| 111 | +$$ \label{eq:inc-beta-fct} |
| 112 | +\mathrm{B}(x; a, b) = \int_0^x x^{a-1} \, (1-x)^{b-1} \, \mathrm{d}x \; . |
| 113 | +$$ |
| 114 | + |
| 115 | +<br> |
| 116 | +2) If $k > 2$, there is no similarly simple expression, because in general |
| 117 | + |
| 118 | +$$ \label{eq:Dir-EP-ineq} |
| 119 | +\varphi_i = p(r_i = \mathrm{max}(r)) > p(r_i > 1/2) \quad \text{for} \quad i = 1, \ldots, k \; , |
| 120 | +$$ |
| 121 | + |
| 122 | +i.e. exceedance probabilities cannot be evaluated using a simple threshold on $r_i$, because $r_i$ might be the maximal element in $r$ without being larger than $1/2$. Instead, we make use of the [relationship between the Dirichlet and the gamma distribution](/P/gam-dir) which states that |
| 123 | + |
| 124 | +$$ \label{eq:Gam-Dir} |
| 125 | +\begin{split} |
| 126 | +& Y_1 \sim \mathrm{Gam}(\alpha_1,\beta), \, \ldots, \, Y_k \sim \mathrm{Gam}(\alpha_k,\beta), \, Y_s = \sum_{i=1}^k Y_j \\ |
| 127 | +\Rightarrow \; & X = (X_1, \ldots, X_k) = \left( \frac{Y_1}{Y_s}, \ldots, \frac{Y_k}{Y_s} \right) \sim \mathrm{Dir}(\alpha_1, \ldots, \alpha_k) \; . |
| 128 | +\end{split} |
| 129 | +$$ |
| 130 | + |
| 131 | +The [probability density function of the gamma distribution](/P/gam-pdf) is given by |
| 132 | + |
| 133 | +$$ \label{eq:Gam-pdf} |
| 134 | +\mathrm{Gam}(x; a, b) = \frac{{b}^{a}}{\Gamma(a)} \, x^{a-1} \, \exp[-b x] \quad \text{for} \quad x > 0 \; . |
| 135 | +$$ |
| 136 | + |
| 137 | +Consider the [gamma random variables](/D/gam) |
| 138 | + |
| 139 | +$$ \label{eq:Gam-Dir-A} |
| 140 | +q_1 \sim \mathrm{Gam}(\alpha_1,1), \, \ldots, \, q_k \sim \mathrm{Gam}(\alpha_k,1), \, q_s = \sum_{j=1}^k q_j |
| 141 | +$$ |
| 142 | + |
| 143 | +and the [Dirichlet random vector](/D/dir) |
| 144 | + |
| 145 | +$$ \label{eq:Gam-Dir-B} |
| 146 | +r = (r_1, \ldots, r_k) = \left( \frac{q_1}{q_s}, \ldots, \frac{q_k}{q_s} \right) \sim \mathrm{Dir}(\alpha_1, \ldots, \alpha_k) \; . |
| 147 | +$$ |
| 148 | + |
| 149 | +Obviously, it holds that |
| 150 | + |
| 151 | +$$ \label{eq:Gam-Dir-eq} |
| 152 | +r_i > r_j \; \Leftrightarrow \; q_i > q_j \quad \text{for} \quad i,j = 1, \ldots, k \quad \text{with} \quad j \neq i \; . |
| 153 | +$$ |
| 154 | + |
| 155 | +Therefore, consider the probability that $q_i$ is larger than $q_j$, given $q_i$ is known. This probability is equal to the probability that $q_j$ is smaller than $q_i$, given $q_i$ is known |
| 156 | + |
| 157 | +$$ \label{eq:Gam-EP0} |
| 158 | +p(q_i > q_j|q_i) = p(q_j < q_i|q_i) |
| 159 | +$$ |
| 160 | + |
| 161 | +which can be expressed in terms of the [cumulative distribution function of the gamma distribution](/P/gam-cdf) as |
| 162 | + |
| 163 | +$$ \label{eq:Gam-EP1} |
| 164 | +p(q_j < q_i|q_i) = \int_0^{q_i} \mathrm{Gam}(q_j;\alpha_j,1) \, \mathrm{d}q_j = \frac{\gamma(\alpha_j,q_i)}{\Gamma(\alpha_j)} |
| 165 | +$$ |
| 166 | + |
| 167 | +where $\Gamma(x)$ is the gamma function and $\gamma(s,x)$ is the lower incomplete gamma function. Since the gamma variates are independent of each other, these probabilties factorize: |
| 168 | + |
| 169 | +$$ \label{eq:Gam-EP2} |
| 170 | +p(\forall_{j \neq i} \left[ q_i > q_j \right]|q_i) = \prod_{j \neq i} p(q_i > q_j|q_i) = \prod_{j \neq i} \frac{\gamma(\alpha_j,q_i)}{\Gamma(\alpha_j)} \; . |
| 171 | +$$ |
| 172 | + |
| 173 | +In order to obtain the exceedance probability $\varphi_i$, the dependency on $q_i$ in this probability still has to be removed. From equations (\ref{eq:Dir-EP-def}) and (\ref{eq:Gam-Dir-eq}), it follows that |
| 174 | + |
| 175 | +$$ \label{eq:Dir-EP2a} |
| 176 | +\varphi_i = p(\forall_{j \neq i} \left[ r_i > r_j \right]) = p(\forall_{j \neq i} \left[ q_i > q_j \right]) \; . |
| 177 | +$$ |
| 178 | + |
| 179 | +Using the [law of marginal probability](/D/prob-marg), we have |
| 180 | + |
| 181 | +$$ \label{eq:Dir-EP2b} |
| 182 | +\varphi_i = \int_0^\infty p(\forall_{j \neq i} \left[ q_i > q_j \right]|q_i) \, p(q_i) \, \mathrm{d}q_i \; . |
| 183 | +$$ |
| 184 | + |
| 185 | +With (\ref{eq:Gam-EP2}) and (\ref{eq:Gam-Dir-A}), this becomes |
| 186 | + |
| 187 | +$$ \label{eq:Dir-EP2c} |
| 188 | +\varphi_i = \int_0^\infty \prod_{j \neq i} \left( p(q_i > q_j|q_i) \right) \cdot \mathrm{Gam}(q_i;\alpha_i,1) \, \mathrm{d}q_i \; . |
| 189 | +$$ |
| 190 | + |
| 191 | +And with (\ref{eq:Gam-EP1}) and (\ref{eq:Gam-pdf}), it becomes |
| 192 | + |
| 193 | +$$ \label{eq:Dir-EP-qed} |
| 194 | +\varphi_i = \int_0^\infty \prod_{j \neq i} \left( \frac{\gamma(\alpha_j,q_i)}{\Gamma(\alpha_j)} \right) \cdot \frac{q_i^{\alpha_i-1} \exp[-q_i]}{\Gamma(\alpha_i)} \, \mathrm{d}q_i \; . |
| 195 | +$$ |
| 196 | + |
| 197 | +In other words, the [exceedance probability](/D/prob-exc) for one element from a [Dirichlet-distributed](/D/dir) [random vector](/D/rvec) is an integral from zero to infinity where the first term in the integrand conforms to a product of [gamma](/D/gam) [cumulative distribution functions](/D/cdf) and the second term is a [gamma](/D/gam) [probability density function](/D/pdf). |
0 commit comments