Merge pull request #86 from tomfaulkenberry/master

JoramSoch · web-flow · commit fb45bf652e04 · 2020-09-03T04:14:35.000+02:00
added [em,bf-ep] / modified [bf,bf-sddr]
diff --git a/D/bf.md b/D/bf.md
@@ -28,22 +28,22 @@ username: "tomfaulkenberry"
 ---
 
 
-**Definition:** Consider two competing [generative models](/D/gm) $\mathcal{M}_1$ and $\mathcal{M}_2$ for observed data $y$. Then the Bayes factor in favor $\mathcal{M}_1$ over $\mathcal{M}_2$ is the ratio of [marginal likelihoods](/D/ml) of $\mathcal{M}_1$ and $\mathcal{M}_2$:
+**Definition:** Consider two competing [generative models](/D/gm) $m_1$ and $m_2$ for observed data $y$. Then the Bayes factor in favor $m_1$ over $m_2$ is the ratio of [marginal likelihoods](/D/ml) of $m_1$ and $m_2$:
 
 $$ \label{eq:BF}
-\text{BF}_{12} = \frac{p(y\mid \mathcal{M}_1)}{p(y\mid \mathcal{M}_2)}.
+\text{BF}_{12} = \frac{p(y\mid m_1)}{p(y\mid m_2)}.
 $$
 
 Note: by [Bayes Theorem](/P/bayes-th), the ratio of [posterior model probabilities](/D/pmp) (i.e., the posterior model odds) can be written as
 
 $$ \label{eq:odds}
-\frac{p(\mathcal{M}_1\mid y)}{p(\mathcal{M}_2\mid y)} = \frac{p(\mathcal{M}_1)}{p(\mathcal{M}_2)} \cdot \frac{p(y\mid \mathcal{M}_1)}{p(y\mid \mathcal{M}_2)},
+\frac{p(m_1 \mid y)}{p(m_2 \mid y)} = \frac{p(m_1)}{p(m_2)} \cdot \frac{p(y\mid m_1)}{p(y\mid m_2)},
 $$
 
 or equivalently by \eqref{eq:BF},
 
 $$ \label{eq:odds2}
-\frac{p(\mathcal{M}_1\mid y)}{p(\mathcal{M}_2\mid y)} = \frac{p(\mathcal{M}_1)}{p(\mathcal{M}_2)} \cdot \text{BF}_{12}.
+\frac{p(m_1 \mid y)}{p(m_2 \mid y)} = \frac{p(m_1)}{p(m_2)} \cdot \text{BF}_{12}.
 $$
 
-In other words, the Bayes factor can be viewed as the factor by which the prior model odds are updated (after observing data $y$) to posterior model odds.
+In other words, the Bayes factor can be viewed as the factor by which the prior model odds are updated (after observing data $y$) to posterior model odds (see also [Bayes' rule](/P/bayes-rule)).
diff --git a/D/em.md b/D/em.md
@@ -0,0 +1,31 @@
+---
+layout: definition
+mathjax: true
+
+author: "Thomas J. Faulkenberry"
+affiliation: "Tarleton State University"
+e_mail: "faulkenberry@tarleton.edu"
+date: 2020-08-26 12:00:00
+
+title: "Encompassing model"
+chapter: "Model Selection"
+section: "Bayesian model selection"
+topic: "Bayes factor"
+definition: "Definition"
+
+sources:
+  - authors: "Klugkist, I., Kato, B., and Hoijtink, H."
+    year: 2005
+    title: "Bayesian model selection using encompassing priors"
+    in: "Statistica Neerlandica"
+    pages: "vol. 59, no. 1., pp. 57-69"
+    url: "https://dx.doi.org/10.1111/j.1467-9574.2005.00279.x"
+    doi: "10.1111/j.1467-9574.2005.00279.x"
+
+def_id: "D93"
+shortcut: "em"
+username: "tomfaulkenberry"
+---
+
+
+**Definition:** Consider a family $f$ of [generative models](/D/gm) $m$ on data $y$, where each $m \in f$ is defined by placing an inequality constraint on model parameter(s) $\theta$ (e.g., $m:\theta>0$. Then the encompassing model $m_e$ is constructed such that each $m$ is nested within $m_e$ and all inequality constraints on the parameter(s) $\theta$ are removed.
diff --git a/P/bf-ep.md b/P/bf-ep.md
@@ -0,0 +1,97 @@
+---
+layout: proof
+mathjax: true
+
+author: "Thomas J. Faulkenberry"
+affiliation: "Tarleton State University"
+e_mail: "faulkenberry@tarleton.edu"
+date: 2020-08-26 12:00:00
+
+title: "Encompassing Prior Method for computing Bayes Factors"
+chapter: "Model Selection"
+section: "Bayesian model selection"
+topic: "Bayes factor"
+theorem: "Computation using Encompassing Prior Method"
+
+sources:
+  - authors: "Klugkist, I., Kato, B., and Hoijtink, H."
+    year: 2005
+    title: "Bayesian model selection using encompassing priors"
+    in: "Statistica Neerlandica"
+    pages: "vol. 59, no. 1., pp. 57-69"
+    url: "https://dx.doi.org/10.1111/j.1467-9574.2005.00279.x"
+    doi: "10.1111/j.1467-9574.2005.00279.x"
+
+  - authors: "Faulkenberry, Thomas J."
+    year: 2019
+    title: "A tutorial on generalizing the default Bayesian t-test via posterior sampling and encompassing priors"
+    in: "Communications for Statistical Applications and Methods"
+    pages: "vol. 26, no. 2, pp. 217-238"
+    url: "https://dx.doi.org/10.29220/CSAM.2019.26.2.217"
+    doi: "10.29220/CSAM.2019.26.2.217"
+
+proof_id: "P157"
+shortcut: "bf-ep"
+username: "tomfaulkenberry"
+---
+
+
+**Theorem:** Consider two models $m_1$ and $m_e$, where $m_1$ is nested within an [encompassing model](/D/em) $m_e$ via an inequality constraint on some parameter $\theta$, and $\theta$ is unconstrained under $m_e$. Then
+\[
+  B_{1e} = \frac{c}{d} = \frac{1/d}{1/c}
+\]
+where $1/d$ and $1/c$ represent the proportions of the posterior and prior of the encompassing model, respectively, that are in agreement with the inequality constraint imposed by the nested model $m_1$.
+
+**Proof:**
+Consider first that for any model $m_1$ on data $y$ with parameter $\theta$, [Bayes theorem](/P/bayes-th) implies
+
+$$ \label{eq:bayesth}
+  p(\theta \mid y,m_1) = \frac{p(y \mid \theta,m_1) \cdot p(\theta \mid m_1)}{p(y \mid m_1)}.
+$$
+
+Rearranging Equation \eqref{eq:bayesth} allows us to write the [marginal likelihood](/D/ml) for $y$ under $m_1$ as
+
+$$ \label{eq:marginal}
+  p(y \mid m_1) = \frac{p(y \mid \theta,m_1) \cdot p(\theta \mid m_1)}{p(\theta \mid y,m_1)}.
+$$
+
+Taking the ratio of the marginal likelihoods for $m_1$ and the [encompassing model](/D/em) $m_e$ yields the following [Bayes factor](/D/bf):
+
+$$ \label{eq:bayesfactor}
+  B_{1e} = \frac{p(y \mid \theta,m_1) \cdot p(\theta \mid m_1) / p(\theta \mid y,m_1)}{p(y \mid \theta,m_e) \cdot p(\theta \mid m_e) / p(\theta \mid y,m_e)}.
+$$
+
+Now, both the constrained model $m_1$ and the [encompassing model](/D/em) $m_e$ contain the same parameter vector $\theta$. Choose a specific value of $\theta$, say $\theta'$, that exists in the support of both models $m_1$ and $m_e$ (we can do this because $m_1$ is nested within $m_e$). Then, for this parameter value $\theta'$, we have $p(y \mid \theta',m_1)=p(y \mid \theta',m_e)$, so the expression for the Bayes factor (Equation \eqref{eq:bayesfactor} above) reduces to an expression involving only the priors and posteriors for $\theta'$ under $m_1$ and $m_e$:
+
+$$ \label{eq:bayesfactor2}
+  B_{1e} = \frac{p(\theta' \mid m_1) / p(\theta' \mid y,m_1)}{p(\theta' \mid m_e) / p(\theta' \mid y,m_e)}.
+$$
+
+Because $m_1$ is nested within $m_e$ via an inequality constraint, the prior $p(\theta' \mid m_1)$ is simply a truncation of the encompassing prior $p(\theta' \mid m_e)$. Thus, we can express $p(\theta' \mid m_1)$ in terms of the encompassing prior $p(\theta' \mid m_e)$ by multiplying the encompassing prior by an indicator function over $m_1$ and then normalizing the resulting product.  That is,
+
+$$ \label{eq:normalize}
+\begin{split}
+  p(\theta' \mid m_1) & = \frac{p(\theta' \mid m_e) \cdot I_{\theta' \in m_1}}{\int p(\theta' \mid m_e) \cdot I_{\theta' \in m_1}d\theta'}\\
+                      & = \Biggl(\frac{I_{\theta' \in m_1}}{\int p(\theta' \mid m_e) \cdot I_{\theta' \in m_1}d\theta'}\Biggr) \cdot p(\theta' \mid m_e),
+\end{split}
+$$
+
+where $I_{\theta' \in m_1}$ is an indicator function. For parameters $\theta' \in m_1$, this indicator function is identically equal to 1, so the expression in parentheses reduces to a constant, say $c$, allowing us to write the prior as
+
+$$ \label{eq:prior}
+  p(\theta' \mid m_1) = c \cdot p(\theta' \mid m_e).
+$$
+
+By similar reasoning, we can write the posterior as
+
+$$ \label{eq:posterior}
+  p(\theta' \mid y,m_1) = \Biggl(\frac{I_{\theta' \in m_1}}{\int p(\theta' \mid y,m_e)I_{\theta' \in m_1}d\theta'}\Biggr)\cdot p(\theta' \mid y,m_e) = d \cdot p(\theta' \mid y,m_e).
+$$
+
+This gives us
+
+$$ \label{eq:bayesfactor3}
+  B_{1e} = \frac{c \cdot p(\theta' \mid m_e) / d \cdot p(\theta' \mid y,m_e)}{p(\theta' \mid m_e) / p(\theta' \mid y,m_e)} = \frac{c}{d} = \frac{1/d}{1/c},
+$$
+
+which completes the proof. Note that by definition, $1/d$ represents the proportion of the posterior distribution for $\theta$ under the [encompassing model](/D/em) $m_e$ that agrees with the constraints imposed by $m_1$.  Similarly, $1/c$ represents the proportion of the prior distribution for $\theta$ under the [encompassing model](/D/em) $m_e$ that agrees with the constraints imposed by $m_1$.
diff --git a/P/bf-sddr.md b/P/bf-sddr.md
@@ -28,46 +28,46 @@ username: "tomfaulkenberry"
 ---
 
 
-**Theorem:** Consider two competing [models](/D/gm) on data $y$ containing parameters $\delta$ and $\varphi$, namely $\mathcal{M}_0:\delta=\delta_0,\varphi$ and $\mathcal{M}_1:\delta,\varphi$. In this context, we say that $\delta$ is a parameter of interest, $\varphi$ is a nuisance parameter (i.e., common to both models), and $\mathcal{M}_0$ is a sharp point hypothesis nested within $\mathcal{M}_1$. Suppose further that the prior for the nuisance parameter $\varphi$ in $\mathcal{M}_0$ is equal to the prior for $\varphi$ in $\mathcal{M}_1$ after conditioning on the restriction -- that is, $p(\varphi\mid \mathcal{M}_0) = p(\varphi\mid \delta=\delta_0,\mathcal{M}_1)$. Then the [Bayes factor](/D/bf) for $\mathcal{M}_0$ over $\mathcal{M}_1$ can be computed as:
+**Theorem:** Consider two competing [models](/D/gm) on data $y$ containing parameters $\delta$ and $\varphi$, namely $m_0:\delta=\delta_0,\varphi$ and $m_1:\delta,\varphi$. In this context, we say that $\delta$ is a parameter of interest, $\varphi$ is a nuisance parameter (i.e., common to both models), and $m_0$ is a sharp point hypothesis nested within $m_1$. Suppose further that the prior for the nuisance parameter $\varphi$ in $m_0$ is equal to the prior for $\varphi$ in $m_1$ after conditioning on the restriction -- that is, $p(\varphi\mid m_0) = p(\varphi\mid \delta=\delta_0,m_1)$. Then the [Bayes factor](/D/bf) for $m_0$ over $m_1$ can be computed as:
 
 $$ \label{eq:sd}
-\text{BF}_{01} = \frac{p(\delta=\delta_0\mid y,\mathcal{M}_1)}{p(\delta=\delta_0\mid \mathcal{M}_1)}.
+\text{BF}_{01} = \frac{p(\delta=\delta_0\mid y,m_1)}{p(\delta=\delta_0\mid m_1)}.
 $$
 
 **Proof:**
 
-By [definition](/D/bf), the Bayes factor $\text{BF}_{01}$ is the ratio of marginal likelihoods of data $y$ over $\mathcal{M}_0$ and $\mathcal{M}_1$, respectively. That is,
+By [definition](/D/bf), the Bayes factor $\text{BF}_{01}$ is the ratio of marginal likelihoods of data $y$ over $m_0$ and $m_1$, respectively. That is,
 
 $$ \label{eq:bf}
-\text{BF}_{01}=\frac{p(y \mid \mathcal{M}_0)}{p(y \mid \mathcal{M}_1)}.
+\text{BF}_{01}=\frac{p(y \mid m_0)}{p(y \mid m_1)}.
 $$
 
-The key idea in the proof is that we can use a "change of variables" technique to express $\text{BF}_{01}$ entirely in terms of the "encompassing" model $\mathcal{M}_1$. This proceeds by first unpacking the [marginal likelihood](/D/ml) for $\mathcal{M}_0$ over the nuisance parameter $\varphi$ and then using the fact that $\mathcal{M}_0$ is a sharp hypothesis nested within $\mathcal{M}_1$ to rewrite everything in terms of $\mathcal{H}_1$. Specifically,
+The key idea in the proof is that we can use a "change of variables" technique to express $\text{BF}_{01}$ entirely in terms of the "encompassing" model $m_1$. This proceeds by first unpacking the [marginal likelihood](/D/ml) for $m_0$ over the nuisance parameter $\varphi$ and then using the fact that $m_0$ is a sharp hypothesis nested within $m_1$ to rewrite everything in terms of $\mathcal{H}_1$. Specifically,
 
 $$
-\begin{aligned}
- p(y \mid \mathcal{M}_0) &= \int p(y \mid \varphi,\mathcal{M}_0)p(\varphi\mid \mathcal{M}_0)d\varphi\\
-  &= \int p(y \mid \varphi,\delta=\delta_0,\mathcal{M}_1)p(\varphi\mid \delta=\delta_0,\mathcal{M}_1)d\varphi\\
-  &= p(y \mid \delta=\delta_0,\mathcal{M}_1).\\
-\end{aligned}
+\begin{split}
+ p(y \mid m_0) &= \int p(y \mid \varphi,m_0)p(\varphi\mid m_0)d\varphi\\
+  &= \int p(y \mid \varphi,\delta=\delta_0,m_1)p(\varphi\mid \delta=\delta_0,m_1)d\varphi\\
+  &= p(y \mid \delta=\delta_0,m_1).\\
+\end{split}
 $$
 
 By [Bayes Theorem](/P/bayes-th), we can rewrite this last line as
 
 $$
-p(y \mid \delta=\delta_0,\mathcal{M}_1) = \frac{p(\delta=\delta_0\mid y,\mathcal{M}_1)p(y \mid \mathcal{M}_1)}{p(\delta=\delta_0\mid \mathcal{M}_1)}.
+p(y \mid \delta=\delta_0,m_1) = \frac{p(\delta=\delta_0\mid y,m_1)p(y \mid m_1)}{p(\delta=\delta_0\mid m_1)}.
 $$
 
 Thus we have
 
 $$ 
-\begin{aligned}
-  \text{BF}_{01} &= \frac{p(y \mid \mathcal{M}_0)}{p(y \mid \mathcal{M}_1)}\\
-  &= p(y \mid \mathcal{M}_0) \cdot \frac{1}{p(y \mid \mathcal{M}_1)}\\
-  &= p(y \mid \delta=\delta_0,\mathcal{M}_1) \cdot \frac{1}{p(y \mid \mathcal{M}_1)}\\
-  &= \frac{p(\delta=\delta_0\mid y,\mathcal{M}_1)p(y \mid \mathcal{M}_1)}{p(\delta=\delta_0\mid \mathcal{M}_1)} \cdot \frac{1}{p(y \mid \mathcal{M}_1)}\\
-  &=\frac{p(\delta=\delta_0 \mid y,\mathcal{M}_1)}{p(\delta=\delta_0\mid \mathcal{M}_1)},
-\end{aligned}
+\begin{split}
+  \text{BF}_{01} &= \frac{p(y \mid m_0)}{p(y \mid m_1)}\\
+  &= p(y \mid m_0) \cdot \frac{1}{p(y \mid m_1)}\\
+  &= p(y \mid \delta=\delta_0,m_1) \cdot \frac{1}{p(y \mid m_1)}\\
+  &= \frac{p(\delta=\delta_0\mid y,m_1)p(y \mid m_1)}{p(\delta=\delta_0\mid m_1)} \cdot \frac{1}{p(y \mid m_1)}\\
+  &=\frac{p(\delta=\delta_0 \mid y,m_1)}{p(\delta=\delta_0\mid m_1)},
+\end{split}
 $$
 
 which completes the proof of \eqref{eq:sd}.