You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Definition:** Let $m$ be a [generative model](/D/gm) with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and [prior distribution](/D/prior) $p(\theta \vert m)$. Then,
31
+
32
+
* the [prior distribution](/D/prior) is called "conjugate", if it, when combined with the [likelihood function](/D/lf), leads to a [posterior distribution](/D/post) that belongs to the same family of [probability distributions](/D/dist);
33
+
34
+
* the prior distribution is called "non-conjugate", if this is not the case.
**Definition:** Let $m$ be a [generative model](/D/gm) with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and [prior distribution](/D/prior) $p(\theta \vert \lambda, m)$ using [prior hyperparameters](/D/prior) $\lambda$. Let $p(y \vert \lambda, m)$ be the [marginal likelihood](/D/ml) when [integrating the parameters out of the joint likelihood](/P/ml-jl). Then, the prior distribution is called an "Empirical Bayes prior", if it maximizes the logarithmized marginal likelihood:
31
+
32
+
$$ \label{eq:prior-eb}
33
+
\lambda_{\mathrm{EB}} = \operatorname*{arg\,max}_{\lambda} \log p(y \vert \lambda, m) \; .
**Definition:** Let $m$ be a [generative model](/D/gm) with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and [prior distribution](/D/prior) $p(\theta \vert \lambda, m)$ using [prior hyperparameters](/D/prior) $\lambda$. Then, the prior distribution is called a "maximum entropy prior", if
31
+
32
+
1) when $\theta$ is a [discrete random variable](/D/rvar-disc), it maximizes the [entropy](/D/ent) of the prior [probability mass function](/D/pmf):
2) when $\theta$ is a [continuous random variable](/D/rvar-disc), it maximizes the [differential entropy](/D/dent) of the prior [probability density function](/D/pdf):
**Definition:** Let $m$ be a [generative model](/D/gm) with [likelihood function](/D/lf) $p(y \vert \theta, m)$ and [prior distribution](/D/prior) $p(\theta \vert \lambda, m)$ using [prior hyperparameters](/D/prior) $\lambda$. Let $p(\theta \vert y, \lambda, m)$ be the [posterior distribution](/D/post) that is [proportional to the the joint likelihood](/P/post-jl). Then, the prior distribution is called a "reference prior", if it maximizes the [expected](/D/mean)[Kullback-Leibler divergence](/D/kl) of the posterior distribution relative to the prior distribution:
31
+
32
+
$$ \label{eq:prior-ref}
33
+
\lambda_{\mathrm{ref}} = \operatorname*{arg\,max}_{\lambda} \mathrm{KL} \left[ p(\theta \vert y, \lambda, m) \, || \, p(\theta \vert \lambda, m) \right] \; .
**Definition:** Let $p(\theta \vert m)$ a [prior distribution](/D/prior) for the parameter $\theta \in \Theta$ of a [generative model](/D/gm) $m$. Then,
31
+
32
+
* the distribution is called a "uniform prior", if its [density](/D/pdf) is constant over the entire parameter space $\Theta$;
33
+
34
+
* the distribution is called a "non-uniform prior", if its [density](/D/pdf) is not constant over the parameter space $\Theta$.
0 commit comments