Skip to content

Commit 4836ff0

Browse files
authored
added proofs "bvn-mi", "mvn-mi"
1 parent 5289ab0 commit 4836ff0

2 files changed

Lines changed: 196 additions & 0 deletions

File tree

P/bvn-mi.md

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2024-11-01 11:51:06
9+
10+
title: "Mutual information of the bivariate normal distribution"
11+
chapter: "Probability Distributions"
12+
section: "Multivariate continuous distributions"
13+
topic: "Bivariate normal distribution"
14+
theorem: "Mutual information"
15+
16+
sources:
17+
- authors: "Krafft, Peter"
18+
year: 2013
19+
title: "Correlation and Mutual Information"
20+
in: "Princeton University Department of Computer Science: Laboratory for Intelligent Probabilistic Systems"
21+
pages: "February 13, 2013"
22+
url: "https://lips.cs.princeton.edu/correlation-and-mutual-information/"
23+
24+
proof_id: "P476"
25+
shortcut: "bvn-mi"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Theorem:** Let $X$ and $Y$ follow a [bivariate normal distribution](/D/bvn):
31+
32+
$$ \label{eq:bvn}
33+
\left[ \begin{matrix} X \\ Y \end{matrix} \right] \sim
34+
\mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \sigma_1^2 & \sigma_{12} \\ \sigma_{12} & \sigma_2^2 \end{matrix} \right] \right) \; .
35+
$$
36+
37+
Then, the [mutual information](/D/mi) of $X$ and $Y$ is
38+
39+
$$ \label{eq:bvn-lincomb}
40+
\mathrm{I}(X,Y) = -\frac{1}{2} \ln (1-\rho^2)
41+
$$
42+
43+
where $\rho$ is the [correlation](/D/corr) of $X$ and $Y$.
44+
45+
46+
**Proof:** [Mutual information can be written in terms of marginal and joint differential entropy](/P/cmi-mjde):
47+
48+
$$ \label{eq:cmi-mjde}
49+
\mathrm{I}(X,Y) = \mathrm{h}(X) + \mathrm{h}(Y) - \mathrm{h}(X,Y) \; .
50+
$$
51+
52+
The [marginal distributions of the multivariate normal distribution are also multivariate normal]
53+
54+
$$ \label{eq:mvn-marg}
55+
\left[ \begin{matrix} X_1 \\ X_2 \end{matrix} \right] \sim
56+
\mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{matrix} \right] \right)
57+
\quad \Rightarrow \quad
58+
X_1 \sim \mathcal{N}\left( \mu_1, \Sigma_{11} \right) \; ,
59+
$$
60+
61+
such that the [marginals](/D/marg) of the [bivariate normal distribution](/D/bvn) are [univariate normal distribution](/D/norm):
62+
63+
$$ \label{eq:bvn-marg}
64+
\left[ \begin{matrix} X \\ Y \end{matrix} \right] \sim
65+
\mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \sigma_1^2 & \sigma_{12} \\ \sigma_{12} & \sigma_2^2 \end{matrix} \right] \right)
66+
\quad \Rightarrow \quad
67+
X \sim \mathcal{N}\left( \mu_1, \sigma_1^2 \right)
68+
\quad \text{and} \quad
69+
Y \sim \mathcal{N}\left( \mu_2, \sigma_2^2 \right) \; .
70+
$$
71+
72+
The [differential entropy of the univariate normal distribution](/P/norm-dent) is
73+
74+
$$ \label{eq:norm-dent}
75+
\mathrm{h}(X) = \frac{1}{2} \ln\left( 2 \pi \sigma^2 e \right)
76+
$$
77+
78+
and the [differential entropy of the multivariate normal distribution](/P/mvn-dent) is
79+
80+
$$ \label{eq:mvn-dent}
81+
\mathrm{h}(x) = \frac{n}{2} \ln(2\pi) + \frac{1}{2} \ln|\Sigma| + \frac{1}{2} n
82+
$$
83+
84+
where $\lvert \Sigma \rvert$ is the determinant of the [covariance matrix](/D/covmat) $\Sigma$. A two-dimensional [covariance matrix can be rewritten in terms of correlations](/P/covmat-corrmat) as follows:
85+
86+
$$ \label{eq:Sigma}
87+
\begin{split}
88+
\Sigma
89+
&= \left[ \begin{matrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{matrix} \right] \left[ \begin{matrix} 1 & \rho \\ \rho & 1 \end{matrix} \right] \left[ \begin{matrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{matrix} \right] \\
90+
&= \left[ \begin{matrix} \sigma_1^2 & \rho \, \sigma_1 \sigma_2 \\ \rho \, \sigma_1 \sigma_2 & \sigma_2^2 \end{matrix} \right] \; .
91+
\end{split}
92+
$$
93+
94+
Combining \eqref{eq:cmi-mjde} with \eqref{eq:norm-dent} and \eqref{eq:mvn-dent}, applying $n = 2$, we get:
95+
96+
$$ \label{eq:bvn-mi}
97+
\begin{split}
98+
\mathrm{I}(X,Y)
99+
&\overset{\eqref{eq:cmi-mjde}}{=} \mathrm{h}(X) + \mathrm{h}(Y) - \mathrm{h}(X,Y) \\
100+
&\overset{\eqref{eq:bvn-marg}}{=} \mathrm{h}\left[ \mathcal{N}\left( \mu_1, \sigma_1^2 \right) \right] + \mathrm{h}\left[ \mathcal{N}\left( \mu_2, \sigma_2^2 \right) \right] - \mathrm{h}\left[ \mathcal{N}\left( \mu, \Sigma \right) \right] \\
101+
&\overset{\eqref{eq:Sigma}}{=} \left[ \frac{1}{2} \ln\left( 2 \pi \sigma_1^2 e \right) \right] + \left[ \frac{1}{2} \ln\left( 2 \pi \sigma_2^2 e \right) \right] - \left[ \frac{2}{2} \ln(2\pi) + \frac{1}{2} \ln \left| \left[ \begin{matrix} \sigma_1^2 & \rho \, \sigma_1 \sigma_2 \\ \rho \, \sigma_1 \sigma_2 & \sigma_2^2 \end{matrix} \right] \right| + \frac{1}{2} \cdot 2 \right] \\
102+
&= \left( \frac{2}{2} \ln(2\pi) + \frac{2}{2} \ln(e) - \ln(2\pi) - 1 \right) + \left( \frac{1}{2} \ln\left( \sigma_1^2 \right) + \frac{1}{2} \ln\left( \sigma_2^2 \right) - \frac{1}{2} \ln \left| \left[ \begin{matrix} \sigma_1^2 & \rho \, \sigma_1 \sigma_2 \\ \rho \, \sigma_1 \sigma_2 & \sigma_2^2 \end{matrix} \right] \right| \right) \\
103+
&= \frac{1}{2} \left[ \ln\left( \sigma_1^2 \right) + \ln\left( \sigma_2^2 \right) - \ln\left( \sigma_1^2 \sigma_2^2 - (\rho \, \sigma_1 \sigma_2)^2 \right) \right] \\
104+
&= \frac{1}{2} \ln \left[ \frac{\sigma_1^2 \sigma_2^2}{\sigma_1^2 \sigma_2^2 - (\rho \, \sigma_1 \sigma_2)^2} \right] \\
105+
&= \frac{1}{2} \ln \left[ \frac{\sigma_1^2 \sigma_2^2}{\sigma_1^2 \sigma_2^2 (1-\rho^2)} \right] \\
106+
&= \frac{1}{2} \ln \left[ \frac{1}{1-\rho^2} \right] \\
107+
&= -\frac{1}{2} \ln (1-\rho^2) \; .
108+
\end{split}
109+
$$

P/mvn-mi.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2024-11-01 12:36:44
9+
10+
title: "Mutual information of the multivariate normal distribution"
11+
chapter: "Probability Distributions"
12+
section: "Multivariate continuous distributions"
13+
topic: "Multivariate normal distribution"
14+
theorem: "Mutual information"
15+
16+
sources:
17+
- authors: "a06e"
18+
year: 2019
19+
title: "Mutual information between subsets of variables in the multivariate normal distribution"
20+
in: "StackExchange CrossValidated"
21+
pages: "retrieved on 2024-11-01"
22+
url: "https://stats.stackexchange.com/a/438613/270304"
23+
24+
proof_id: "P477"
25+
shortcut: "mvn-mi"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Theorem:** Let $X \in \mathbb{R}^n$ and $Y \in \mathbb{R}^m$ be [random vectors](/D/rvec) that are [jointly multivariate normal](/D/mvn):
31+
32+
$$ \label{eq:bvn}
33+
\left[ \begin{matrix} X \\ Y \end{matrix} \right] \sim
34+
\mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \Sigma_1 & \Sigma_{12} \\ \Sigma_{21} & \Sigma_2 \end{matrix} \right] \right) \; .
35+
$$
36+
37+
Then, the [mutual information](/D/mi) of $X$ and $Y$ is
38+
39+
$$ \label{eq:bvn-lincomb}
40+
\mathrm{I}(X,Y) = \frac{1}{2} \ln \left[ \frac{|\Sigma_1| |\Sigma_2|}{|\Sigma|} \right]
41+
$$
42+
43+
where $\mu \in \mathbb{R}^p$ and $\Sigma \in \mathbb{R}^{p \times p}$ are the [mean](/D/mean) and [covariance matrix](/D/covmat) of the [random vector](/D/rvec) $\left[ \begin{matrix} X \\\\ Y \end{matrix} \right] \in \mathbb{R}^p$, respectively, where $p = n + m$.
44+
45+
46+
**Proof:** [Mutual information can be written in terms of marginal and joint differential entropy](/P/cmi-mjde):
47+
48+
$$ \label{eq:cmi-mjde}
49+
\mathrm{I}(X,Y) = \mathrm{h}(X) + \mathrm{h}(Y) - \mathrm{h}(X,Y) \; .
50+
$$
51+
52+
The [marginal distributions of the multivariate normal distribution are also multivariate normal]
53+
54+
$$ \label{eq:mvn-marg}
55+
\left[ \begin{matrix} X_1 \\ X_2 \end{matrix} \right] \sim
56+
\mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{matrix} \right] \right)
57+
\quad \Rightarrow \quad
58+
X_1 \sim \mathcal{N}\left( \mu_1, \Sigma_{11} \right) \; ,
59+
$$
60+
61+
such that the [marginals](/D/marg) of $X$ and $Y$ are:
62+
63+
$$ \label{eq:X-Y-marg}
64+
X \sim \mathcal{N}\left( \mu_1, \Sigma_1 \right)
65+
\quad \text{and} \quad
66+
Y \sim \mathcal{N}\left( \mu_2, \Sigma_2 \right) \; .
67+
$$
68+
69+
The [differential entropy of the multivariate normal distribution](/P/mvn-dent) is
70+
71+
$$ \label{eq:mvn-dent}
72+
\mathrm{h}(x) = \frac{n}{2} \ln(2\pi) + \frac{1}{2} \ln|\Sigma| + \frac{1}{2} n
73+
$$
74+
75+
where $\lvert \Sigma \rvert$ is the determinant of $\Sigma$. Combining \eqref{eq:cmi-mjde} with \eqref{eq:mvn-dent}, we get:
76+
77+
$$ \label{eq:bvn-mi}
78+
\begin{split}
79+
\mathrm{I}(X,Y)
80+
&\overset{\eqref{eq:cmi-mjde}}{=} \mathrm{h}(X) + \mathrm{h}(Y) - \mathrm{h}(X,Y) \\
81+
&\overset{\eqref{eq:X-Y-marg}}{=} \mathrm{h}\left[ \mathcal{N}\left( \mu_1, \Sigma_1 \right) \right] + \mathrm{h}\left[ \mathcal{N}\left( \mu_2, \Sigma_2 \right) \right] - \mathrm{h}\left[ \mathcal{N}\left( \mu, \Sigma \right) \right] \\
82+
&\overset{\eqref{eq:mvn-dent}}{=} \left[ \frac{n}{2} \ln(2\pi) + \frac{1}{2} \ln|\Sigma_1| + \frac{1}{2} n \right] + \left[ \frac{m}{2} \ln(2\pi) + \frac{1}{2} \ln|\Sigma_2| + \frac{1}{2} m \right] - \left[ \frac{p}{2} \ln(2\pi) + \frac{1}{2} \ln|\Sigma| + \frac{1}{2} p \right] \\
83+
&= \left( \frac{n+m-p}{2} \ln(2\pi) + \frac{1}{2}(n+m-p) \right) + \left( \frac{1}{2} \ln|\Sigma_1| + \frac{1}{2} \ln|\Sigma_2| - \frac{1}{2} \ln|\Sigma| \right) \\
84+
&= \frac{1}{2} \left( \ln|\Sigma_1| + \ln|\Sigma_2| - \ln|\Sigma| \right) \\
85+
&= \frac{1}{2} \ln \left[ \frac{|\Sigma_1| |\Sigma_2|}{|\Sigma|} \right] \; .
86+
\end{split}
87+
$$

0 commit comments

Comments
 (0)