Skip to content

Commit c5674ab

Browse files
authored
Merge pull request #193 from JoramSoch/master
added 6 proofs
2 parents c6f4d43 + 7f161cf commit c5674ab

7 files changed

Lines changed: 560 additions & 29 deletions

File tree

I/ToC.md

Lines changed: 35 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,8 @@ title: "Table of Contents"
368368
&emsp;&ensp; 3.1.6. **[Mean](/P/cuni-mean)** <br>
369369
&emsp;&ensp; 3.1.7. **[Median](/P/cuni-med)** <br>
370370
&emsp;&ensp; 3.1.8. **[Mode](/P/cuni-mode)** <br>
371+
&emsp;&ensp; 3.1.9. **[Variance](/P/cuni-var)** <br>
372+
&emsp;&ensp; 3.1.10. **[Differential entropy](/P/cuni-dent)** <br>
371373

372374
3.2. Normal distribution <br>
373375
&emsp;&ensp; 3.2.1. *[Definition](/D/norm)* <br>
@@ -470,17 +472,18 @@ title: "Table of Contents"
470472

471473
4.1. Multivariate normal distribution <br>
472474
&emsp;&ensp; 4.1.1. *[Definition](/D/mvn)* <br>
473-
&emsp;&ensp; 4.1.2. **[Special case of matrix-normal distribution](/P/mvn-matn)** <br>
474-
&emsp;&ensp; 4.1.3. **[Probability density function](/P/mvn-pdf)** <br>
475-
&emsp;&ensp; 4.1.4. **[Mean](/P/mvn-mean)** <br>
476-
&emsp;&ensp; 4.1.5. **[Covariance](/P/mvn-cov)** <br>
477-
&emsp;&ensp; 4.1.6. **[Differential entropy](/P/mvn-dent)** <br>
478-
&emsp;&ensp; 4.1.7. **[Kullback-Leibler divergence](/P/mvn-kl)** <br>
479-
&emsp;&ensp; 4.1.8. **[Linear transformation](/P/mvn-ltt)** <br>
480-
&emsp;&ensp; 4.1.9. **[Marginal distributions](/P/mvn-marg)** <br>
481-
&emsp;&ensp; 4.1.10. **[Conditional distributions](/P/mvn-cond)** <br>
482-
&emsp;&ensp; 4.1.11. **[Conditions for independence](/P/mvn-ind)** <br>
483-
&emsp;&ensp; 4.1.12. **[Independence of products](/P/mvn-indprod)** <br>
475+
&emsp;&ensp; 4.1.2. **[Relationship to chi-squared distribution](/P/mvn-chi2)** <br>
476+
&emsp;&ensp; 4.1.3. **[Special case of matrix-normal distribution](/P/mvn-matn)** <br>
477+
&emsp;&ensp; 4.1.4. **[Probability density function](/P/mvn-pdf)** <br>
478+
&emsp;&ensp; 4.1.5. **[Mean](/P/mvn-mean)** <br>
479+
&emsp;&ensp; 4.1.6. **[Covariance](/P/mvn-cov)** <br>
480+
&emsp;&ensp; 4.1.7. **[Differential entropy](/P/mvn-dent)** <br>
481+
&emsp;&ensp; 4.1.8. **[Kullback-Leibler divergence](/P/mvn-kl)** <br>
482+
&emsp;&ensp; 4.1.9. **[Linear transformation](/P/mvn-ltt)** <br>
483+
&emsp;&ensp; 4.1.10. **[Marginal distributions](/P/mvn-marg)** <br>
484+
&emsp;&ensp; 4.1.11. **[Conditional distributions](/P/mvn-cond)** <br>
485+
&emsp;&ensp; 4.1.12. **[Conditions for independence](/P/mvn-ind)** <br>
486+
&emsp;&ensp; 4.1.13. **[Independence of products](/P/mvn-indprod)** <br>
484487

485488
4.2. Multivariate t-distribution <br>
486489
&emsp;&ensp; 4.2.1. *[Definition](/D/mvt)* <br>
@@ -620,22 +623,24 @@ title: "Table of Contents"
620623
&emsp;&ensp; 1.5.10. *[Projection matrix](/D/pmat)* <br>
621624
&emsp;&ensp; 1.5.11. *[Residual-forming matrix](/D/rfmat)* <br>
622625
&emsp;&ensp; 1.5.12. **[Estimation, projection and residual-forming matrix](/P/mlr-mat)** <br>
623-
&emsp;&ensp; 1.5.13. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)** <br>
624-
&emsp;&ensp; 1.5.14. **[Independence of estimated parameters and residuals](/P/mlr-ind)** <br>
625-
&emsp;&ensp; 1.5.15. **[Distribution of estimated parameters, signal and residuals](/P/mlr-wlsdist)** <br>
626-
&emsp;&ensp; 1.5.16. **[Distribution of residual sum of squares](/P/mlr-rssdist)** <br>
627-
&emsp;&ensp; 1.5.17. **[Weighted least squares](/P/mlr-wls)** (1) <br>
628-
&emsp;&ensp; 1.5.18. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
629-
&emsp;&ensp; 1.5.19. *[t-contrast](/D/tcon)* <br>
630-
&emsp;&ensp; 1.5.20. *[F-contrast](/D/fcon)* <br>
631-
&emsp;&ensp; 1.5.21. **[Contrast-based t-test](/P/mlr-t)** <br>
632-
&emsp;&ensp; 1.5.22. **[Contrast-based F-test](/P/mlr-f)** <br>
633-
&emsp;&ensp; 1.5.23. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
634-
&emsp;&ensp; 1.5.24. **[Maximum log-likelihood](/P/mlr-mll)** <br>
635-
&emsp;&ensp; 1.5.25. **[Deviance function](/P/mlr-dev)** <br>
636-
&emsp;&ensp; 1.5.26. **[Akaike information criterion](/P/mlr-aic)** <br>
637-
&emsp;&ensp; 1.5.27. **[Bayesian information criterion](/P/mlr-bic)** <br>
638-
&emsp;&ensp; 1.5.28. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
626+
&emsp;&ensp; 1.5.13. **[Symmetry of projection and residual-forming matrix](/P/mlr-symm)** <br>
627+
&emsp;&ensp; 1.5.14. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)** <br>
628+
&emsp;&ensp; 1.5.15. **[Independence of estimated parameters and residuals](/P/mlr-ind)** <br>
629+
&emsp;&ensp; 1.5.16. **[Distribution of OLS estimates, signal and residuals](/P/mlr-olsdist)** <br>
630+
&emsp;&ensp; 1.5.17. **[Distribution of WLS estimates, signal and residuals](/P/mlr-wlsdist)** <br>
631+
&emsp;&ensp; 1.5.18. **[Distribution of residual sum of squares](/P/mlr-rssdist)** <br>
632+
&emsp;&ensp; 1.5.19. **[Weighted least squares](/P/mlr-wls)** (1) <br>
633+
&emsp;&ensp; 1.5.20. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
634+
&emsp;&ensp; 1.5.21. *[t-contrast](/D/tcon)* <br>
635+
&emsp;&ensp; 1.5.22. *[F-contrast](/D/fcon)* <br>
636+
&emsp;&ensp; 1.5.23. **[Contrast-based t-test](/P/mlr-t)** <br>
637+
&emsp;&ensp; 1.5.24. **[Contrast-based F-test](/P/mlr-f)** <br>
638+
&emsp;&ensp; 1.5.25. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
639+
&emsp;&ensp; 1.5.26. **[Maximum log-likelihood](/P/mlr-mll)** <br>
640+
&emsp;&ensp; 1.5.27. **[Deviance function](/P/mlr-dev)** <br>
641+
&emsp;&ensp; 1.5.28. **[Akaike information criterion](/P/mlr-aic)** <br>
642+
&emsp;&ensp; 1.5.29. **[Bayesian information criterion](/P/mlr-bic)** <br>
643+
&emsp;&ensp; 1.5.30. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
639644

640645
1.6. Bayesian linear regression <br>
641646
&emsp;&ensp; 1.6.1. **[Conjugate prior distribution](/P/blr-prior)** <br>
@@ -738,8 +743,9 @@ title: "Table of Contents"
738743

739744
1.1. Residual variance <br>
740745
&emsp;&ensp; 1.1.1. *[Definition](/D/resvar)* <br>
741-
&emsp;&ensp; 1.1.2. **[Maximum likelihood estimator is biased](/P/resvar-bias)** <br>
742-
&emsp;&ensp; 1.1.3. **[Construction of unbiased estimator](/P/resvar-unb)** <br>
746+
&emsp;&ensp; 1.1.2. **[Maximum likelihood estimator is biased (p = 1)](/P/resvar-bias)** <br>
747+
&emsp;&ensp; 1.1.3. **[Maximum likelihood estimator is biased (p > 1)](/P/resvar-biasp)** <br>
748+
&emsp;&ensp; 1.1.4. **[Construction of unbiased estimator](/P/resvar-unb)** <br>
743749

744750
1.2. R-squared <br>
745751
&emsp;&ensp; 1.2.1. *[Definition](/D/rsq)* <br>

P/cuni-dent.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 20221-12-20 18:21:00
9+
10+
title: "Differential entropy of the continuous uniform distribution"
11+
chapter: "Probability Distributions"
12+
section: "Univariate continuous distributions"
13+
topic: "Continuous uniform distribution"
14+
theorem: "Differential entropy"
15+
16+
sources:
17+
18+
proof_id: "P397"
19+
shortcut: "cuni-dent"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $X$ be a [random variable](/D/rvar) following a [continuous uniform distribution](/D/cuni):
25+
26+
$$ \label{eq:cuni}
27+
X \sim \mathcal{U}(a, b) \; .
28+
$$
29+
30+
Then, the [differential entropy](/D/dent) of $X$ is
31+
32+
$$ \label{eq:cuni-dent}
33+
\mathrm{h}(X) = \ln(b-a) \; .
34+
$$
35+
36+
37+
**Proof:** The [differential entropy](/D/dent) of a random variable is defined as
38+
39+
$$ \label{eq:dent}
40+
\mathrm{h}(X) = - \int_{\mathcal{X}} p(x) \, \log_b p(x) \, \mathrm{d}x \; .
41+
$$
42+
43+
To measure $h(X)$ in nats, we set $b = e$, such that
44+
45+
$$ \label{eq:dent-nats}
46+
\mathrm{h}(X) = - \int_{\mathcal{X}} p(x) \, \ln p(x) \, \mathrm{d}x \; .
47+
$$
48+
49+
With the [probability density function of the continuous uniform distribution](/P/cuni-pdf), the differential entropy of $X$ is:
50+
51+
$$ \label{eq:cuni-dent-qed}
52+
\begin{split}
53+
\mathrm{h}(X) &= - \int_a^b \frac{1}{b-a} \, \ln \left( \frac{1}{b-a} \right) \, \mathrm{d}x \\
54+
&= \frac{1}{b-a} \cdot \int_a^b \ln(b-a) \, \mathrm{d}x \\
55+
&= \frac{1}{b-a} \cdot \left[ x \cdot \ln(b-a) \right]_a^b \\
56+
&= \frac{1}{b-a} \cdot \left[ b \cdot \ln(b-a) - a \cdot \ln(b-a) \right] \\
57+
&= \frac{1}{b-a} (b-a) \ln(b-a) \\
58+
&= \ln(b-a) \; .
59+
\end{split}
60+
$$

P/cuni-var.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 20221-12-20 18:04:00
9+
10+
title: "Variance of the continuous uniform distribution"
11+
chapter: "Probability Distributions"
12+
section: "Univariate continuous distributions"
13+
topic: "Continuous uniform distribution"
14+
theorem: "Variance"
15+
16+
sources:
17+
18+
proof_id: "P396"
19+
shortcut: "cuni-var"
20+
username: "JoramSoch"
21+
---
22+
23+
24+
**Theorem:** Let $X$ be a [random variable](/D/rvar) following a [continuous uniform distribution](/D/cuni):
25+
26+
$$ \label{eq:cuni}
27+
X \sim \mathcal{U}(a, b) \; .
28+
$$
29+
30+
Then, the [variance](/D/var) of $X$ is
31+
32+
$$ \label{eq:cuni-var}
33+
\mathrm{Var}(X) = \frac{1}{12} (b-a)^2 \; .
34+
$$
35+
36+
37+
**Proof:** The [variance](/D/var) is the probability-weighted average of the squared deviation from the [mean](/D/mean):
38+
39+
$$ \label{eq:var}
40+
\mathrm{Var}(X) = \int_{\mathbb{R}} (x - \mathrm{E}(X))^2 \cdot f_\mathrm{X}(x) \, \mathrm{d}x \; .
41+
$$
42+
43+
With the [expected value](/P/cuni-mean) and [probability density function](/P/cuni-pdf) of the continuous uniform distribution, this reads:
44+
45+
$$ \label{eq:cuni-var-qed}
46+
\begin{split}
47+
\mathrm{Var}(X) &= \int_a^b \left( x - \frac{1}{2} (a+b) \right)^2 \cdot \frac{1}{b-a} \, \mathrm{d}x \\
48+
&= \frac{1}{b-a} \cdot \int_a^b \left( x - \frac{a+b}{2} \right)^2 \, \mathrm{d}x \\
49+
&= \frac{1}{b-a} \cdot \left[ \frac{1}{3} \left( x - \frac{a+b}{2} \right)^3 \right]_a^b \\
50+
&= \frac{1}{3(b-a)} \cdot \left[ \left( \frac{2x-(a+b)}{2} \right)^3 \right]_a^b \\
51+
&= \frac{1}{3(b-a)} \cdot \left[ \frac{1}{8} ( 2x-a-b )^3 \right]_a^b \\
52+
&= \frac{1}{24(b-a)} \cdot \left[ ( 2x-a-b )^3 \right]_a^b \\
53+
&= \frac{1}{24(b-a)} \cdot \left[ ( 2b-a-b )^3 - ( 2a-a-b )^3 \right] \\
54+
&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 - ( a-b )^3 \right] \\
55+
&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 + (-1)^3 ( a-b )^3 \right] \\
56+
&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 + ( b-a )^3 \right] \\
57+
&= \frac{2}{24(b-a)} (b-a)^3 \\
58+
&= \frac{1}{12} (b-a)^2 \; .
59+
\end{split}
60+
$$

P/mlr-olsdist.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-12-23 16:36:00
9+
10+
title: "Distributions of estimated parameters, fitted signal and residuals in multiple linear regression upon ordinary least squares"
11+
chapter: "Statistical Models"
12+
section: "Univariate normal data"
13+
topic: "Multiple linear regression"
14+
theorem: "Distribution of OLS estimates, fitted signal and residuals"
15+
16+
sources:
17+
- authors: "Koch, Karl-Rudolf"
18+
year: 2007
19+
title: "Linear Model"
20+
in: "Introduction to Bayesian Statistics"
21+
pages: "Springer, Berlin/Heidelberg, 2007, ch. 4, eqs. 4.2, 4.30"
22+
url: "https://www.springer.com/de/book/9783540727231"
23+
doi: "10.1007/978-3-540-72726-2"
24+
- authors: "Penny, William"
25+
year: 2006
26+
title: "Multiple Regression"
27+
in: "Mathematics for Brain Imaging"
28+
pages: "ch. 1.5, pp. 39-41, eqs. 1.106-1.110"
29+
url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf"
30+
31+
proof_id: "P400"
32+
shortcut: "mlr-olsdist"
33+
username: "JoramSoch"
34+
---
35+
36+
37+
**Theorem:** Assume a [linear regression model](/D/mlr) with independent observations
38+
39+
$$ \label{eq:mlr}
40+
y = X\beta + \varepsilon, \; \varepsilon_i \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2)
41+
$$
42+
43+
and consider estimation using [ordinary least squares](/P/mlr-ols). Then, the estimated parameters, fitted signal and residuals are distributed as
44+
45+
$$ \label{eq:mlr-dist}
46+
\begin{split}
47+
\hat{\beta} &\sim \mathcal{N}\left( \beta, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \\
48+
\hat{y} &\sim \mathcal{N}\left( X \beta, \sigma^2 P \right) \\
49+
\hat{\varepsilon} &\sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right)
50+
\end{split}
51+
$$
52+
53+
where $P$ is the [projection matrix](/D/pmat) for [ordinary least squares](/P/mlr-ols)
54+
55+
$$ \label{eq:mlr-pmat}
56+
P = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \; .
57+
$$
58+
59+
60+
**Proof:** We will use the [linear transformation theorem for the multivariate normal distribution](/P/mvn-ltt):
61+
62+
$$ \label{eq:mvn-ltt}
63+
x \sim \mathcal{N}(\mu, \Sigma) \quad \Rightarrow \quad y = Ax + b \sim \mathcal{N}(A\mu + b, A \Sigma A^\mathrm{T}) \; .
64+
$$
65+
66+
The distributional assumption in \eqref{eq:mlr} [is equivalent to](/D/mvn-ind):
67+
68+
$$ \label{eq:mlr-vect}
69+
y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 I_n) \; .
70+
$$
71+
72+
Applying \eqref{eq:mvn-ltt} to \eqref{eq:mlr-vect}, the measured data are distributed as
73+
74+
$$ \label{eq:y-dist}
75+
y \sim \mathcal{N}\left( X \beta, \sigma^2 I_n \right) \; .
76+
$$
77+
78+
1) The [parameter estimates from ordinary least sqaures](/P/mlr-ols) are given by
79+
80+
$$ \label{eq:b-est}
81+
\hat{\beta} = (X^\mathrm{T} X)^{-1} X^\mathrm{T} y
82+
$$
83+
84+
and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:b-est}, they are distributed as
85+
86+
$$ \label{eq:b-est-dist}
87+
\begin{split}
88+
\hat{\beta} &\sim \mathcal{N}\left( \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] I_n \left[ X (X^\mathrm{T} X)^{-1} \right] \right) \\
89+
&\sim \mathcal{N}\left( \beta, \, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \; .
90+
\end{split}
91+
$$
92+
93+
2) The [fitted signal in multiple linear regression](/P/mlr-mat) is given by
94+
95+
$$ \label{eq:y-est}
96+
\hat{y} = X \hat{\beta} = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} y = P y
97+
$$
98+
99+
and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:y-est}, they are distributed as
100+
101+
$$ \label{eq:y-est-dist}
102+
\begin{split}
103+
\hat{y} &\sim \mathcal{N}\left( X \beta, \, \sigma^2 X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) \\
104+
&\sim \mathcal{N}\left( X \beta, \, \sigma^2 P \right) \; .
105+
\end{split}
106+
$$
107+
108+
3) The [residuals of the linear regression model](/P/mlr-mat) are given by
109+
110+
$$ \label{eq:e-est}
111+
\hat{\varepsilon} = y - X \hat{\beta} = \left( I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) y = \left( I_n - P \right) y
112+
$$
113+
114+
and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:e-est}, they are distributed as
115+
116+
$$ \label{eq:e-est-dist-s1}
117+
\begin{split}
118+
\hat{\varepsilon} &\sim \mathcal{N}\left( \left[ I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ I_n - P \right] I_n \left[ I_n - P \right]^\mathrm{T} \right) \\
119+
&\sim \mathcal{N}\left( X \beta - X \beta, \, \sigma^2 \left[ I_n - P \right] \left[ I_n - P \right]^\mathrm{T} \right) \; .
120+
\end{split}
121+
$$
122+
123+
Because the [residual-forming matrix](/D/rfm) is [symmetric](/P/mlr-symm) and [idempotent](/P/mlr-idem), this becomes:
124+
125+
$$ \label{eq:e-est-dist-s2}
126+
\hat{\varepsilon} \sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) \; .
127+
$$

0 commit comments

Comments
 (0)