StatProofBook
diff --git a/‎I/ToC.md‎
Lines changed: 35 additions & 29 deletions b/‎I/ToC.md‎
Lines changed: 35 additions & 29 deletions
diff --git a/‎P/cuni-dent.md‎
Lines changed: 60 additions & 0 deletions b/‎P/cuni-dent.md‎
Lines changed: 60 additions & 0 deletions
diff --git a/‎P/cuni-var.md‎
Lines changed: 60 additions & 0 deletions b/‎P/cuni-var.md‎
Lines changed: 60 additions & 0 deletions
diff --git a/‎P/mlr-olsdist.md‎
Lines changed: 127 additions & 0 deletions b/‎P/mlr-olsdist.md‎
Lines changed: 127 additions & 0 deletions
@@ -368,6 +368,8 @@ title: "Table of Contents"
    &emsp;&ensp; 3.1.6. **[Mean](/P/cuni-mean)** <br>
    &emsp;&ensp; 3.1.7. **[Median](/P/cuni-med)** <br>
    &emsp;&ensp; 3.1.8. **[Mode](/P/cuni-mode)** <br>
+   &emsp;&ensp; 3.1.9. **[Variance](/P/cuni-var)** <br>
+   &emsp;&ensp; 3.1.10. **[Differential entropy](/P/cuni-dent)** <br>
 
    3.2. Normal distribution <br>
    &emsp;&ensp; 3.2.1. *[Definition](/D/norm)* <br>
@@ -470,17 +472,18 @@ title: "Table of Contents"
 
    4.1. Multivariate normal distribution <br>
    &emsp;&ensp; 4.1.1. *[Definition](/D/mvn)* <br>
-   &emsp;&ensp; 4.1.2. **[Special case of matrix-normal distribution](/P/mvn-matn)** <br>
-   &emsp;&ensp; 4.1.3. **[Probability density function](/P/mvn-pdf)** <br>
-   &emsp;&ensp; 4.1.4. **[Mean](/P/mvn-mean)** <br>
-   &emsp;&ensp; 4.1.5. **[Covariance](/P/mvn-cov)** <br>
-   &emsp;&ensp; 4.1.6. **[Differential entropy](/P/mvn-dent)** <br>
-   &emsp;&ensp; 4.1.7. **[Kullback-Leibler divergence](/P/mvn-kl)** <br>
-   &emsp;&ensp; 4.1.8. **[Linear transformation](/P/mvn-ltt)** <br>
-   &emsp;&ensp; 4.1.9. **[Marginal distributions](/P/mvn-marg)** <br>
-   &emsp;&ensp; 4.1.10. **[Conditional distributions](/P/mvn-cond)** <br>
-   &emsp;&ensp; 4.1.11. **[Conditions for independence](/P/mvn-ind)** <br>
-   &emsp;&ensp; 4.1.12. **[Independence of products](/P/mvn-indprod)** <br>
+   &emsp;&ensp; 4.1.2. **[Relationship to chi-squared distribution](/P/mvn-chi2)** <br>
+   &emsp;&ensp; 4.1.3. **[Special case of matrix-normal distribution](/P/mvn-matn)** <br>
+   &emsp;&ensp; 4.1.4. **[Probability density function](/P/mvn-pdf)** <br>
+   &emsp;&ensp; 4.1.5. **[Mean](/P/mvn-mean)** <br>
+   &emsp;&ensp; 4.1.6. **[Covariance](/P/mvn-cov)** <br>
+   &emsp;&ensp; 4.1.7. **[Differential entropy](/P/mvn-dent)** <br>
+   &emsp;&ensp; 4.1.8. **[Kullback-Leibler divergence](/P/mvn-kl)** <br>
+   &emsp;&ensp; 4.1.9. **[Linear transformation](/P/mvn-ltt)** <br>
+   &emsp;&ensp; 4.1.10. **[Marginal distributions](/P/mvn-marg)** <br>
+   &emsp;&ensp; 4.1.11. **[Conditional distributions](/P/mvn-cond)** <br>
+   &emsp;&ensp; 4.1.12. **[Conditions for independence](/P/mvn-ind)** <br>
+   &emsp;&ensp; 4.1.13. **[Independence of products](/P/mvn-indprod)** <br>
 
    4.2. Multivariate t-distribution <br>
    &emsp;&ensp; 4.2.1. *[Definition](/D/mvt)* <br>
@@ -620,22 +623,24 @@ title: "Table of Contents"
    &emsp;&ensp; 1.5.10. *[Projection matrix](/D/pmat)* <br>
    &emsp;&ensp; 1.5.11. *[Residual-forming matrix](/D/rfmat)* <br>
    &emsp;&ensp; 1.5.12. **[Estimation, projection and residual-forming matrix](/P/mlr-mat)** <br>
-   &emsp;&ensp; 1.5.13. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)** <br>
-   &emsp;&ensp; 1.5.14. **[Independence of estimated parameters and residuals](/P/mlr-ind)** <br>
-   &emsp;&ensp; 1.5.15. **[Distribution of estimated parameters, signal and residuals](/P/mlr-wlsdist)** <br>
-   &emsp;&ensp; 1.5.16. **[Distribution of residual sum of squares](/P/mlr-rssdist)** <br>
-   &emsp;&ensp; 1.5.17. **[Weighted least squares](/P/mlr-wls)** (1) <br>
-   &emsp;&ensp; 1.5.18. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
-   &emsp;&ensp; 1.5.19. *[t-contrast](/D/tcon)* <br>
-   &emsp;&ensp; 1.5.20. *[F-contrast](/D/fcon)* <br>
-   &emsp;&ensp; 1.5.21. **[Contrast-based t-test](/P/mlr-t)** <br>
-   &emsp;&ensp; 1.5.22. **[Contrast-based F-test](/P/mlr-f)** <br>
-   &emsp;&ensp; 1.5.23. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
-   &emsp;&ensp; 1.5.24. **[Maximum log-likelihood](/P/mlr-mll)** <br>
-   &emsp;&ensp; 1.5.25. **[Deviance function](/P/mlr-dev)** <br>
-   &emsp;&ensp; 1.5.26. **[Akaike information criterion](/P/mlr-aic)** <br>
-   &emsp;&ensp; 1.5.27. **[Bayesian information criterion](/P/mlr-bic)** <br>
-   &emsp;&ensp; 1.5.28. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
+   &emsp;&ensp; 1.5.13. **[Symmetry of projection and residual-forming matrix](/P/mlr-symm)** <br>
+   &emsp;&ensp; 1.5.14. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)** <br>
+   &emsp;&ensp; 1.5.15. **[Independence of estimated parameters and residuals](/P/mlr-ind)** <br>
+   &emsp;&ensp; 1.5.16. **[Distribution of OLS estimates, signal and residuals](/P/mlr-olsdist)** <br>
+   &emsp;&ensp; 1.5.17. **[Distribution of WLS estimates, signal and residuals](/P/mlr-wlsdist)** <br>
+   &emsp;&ensp; 1.5.18. **[Distribution of residual sum of squares](/P/mlr-rssdist)** <br>
+   &emsp;&ensp; 1.5.19. **[Weighted least squares](/P/mlr-wls)** (1) <br>
+   &emsp;&ensp; 1.5.20. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
+   &emsp;&ensp; 1.5.21. *[t-contrast](/D/tcon)* <br>
+   &emsp;&ensp; 1.5.22. *[F-contrast](/D/fcon)* <br>
+   &emsp;&ensp; 1.5.23. **[Contrast-based t-test](/P/mlr-t)** <br>
+   &emsp;&ensp; 1.5.24. **[Contrast-based F-test](/P/mlr-f)** <br>
+   &emsp;&ensp; 1.5.25. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
+   &emsp;&ensp; 1.5.26. **[Maximum log-likelihood](/P/mlr-mll)** <br>
+   &emsp;&ensp; 1.5.27. **[Deviance function](/P/mlr-dev)** <br>
+   &emsp;&ensp; 1.5.28. **[Akaike information criterion](/P/mlr-aic)** <br>
+   &emsp;&ensp; 1.5.29. **[Bayesian information criterion](/P/mlr-bic)** <br>
+   &emsp;&ensp; 1.5.30. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
 
    1.6. Bayesian linear regression <br>
    &emsp;&ensp; 1.6.1. **[Conjugate prior distribution](/P/blr-prior)** <br>
@@ -738,8 +743,9 @@ title: "Table of Contents"
 
    1.1. Residual variance <br>
    &emsp;&ensp; 1.1.1. *[Definition](/D/resvar)* <br>
-   &emsp;&ensp; 1.1.2. **[Maximum likelihood estimator is biased](/P/resvar-bias)** <br>
-   &emsp;&ensp; 1.1.3. **[Construction of unbiased estimator](/P/resvar-unb)** <br>
+   &emsp;&ensp; 1.1.2. **[Maximum likelihood estimator is biased (p = 1)](/P/resvar-bias)** <br>
+   &emsp;&ensp; 1.1.3. **[Maximum likelihood estimator is biased (p > 1)](/P/resvar-biasp)** <br>
+   &emsp;&ensp; 1.1.4. **[Construction of unbiased estimator](/P/resvar-unb)** <br>
 
    1.2. R-squared <br>
    &emsp;&ensp; 1.2.1. *[Definition](/D/rsq)* <br>
 
@@ -0,0 +1,60 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 20221-12-20 18:21:00
+
+title: "Differential entropy of the continuous uniform distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Continuous uniform distribution"
+theorem: "Differential entropy"
+
+sources:
+
+proof_id: "P397"
+shortcut: "cuni-dent"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a [random variable](/D/rvar) following a [continuous uniform distribution](/D/cuni):
+
+$$ \label{eq:cuni}
+X \sim \mathcal{U}(a, b) \; .
+$$
+
+Then, the [differential entropy](/D/dent) of $X$ is
+
+$$ \label{eq:cuni-dent}
+\mathrm{h}(X) = \ln(b-a) \; .
+$$
+
+
+**Proof:** The [differential entropy](/D/dent) of a random variable is defined as
+
+$$ \label{eq:dent}
+\mathrm{h}(X) = - \int_{\mathcal{X}} p(x) \, \log_b p(x) \, \mathrm{d}x \; .
+$$
+
+To measure $h(X)$ in nats, we set $b = e$, such that
+
+$$ \label{eq:dent-nats}
+\mathrm{h}(X) = - \int_{\mathcal{X}} p(x) \, \ln p(x) \, \mathrm{d}x \; .
+$$
+
+With the [probability density function of the continuous uniform distribution](/P/cuni-pdf), the differential entropy of $X$ is:
+
+$$ \label{eq:cuni-dent-qed}
+\begin{split}
+\mathrm{h}(X) &= - \int_a^b \frac{1}{b-a} \, \ln \left( \frac{1}{b-a} \right) \, \mathrm{d}x \\
+&= \frac{1}{b-a} \cdot \int_a^b \ln(b-a) \, \mathrm{d}x \\
+&= \frac{1}{b-a} \cdot \left[ x \cdot \ln(b-a) \right]_a^b \\
+&= \frac{1}{b-a} \cdot \left[ b \cdot \ln(b-a) - a \cdot \ln(b-a) \right] \\
+&= \frac{1}{b-a} (b-a) \ln(b-a) \\
+&= \ln(b-a) \; .
+\end{split}
+$$
@@ -0,0 +1,60 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 20221-12-20 18:04:00
+
+title: "Variance of the continuous uniform distribution"
+chapter: "Probability Distributions"
+section: "Univariate continuous distributions"
+topic: "Continuous uniform distribution"
+theorem: "Variance"
+
+sources:
+
+proof_id: "P396"
+shortcut: "cuni-var"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $X$ be a [random variable](/D/rvar) following a [continuous uniform distribution](/D/cuni):
+
+$$ \label{eq:cuni}
+X \sim \mathcal{U}(a, b) \; .
+$$
+
+Then, the [variance](/D/var) of $X$ is
+
+$$ \label{eq:cuni-var}
+\mathrm{Var}(X) = \frac{1}{12} (b-a)^2 \; .
+$$
+
+
+**Proof:** The [variance](/D/var) is the probability-weighted average of the squared deviation from the [mean](/D/mean):
+
+$$ \label{eq:var}
+\mathrm{Var}(X) = \int_{\mathbb{R}} (x - \mathrm{E}(X))^2 \cdot f_\mathrm{X}(x) \, \mathrm{d}x \; .
+$$
+
+With the [expected value](/P/cuni-mean) and [probability density function](/P/cuni-pdf) of the continuous uniform distribution, this reads:
+
+$$ \label{eq:cuni-var-qed}
+\begin{split}
+\mathrm{Var}(X) &= \int_a^b \left( x - \frac{1}{2} (a+b) \right)^2 \cdot \frac{1}{b-a} \, \mathrm{d}x \\
+&= \frac{1}{b-a} \cdot \int_a^b \left( x - \frac{a+b}{2} \right)^2 \, \mathrm{d}x \\
+&= \frac{1}{b-a} \cdot \left[ \frac{1}{3} \left( x - \frac{a+b}{2} \right)^3 \right]_a^b \\
+&= \frac{1}{3(b-a)} \cdot \left[ \left( \frac{2x-(a+b)}{2} \right)^3 \right]_a^b \\
+&= \frac{1}{3(b-a)} \cdot \left[ \frac{1}{8} ( 2x-a-b )^3 \right]_a^b \\
+&= \frac{1}{24(b-a)} \cdot \left[ ( 2x-a-b )^3 \right]_a^b \\
+&= \frac{1}{24(b-a)} \cdot \left[ ( 2b-a-b )^3 - ( 2a-a-b )^3 \right] \\
+&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 - ( a-b )^3 \right] \\
+&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 + (-1)^3 ( a-b )^3 \right] \\
+&= \frac{1}{24(b-a)} \cdot \left[ ( b-a )^3 + ( b-a )^3 \right] \\
+&= \frac{2}{24(b-a)} (b-a)^3 \\
+&= \frac{1}{12} (b-a)^2 \; .
+\end{split}
+$$
@@ -0,0 +1,127 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2022-12-23 16:36:00
+
+title: "Distributions of estimated parameters, fitted signal and residuals in multiple linear regression upon ordinary least squares"
+chapter: "Statistical Models"
+section: "Univariate normal data"
+topic: "Multiple linear regression"
+theorem: "Distribution of OLS estimates, fitted signal and residuals"
+
+sources:
+  - authors: "Koch, Karl-Rudolf"
+    year: 2007
+    title: "Linear Model"
+    in: "Introduction to Bayesian Statistics"
+    pages: "Springer, Berlin/Heidelberg, 2007, ch. 4, eqs. 4.2, 4.30"
+    url: "https://www.springer.com/de/book/9783540727231"
+    doi: "10.1007/978-3-540-72726-2"
+  - authors: "Penny, William"
+    year: 2006
+    title: "Multiple Regression"
+    in: "Mathematics for Brain Imaging"
+    pages: "ch. 1.5, pp. 39-41, eqs. 1.106-1.110"
+    url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf"
+
+proof_id: "P400"
+shortcut: "mlr-olsdist"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Assume a [linear regression model](/D/mlr) with independent observations
+
+$$ \label{eq:mlr}
+y = X\beta + \varepsilon, \; \varepsilon_i \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2)
+$$
+
+and consider estimation using [ordinary least squares](/P/mlr-ols). Then, the estimated parameters, fitted signal and residuals are distributed as
+
+$$ \label{eq:mlr-dist}
+\begin{split}
+\hat{\beta} &\sim \mathcal{N}\left( \beta, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \\
+\hat{y} &\sim \mathcal{N}\left( X \beta, \sigma^2 P \right) \\
+\hat{\varepsilon} &\sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right)
+\end{split}
+$$
+
+where $P$ is the [projection matrix](/D/pmat) for [ordinary least squares](/P/mlr-ols)
+
+$$ \label{eq:mlr-pmat}
+P = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \; .
+$$
+
+
+**Proof:** We will use the [linear transformation theorem for the multivariate normal distribution](/P/mvn-ltt):
+
+$$ \label{eq:mvn-ltt}
+x \sim \mathcal{N}(\mu, \Sigma) \quad \Rightarrow \quad y = Ax + b \sim \mathcal{N}(A\mu + b, A \Sigma A^\mathrm{T}) \; .
+$$
+
+The distributional assumption in \eqref{eq:mlr} [is equivalent to](/D/mvn-ind):
+
+$$ \label{eq:mlr-vect}
+y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 I_n) \; .
+$$
+
+Applying \eqref{eq:mvn-ltt} to \eqref{eq:mlr-vect}, the measured data are distributed as
+
+$$ \label{eq:y-dist}
+y \sim \mathcal{N}\left( X \beta, \sigma^2 I_n \right) \; .
+$$
+
+1) The [parameter estimates from ordinary least sqaures](/P/mlr-ols) are given by
+
+$$ \label{eq:b-est}
+\hat{\beta} = (X^\mathrm{T} X)^{-1} X^\mathrm{T} y
+$$
+
+and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:b-est}, they are distributed as
+
+$$ \label{eq:b-est-dist}
+\begin{split}
+\hat{\beta} &\sim \mathcal{N}\left( \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] I_n \left[ X (X^\mathrm{T} X)^{-1} \right] \right) \\
+&\sim \mathcal{N}\left( \beta, \, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \; .
+\end{split}
+$$
+
+2) The [fitted signal in multiple linear regression](/P/mlr-mat) is given by
+
+$$ \label{eq:y-est}
+\hat{y} = X \hat{\beta} = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} y = P y
+$$
+
+and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:y-est}, they are distributed as
+
+$$ \label{eq:y-est-dist}
+\begin{split}
+\hat{y} &\sim \mathcal{N}\left( X \beta, \, \sigma^2 X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) \\
+&\sim \mathcal{N}\left( X \beta, \, \sigma^2 P \right) \; .
+\end{split}
+$$
+
+3) The [residuals of the linear regression model](/P/mlr-mat) are given by
+
+$$ \label{eq:e-est}
+\hat{\varepsilon} = y - X \hat{\beta} = \left( I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) y = \left( I_n - P \right) y
+$$
+
+and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:e-est}, they are distributed as
+
+$$ \label{eq:e-est-dist-s1}
+\begin{split}
+\hat{\varepsilon} &\sim \mathcal{N}\left( \left[ I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ I_n - P \right] I_n \left[ I_n - P \right]^\mathrm{T} \right) \\
+&\sim \mathcal{N}\left( X \beta - X \beta, \, \sigma^2 \left[ I_n - P \right] \left[ I_n - P \right]^\mathrm{T} \right) \; .
+\end{split}
+$$
+
+Because the [residual-forming matrix](/D/rfm) is [symmetric](/P/mlr-symm) and [idempotent](/P/mlr-idem), this becomes:
+
+$$ \label{eq:e-est-dist-s2}
+\hat{\varepsilon} \sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) \; .
+$$