Merge pull request #70 from StatProofBook/master

JoramSoch · web-flow · commit b5014be2e8db · 2021-11-16T15:39:31.000+01:00
update to master
diff --git a/I/PbA.md b/I/PbA.md
@@ -4,7 +4,7 @@ title: "Proof by Author"
 ---
 
 
-### JoramSoch (270 proofs)
+### JoramSoch (275 proofs)
 
 - [(Non-)Multiplicativity of the expected value](/P/mean-mult)
 - [Accuracy and complexity for the univariate Gaussian](/P/ug-anc)
@@ -60,6 +60,7 @@ title: "Proof by Author"
 - [Differential entropy of the multivariate normal distribution](/P/mvn-dent)
 - [Differential entropy of the normal distribution](/P/norm-dent)
 - [Differential entropy of the normal-gamma distribution](/P/ng-dent)
+- [Distribution of parameter estimates for simple linear regression](/P/slr-olsdist)
 - [Distribution of the inverse general linear model](/P/iglm-dist)
 - [Distribution of the transformed general linear model](/P/tglm-dist)
 - [Distributional transformation using cumulative distribution function](/P/cdf-dt)
@@ -223,6 +224,7 @@ title: "Proof by Author"
 - [Probability under mutual exclusivity](/P/prob-exc)
 - [Probability under statistical independence](/P/prob-ind)
 - [Projection matrix and residual-forming matrix are idempotent](/P/mlr-idem)
+- [Projection of a data point to the regression line](/P/slr-proj)
 - [Quantile function is inverse of strictly monotonically increasing cumulative distribution function](/P/qf-cdf)
 - [Quantile function of the continuous uniform distribution](/P/cuni-qf)
 - [Quantile function of the discrete uniform distribution](/P/duni-qf)
@@ -258,10 +260,13 @@ title: "Proof by Author"
 - [Relationship between signal-to-noise ratio and R²](/P/snr-rsq)
 - [Scaling of the variance upon multiplication with a constant](/P/var-scal)
 - [Second central moment is variance](/P/momcent-2nd)
+- [Simple linear regression is a special case of multiple linear regression](/P/slr-mlr)
+- [Sums of squares for simple linear regression](/P/slr-sss)
 - [The regression line goes through the center of mass point](/P/slr-comp)
 - [The residuals and the covariate are uncorrelated in simple linear regression](/P/slr-rescorr)
 - [The sum of residuals is zero in simple linear regression](/P/slr-ressum)
 - [Transformation matrices for ordinary least squares](/P/mlr-mat)
+- [Transformation matrices for simple linear regression](/P/slr-mat)
 - [Transposition of a matrix-normal random variable](/P/matn-trans)
 - [Two-sample t-test for independent observations](/P/ug-ttest2)
 - [Two-sample z-test for independent observations](/P/ugkv-ztest2)
diff --git a/I/PbN.md b/I/PbN.md
@@ -286,3 +286,8 @@ title: "Proof by Number"
 | P278 | slr-resvar | [Relationship between residual variance and sample variance in simple linear regression](/P/slr-resvar) | JoramSoch | 2021-10-27 |
 | P279 | slr-corr | [Relationship between correlation coefficient and slope estimate in simple linear regression](/P/slr-corr) | JoramSoch | 2021-10-27 |
 | P280 | slr-rsq | [Relationship between coefficient of determination and correlation coefficient in simple linear regression](/P/slr-rsq) | JoramSoch | 2021-10-27 |
+| P281 | slr-mlr | [Simple linear regression is a special case of multiple linear regression](/P/slr-mlr) | JoramSoch | 2021-11-09 |
+| P282 | slr-olsdist | [Distribution of parameter estimates for simple linear regression](/P/slr-olsdist) | JoramSoch | 2021-11-09 |
+| P283 | slr-proj | [Projection of a data point to the regression line](/P/slr-proj) | JoramSoch | 2021-11-09 |
+| P284 | slr-sss | [Sums of squares for simple linear regression](/P/slr-sss) | JoramSoch | 2021-11-09 |
+| P285 | slr-mat | [Transformation matrices for simple linear regression](/P/slr-mat) | JoramSoch | 2021-11-09 |
diff --git a/I/PbT.md b/I/PbT.md
@@ -69,6 +69,7 @@ title: "Proof by Topic"
 - [Differential entropy of the multivariate normal distribution](/P/mvn-dent)
 - [Differential entropy of the normal distribution](/P/norm-dent)
 - [Differential entropy of the normal-gamma distribution](/P/ng-dent)
+- [Distribution of parameter estimates for simple linear regression](/P/slr-olsdist)
 - [Distribution of the inverse general linear model](/P/iglm-dist)
 - [Distribution of the transformed general linear model](/P/tglm-dist)
 - [Distributional transformation using cumulative distribution function](/P/cdf-dt)
@@ -271,6 +272,7 @@ title: "Proof by Topic"
 - [Probability under mutual exclusivity](/P/prob-exc)
 - [Probability under statistical independence](/P/prob-ind)
 - [Projection matrix and residual-forming matrix are idempotent](/P/mlr-idem)
+- [Projection of a data point to the regression line](/P/slr-proj)
 - [Proof Template](/P/-temp-)
 
 ### Q
@@ -317,13 +319,16 @@ title: "Proof by Topic"
 - [Savage-Dickey Density Ratio for computing Bayes Factors](/P/bf-sddr)
 - [Scaling of the variance upon multiplication with a constant](/P/var-scal)
 - [Second central moment is variance](/P/momcent-2nd)
+- [Simple linear regression is a special case of multiple linear regression](/P/slr-mlr)
+- [Sums of squares for simple linear regression](/P/slr-sss)
 
 ### T
 
 - [The regression line goes through the center of mass point](/P/slr-comp)
 - [The residuals and the covariate are uncorrelated in simple linear regression](/P/slr-rescorr)
 - [The sum of residuals is zero in simple linear regression](/P/slr-ressum)
 - [Transformation matrices for ordinary least squares](/P/mlr-mat)
+- [Transformation matrices for simple linear regression](/P/slr-mat)
 - [Transitivity of Bayes Factors](/P/bf-trans)
 - [Transposition of a matrix-normal random variable](/P/matn-trans)
 - [Two-sample t-test for independent observations](/P/ug-ttest2)
diff --git a/I/PwS.md b/I/PwS.md
@@ -86,6 +86,9 @@ title: "Proofs without Source"
 - [Relationship between R² and maximum log-likelihood](/P/rsq-mll)
 - [Relationship between second raw moment, variance and mean](/P/momraw-2nd)
 - [Relationship between signal-to-noise ratio and R²](/P/snr-rsq)
+- [Simple linear regression is a special case of multiple linear regression](/P/slr-mlr)
+- [Sums of squares for simple linear regression](/P/slr-sss)
+- [Transformation matrices for simple linear regression](/P/slr-mat)
 - [Transitivity of Bayes Factors](/P/bf-trans)
 - [Transposition of a matrix-normal random variable](/P/matn-trans)
 - [Variance of the Wald distribution](/P/wald-var)
diff --git a/P/prob-exh.md b/P/prob-exh.md
@@ -33,7 +33,7 @@ username: "JoramSoch"
 ---
 
 
-**Theorem:** Let $B_1, \ldots, B_n$ be mutually exclusive and collectively exhaustive subsets of a [sample space](/D/samp-spc) \Omega. Then, their [total probability](/P/prob-tot) is one:
+**Theorem:** Let $B_1, \ldots, B_n$ be mutually exclusive and collectively exhaustive subsets of a [sample space](/D/samp-spc) $\Omega$. Then, their [total probability](/P/prob-tot) is one:
 
 $$ \label{eq:prob-exh}
 \sum_i P(B_i) = 1 \; .
diff --git a/P/slr-mat.md b/P/slr-mat.md
@@ -68,7 +68,7 @@ E &= (X^\mathrm{T} X)^{-1} X^\mathrm{T} \\
 &= \left( \left[ \begin{matrix} n & n\bar{x} \\ n\bar{x} & x^\mathrm{T} x \end{matrix} \right] \right)^{-1} \left[ \begin{matrix} 1_n^\mathrm{T} \\ x^\mathrm{T} \end{matrix} \right] \\
 &= \frac{1}{n x^\mathrm{T} x - (n\bar{x})^2} \left[ \begin{matrix} x^\mathrm{T} x & -n\bar{x} \\ -n\bar{x} & n \end{matrix} \right] \left[ \begin{matrix} 1_n^\mathrm{T} \\ x^\mathrm{T} \end{matrix} \right] \\
 &= \frac{1}{x^\mathrm{T} x - n\bar{x}^2} \left[ \begin{matrix} x^\mathrm{T} x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \left[ \begin{matrix} 1_n^\mathrm{T} \\ x^\mathrm{T} \end{matrix} \right] \\
-&= \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} (x^\mathrm{T} x/n) \, 1_n^\mathrm{T} - \bar{x} \, x^\mathrm{T} \\ - \bar{x} \, 1_n^\mathrm{T} + x^\mathrm{T} \end{matrix} \right] \; .
+&\overset{\eqref{eq:b-est-cov-den}}{=} \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} (x^\mathrm{T} x/n) \, 1_n^\mathrm{T} - \bar{x} \, x^\mathrm{T} \\ - \bar{x} \, 1_n^\mathrm{T} + x^\mathrm{T} \end{matrix} \right] \; .
 \end{split}
 $$
 
@@ -83,7 +83,7 @@ which is an $n \times n$ matrix and can be reformulated as follows:
 
 $$ \label{eq:P-qed}
 \begin{split}
-P &= X \, E = \left[ \begin{matrix} 1_n & x \end{matrix} \right] \left[ \begin{matrix} e_1 \\ e_2 \end{matrix} \right] \\
+P &= X \, E = \left[ 1_n, \, x \right] \left[ \begin{matrix} e_1 \\ e_2 \end{matrix} \right] \\
 &= \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} 1 & x_1 \\ \vdots & \vdots \\ 1 & x_n \end{matrix} \right] \left[ \begin{matrix} (x^\mathrm{T} x/n) - \bar{x} x_1 & \cdots & (x^\mathrm{T} x/n) - \bar{x} x_n \\ -\bar{x} + x_1 & \cdots & -\bar{x} + x_n \end{matrix} \right] \\
 &= \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} (x^\mathrm{T} x/n) - 2 \bar{x} x_1 + x_1^2 & \cdots & (x^\mathrm{T} x/n) - \bar{x} (x_1 + x_n) + x_1 x_n \\ \vdots & \ddots & \vdots \\ (x^\mathrm{T} x/n) - \bar{x} (x_1 + x_n) + x_1 x_n & \cdots & (x^\mathrm{T} x/n) - 2 \bar{x} x_n + x_n^2 \end{matrix} \right] \; .
 \end{split}
@@ -101,7 +101,7 @@ which also is an $n \times n$ matrix and can be reformulated as follows:
 $$ \label{eq:R-qed}
 \begin{split}
 R &= I_n - P = \left[ \begin{matrix} 1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & 1 \end{matrix} \right] - \left[ \begin{matrix} p_{11} & \cdots & p_{1n} \\ \vdots & \ddots & \vdots \\ p_{n1} & \cdots & p_{nn} \end{matrix} \right] \\
-&= \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} x^\mathrm{T} x - n\bar{x}^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x^\mathrm{T} x - n\bar{x}^2 \end{matrix} \right] \\
+&\overset{\eqref{eq:b-est-cov-den}}{=} \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} x^\mathrm{T} x - n\bar{x}^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x^\mathrm{T} x - n\bar{x}^2 \end{matrix} \right] \\
 &- \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} (x^\mathrm{T} x/n) - 2 \bar{x} x_1 + x_1^2 & \cdots & (x^\mathrm{T} x/n) - \bar{x} (x_1 + x_n) + x_1 x_n \\ \vdots & \ddots & \vdots \\ (x^\mathrm{T} x/n) - \bar{x} (x_1 + x_n) + x_1 x_n & \cdots & (x^\mathrm{T} x/n) - 2 \bar{x} x_n + x_n^2 \end{matrix} \right] \\
 &= \frac{1}{(n-1)\,s_x^2} \left[ \begin{matrix} (n-1) (x^\mathrm{T} x/n) + \bar{x} (2 x_1 - n\bar{x}) - x_1^2 & \cdots & -(x^\mathrm{T} x/n) + \bar{x} (x_1 + x_n) - x_1 x_n \\ \vdots & \ddots & \vdots \\ -(x^\mathrm{T} x/n) + \bar{x} (x_1 + x_n) - x_1 x_n & \cdots &  (n-1) (x^\mathrm{T} x/n) + \bar{x} (2 x_n - n\bar{x}) - x_n^2 \end{matrix} \right] \; .
 \end{split}
diff --git a/P/slr-olsdist.md b/P/slr-olsdist.md
@@ -39,7 +39,7 @@ $$ \label{eq:slr-olsdist}
 \left[ \begin{matrix} \hat{\beta}_0 \\ \hat{\beta}_1 \end{matrix} \right] \sim \mathcal{N}\left( \left[ \begin{matrix} \beta_0 \\ \beta_1 \end{matrix} \right], \, \frac{\sigma^2}{(n-1) \, s_x^2} \cdot \left[ \begin{matrix} x^\mathrm{T}x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \right)
 $$
 
-where $s_x^2$ is the [sample variance](/D/var-samp) of $x$.
+where $\bar{x}$ is the [sample mean](/D/mean-samp) and $s_x^2$ is the [sample variance](/D/var-samp) of $x$.
 
 
 **Proof:** [Simple linear regression is a special case of multiple linear regression](/P/slr-mlr) with
diff --git a/P/slr-proj.md b/P/slr-proj.md
@@ -40,7 +40,7 @@ P\left(w \mid \hat{\beta}_0 + \hat{\beta}_1 w\right) \quad \text{with} \quad w =
 $$
 
 
-**Proof:** The intersection point of the regression line with the y-axis is
+**Proof:** The intersection point of the [regression line](/D/regline) with the y-axis is
 
 $$ \label{eq:S}
 S(0 \vert \hat{\beta}_0) \; .
@@ -89,8 +89,8 @@ With \eqref{eq:a} and \eqref{eq:b}, $w$ can be calculated as
 $$ \label{eq:w-qed}
 \begin{split}
 w &= \frac{a^\mathrm{T} b}{a^\mathrm{T} a} \\
-&= \frac{\left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)^\mathrm{T} \left( \begin{matrix} x_o \\ y_o - \hat{\beta}_0 \end{matrix} \right)}{\left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)^\mathrm{T} \left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)} \\
-&= \frac{x_0 + (y_o - \hat{\beta}_0) \hat{\beta}_1}{1 + \hat{\beta}_1^2}
+w &= \frac{\left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)^\mathrm{T} \left( \begin{matrix} x_o \\ y_o - \hat{\beta}_0 \end{matrix} \right)}{\left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)^\mathrm{T} \left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right)} \\
+w &= \frac{x_0 + (y_o - \hat{\beta}_0) \hat{\beta}_1}{1 + \hat{\beta}_1^2}
 \end{split}
 $$
 
@@ -100,4 +100,4 @@ $$ \label{eq:P-qed}
 \left( \begin{matrix} x_p \\ y_p \end{matrix} \right) = \left( \begin{matrix} 0 \\ \hat{\beta}_0 \end{matrix} \right) + w \cdot \left( \begin{matrix} 1 \\ \hat{\beta}_1 \end{matrix} \right) = \left( \begin{matrix} w \\ \hat{\beta}_0 + \hat{\beta}_1 w \end{matrix} \right) \; .
 $$
 
-Together, \eqref{eq:P-qed} and \eqref{eq:w-qed} constitute the proof of \eqref{eq:slr-proj}.
+Together, \eqref{eq:P-qed} and \eqref{eq:w-qed} constitute the proof of equation \eqref{eq:slr-proj}.
diff --git a/P/slr-sss.md b/P/slr-sss.md
@@ -92,12 +92,12 @@ $$ \label{eq:RSS-qed}
 \begin{split}
 \mathrm{RSS} &= \sum_{i=1}^n (y_i - \hat{y}_i)^2 \\
 &= \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i)^2 \\
-&= \sum_{i=1}^n (y_i - \bar{y} + \hat{\beta}_1 \bar{x} - \hat{\beta}_1 x_i)^2 \\
+&\overset{\eqref{eq:slr-ols}}{=} \sum_{i=1}^n (y_i - \bar{y} + \hat{\beta}_1 \bar{x} - \hat{\beta}_1 x_i)^2 \\
 &= \sum_{i=1}^n \left( (y_i - \bar{y}) - \hat{\beta}_1 (x_i - \bar{x}) \right)^2 \\
 &= \sum_{i=1}^n \left( (y_i - \bar{y})^2 - 2 \hat{\beta}_1 (x_i - \bar{x}) (y_i - \bar{y}) + \hat{\beta}_1^2 (x_i - \bar{x})^2 \right) \\
 &= \sum_{i=1}^n (y_i - \bar{y})^2 - 2 \hat{\beta}_1 \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) + \hat{\beta}_1^2 \sum_{i=1}^n (x_i - \bar{x})^2 \\
 &= (n-1) \, s_y^2 - 2 (n-1) \, \hat{\beta}_1 \, s_{xy} + (n-1) \, \hat{\beta}_1^2 \, s_x^2 \\
-&= (n-1) \, s_y^2 - 2 (n-1) \left( \frac{s_{xy}}{s_x^2} \right) s_{xy} + (n-1) \left( \frac{s_{xy}}{s_x^2} \right)^2 s_x^2 \\
+&\overset{\eqref{eq:slr-ols}}{=} (n-1) \, s_y^2 - 2 (n-1) \left( \frac{s_{xy}}{s_x^2} \right) s_{xy} + (n-1) \left( \frac{s_{xy}}{s_x^2} \right)^2 s_x^2 \\
 &= (n-1) \, s_y^2 - (n-1) \, \frac{s_{xy}^2}{s_x^2} \\
 &= (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \; .
 \end{split}