|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Joram Soch" |
| 6 | +affiliation: "BCCN Berlin" |
| 7 | +e_mail: "joram.soch@bccn-berlin.de" |
| 8 | +date: 2021-10-27 08:56:00 |
| 9 | + |
| 10 | +title: "Ordinary least squares for simple linear regression" |
| 11 | +chapter: "Statistical Models" |
| 12 | +section: "Univariate normal data" |
| 13 | +topic: "Simple linear regression" |
| 14 | +theorem: "Ordinary least squares" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Penny, William" |
| 18 | + year: 2006 |
| 19 | + title: "Linear regression" |
| 20 | + in: "Mathematics for Brain Imaging" |
| 21 | + pages: "ch. 1.2.2, pp. 14-16, eqs. 1.24/1.25" |
| 22 | + url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf" |
| 23 | + - authors: "Wikipedia" |
| 24 | + year: 2021 |
| 25 | + title: "Proofs involving ordinary least squares" |
| 26 | + in: "Wikipedia, the free encyclopedia" |
| 27 | + pages: "retrieved on 2021-10-27" |
| 28 | + url: "https://en.wikipedia.org/wiki/Proofs_involving_ordinary_least_squares#Derivation_of_simple_linear_regression_estimators" |
| 29 | + |
| 30 | +proof_id: "P271" |
| 31 | +shortcut: "slr-ols" |
| 32 | +username: "JoramSoch" |
| 33 | +--- |
| 34 | + |
| 35 | + |
| 36 | +**Theorem:** Given a [simple linear regression model](/D/slr) with independent observations |
| 37 | + |
| 38 | +$$ \label{eq:slr} |
| 39 | +y = \beta_0 + \beta_1 x + \varepsilon, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n \; , |
| 40 | +$$ |
| 41 | + |
| 42 | +the parameters minimizing the [residual sum of squares](/D/rss) are given by |
| 43 | + |
| 44 | +$$ \label{eq:slr-ols} |
| 45 | +\begin{split} |
| 46 | +\hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \\ |
| 47 | +\hat{\beta}_1 &= \frac{s_{xy}}{s_x^2} |
| 48 | +\end{split} |
| 49 | +$$ |
| 50 | + |
| 51 | +where $\bar{x}$ and $\bar{y}$ are the [sample means](/D/mean-samp), $s_x^2$ is the [sample variance](/D/var-samp) of $x$ and $s_{xy}$ is the [sample covariance](/D/cov-samp) between $x$ and $y$. |
| 52 | + |
| 53 | + |
| 54 | +**Proof:** The [residual sum of squares](/D/rss) is defined as |
| 55 | + |
| 56 | +$$ \label{eq:rss} |
| 57 | +\mathrm{RSS}(\beta_0,\beta_1) = \sum_{i=1}^n \varepsilon_i^2 = \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i)^2 \; . |
| 58 | +$$ |
| 59 | + |
| 60 | +The derivatives of $\mathrm{RSS}(\beta_0,\beta_1)$ with respect to $\beta_0$ and $\beta_1$ are |
| 61 | + |
| 62 | +$$ \label{eq:rss-der} |
| 63 | +\begin{split} |
| 64 | +\frac{\mathrm{d}\mathrm{RSS}(\beta_0,\beta_1)}{\mathrm{d}\beta_0} &= \sum_{i=1}^n 2 (y_i - \beta_0 - \beta_1 x_i) (-1) \\ |
| 65 | +&= -2 \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i) \\ |
| 66 | +\frac{\mathrm{d}\mathrm{RSS}(\beta_0,\beta_1)}{\mathrm{d}\beta_1} &= \sum_{i=1}^n 2 (y_i - \beta_0 - \beta_1 x_i) (-x_i) \\ |
| 67 | +&= -2 \sum_{i=1}^n (x_i y_i - \beta_0 x_i - \beta_1 x_i^2) |
| 68 | +\end{split} |
| 69 | +$$ |
| 70 | + |
| 71 | +and setting these derivatives to zero |
| 72 | + |
| 73 | +$$ \label{eq:rss-der-zero} |
| 74 | +\begin{split} |
| 75 | +0 &= -2 \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i) \\ |
| 76 | +0 &= -2 \sum_{i=1}^n (x_i y_i - \hat{\beta}_0 x_i - \hat{\beta}_1 x_i^2) |
| 77 | +\end{split} |
| 78 | +$$ |
| 79 | + |
| 80 | +yields the following equations: |
| 81 | + |
| 82 | +$$ \label{eq:slr-norm-eq} |
| 83 | +\begin{split} |
| 84 | +\hat{\beta}_1 \sum_{i=1}^n x_i + \hat{\beta}_0 \cdot n &= \sum_{i=1}^n y_i \\ |
| 85 | +\hat{\beta}_1 \sum_{i=1}^n x_i^2 + \hat{\beta}_0 \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i y_i \; . |
| 86 | +\end{split} |
| 87 | +$$ |
| 88 | + |
| 89 | +From the first equation, we can derive the estimate for the intercept: |
| 90 | + |
| 91 | +$$ \label{eq:slr-ols-int} |
| 92 | +\begin{split} |
| 93 | +\hat{\beta}_0 &= \frac{1}{n} \sum_{i=1}^n y_i - \hat{\beta}_1 \cdot \frac{1}{n} \sum_{i=1}^n x_i \\ |
| 94 | +&= \bar{y} - \hat{\beta}_1 \bar{x} \; . |
| 95 | +\end{split} |
| 96 | +$$ |
| 97 | + |
| 98 | +From the second equation, we can derive the estimate for the slope: |
| 99 | + |
| 100 | +$$ \label{eq:slr-ols-sl} |
| 101 | +\begin{split} |
| 102 | +\hat{\beta}_1 \sum_{i=1}^n x_i^2 + \hat{\beta}_0 \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i y_i \\ |
| 103 | +\hat{\beta}_1 \sum_{i=1}^n x_i^2 + \left( \bar{y} - \hat{\beta}_1 \bar{x} \right) \sum_{i=1}^n x_i &\overset{\eqref{eq:slr-ols-int}}{=} \sum_{i=1}^n x_i y_i \\ |
| 104 | +\hat{\beta}_1 \left( \sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i \right) &= \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i \\ |
| 105 | +\hat{\beta}_1 &= \frac{\sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} \; . |
| 106 | +\end{split} |
| 107 | +$$ |
| 108 | + |
| 109 | +Note that the numerator can be rewritten as |
| 110 | + |
| 111 | +$$ \label{eq:slr-ols-sl-num} |
| 112 | +\begin{split} |
| 113 | +\sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i y_i - n \bar{x} \bar{y} \\ |
| 114 | +&= \sum_{i=1}^n x_i y_i - n \bar{x} \bar{y} - n \bar{x} \bar{y} + n \bar{x} \bar{y} \\ |
| 115 | +&= \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i - \bar{x} \sum_{i=1}^n y_i + \sum_{i=1}^n \bar{x} \bar{y} \\ |
| 116 | +&= \sum_{i=1}^n \left( x_i y_i - x_i \bar{y} - \bar{x} y_i + \bar{x} \bar{y} \right) \\ |
| 117 | +&= \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) |
| 118 | +\end{split} |
| 119 | +$$ |
| 120 | + |
| 121 | +and that the denominator can be rewritten as |
| 122 | + |
| 123 | +$$ \label{eq:slr-ols-sl-den} |
| 124 | +\begin{split} |
| 125 | +\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i^2 - n \bar{x}^2 \\ |
| 126 | +&= \sum_{i=1}^n x_i^2 - 2 n \bar{x} \bar{x} + n \bar{x}^2 \\ |
| 127 | +&= \sum_{i=1}^n x_i^2 - 2 \bar{x} \sum_{i=1}^n x_i - \sum_{i=1}^n \bar{x}^2 \\ |
| 128 | +&= \sum_{i=1}^n \left( x_i^2 - 2 \bar{x} x_i + \bar{x}^2 \right) \\ |
| 129 | +&= \sum_{i=1}^n (x_i - \bar{x})^2 \; . |
| 130 | +\end{split} |
| 131 | +$$ |
| 132 | + |
| 133 | +With \eqref{eq:slr-ols-sl-num} and \eqref{eq:slr-ols-sl-den}, the estimate from \eqref{eq:slr-ols-sl} can be simplified as follows: |
| 134 | + |
| 135 | +$$ \label{eq:slr-ols-sl-qed} |
| 136 | +\begin{split} |
| 137 | +\hat{\beta}_1 &= \frac{\sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} \\ |
| 138 | +&= \frac{\sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2} \\ |
| 139 | +&= \frac{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2} \\ |
| 140 | +&= \frac{s_{xy}}{s_x^2} \; . |
| 141 | +\end{split} |
| 142 | +$$ |
| 143 | + |
| 144 | +Together, \eqref{eq:slr-ols-int} and \eqref{eq:slr-ols-sl-qed} constitute the ordinary least squares parameter estimates for simple linear regression. |
0 commit comments