|
1 | | ---- |
2 | | -layout: proof |
3 | | -mathjax: true |
4 | | - |
5 | | -author: "Joram Soch" |
6 | | -affiliation: "BCCN Berlin" |
7 | | -e_mail: "joram.soch@bccn-berlin.de" |
8 | | -date: 2021-10-27 14:37:00 |
9 | | - |
10 | | -title: "Relationship between residual variance and sample variance in simple linear regression" |
11 | | -chapter: "Statistical Models" |
12 | | -section: "Univariate normal data" |
13 | | -topic: "Simple linear regression" |
14 | | -theorem: "Residual variance in terms of sample variance" |
15 | | - |
16 | | -sources: |
17 | | - - authors: "Penny, William" |
18 | | - year: 2006 |
19 | | - title: "Relation to correlation" |
20 | | - in: "Mathematics for Brain Imaging" |
21 | | - pages: "ch. 1.2.3, p. 18, eq. 1.28" |
22 | | - url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf" |
23 | | - - authors: "Wikipedia" |
24 | | - year: 2021 |
25 | | - title: "Simple linear regression" |
26 | | - in: "Wikipedia, the free encyclopedia" |
27 | | - pages: "retrieved on 2021-10-27" |
28 | | - url: "https://en.wikipedia.org/wiki/Simple_linear_regression#Numerical_properties" |
29 | | - |
30 | | -proof_id: "P278" |
31 | | -shortcut: "slr-vars" |
32 | | -username: "JoramSoch" |
33 | | ---- |
34 | | - |
35 | | - |
36 | | -**Theorem:** Assume a [simple linear regression model](/D/slr) with independent observations |
37 | | - |
38 | | -$$ \label{eq:slr} |
39 | | -y = \beta_0 + \beta_1 x + \varepsilon, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n |
40 | | -$$ |
41 | | - |
42 | | -and consider estimation using [ordinary least squares](/P/slr-ols). Then, [residual variance](/D/resvar) and [sample variance](/D/var-samp) are related to each other via the [correlation coefficient](/D/corr): |
43 | | - |
44 | | -$$ \label{eq:slr-vars} |
45 | | -\hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; . |
46 | | -$$ |
47 | | - |
48 | | - |
49 | | -**Proof:** The [residual variance](/D/resvar) can be expressed in terms of the [residual sum of squares](/D/rss): |
50 | | - |
51 | | -$$ \label{eq:slr-res} |
52 | | -\hat{\sigma}^2 = \frac{1}{n-1} \, \mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1) |
53 | | -$$ |
54 | | - |
55 | | -and [the residual sum of squares for simple linear regression](/P/slr-sss) is |
56 | | - |
57 | | -$$ \label{eq:slr-rss} |
58 | | -\mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1) = (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \; . |
59 | | -$$ |
60 | | - |
61 | | -Combining \eqref{eq:slr-res} and \eqref{eq:slr-rss}, we obtain: |
62 | | - |
63 | | -$$ \label{eq:slr-vars-s1} |
64 | | -\begin{split} |
65 | | -\hat{\sigma}^2 &= \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \\ |
66 | | -&= \left( 1 - \frac{s_{xy}^2}{s_x^2 s_y^2} \right) s_y^2 \\ |
67 | | -&= \left( 1 - \left( \frac{s_{xy}}{s_x \, s_y} \right)^2 \right) s_y^2 \; . |
68 | | -\end{split} |
69 | | -$$ |
70 | | - |
71 | | -Using the [relationship between correlation, covariance and standard deviation](/D/corr) |
72 | | - |
73 | | -$$ \label{eq:corr-cov-std} |
74 | | -\mathrm{Corr}(X,Y) = \frac{\mathrm{Cov}(X,Y)}{\sqrt{\mathrm{Var}(X)} \sqrt{\mathrm{Var}(Y)}} |
75 | | -$$ |
76 | | - |
77 | | -which also holds for sample correlation, [sample covariance](/D/cov-samp) and sample [standard deviation](/D/std) |
78 | | - |
79 | | -$$ \label{eq:corr-cov-std-samp} |
80 | | -r_{xy} = \frac{s_{xy}}{s_x \, s_y} \; , |
81 | | -$$ |
82 | | - |
83 | | -we get the final result: |
84 | | - |
85 | | -$$ \label{eq:slr-vars-s2} |
86 | | -\hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; . |
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Joram Soch" |
| 6 | +affiliation: "BCCN Berlin" |
| 7 | +e_mail: "joram.soch@bccn-berlin.de" |
| 8 | +date: 2021-10-27 14:37:00 |
| 9 | + |
| 10 | +title: "Relationship between residual variance and sample variance in simple linear regression" |
| 11 | +chapter: "Statistical Models" |
| 12 | +section: "Univariate normal data" |
| 13 | +topic: "Simple linear regression" |
| 14 | +theorem: "Residual variance in terms of sample variance" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Penny, William" |
| 18 | + year: 2006 |
| 19 | + title: "Relation to correlation" |
| 20 | + in: "Mathematics for Brain Imaging" |
| 21 | + pages: "ch. 1.2.3, p. 18, eq. 1.28" |
| 22 | + url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf" |
| 23 | + - authors: "Wikipedia" |
| 24 | + year: 2021 |
| 25 | + title: "Simple linear regression" |
| 26 | + in: "Wikipedia, the free encyclopedia" |
| 27 | + pages: "retrieved on 2021-10-27" |
| 28 | + url: "https://en.wikipedia.org/wiki/Simple_linear_regression#Numerical_properties" |
| 29 | + |
| 30 | +proof_id: "P278" |
| 31 | +shortcut: "slr-resvar" |
| 32 | +username: "JoramSoch" |
| 33 | +--- |
| 34 | + |
| 35 | + |
| 36 | +**Theorem:** Assume a [simple linear regression model](/D/slr) with independent observations |
| 37 | + |
| 38 | +$$ \label{eq:slr} |
| 39 | +y = \beta_0 + \beta_1 x + \varepsilon, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n |
| 40 | +$$ |
| 41 | + |
| 42 | +and consider estimation using [ordinary least squares](/P/slr-ols). Then, [residual variance](/D/resvar) and [sample variance](/D/var-samp) are related to each other via the [correlation coefficient](/D/corr): |
| 43 | + |
| 44 | +$$ \label{eq:slr-vars} |
| 45 | +\hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; . |
| 46 | +$$ |
| 47 | + |
| 48 | + |
| 49 | +**Proof:** The [residual variance](/D/resvar) can be expressed in terms of the [residual sum of squares](/D/rss): |
| 50 | + |
| 51 | +$$ \label{eq:slr-res} |
| 52 | +\hat{\sigma}^2 = \frac{1}{n-1} \, \mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1) |
| 53 | +$$ |
| 54 | + |
| 55 | +and [the residual sum of squares for simple linear regression](/P/slr-sss) is |
| 56 | + |
| 57 | +$$ \label{eq:slr-rss} |
| 58 | +\mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1) = (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \; . |
| 59 | +$$ |
| 60 | + |
| 61 | +Combining \eqref{eq:slr-res} and \eqref{eq:slr-rss}, we obtain: |
| 62 | + |
| 63 | +$$ \label{eq:slr-vars-s1} |
| 64 | +\begin{split} |
| 65 | +\hat{\sigma}^2 &= \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \\ |
| 66 | +&= \left( 1 - \frac{s_{xy}^2}{s_x^2 s_y^2} \right) s_y^2 \\ |
| 67 | +&= \left( 1 - \left( \frac{s_{xy}}{s_x \, s_y} \right)^2 \right) s_y^2 \; . |
| 68 | +\end{split} |
| 69 | +$$ |
| 70 | + |
| 71 | +Using the [relationship between correlation, covariance and standard deviation](/D/corr) |
| 72 | + |
| 73 | +$$ \label{eq:corr-cov-std} |
| 74 | +\mathrm{Corr}(X,Y) = \frac{\mathrm{Cov}(X,Y)}{\sqrt{\mathrm{Var}(X)} \sqrt{\mathrm{Var}(Y)}} |
| 75 | +$$ |
| 76 | + |
| 77 | +which also holds for sample correlation, [sample covariance](/D/cov-samp) and sample [standard deviation](/D/std) |
| 78 | + |
| 79 | +$$ \label{eq:corr-cov-std-samp} |
| 80 | +r_{xy} = \frac{s_{xy}}{s_x \, s_y} \; , |
| 81 | +$$ |
| 82 | + |
| 83 | +we get the final result: |
| 84 | + |
| 85 | +$$ \label{eq:slr-vars-s2} |
| 86 | +\hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; . |
87 | 87 | $$ |
0 commit comments