You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Definition:** Let there be a [simple linear regression with independent observations](/D/slr) using dependent variable $y$ and independent variable $x$:
Then, given some parameters $\beta_0, \beta_1 \in \mathbb{R}$, the set
37
+
38
+
$$ \label{eq:regline}
39
+
L(\beta_0, \beta_1) = \left\lbrace (x,y) \in \mathbb{R}^2 \mid y = \beta_0 + \beta_1 x \right\rbrace
40
+
$$
41
+
42
+
is called a "regression line" and the set
43
+
44
+
$$ \label{eq:regline-ols}
45
+
L(\hat{\beta}_0, \hat{\beta}_1) = \left\lbrace (x,y) \in \mathbb{R}^2 \mid y = \hat{\beta}_0 + \hat{\beta}_1 x \right\rbrace
46
+
$$
47
+
48
+
is called the "fitted regression line", with estimated regression coefficients $\hat{\beta}_0, \hat{\beta}_1$, e.g. obtained via [ordinary least squares](/P/slr-ols).
**Definition:** Let $y$ and $x$ be two $n \times 1$ vectors.
31
+
32
+
Then, a statement asserting a linear relationship between $x$ and $y$
33
+
34
+
$$ \label{eq:slr-model}
35
+
y = \beta_0 + \beta_1 x + \varepsilon \; ,
36
+
$$
37
+
38
+
together with a statement asserting a [normal distribution](/D/mvn) for $\varepsilon$
39
+
40
+
$$ \label{eq:slr-noise}
41
+
\varepsilon \sim \mathcal{N}(0, \sigma^2 V)
42
+
$$
43
+
44
+
is called a univariate simple regression model or simply, "simple linear regression".
45
+
46
+
* $y$ is called "dependent variable", "measured data" or "signal";
47
+
48
+
* $x$ is called "independent variable", "predictor" or "covariate";
49
+
50
+
* $V$ is called "covariance matrix" or "covariance structure";
51
+
52
+
* $\beta_1$ is called "slope of the [regression line](/D/regline)";
53
+
54
+
* $\beta_0$ is called "intercept of the [regression line](/D/regline)";
55
+
56
+
* $\varepsilon$ is called "noise", "errors" or "error terms";
57
+
58
+
* $\sigma^2$ is called "noise variance" or "error variance";
59
+
60
+
* $n$ is the number of observations.
61
+
62
+
When the covariance structure $V$ is equal to the $n \times n$ identity matrix, this is called simple linear regression with independent and identically distributed (i.i.d.) observations:
**Theorem:** In [simple linear regression](/D/slr), the [regression line](/D/regline) estimated using [ordinary least squares](/P/slr-ols) includes the point $M(\bar{x},\bar{y})$.
31
+
32
+
**Proof:** The [fitted regression line](/D/regline) is described by the equation
33
+
34
+
$$ \label{eq:slr-ols-regline}
35
+
y = \hat{\beta}_0 + \hat{\beta}_1 x \quad \text{where} \quad x,y \in \mathbb{R} \; .
36
+
$$
37
+
38
+
Plugging in the coordinates of $M$ and the [ordinary least squares estimate of the intercept](/P/slr-ols), we obtain
which is a true statement. Thus, the [regression line](/D/regline) goes through the center of mass point $(\bar{x},\bar{y})$, if [the model](/D/slr) includes an intercept term $\beta_0$.
**Theorem:** Assume a [simple linear regression model](/D/slr) with independent observations
37
+
38
+
$$ \label{eq:slr}
39
+
y = \beta_0 + \beta_1 x + \varepsilon, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n
40
+
$$
41
+
42
+
and consider estimation using [ordinary least squares](/P/slr-ols). Then, [correlation coefficient](/D/corr) and the estimated value of the [slope parameter](/D/slr) are related to each other via the sample [standard deviations](/D/std):
43
+
44
+
$$ \label{eq:slr-corr}
45
+
r_{xy} = \frac{s_x}{s_y} \, \hat{\beta}_1 \; .
46
+
$$
47
+
48
+
49
+
**Proof:** The [ordinary least squares estimate of the slope](/P/slr-ols) is given by
50
+
51
+
$$ \label{eq:slr-ols-sl}
52
+
\hat{\beta}_1 = \frac{s_{xy}}{s_x^2} \; .
53
+
$$
54
+
55
+
Using the [relationship between covariance and correlation](/D/cov-corr)
0 commit comments