Skip to content

Commit cfb4582

Browse files
authored
added 2 proofs
1 parent 2aa0d1a commit cfb4582

2 files changed

Lines changed: 136 additions & 0 deletions

File tree

P/corr-range.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2021-12-14 02:08:00
9+
10+
title: "Correlation always falls between -1 and +1"
11+
chapter: "General Theorems"
12+
section: "Probability theory"
13+
topic: "Correlation"
14+
theorem: "Range"
15+
16+
sources:
17+
- authors: "Dor Leventer"
18+
year: 2021
19+
title: "How can I simply prove that the pearson correlation coefficient is between -1 and 1?"
20+
in: "StackExchange Mathematics"
21+
pages: "retrieved on 2021-12-14"
22+
url: "https://math.stackexchange.com/a/4260655/480910"
23+
24+
proof_id: "P300"
25+
shortcut: "corr-range"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Theorem:** Let $X$ and $Y$ be two [random variables](/D/rvar). Then, the correlation of $X$ and $Y$ is between and including $-1$ and $+1$:
31+
32+
$$ \label{eq:corr-range}
33+
-1 \leq \mathrm{Corr}(X,Y) \leq +1 \; .
34+
$$
35+
36+
37+
**Proof:** Consider the [variance](/D/var) of $X$ plus or minus $Y$, divided by their [standard deviations](/D/std):
38+
39+
$$ \label{eq:var-XY}
40+
\mathrm{Var}\left( \frac{X}{\sigma_X} \pm \frac{Y}{\sigma_Y} \right) \; .
41+
$$
42+
43+
Because the [variance is non-negative](/P/var-nonneg), this term is larger than or equal to zero:
44+
45+
$$ \label{eq:var-XY-0}
46+
0 \leq \mathrm{Var}\left( \frac{X}{\sigma_X} \pm \frac{Y}{\sigma_Y} \right) \; .
47+
$$
48+
49+
Using the [variance of a linear combination](/P/var-lincomb), it can also be written as:
50+
51+
$$ \label{eq:var-XY-s1}
52+
\begin{split}
53+
\mathrm{Var}\left( \frac{X}{\sigma_X} \pm \frac{Y}{\sigma_Y} \right) &= \mathrm{Var}\left( \frac{X}{\sigma_X} \right) + \mathrm{Var}\left( \frac{Y}{\sigma_Y} \right) \pm 2 \, \mathrm{Cov}\left( \frac{X}{\sigma_X}, \frac{Y}{\sigma_Y} \right) \\
54+
&= \frac{1}{\sigma_X^2} \mathrm{Var}(X) + \frac{1}{\sigma_Y^2} \mathrm{Var}(Y) \pm 2 \, \frac{1}{\sigma_X \sigma_Y} \, \mathrm{Cov}(X,Y) \\
55+
&= \frac{1}{\sigma_X^2} \sigma_X^2 + \frac{1}{\sigma_Y^2} \sigma_Y^2 \pm 2 \, \frac{1}{\sigma_X \sigma_Y} \, \sigma_{XY} \; .
56+
\end{split}
57+
$$
58+
59+
Using the [relationship between covariance and correlation](/P/cov-corr), we have:
60+
61+
$$ \label{eq:var-XY-s2}
62+
\mathrm{Var}\left( \frac{X}{\sigma_X} \pm \frac{Y}{\sigma_Y} \right) = 1 + 1 + \pm 2 \, \mathrm{Corr}(X,Y) \; .
63+
$$
64+
65+
Thus, the combination of \eqref{eq:var-XY-0} with \eqref{eq:var-XY-s2} yields
66+
67+
$$ \label{eq:var-XY-ineq}
68+
0 \leq 2 \pm 2 \, \mathrm{Corr}(X,Y)
69+
$$
70+
71+
which is equivalent to
72+
73+
$$ \label{eq:corr-range-qed}
74+
-1 \leq \mathrm{Corr}(X,Y) \leq +1 \; .
75+
$$

P/corr-z.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2021-12-14 02:31:00
9+
10+
title: "Correlation coefficient in terms of standard scores"
11+
chapter: "General Theorems"
12+
section: "Probability theory"
13+
topic: "Correlation"
14+
theorem: "Relationship to standard scores"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2021
19+
title: "Peason correlation coefficient"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2021-12-14"
22+
url: "https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#For_a_sample"
23+
24+
proof_id: "P299"
25+
shortcut: "corr-z"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Theorem:** Let $x = \left\lbrace x_1, \ldots, x_n \right\rbrace$ and $y = \left\lbrace y_1, \ldots, y_n \right\rbrace$ be [samples](/D/samp) from [random variables](/D/rvar) $X$ and $Y$. Then, the [sample correlation coefficient](/D/corr-samp) $r_{xy}$ can be expressed in terms of the [standard scores](/D/z) of $x$ and $y$:
31+
32+
$$ \label{eq:corr-z}
33+
r_{xy} = \frac{1}{n-1} \sum_{i=1}^n z_i^{(x)} \cdot z_i^{(y)} = \frac{1}{n-1} \sum_{i=1}^n \left( \frac{x_i-\bar{x}}{s_x} \right) \left( \frac{y_i-\bar{y}}{s_y} \right)
34+
$$
35+
36+
where $\bar{x}$ and $\bar{y}$ are the [sample means](/D/mean-samp) and $s_x$ and $s_y$ are the [sample variances](/D/var-samp).
37+
38+
39+
**Proof:** The [sample correlation coefficient](/D/corr-samp) is defined as
40+
41+
$$ \label{eq:corr-samp}
42+
r_{xy} = \frac{\sum_{i=1}^n (x_i-\bar{x}) (y_i-\bar{y})}{\sqrt{\sum_{i=1}^n (x_i-\bar{x})^2} \sqrt{\sum_{i=1}^n (y_i-\bar{y})^2}} \; .
43+
$$
44+
45+
Using the [sample variances](/D/var-samp) of $x$ and $y$, we can write:
46+
47+
$$ \label{eq:corr-z-s1}
48+
r_{xy} = \frac{\sum_{i=1}^n (x_i-\bar{x}) (y_i-\bar{y})}{\sqrt{(n-1) s_x^2} \sqrt{(n-1) s_y^2}} \; .
49+
$$
50+
51+
Rearranging the terms, we arrive at:
52+
53+
$$ \label{eq:corr-z-s2}
54+
r_{xy} = \frac{1}{(n-1) \, s_x \, s_y} \sum_{i=1}^n (x_i-\bar{x}) (y_i-\bar{y}) \; .
55+
$$
56+
57+
Further simplifying, the result is:
58+
59+
$$ \label{eq:corr-z-s3}
60+
r_{xy} = \frac{1}{n-1} \sum_{i=1}^n \left( \frac{x_i-\bar{x}}{s_x} \right) \left( \frac{y_i-\bar{y}}{s_y} \right) \; .
61+
$$

0 commit comments

Comments
 (0)