Skip to content

Commit 8dbe7ec

Browse files
Merge pull request #272 from salbalkus/master
add median minimizes MAE
2 parents 9c8f7e0 + 9f2c343 commit 8dbe7ec

2 files changed

Lines changed: 87 additions & 2 deletions

File tree

I/ToC.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,8 @@ title: "Table of Contents"
192192
<p id="Measures of central tendency"></p>
193193
1.15. Measures of central tendency <br>
194194
&emsp;&ensp; 1.15.1. *[Median](/D/med)* <br>
195-
&emsp;&ensp; 1.15.2. *[Mode](/D/mode)* <br>
195+
&emsp;&ensp; 1.15.2. **[The median minimizes mean absolute error](/P/med-mae)** <br>
196+
&emsp;&ensp; 1.15.3. *[Mode](/D/mode)* <br>
196197

197198
<p id="Measures of statistical dispersion"></p>
198199
1.16. Measures of statistical dispersion <br>
@@ -998,4 +999,4 @@ title: "Table of Contents"
998999
3.5. Bayesian model averaging <br>
9991000
&emsp;&ensp; 3.5.1. *[Definition](/D/bma)* <br>
10001001
&emsp;&ensp; 3.5.2. **[Derivation](/P/bma-der)** <br>
1001-
&emsp;&ensp; 3.5.3. **[Calculation from log model evidences](/P/bma-lme)** <br>
1002+
&emsp;&ensp; 3.5.3. **[Calculation from log model evidences](/P/bma-lme)** <br>

P/med-mae.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Salvador Balkus"
6+
affiliation: "Harvard T.H. Chan School of Public Health"
7+
e_mail: "sbalkus@g.harvard.edu"
8+
date: 2024-09-23 23:30:00
9+
10+
title: "The median minimizes the mean absolute error"
11+
chapter: "General Theorems"
12+
section: "Probability theory"
13+
topic: "Measures of central tendency
14+
theorem: "The median minimizes the mean absolute error"
15+
16+
sources:
17+
- authors: "Wikipedia"
18+
year: 2024
19+
title: "Derivative test"
20+
in: "Wikipedia, the free encyclopedia"
21+
pages: "retrieved on 2024-09-23"
22+
url: "https://en.wikipedia.org/wiki/Derivative_test"
23+
- authors: "Wikipedia"
24+
year: 2024
25+
title: "Leibniz integral rule"
26+
in: "Wikipedia, the free encyclopedia"
27+
pages: "retrieved on 2024-09-23"
28+
url: "https://en.wikipedia.org/wiki/Leibniz_integral_rule"
29+
- authors: "Wikipedia"
30+
year: 2024
31+
title: "Jensen's Inequality"
32+
in: "Wikipedia, the free encyclopedia"
33+
pages: "retrieved on 2024-09-23"
34+
url: "https://en.wikipedia.org/wiki/Jensen%27s_inequality"
35+
- authors: "Wikipedia"
36+
year: 2024
37+
title: "Convex Function"
38+
in: "Wikipedia, the free encyclopedia"
39+
pages: "retrieved on 2024-09-23"
40+
url: "https://en.wikipedia.org/wiki/Convex_function"
41+
42+
proof_id: "P470"
43+
shortcut: "med-mae"
44+
username: "salbalkus"
45+
---
46+
47+
48+
**Theorem:** Let $X_1, \ldots, X_n$ be a collection of continuous [random variables](/D/rvar) drawn from the [probability density function](/D/pdf) $f(x)$ supported on $(-\infty, \infty)$ with common [median](/D/med) $m$. Then, $m$ minimizes the mean absolute error:
49+
50+
$$ \label{eq:med-mae}
51+
m = \operatorname*{arg\,min}_{a \in \mathbb{R}} \mathrm{E}\left[ \lvert X_i - a \rvert \right] \; .
52+
$$
53+
54+
55+
**Proof:** We can find the optimum by performing a derivative test. First, since an absolute value function is not directly differentaible, simplify the objective function by splitting it into two separate integrals like so:
56+
57+
$$ \label{eq:med-mae-split}
58+
E(\lvert X_i - a \rvert) = \int_{-\infty}^a (a - x) f(x)dx + \int_{a}^\infty (x - a) f(x)dx
59+
$$
60+
61+
Now note that $\lvert\frac{\partial}{\partial a}(a - x)f(x)\rvert = \lvert\frac{\partial}{\partial a}(x - a)f(x)\rvert = f(x)$. Consequently, $\int_{-\infty}^af(x) = P(X_i < a)$ and $\int_{a}^\infty f(x) = P(X_i > a)$ both of which must be finite by the [axioms of probability](/D/prob-ax). Therefore, these integrals meet the conditions for [Leibniz's rule](https://en.wikipedia.org/wiki/Leibniz_integral_rule) to be applied.
62+
63+
Applying Leibniz's rule, we can differentiate the objective function as follows:
64+
65+
$$ \label{eq:med-mae-split}
66+
\frac{\partial}{\partial a} \Big(\int_{-\infty}^a (a - x) f(x)dx + \int_{a}^\infty (x - a) f(x)dx\Big) = (a - x)f(x) + \int_{-\infty}^a f(x)dx - (x - a)f(x) - \int_{a}^\infty f(x)dx
67+
$$
68+
69+
Canceling terms and setting this derivative to 0, it must be true that
70+
71+
$$\label{eq:dmed-da}
72+
\int_{-\infty}^a f(x)dx - \int_{a}^\infty f(x)dx = 0 \implies P(X_i < a) = P(X_i > a)
73+
$$
74+
75+
This yields the implication
76+
77+
$$\label{eq:med-mae-qed}
78+
P(X_i < a) = P(X_i > a) \implies P(X_i < a) = 1 - P(X_i < a) \implies P(X_i < a) = 0.5
79+
$$
80+
81+
As a result, $a$ satisfies the [definition of a median](/D/med) at the critical point of the objective function.
82+
83+
Finally, absolute value is a [convex function](https://en.wikipedia.org/wiki/Convex_function), and so is its expected value by [Jensen's inequality](https://en.wikipedia.org/wiki/Jensen%27s_inequality); this implies, since the median is the sole critical point, it must be a global minimum. Therefore, the median must minimize the mean absolute error, completing the proof.
84+

0 commit comments

Comments
 (0)