|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Salvador Balkus" |
| 6 | +affiliation: "Harvard T.H. Chan School of Public Health" |
| 7 | +e_mail: "sbalkus@g.harvard.edu" |
| 8 | +date: 2024-09-23 23:30:00 |
| 9 | + |
| 10 | +title: "The median minimizes the mean absolute error" |
| 11 | +chapter: "General Theorems" |
| 12 | +section: "Probability theory" |
| 13 | +topic: "Measures of central tendency |
| 14 | +theorem: "The median minimizes the mean absolute error" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Wikipedia" |
| 18 | + year: 2024 |
| 19 | + title: "Derivative test" |
| 20 | + in: "Wikipedia, the free encyclopedia" |
| 21 | + pages: "retrieved on 2024-09-23" |
| 22 | + url: "https://en.wikipedia.org/wiki/Derivative_test" |
| 23 | + - authors: "Wikipedia" |
| 24 | + year: 2024 |
| 25 | + title: "Leibniz integral rule" |
| 26 | + in: "Wikipedia, the free encyclopedia" |
| 27 | + pages: "retrieved on 2024-09-23" |
| 28 | + url: "https://en.wikipedia.org/wiki/Leibniz_integral_rule" |
| 29 | + - authors: "Wikipedia" |
| 30 | + year: 2024 |
| 31 | + title: "Jensen's Inequality" |
| 32 | + in: "Wikipedia, the free encyclopedia" |
| 33 | + pages: "retrieved on 2024-09-23" |
| 34 | + url: "https://en.wikipedia.org/wiki/Jensen%27s_inequality" |
| 35 | + - authors: "Wikipedia" |
| 36 | + year: 2024 |
| 37 | + title: "Convex Function" |
| 38 | + in: "Wikipedia, the free encyclopedia" |
| 39 | + pages: "retrieved on 2024-09-23" |
| 40 | + url: "https://en.wikipedia.org/wiki/Convex_function" |
| 41 | + |
| 42 | +proof_id: "P470" |
| 43 | +shortcut: "med-mae" |
| 44 | +username: "salbalkus" |
| 45 | +--- |
| 46 | + |
| 47 | + |
| 48 | +**Theorem:** Let $X_1, \ldots, X_n$ be a collection of continuous [random variables](/D/rvar) drawn from the [probability density function](/D/pdf) $f(x)$ supported on $(-\infty, \infty)$ with common [median](/D/med) $m$. Then, $m$ minimizes the mean absolute error: |
| 49 | + |
| 50 | +$$ \label{eq:med-mae} |
| 51 | +m = \operatorname*{arg\,min}_{a \in \mathbb{R}} \mathrm{E}\left[ \lvert X_i - a \rvert \right] \; . |
| 52 | +$$ |
| 53 | + |
| 54 | + |
| 55 | +**Proof:** We can find the optimum by performing a derivative test. First, since an absolute value function is not directly differentaible, simplify the objective function by splitting it into two separate integrals like so: |
| 56 | + |
| 57 | +$$ \label{eq:med-mae-split} |
| 58 | +E(\lvert X_i - a \rvert) = \int_{-\infty}^a (a - x) f(x)dx + \int_{a}^\infty (x - a) f(x)dx |
| 59 | +$$ |
| 60 | + |
| 61 | +Now note that $\lvert\frac{\partial}{\partial a}(a - x)f(x)\rvert = \lvert\frac{\partial}{\partial a}(x - a)f(x)\rvert = f(x)$. Consequently, $\int_{-\infty}^af(x) = P(X_i < a)$ and $\int_{a}^\infty f(x) = P(X_i > a)$ both of which must be finite by the [axioms of probability](/D/prob-ax). Therefore, these integrals meet the conditions for [Leibniz's rule](https://en.wikipedia.org/wiki/Leibniz_integral_rule) to be applied. |
| 62 | + |
| 63 | +Applying Leibniz's rule, we can differentiate the objective function as follows: |
| 64 | + |
| 65 | +$$ \label{eq:med-mae-split} |
| 66 | +\frac{\partial}{\partial a} \Big(\int_{-\infty}^a (a - x) f(x)dx + \int_{a}^\infty (x - a) f(x)dx\Big) = (a - x)f(x) + \int_{-\infty}^a f(x)dx - (x - a)f(x) - \int_{a}^\infty f(x)dx |
| 67 | +$$ |
| 68 | + |
| 69 | +Canceling terms and setting this derivative to 0, it must be true that |
| 70 | + |
| 71 | +$$\label{eq:dmed-da} |
| 72 | +\int_{-\infty}^a f(x)dx - \int_{a}^\infty f(x)dx = 0 \implies P(X_i < a) = P(X_i > a) |
| 73 | +$$ |
| 74 | + |
| 75 | +This yields the implication |
| 76 | + |
| 77 | +$$\label{eq:med-mae-qed} |
| 78 | +P(X_i < a) = P(X_i > a) \implies P(X_i < a) = 1 - P(X_i < a) \implies P(X_i < a) = 0.5 |
| 79 | +$$ |
| 80 | + |
| 81 | +As a result, $a$ satisfies the [definition of a median](/D/med) at the critical point of the objective function. |
| 82 | + |
| 83 | +Finally, absolute value is a [convex function](https://en.wikipedia.org/wiki/Convex_function), and so is its expected value by [Jensen's inequality](https://en.wikipedia.org/wiki/Jensen%27s_inequality); this implies, since the median is the sole critical point, it must be a global minimum. Therefore, the median must minimize the mean absolute error, completing the proof. |
| 84 | + |
0 commit comments