|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Salvador Balkus" |
| 6 | +affiliation: "Harvard T.H. Chan School of Public Health" |
| 7 | +e_mail: "sbalkus@g.harvard.edu" |
| 8 | +date: 2024-09-13 23:30:00 |
| 9 | + |
| 10 | +title: "The expected value minimizes the mean squared error" |
| 11 | +chapter: "General Theorems" |
| 12 | +section: "Probability theory" |
| 13 | +topic: "Expected value" |
| 14 | +theorem: "Expected value minimizes squared error" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Wikipedia" |
| 18 | + year: 2024 |
| 19 | + title: "Derivative test" |
| 20 | + in: "Wikipedia, the free encyclopedia" |
| 21 | + pages: "retrieved on 2024-09-13" |
| 22 | + url: "https://en.wikipedia.org/wiki/Derivative_test" |
| 23 | + |
| 24 | +proof_id: "P469" |
| 25 | +shortcut: "mean-mse" |
| 26 | +username: "salbalkus" |
| 27 | +--- |
| 28 | + |
| 29 | + |
| 30 | +**Theorem:** Let $X_1, \ldots, X_n$ be a collection of [random variables](/D/rvar) with common [mean](/D/mean) $E(X_i) = \mu$. Then, $\mu$ minimizes the mean squared error: |
| 31 | + |
| 32 | +$$ \label{eq:mean-mse} |
| 33 | +\mu = \operatorname*{arg\,min}_{a \in \mathbb{R}} E\left[ (X_i - a)^2 \right] \; . |
| 34 | +
|
| 35 | +
|
| 36 | +**Proof:** Using the [linearity of expectation](/P/mean-lin), we can simplify the objective function: |
| 37 | +
|
| 38 | +$$ \label{eq:mse} |
| 39 | +E\left[ (X_i - a)^2 \right] = E\left[ X_i^2 - 2aX_i + a^2 \right] = a^2 - 2a\mu + E(X_i^2) \; . |
| 40 | +$$ |
| 41 | + |
| 42 | +Setting the first derivative |
| 43 | + |
| 44 | +$$ \label{eq:dmse-da} |
| 45 | +\frac{d}{da} \left[ a^2 - 2a\mu + E(X_i^2) ] = 2a - 2\mu |
| 46 | +$$ |
| 47 | + |
| 48 | +to zero to perform a derivative test, we obtain: |
| 49 | + |
| 50 | +$$ \label{eq:mean-mse-qed} |
| 51 | +2a - 2\mu = 0 \quad \Leftrightarrow \quad a = \mu \; . |
| 52 | +$$ |
| 53 | + |
| 54 | +The second derivative is equal to 2, which is greater than 0. Since $a = \mu$ is the sole critical point, we can conclude that this value is the unique global minimum. This completes the proof that the expected value minimizes the mean squared error. |
0 commit comments