|
| 1 | +--- |
| 2 | +layout: proof |
| 3 | +mathjax: true |
| 4 | + |
| 5 | +author: "Joram Soch" |
| 6 | +affiliation: "BCCN Berlin" |
| 7 | +e_mail: "joram.soch@bccn-berlin.de" |
| 8 | +date: 2022-12-23 16:36:00 |
| 9 | + |
| 10 | +title: "Distributions of estimated parameters, fitted signal and residuals in multiple linear regression upon ordinary least squares" |
| 11 | +chapter: "Statistical Models" |
| 12 | +section: "Univariate normal data" |
| 13 | +topic: "Multiple linear regression" |
| 14 | +theorem: "Distribution of OLS estimates, fitted signal and residuals" |
| 15 | + |
| 16 | +sources: |
| 17 | + - authors: "Koch, Karl-Rudolf" |
| 18 | + year: 2007 |
| 19 | + title: "Linear Model" |
| 20 | + in: "Introduction to Bayesian Statistics" |
| 21 | + pages: "Springer, Berlin/Heidelberg, 2007, ch. 4, eqs. 4.2, 4.30" |
| 22 | + url: "https://www.springer.com/de/book/9783540727231" |
| 23 | + doi: "10.1007/978-3-540-72726-2" |
| 24 | + - authors: "Penny, William" |
| 25 | + year: 2006 |
| 26 | + title: "Multiple Regression" |
| 27 | + in: "Mathematics for Brain Imaging" |
| 28 | + pages: "ch. 1.5, pp. 39-41, eqs. 1.106-1.110" |
| 29 | + url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf" |
| 30 | + |
| 31 | +proof_id: "P400" |
| 32 | +shortcut: "mlr-olsdist" |
| 33 | +username: "JoramSoch" |
| 34 | +--- |
| 35 | + |
| 36 | + |
| 37 | +**Theorem:** Assume a [linear regression model](/D/mlr) with independent observations |
| 38 | + |
| 39 | +$$ \label{eq:mlr} |
| 40 | +y = X\beta + \varepsilon, \; \varepsilon_i \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2) |
| 41 | +$$ |
| 42 | + |
| 43 | +and consider estimation using [ordinary least squares](/P/mlr-ols). Then, the estimated parameters, fitted signal and residuals are distributed as |
| 44 | + |
| 45 | +$$ \label{eq:mlr-dist} |
| 46 | +\begin{split} |
| 47 | +\hat{\beta} &\sim \mathcal{N}\left( \beta, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \\ |
| 48 | +\hat{y} &\sim \mathcal{N}\left( X \beta, \sigma^2 P \right) \\ |
| 49 | +\hat{\varepsilon} &\sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) |
| 50 | +\end{split} |
| 51 | +$$ |
| 52 | + |
| 53 | +where $P$ is the [projection matrix](/D/pmat) for [ordinary least squares](/P/mlr-ols) |
| 54 | + |
| 55 | +$$ \label{eq:mlr-pmat} |
| 56 | +P = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \; . |
| 57 | +$$ |
| 58 | + |
| 59 | + |
| 60 | +**Proof:** We will use the [linear transformation theorem for the multivariate normal distribution](/P/mvn-ltt): |
| 61 | + |
| 62 | +$$ \label{eq:mvn-ltt} |
| 63 | +x \sim \mathcal{N}(\mu, \Sigma) \quad \Rightarrow \quad y = Ax + b \sim \mathcal{N}(A\mu + b, A \Sigma A^\mathrm{T}) \; . |
| 64 | +$$ |
| 65 | + |
| 66 | +The distributional assumption in \eqref{eq:mlr} [is equivalent to](/D/mvn-ind): |
| 67 | + |
| 68 | +$$ \label{eq:mlr-vect} |
| 69 | +y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 I_n) \; . |
| 70 | +$$ |
| 71 | + |
| 72 | +Applying \eqref{eq:mvn-ltt} to \eqref{eq:mlr-vect}, the measured data are distributed as |
| 73 | + |
| 74 | +$$ \label{eq:y-dist} |
| 75 | +y \sim \mathcal{N}\left( X \beta, \sigma^2 I_n \right) \; . |
| 76 | +$$ |
| 77 | + |
| 78 | +1) The [parameter estimates from ordinary least sqaures](/P/mlr-ols) are given by |
| 79 | + |
| 80 | +$$ \label{eq:b-est} |
| 81 | +\hat{\beta} = (X^\mathrm{T} X)^{-1} X^\mathrm{T} y |
| 82 | +$$ |
| 83 | + |
| 84 | +and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:b-est}, they are distributed as |
| 85 | + |
| 86 | +$$ \label{eq:b-est-dist} |
| 87 | +\begin{split} |
| 88 | +\hat{\beta} &\sim \mathcal{N}\left( \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] I_n \left[ X (X^\mathrm{T} X)^{-1} \right] \right) \\ |
| 89 | +&\sim \mathcal{N}\left( \beta, \, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \; . |
| 90 | +\end{split} |
| 91 | +$$ |
| 92 | + |
| 93 | +2) The [fitted signal in multiple linear regression](/P/mlr-mat) is given by |
| 94 | + |
| 95 | +$$ \label{eq:y-est} |
| 96 | +\hat{y} = X \hat{\beta} = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} y = P y |
| 97 | +$$ |
| 98 | + |
| 99 | +and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:y-est}, they are distributed as |
| 100 | + |
| 101 | +$$ \label{eq:y-est-dist} |
| 102 | +\begin{split} |
| 103 | +\hat{y} &\sim \mathcal{N}\left( X \beta, \, \sigma^2 X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) \\ |
| 104 | +&\sim \mathcal{N}\left( X \beta, \, \sigma^2 P \right) \; . |
| 105 | +\end{split} |
| 106 | +$$ |
| 107 | + |
| 108 | +3) The [residuals of the linear regression model](/P/mlr-mat) are given by |
| 109 | + |
| 110 | +$$ \label{eq:e-est} |
| 111 | +\hat{\varepsilon} = y - X \hat{\beta} = \left( I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) y = \left( I_n - P \right) y |
| 112 | +$$ |
| 113 | + |
| 114 | +and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:e-est}, they are distributed as |
| 115 | + |
| 116 | +$$ \label{eq:e-est-dist-s1} |
| 117 | +\begin{split} |
| 118 | +\hat{\varepsilon} &\sim \mathcal{N}\left( \left[ I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ I_n - P \right] I_n \left[ I_n - P \right]^\mathrm{T} \right) \\ |
| 119 | +&\sim \mathcal{N}\left( X \beta - X \beta, \, \sigma^2 \left[ I_n - P \right] \left[ I_n - P \right]^\mathrm{T} \right) \; . |
| 120 | +\end{split} |
| 121 | +$$ |
| 122 | + |
| 123 | +Because the [residual-forming matrix](/D/rfm) is [symmetric](/P/mlr-symm) and [idempotent](/P/mlr-idem), this becomes: |
| 124 | + |
| 125 | +$$ \label{eq:e-est-dist-s2} |
| 126 | +\hat{\varepsilon} \sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) \; . |
| 127 | +$$ |
0 commit comments