Skip to content

Commit 6df9772

Browse files
authored
Merge pull request #192 from JoramSoch/master
added 2 definitions and 6 proofs
2 parents 6add230 + ecd3db7 commit 6df9772

9 files changed

Lines changed: 880 additions & 8 deletions

File tree

D/fcon.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
layout: definition
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-12-16 12:42:00
9+
10+
title: "F-contrast for contrast-based inference in multiple linear regression"
11+
chapter: "Statistical Models"
12+
section: "Univariate normal data"
13+
topic: "Multiple linear regression"
14+
definition: "F-contrast"
15+
16+
sources:
17+
- authors: "Stephan, Klaas Enno"
18+
year: 2010
19+
title: "Classical (frequentist) inference"
20+
in: "Methods and models for fMRI data analysis in neuroeconomics"
21+
pages: "Lecture 4, Slides 23/25"
22+
url: "http://www.socialbehavior.uzh.ch/teaching/methodsspring10.html"
23+
24+
def_id: "D186"
25+
shortcut: "fcon"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Definition:** Consider a [linear regression model](/D/mlr) with $n \times p$ design matrix $X$ and $p \times 1$ regression coefficients $\beta$:
31+
32+
$$ \label{eq:mlr}
33+
y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) \; .
34+
$$
35+
36+
Then, an F-contrast is specified by a $p \times q$ matrix $C$, yielding a $q \times 1$ vector $\gamma = C^\mathrm{T} \beta$, and it entails the [null hypothesis](/D/h0) that each value in this vector is zero:
37+
38+
$$ \label{eq:mlr-f-h0}
39+
H_0: \; \gamma_1 = 0 \wedge \ldots \wedge \gamma_q = 0 \; .
40+
$$
41+
42+
Consequently, the [alternative hypothesis](/D/h1) of the [statistical test](/D/test) would be that at least one entry of this vector is non-zero:
43+
44+
$$ \label{eq:mlr-f-h1}
45+
H_1: \; \gamma_1 \neq 0 \vee \ldots \vee \gamma_q \neq 0 \; .
46+
$$
47+
48+
Here, $C$ is called the "contrast matrix" and $C^\mathrm{T} \beta$ are called the "contrast values". With estimated regression coefficients, $C^\mathrm{T} \hat{\beta}$ are called the "estimated contrast values".

D/tcon.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
layout: definition
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-12-16 12:35:00
9+
10+
title: "t-contrast for contrast-based inference in multiple linear regression"
11+
chapter: "Statistical Models"
12+
section: "Univariate normal data"
13+
topic: "Multiple linear regression"
14+
definition: "t-contrast"
15+
16+
sources:
17+
- authors: "Stephan, Klaas Enno"
18+
year: 2010
19+
title: "Classical (frequentist) inference"
20+
in: "Methods and models for fMRI data analysis in neuroeconomics"
21+
pages: "Lecture 4, Slides 7/9"
22+
url: "http://www.socialbehavior.uzh.ch/teaching/methodsspring10.html"
23+
24+
def_id: "D185"
25+
shortcut: "tcon"
26+
username: "JoramSoch"
27+
---
28+
29+
30+
**Definition:** Consider a [linear regression model](/D/mlr) with $n \times p$ design matrix $X$ and $p \times 1$ regression coefficients $\beta$:
31+
32+
$$ \label{eq:mlr}
33+
y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) \; .
34+
$$
35+
36+
Then, a t-contrast is specified by a $p \times 1$ vector $c$ and it entails the [null hypothesis](/D/h0) that the product of this vector and the regression coefficients is zero:
37+
38+
$$ \label{eq:mlr-t-h0}
39+
H_0: \; c^\mathrm{T} \beta = 0 \; .
40+
$$
41+
42+
Consequently, the [alternative hypothesis](/D/h1) of a [two-tailed t-test](/D/hyp-tail) is
43+
44+
$$ \label{eq:mlr-t-h1}
45+
H_1: \; c^\mathrm{T} \beta \neq 0
46+
$$
47+
48+
and the [alternative hypothesis](/D/h1) of a [one-sided t-test](/D/hyp-tail) would be
49+
50+
$$ \label{eq:mlr-t-h1lr}
51+
H_1: \; c^\mathrm{T} \beta < 0 \quad \text{or} \quad H_1: \; c^\mathrm{T} \beta > 0 \; .
52+
$$
53+
54+
Here, $c$ is called the "contrast vector" and $c^\mathrm{T} \beta$ is called the "contrast value". With estimated regression coefficients, $c^\mathrm{T} \hat{\beta}$ is called the "estimated contrast value".

I/ToC.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -480,6 +480,7 @@ title: "Table of Contents"
480480
&emsp;&ensp; 4.1.9. **[Marginal distributions](/P/mvn-marg)** <br>
481481
&emsp;&ensp; 4.1.10. **[Conditional distributions](/P/mvn-cond)** <br>
482482
&emsp;&ensp; 4.1.11. **[Conditions for independence](/P/mvn-ind)** <br>
483+
&emsp;&ensp; 4.1.12. **[Independence of products](/P/mvn-indprod)** <br>
483484

484485
4.2. Multivariate t-distribution <br>
485486
&emsp;&ensp; 4.2.1. *[Definition](/D/mvt)* <br>
@@ -620,14 +621,21 @@ title: "Table of Contents"
620621
&emsp;&ensp; 1.5.11. *[Residual-forming matrix](/D/rfmat)* <br>
621622
&emsp;&ensp; 1.5.12. **[Estimation, projection and residual-forming matrix](/P/mlr-mat)** <br>
622623
&emsp;&ensp; 1.5.13. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)** <br>
623-
&emsp;&ensp; 1.5.14. **[Weighted least squares](/P/mlr-wls)** (1) <br>
624-
&emsp;&ensp; 1.5.15. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
625-
&emsp;&ensp; 1.5.16. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
626-
&emsp;&ensp; 1.5.17. **[Maximum log-likelihood](/P/mlr-mll)** <br>
627-
&emsp;&ensp; 1.5.18. **[Deviance function](/P/mlr-dev)** <br>
628-
&emsp;&ensp; 1.5.19. **[Akaike information criterion](/P/mlr-aic)** <br>
629-
&emsp;&ensp; 1.5.20. **[Bayesian information criterion](/P/mlr-bic)** <br>
630-
&emsp;&ensp; 1.5.21. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
624+
&emsp;&ensp; 1.5.14. **[Independence of estimated parameters and residuals](/P/mlr-ind)** <br>
625+
&emsp;&ensp; 1.5.15. **[Distribution of estimated parameters, signal and residuals](/P/mlr-wlsdist)** <br>
626+
&emsp;&ensp; 1.5.16. **[Distribution of residual sum of squares](/P/mlr-rssdist)** <br>
627+
&emsp;&ensp; 1.5.17. **[Weighted least squares](/P/mlr-wls)** (1) <br>
628+
&emsp;&ensp; 1.5.18. **[Weighted least squares](/P/mlr-wls2)** (2) <br>
629+
&emsp;&ensp; 1.5.19. *[t-contrast](/D/tcon)* <br>
630+
&emsp;&ensp; 1.5.20. *[F-contrast](/D/fcon)* <br>
631+
&emsp;&ensp; 1.5.21. **[Contrast-based t-test](/P/mlr-t)** <br>
632+
&emsp;&ensp; 1.5.22. **[Contrast-based F-test](/P/mlr-f)** <br>
633+
&emsp;&ensp; 1.5.23. **[Maximum likelihood estimation](/P/mlr-mle)** <br>
634+
&emsp;&ensp; 1.5.24. **[Maximum log-likelihood](/P/mlr-mll)** <br>
635+
&emsp;&ensp; 1.5.25. **[Deviance function](/P/mlr-dev)** <br>
636+
&emsp;&ensp; 1.5.26. **[Akaike information criterion](/P/mlr-aic)** <br>
637+
&emsp;&ensp; 1.5.27. **[Bayesian information criterion](/P/mlr-bic)** <br>
638+
&emsp;&ensp; 1.5.28. **[Corrected Akaike information criterion](/P/mlr-aicc)** <br>
631639

632640
1.6. Bayesian linear regression <br>
633641
&emsp;&ensp; 1.6.1. **[Conjugate prior distribution](/P/blr-prior)** <br>

P/mlr-f.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
---
2+
layout: proof
3+
mathjax: true
4+
5+
author: "Joram Soch"
6+
affiliation: "BCCN Berlin"
7+
e_mail: "joram.soch@bccn-berlin.de"
8+
date: 2022-12-13 12:36:00
9+
10+
title: "F-test for multiple linear regression using contrast-based inference"
11+
chapter: "Statistical Models"
12+
section: "Univariate normal data"
13+
topic: "Multiple linear regression"
14+
theorem: "Contrast-based F-test"
15+
16+
sources:
17+
- authors: "Stephan, Klaas Enno"
18+
year: 2010
19+
title: "Classical (frequentist) inference"
20+
in: "Methods and models for fMRI data analysis in neuroeconomics"
21+
pages: "Lecture 4, Slides 23/25"
22+
url: "http://www.socialbehavior.uzh.ch/teaching/methodsspring10.html"
23+
- authors: "Koch, Karl-Rudolf"
24+
year: 2007
25+
title: "Multivariate Distributions"
26+
in: "Introduction to Bayesian Statistics"
27+
pages: "Springer, Berlin/Heidelberg, 2007, ch. 2.5, eqs. 2.202, 2.213, 2.211"
28+
url: "https://www.springer.com/de/book/9783540727231"
29+
doi: "10.1007/978-3-540-72726-2"
30+
- authors: "jld"
31+
year: 2018
32+
title: "Understanding t-test for linear regression"
33+
in: "StackExchange CrossValidated"
34+
pages: "retrieved on 2022-12-13"
35+
url: "https://stats.stackexchange.com/a/344008"
36+
- authors: "Penny, William"
37+
year: 2006
38+
title: "Comparing nested GLMs"
39+
in: "Mathematics for Brain Imaging"
40+
pages: "ch. 2.3, pp. 51-52, eq. 2.9"
41+
url: "https://ueapsylabs.co.uk/sites/wpenny/mbi/mbi_course.pdf"
42+
43+
proof_id: "P392"
44+
shortcut: "mlr-f"
45+
username: "JoramSoch"
46+
---
47+
48+
49+
**Theorem:** Consider a [linear regression model](/D/mlr)
50+
51+
$$ \label{eq:mlr}
52+
y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V)
53+
$$
54+
55+
and an [F-contrast](/D/fcon) on the model parameters
56+
57+
$$ \label{eq:fcon}
58+
\gamma = C^\mathrm{T} \beta \quad \text{where} \quad C \in \mathbb{R}^{p \times q} \; .
59+
$$
60+
61+
Then, the [test statistic](/D/tstat)
62+
63+
$$ \label{eq:mlr-f}
64+
F = \hat{\beta}^\mathrm{T} C \left( \hat{\sigma}^2 C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} C^\mathrm{T} \hat{\beta} / q
65+
$$
66+
67+
with the [parameter estimates](/P/mlr-mle)
68+
69+
$$ \label{eq:mlr-est}
70+
\begin{split}
71+
\hat{\beta} &= (X^\mathrm{T} V^{-1} X)^{-1} X^\mathrm{T} V^{-1} y \\
72+
\hat{\sigma}^2 &= \frac{1}{n-p} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta})
73+
\end{split}
74+
$$
75+
76+
follows an [F-distribution](/D/f)
77+
78+
$$ \label{eq:mlr-f-dist}
79+
F \sim \mathrm{F}(q, n-p)
80+
$$
81+
82+
under the [null hypothesis](/D/h0)
83+
84+
$$ \label{eq:mlr-f-h0}
85+
\begin{split}
86+
H_0: &\; \gamma_1 = 0 \wedge \ldots \wedge \gamma_q = 0 \\
87+
H_1: &\; \gamma_1 \neq 0 \vee \ldots \vee \gamma_q \neq 0 \; .
88+
\end{split}
89+
$$
90+
91+
92+
**Proof:**
93+
94+
1) We know that [the estimated regression coefficients in linear regression follow a multivariate normal distribution](/P/mlr-wlsdist):
95+
96+
$$ \label{eq:b-est-dist}
97+
\hat{\beta} \sim \mathcal{N}\left( \beta, \, \sigma^2 (X^\mathrm{T} V^{-1} X)^{-1} \right) \; .
98+
$$
99+
100+
Thus, the [estimated contrast vector](/D/tcon) $\hat{\gamma} = C^\mathrm{T} \hat{\beta}$ is also [distributed according to a multivariate normal distribution](/P/mvn-ltt):
101+
102+
$$ \label{eq:g-est-dist-cond}
103+
\hat{\gamma} \sim \mathcal{N}\left( C^\mathrm{T} \beta, \, \sigma^2 C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right) \; .
104+
$$
105+
106+
Substituting the noise variance $\sigma^2$ with the noise precision $\tau = 1/\sigma^2$, we can also write this down as a [conditional distribution](/D/dist-cond):
107+
108+
$$ \label{eq:g-est-tau-dist-cond}
109+
\hat{\gamma} \vert \tau \sim \mathcal{N}\left( C^\mathrm{T} \beta, (\tau Q)^{-1} \right) \quad \text{with} \quad Q = \left( C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} \; .
110+
$$
111+
112+
2) We also know that the [residual sum of squares](/D/rss), divided the [true error variance](/D/mlr)
113+
114+
$$ \label{eq:mlr-rss}
115+
\frac{1}{\sigma^2} \sum_{i=1}^{n} \hat{\varepsilon}_i^2 = \frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{\sigma^2} = \frac{1}{\sigma^2} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta})
116+
$$
117+
118+
[is following a chi-squared distribution](/P/mlr-rssdist):
119+
120+
$$ \label{eq:mlr-rss-dist}
121+
\frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{\sigma^2} = \tau \, \hat{\varepsilon}^\mathrm{T} \hat{\varepsilon} \sim \chi^2(n-p) \; .
122+
$$
123+
124+
The [chi-squared distribution is related to the gamma distribution](/P/gam-chi2) in the following way:
125+
126+
$$ \label{eq:gam-chi2}
127+
X \sim \chi^2(k) \quad \Rightarrow \quad cX \sim \mathrm{Gam}\left( \frac{k}{2}, \frac{1}{2c} \right) \; .
128+
$$
129+
130+
Thus, applying \eqref{eq:gam-chi2} to \eqref{eq:mlr-rss-dist}, we obtain the [marginal distribution](/D/dist-marg) of $\tau$ as:
131+
132+
$$ \label{eq:tau-dist}
133+
\frac{1}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} \left( \tau \, \hat{\varepsilon}^\mathrm{T} \hat{\varepsilon} \right) = \tau \sim \mathrm{Gam}\left( \frac{n-p}{2}, \frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{2} \right) \; .
134+
$$
135+
136+
3) Note that the [joint distribution](/D/dist-joint) of $\hat{\gamma}$ and $\tau$ is, following from \eqref{eq:g-est-tau-dist-cond} and \eqref{eq:tau-dist} and [by definition, a normal-gamma distribution](/D/ng):
137+
138+
$$ \label{eq:g-est-tau-dist-joint}
139+
\hat{\gamma}, \tau \sim \mathrm{NG}\left( C^\mathrm{T} \beta, Q, \frac{n-p}{2}, \frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{2} \right) \; .
140+
$$
141+
142+
The [marginal distribution of a normal-gamma distribution with respect to the normal random variable, is a multivariate t-distribution](/P/ng-marg):
143+
144+
$$ \label{eq:ng-mvt}
145+
X, Y \sim \mathrm{NG}(\mu, \Lambda, a, b) \quad \Rightarrow \quad X \sim \mathrm{t}\left( \mu, \left( \frac{a}{b} \Lambda\right)^{-1}, 2a \right) \; .
146+
$$
147+
148+
Thus, the [marginal distribution](/D/dist-marg) of $\hat{\gamma}$ is:
149+
150+
$$ \label{eq:g-est-dist-marg}
151+
\hat{\gamma} \sim \mathrm{t}\left( C^\mathrm{T} \beta, \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} Q \right)^{-1}, n-p \right) \; .
152+
$$
153+
154+
4) Because of the following [relationship between the multivariate t-distribution and the F-distribution](/P/mvt-f)
155+
156+
$$ \label{eq:mvt-f}
157+
X \sim t(\mu, \Sigma, \nu) \quad \Rightarrow \quad (X-\mu)^\mathrm{T} \, \Sigma^{-1} (X-\mu)/n \sim F(n, \nu) \; ,
158+
$$
159+
160+
the following quantity [is, by definition, F-distributed](/D/f)
161+
162+
$$ \label{eq:mlr-f-s1}
163+
F = \left( \hat{\gamma} - C^\mathrm{T} \hat{\beta} \right)^\mathrm{T} \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} Q \right) \left( \hat{\gamma} - C^\mathrm{T} \hat{\beta} \right) / q
164+
$$
165+
166+
and under the [null hypothesis](/D/h0) \eqref{eq:mlr-f-h0}, it can be evaluated as:
167+
168+
$$ \label{eq:mlr-t-s2}
169+
\begin{split}
170+
F &\overset{\eqref{eq:mlr-f-s1}}{=} \left( \hat{\gamma} - C^\mathrm{T} \hat{\beta} \right)^\mathrm{T} \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} Q \right) \left( \hat{\gamma} - C^\mathrm{T} \hat{\beta} \right) / q \\
171+
&\overset{\eqref{eq:mlr-f-h0}}{=} \hat{\gamma}^\mathrm{T} \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} Q \right) \hat{\gamma} / q \\
172+
&\overset{\eqref{eq:fcon}}{=} \hat{\beta}^\mathrm{T} C \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} Q \right) C^\mathrm{T} \hat{\beta} / q \\
173+
&\overset{\eqref{eq:g-est-tau-dist-cond}}{=} \hat{\beta}^\mathrm{T} C \left( \frac{n-p}{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}} \left( C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} \right) C^\mathrm{T} \hat{\beta} / q \\
174+
&\overset{\eqref{eq:g-est-tau-dist-cond}}{=} \hat{\beta}^\mathrm{T} C \left( \frac{n-p}{(y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta})} \left( C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} \right) C^\mathrm{T} \hat{\beta} / q \\
175+
&\overset{\eqref{eq:mlr-est}}{=} \hat{\beta}^\mathrm{T} C \left( \frac{1}{\hat{\sigma}^2} \left( C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} \right) C^\mathrm{T} \hat{\beta} / q \\
176+
&= \hat{\beta}^\mathrm{T} C \left( \hat{\sigma}^2 C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} C^\mathrm{T} \hat{\beta} / q \; .
177+
\end{split}
178+
$$

0 commit comments

Comments
 (0)