casenr | age | rfft |
---|---|---|
126 | 37 | 136 |
33 | 36 | 80 |
145 | 37 | 102 |
Model specification, parameter estimation, inference, and diagnostics
Ruff Figural Fluency Test (RFFT) is a cognitive assessment.
casenr | age | rfft |
---|---|---|
126 | 37 | 136 |
33 | 36 | 80 |
145 | 37 | 102 |
How much does cognitive ability as measured by RFFT decline with age on average?
Previously you found the best-fitting line:
\[ \text{RFFT} = 134.098 - 1.191 \times \text{age} \]
With each year of age, RFFT decreases by 1.191 points on average.
\[ \begin{align} \text{slope}: \quad-1.191 &= \text{cor}(\text{age}, \text{RFFT})\times\frac{SD(\text{RFFT})}{SD(\text{age})} \\ \text{intercept}: \quad134.098 &= \text{mean}(\text{RFFT}) - (-1.191)\times\text{mean}(\text{age}) \end{align} \]
Recall how you found this line:
Bias and error are measured via residuals: \[ \textcolor{red}{e_i} = y_i - \textcolor{blue}{\hat{y}_i} \]
We said that the best-fitting line achieved two conditions:
The simple linear regression model is:
\[ Y = \textcolor{blue}{\underbrace{\beta_0 + \beta_1 x}_\text{mean}} + \textcolor{red}{\underbrace{\epsilon}_\text{error}} \]
The values that minimize error subject to the model being unbiased are:
\[\begin{align*} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} &\quad(\text{unbiased}) \\ \hat{\beta}_1 &= \frac{s_y}{s_x}\times r &\quad(\text{minimizes SSE}) \end{align*}\]
These are called the least squares estimates.
According to the model, a one-unit increment in \(x\) corresponds to a \(\beta_1\)-unit change in mean \(Y\):
With each additional year of age, mean RFFT score decreases by an estimated 1.191 points.
formula = <RESPONSE> ~ <EXPLANATORY>
specifies the modeldata = <DATAFRAME>
specifies the observationsThe residual standard deviation provides an estimate of error variability:
\[\textcolor{\red}{\hat{\sigma}} = \sqrt{\frac{1}{n - 2} \sum_i e_i^2} \qquad\text{(estimated error variability)}\]
The proportion of variability explained by the model is: \[ R^2 = 1 - \frac{(n - 2)\textcolor{red}{\hat{\sigma}^2}}{(n - 1)\textcolor{darkgrey}{s_y^2}} \quad\left(1 - \frac{\text{error variability}}{\text{total variability}}\right) \]
Age explains 40.43% of variability in RFFT.
Standard errors for the coefficients are:
\[SE\left(\hat{\beta}_0\right) = \hat{\sigma}\sqrt{\frac{1}{n} + \frac{\bar{x}^2}{(n - 1)s_x^2}} \qquad\text{and}\qquad SE\left(\hat{\beta}_1\right) = \hat{\sigma}\sqrt{\frac{1}{(n - 1)s_x^2}}\]
While you won’t need to know these formulae, do notice that:
If the errors are symmetric and unimodal, then the sampling distribution of \[ T = \frac{\hat{\beta}_1 - \beta_1}{SE(\beta_1)} \] is well-approximated by a \(t_{n - 2}\) model.
Significance test: \(\begin{cases} H_0: \beta_1 = 0 \\ H_A: \beta_1 \neq 0 \end{cases}\)
Confidence interval: \(\hat{\beta}_1 \pm c\times SE\left(\hat{\beta}_1\right)\)
Call:
lm(formula = rfft ~ age, data = prevend)
Residuals:
Min 1Q Median 3Q Max
-56.085 -14.690 -2.937 12.744 77.975
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 134.0981 6.0701 22.09 <2e-16 ***
age -1.1908 0.1007 -11.82 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.52 on 206 degrees of freedom
Multiple R-squared: 0.4043, Adjusted R-squared: 0.4014
F-statistic: 139.8 on 1 and 206 DF, p-value: < 2.2e-16
2.5 % 97.5 %
(Intercept) 122.130647 146.0654574
age -1.389341 -0.9922471
Fitted model: \[ \text{RFFT} = 134.098 - 1.191 \times \text{age} \]
Age explains an estimated 40.43% of variation in RFFT.
With each year of age mean RFFT declines by an estimated 1.19 points (SE 0.10).
There is a significant association between age and mean RFFT score (T = -11.82 on 206 degrees of freedom, p < 0.0001).
With 95% confidence, each additional year of age is associated with an estimated decline in mean RFFT between 0.99 and 1.39 points.
Kleiber’s law refers to the relationship between metabolic rate and body mass.
Call:
lm(formula = log.metab ~ log.mass, data = kleiber)
Residuals:
Min 1Q Median 3Q Max
-1.14216 -0.26466 -0.04889 0.25308 1.37616
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.63833 0.04709 119.73 <2e-16 ***
log.mass 0.73874 0.01462 50.53 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4572 on 93 degrees of freedom
Multiple R-squared: 0.9649, Adjusted R-squared: 0.9645
F-statistic: 2553 on 1 and 93 DF, p-value: < 2.2e-16
There is a significant association between body mass and metabolism (p < 0.0001): body mass explains 96.49% of variation in metabolism; with 95% confidence, a unit increment in log mass is associated with an estimated increase in mean log metabolism between 0.7097 and 0.7678, with a point estimate of 0.7387.
Exponentiating both sides of the fitted SLR model equation:
\[ \underbrace{\text{metabolism}}_{e^{\log(\text{metabolism})}} = \underbrace{280.99}_{e^{5.64}} \times \underbrace{\text{mass}^{0.74}}_{e^{0.74 \log(\text{mass})}} \]
So we’ve really estimated what’s known as a power law relationship: \(y = ax^b\).
The estimate and interval for \(\beta_1\) in the SLR model can be transformed appropriately for a more direct interpretation:
With 95% confidence, every doubling of body mass is associated with an estimated 63.55-70.26% increase in median metabolism.
STAT218