Final study guide

The information below provides an overview of the final exam: what it covers, how to prepare, the format, and practice exercises.

Scope

The final is comprehensive in scope – any material covered in lecture, labs, or assignments could appear. The material for the course broadly falls under two umbrellas:

statistical inference from continuous data
- one- and two-sample $t$ -tests and confidence intervals
- analysis of variance
- nonparametric alternatives to $t$ -tests and ANOVA
- simple linear regression
statistical inference from categorical data (weeks 8-10)
- exact and approximate tests and intervals for proportions
- $χ^{2}$ tests
- relative risk and odds ratios

All of the methods above are instances of population parameter estimation, hypothesis tests, and confidence intervals. Thus, you are expected to be fluent with respect to the following general concepts and their specific manifestations in the above-listed methods:

population/model parameters
sample statistics
point estimates
standard errors and sampling variability
sampling distributions
interval coverage and construction
statistical hypotheses/alternatives
test statistics
$p$ -values
type I and type II errors
statistical power

These ideas are very general and provide a core conceptual framework for statistical methodology that extends well beyond the scope of this class.

Preparation

I recommend preparing a set of review notes based on the above, your existing notes, lecture slides, assigned reading, and assignments and labs.

One straightforward strategy would be to list the “core concepts” above for the main methods we discussed and identify one example from a past assignment. For instance, expand the callout block below.

Example method summary

The one-sample $t$ -test.

Method

Population/model parameters: population mean $μ$
Sample statistics: sample mean and SD $\bar{x}, S_{x}$
Point estimate: $\hat{μ} = \bar{x}$
Standard error: $S E (\bar{x}) = \frac{S_{x}}{\sqrt{n}}$
Sampling distribution: $t_{n - 1}$ model for $\frac{\bar{x} - μ}{S E (\bar{x})}$
Interval: $\bar{x} \pm c \times S E (\bar{x})$ , where $c$ is a quantile from the $t$ model
Statistical hypothesis and alternatives: $H_{0} : μ = μ_{0}$ and $H_{A} : μ \begin{matrix} > \\ \neq \\ < \end{matrix} μ_{0}$
Test statistic: $T = \frac{\bar{x} - μ}{S E (\bar{x})}$
$p$ -value: from $T$ model, proportion of samples exceeding observed test statistic in the direction of the alternative

Example application

Body temperature data. Is mean body temp actually 98.6 degrees Farenheit?

${\begin{cases} H_{0} : μ = 98.6 \\ H_{A} : μ \neq 98.6 \end{cases}$

# load data
load('data/temps.RData')
btemp <- temps$body.temp

# two-sided test
t.test(btemp, mu = 98.6, alternative = 'two.sided')


    One Sample t-test

data:  btemp
t = -1.3283, df = 38, p-value = 0.192
alternative hypothesis: true mean is not equal to 98.6
95 percent confidence interval:
 98.10813 98.70213
sample estimates:
mean of x 
 98.40513

The data provide no evidence that mean body temperature differs from 98.6 degrees Farenheit (T = -1.3283 on 38 df, p = 0.192). With 95% confidence, mean body temperature is estimated to be between 98.11 and 98.70 degrees Farenheit.

You may also wish to repeat a few example problems from topics that you struggled with (or found easy, if you want a confidence boost).

Format

The test comprises a series of short data analyses in which quantitative results are provided for you. In each analysis, there are several prompts which require you to interpret results in context or perform simple subsequent calculations.

During the exam you may consult any of your course notes and are allowed the use of a calculator. However, you are not permitted access to digital materials. My personal recommendation is that you prepare a few sheets of concise notes and bring or print relevant class notes.