library(epitools)
Test 3 practice
The problems below are intended to provide extra practice for the third test. They are similar in scope and format to the actual test problems, if a little shorter.
Practice problems
The American Community Survey (ACS) is conducted annually by the U.S. Census Bureau to gather data on socioeconomic and demographic composition of American households in communities across the nation. The
acs
dataset contains 1605 responses from the 2012 ACS.- [L3] Compute the proportion of respondents of each employment status.
- [L6] Construct a 95% confidence interval for the 2012 unemployment rate (defined here as the share of U.S. adults who are unemployed).
- [L7] Test for an association between education level and employment (at the 5% level).
- [L8] Estimate the relative likelihood of unemployment by education level. Provide a 95% confidence interval.
- [L7] How do you explain the association given that there is not a significant difference in unemployment rates?
# load data
# proportion of respondents in each employment category
# CI for unemployment rate
# test for association between employment and education
# relative likelihood of unemployment
# confidence interval for relative likelihood
# inspect residuals
The
avandia
dataset contains data from a 2010 JAMA study investigating the risk of cardiovascular problems (acute myocardial infarction or heart failure) among elderly patients on two common diabetes medications. Data were obtained from health records for 31,840 patients treated with pioglitazone and 13,674 patients treated with rosiglitazone.- [L6] Which population proportions are estimable? Compute point estimates for the proportions you identify.
- [L6] Compute a point estimate for the difference in the risk of cardiovascular problems between the two medications. Interpret the estimate in context.
- [L7] Test at the 5% level for an association between medication and the rate of cardiovascular problems.
- [L8] Construct a 95% confidence interval for the relative risk of cardiovascular issues between the two treatments.
- [L8] Interpret your interval in context. Which drug is safer?
# load data
# point estimates for estimable population propotions
# estimate for difference in risk of cardiovascular issues
# test for association
# estimate measure of association
The
ethanol
dataset contains observations from a 2017 study comparing ethanol ablation treatments for superficial solid tumors with respect to clinical regression. In the study, hamsters with oral cancer tumors were randomly allocated to receive treatments; after a period of time, it was recorded whether the tumor had diminished (‘regressed’).- [L6] What proportion of each treatment group showed regression?
- [L7] Test for a treatment effect at the 5% level.
- [L8] Estimate the relative likelihood of regression on the ethyl cellulose treatment compared with the pure ethanol treatment. Provide a point estimate and 95% confidence interval.
# load data
# proportion of regressions in each treatment group
# test for treatment effect
# estimate relative likelihood of regression
The
antibiotics
dataset contains observations of pre-existing medical conditions of 92 children involved in a study on the optimal duration of antibiotic use in treatment of tracheitis, which is an upper respiratory infection. Assume the subjects are a representative sample of children that develop tracheitis.- [L3] Based on the data, which pre-existing condition is estimated to be most common among children who develop tracheitis?
- [L6] Perform an exact 5% level test to determine whether the prevalence of pre-existing respiratory conditions exceeds 10%.
- [L6] Perform an exact 5% level test to determine whether the prevalence of pre-existing cardiovascular conditions exceeds 10%.
# load data
# proportions of pre-existing conditions
# exact test of whether respiratory condition rate exceeds 10%
# exact test of whether cardiovascular condition rate exceeds 10%