Homework 7

Course

STAT218

Due

May 13, 2025

  1. The cholesterol dataset contains measurements of total serum cholesterol (mg/L) from a study in which participants were randomly allocated to one of two breakfast diets: corn flakes or oat bran.

    1. Construct boxplots of serum cholesterol by diet group. Why might a nonparametric method be more appropriate than the t test to compare groups?
    2. Test for an effect of diet on cholesterol using the rank sum procedure at the 5% level and interpret the result in context.
    3. Compute and interpret a point estimate for the location shift.
    4. Compare your results above with a parametric test. Do your conclusions differ depending on which procedure is used?
# load cholesterol data

# boxplots

# rank sum test (with point estimate)

# parametric test
  1. [your answer here]
  2. [your answer here]
  3. [your answer here]
  4. [your answer here]
  1. The brfss dataset includes observations of desired and actual weight for a small subset of BRFSS survey respondents (a sample of U.S. adults). In this problem you’ll test whether the typical U.S. adult wishes to gain or lose weight.

    1. Construct a histogram of the differences between actual and desired weights among respondents (actual - desired). Inspect the values as well. Provide two reasons why a nonparametric method might be preferable to a t test.
    2. Test whether the median U.S. adult is 5 lbs heavier than they would like at the 1% level using the sign test and interpret the result in context.
    3. Compare your results above with the corresponding parametric inference. Do results differ?
    4. [extra credit] Suppose we define “satisfied with one’s weight” to mean within 5 lbs. of desired weight. Estimate the proportion of U.S. adults satisfied with their weight. (Think carefully about endpoints; “within 5” means less than but not equal to 5.)
    5. [extra credit] Test whether a majority of U.S. adults are satisfied with their weight.
# load brfss data

# compute weight differences

# construct histogram

# sign test 

# compare with parametric test

# extra: estimate proportion of adults within 5lbs of desired weight

# extra: test whether a majority are satisfied with their weight
  1. [your answer here]
  2. [your answer here]
  3. [your answer here]
  4. [your answer here]
  5. [your answer here]
  1. The famuss dataset contains measurements of percent change in dominant and nondominant arm strength after resistance training for 595 study participants, along with ACTN3 genotype, which is thought to be associated with muscle growth.

    1. Construct boxplots of change in nondominant arm strength by genotype. Why might a nonparametric approach be preferable to ANOVA in this circumstance?
    2. Test for an association between genotype and change in nondominant arm strength at the 5% level using the Kruskal-Wallis test. Interpret the result in context.
    3. Compare your result in the previous part with the parametric inference using ANOVA. Do conclusions differ?
# load famuss data

# boxplots of nondominant change by genotype

# test for association between genotype and change in nondominant arm strength

# compare with parametric ANOVA
  1. [your answer here]
  2. [your answer here]
  3. [your answer here]