Biostatistics Summer Prep Course: Methods & Computing
This course consisted of nine 90-minute sessions to prepare incoming biostatistics students for their statistical methods course. The contents available here include slide sets that introduce fundamental statistical topics and exercises for students to gain practice both applying these methods and computing in R.
Flow control
- Description: practice with for and while loops, the apply family, and conditional statements in R
- Exercises
- Solutions
Probability distributions
- Description: overview of discrete and continuous probability distributions; expectation and variance of random variables; empirical cdfs; working with built-in probability distributions in R
- Slides
- Exercises
- Solutions
Graphing and data analysis
- Description: graphing and data manipulation using tidyverse; Central Limit Theorem simulation exercise
- Exercises
- Solutions
Linear regression
- Description: introduction to linear regression; interpreting regression parameters; motivation for ordinary least squares; implementation in R using the lm() function; more involved exercises
- Slides
- Exercises
- Solutions
Simulation studies
- Description: introduction to Monte Carlo simulation; commonly reported operating characteristics; confounding bias simulation exercise in R
- Slides
- Exercises
- Solutions
Maximum Likelihood Estimation
- Description: overview of maximum likelihood theory with application to linear and logistic regression; very brief introduction to glms; practice with the optim() function in R
- Slides
- Exercises
- Solutions
Bootstrap
- Description: motivation for bootstrap; algorithm for nonparametric bootstrap; implementation in R (with connection to MLE exercises)
- Slides
- Exercises
- Solutions
Additional handouts
Source code available here
Slide sets and some exercises were adapted from previous instructors: Katrina Devick (2017), Jessica Gronsbell (2016), and Emma Schwager (2015)