Description
This course focuses on teaching the basic skills and concepts needed to use the R statistical programming language for working with data and running reproducible analysis. The R language is a powerful and flexible tool used for data analysis, data visualization, and statistics. Further, as a free and open-source tool, software licensing, black box technologies, and opaque cost structures are not a concern. Yet, as with many programming languages, the initial learning curve can be a little steep and somewhat daunting.
The aim of this course is not to make learners experts in R—that path is longer than can be covered in a single course. Rather, the intent with this course is to help get over the initial climb and provide the basic skills and experience to work with the R programming language, ask the right questions, and manipulate data to conduct exploratory data analysis and visualization.
With each lesson, learners will be presented with data, R code, and video instruction on how to use R to analyze data. The course teaches how to ingest, manipulate, summarize, and visualize data. Additionally, the course covers the basics of setting up an R project for reproducibility by emphasizing rules, best practices, and frameworks for combining code with prose commentary.
Intended Audience
This course is for analysts and researchers who are interested in learning R. Learners should have basic data management and analysis knowledge but need not have experience using R or other programming languages. This is not a statistics course, but it draws upon ideas and techniques from the field of statistics. This is also not a computer science course, but it draws on topics, technologies, and practices that are closely aligned with that field.
Learners can work on a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. Learners must be able to download and install R, RStudio, and R packages on their computers, all of which are available for free.
Learning Objectives
After completing this course, learners should be able to do the following:
- Program and document code in the R language using the integrated development environment RStudio.
- Perform fundamental data analysis tasks, including
- Importing and managing data from files,
- Manipulating, transforming, and cleaning data to prepare for analysis, and
- Performing exploratory data analysis.
- Visualize, communicate, and document your work systematically in a reproducible framework.
Structure
The course is organized into lessons that are meant to be completed sequentially. Each lesson includes a video introduction to the topic or technique with an example or demo of the topic being implemented, as well as an opportunity to put into practice some or all the material presented in the lesson through a practical exercise.
Lesson |
Lesson Objectives |
1: Why Use R? |
- Discuss the benefits of using open-source software.
- Compare using R and Excel for data analysis.
|
2: Install R and RStudio |
- Describe the difference between R and RStudio.
- Install R.
- Install RStudio.
|
3: RStudio Orientation |
- Configure RStudio for data analysis projects.
- Create an RStudio Project.
- Identify panes in RStudio and what each is used for.
|
4: R Packages |
- Recognize why and when R packages are useful.
- Install an R package.
- Attach an R package.
|
5: Importing Data |
- Organize files in an RStudio Project.
- Create a dataset in R by importing CSV and Excel files.
- Explore the contents of a dataset in R.
|
6: Data Visualization I |
- Create a chart using the ggplot2 package.
- Specify which type of chart to create.
- Change the colors used in the chart.
|
7: Programming Basics |
- Create new objects using the assignment operator.
- Define a data frame.
- Identify key data types.
|
8: Filter and Sort Rows |
- Remove rows from a data frame using logical operators.
- Sort rows in a data frame.
|
9: Select and Rename Columns |
- Remove columns from a data frame.
- Rename columns in a data frame.
- Reorder columns in a data frame.
|
10: Create New Columns |
- Add columns to a data frame.
- Modify columns in a data frame.
|
11: Data Visualization II |
- Modify order of bars in plots.
- Choose custom colors for plots.
- Add titles and labels to plots.
|
12: Summarize Data |
- Calculate summary statistics of a data frame.
- Aggregate rows of a data frame by group.
|
13: Reshape Data |
- Define tidy data.
- Pivot data from wide to long format.
- Pivot data from long to wide format.
|
14: Join Data |
- Compare inner, left, and full joins.
- Merge two data frames using.
- Recognize common join errors.
|
15: Data Visualization III |
- Change the format of axis scales.
- Use palettes to change plot colors.
- Create a small multiple series of plots.
|
16: Dates and Times |
- Convert text to dates and times.
- Calculate the amount of time between two dates.
|
17: Strings |
- Create new strings from existing data.
- Change the case of a string.
- Extract elements from a string.
- Filter a data frame using the contents of a string.
|
18: Missing Values |
- Check for missing values in a data frame.
- Exclude missing values from calculations.
|
19: Export Data |
- Write a data frame to a CSV file.
- Generate sheets in an Excel workbook from multiple data frames.
|
20: Getting Help |
- Identify online resources to answer programming questions.
- Create a reproducible example to debug code.
|
21: Literate Programming |
- Combine code, output, and prose in a document.
- Use Quarto to create HTML and Word documents.
|
22: Put It All Together |
- Develop an automated workflow to generate a monthly report by importing, cleaning, aggregating, and visualizing data.
|
Resources
- Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund, R for Data Science, Second Edition (O’Reilly Media, 2023). [Free E-book]
- Chester Ismay and Albert Y. Kim, Statistical Inference via Data Science: A ModernDive into R and the Tidyverse, Second Edition (Chapman and Hall/CRC, 2025). [Free E-book]
- Jenny Bryan, STAT 545: Data wrangling, exploration, and analysis R (2019). [Free E-book]
Estimated Time to Complete
10 hours