Data import and dates/times

Author
Affiliation

Prof Amanda Luby

Carleton College
Stat 220 - Spring 2025

Click the “code” button above to copy and paste the source code into RStudio

Warm Up

Use read_csv() to import the desserts data set from

https://stat220-s25.github.io/data/desserts.csv

# your code here

readr practice

Use the appropriate read_<type>() function to import the following data sets:

  • https://stat220-s25.github.io/data/data-4.csv

  • https://stat220-s25.github.io/data/tricky-1.csv

If you hit any errors/problems, be sure to explore them and identify the issue, even if you can’t “fix” it.

data-4.csv

# your code here

tricky-1.csv

# your code here

read_excel practice

Use the appropriate read_<type>() function to import the following data sets:

  • https://stat220-s25.github.io/data/sales.xlsx

Step 1: read in the data so it looks like the following:

# A tibble: 9 × 2
  id      n    
  <chr>   <chr>
1 Brand 1 n    
2 1234    8    
3 8721    2    
4 1822    3    
5 Brand 2 n    
6 3333    1    
7 2156    3    
8 3987    6    
9 3216    5   
# your code here

Step 2 (Stretch goal): Manipulate the data so that it looks like the following:

# A tibble: 7 × 3
  brand   id    n    
  <chr>   <chr> <chr>
1 Brand 1 1234  8    
2 Brand 1 8721  2    
3 Brand 1 1822  3    
4 Brand 2 3333  1    
5 Brand 2 2156  3    
6 Brand 2 3987  6    
7 Brand 2 3216  5  
# your code here

lubridate practice

Task 1

Create a new copy of the desserts dataset, but do not parse the uk_airdate within read_csv. Instead, leave it as a character vector and parse the date using {lubridate} functions. Which approach do you prefer?

Then, create a new column called how_long_ago that measures the time between today and the UK airdate of the episode. Can you format this column:

  • in years
  • in months
  • in weeks
  • in days

(Hint: see ?time_length)

# your code here