26-sql
Warm Up
Using nycflights23::flights
, answer the following using {dplyr} code:
- How many flights are in this dataset?
- How many flights flew to MSP from JFK in 2023?
- What was the average flight delay for MSP flights in Sept 2023?
nycflights23::flights
# A tibble: 435,352 × 19
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
<int> <int> <int> <int> <int> <dbl> <int> <int>
1 2023 1 1 1 2038 203 328 3
2 2023 1 1 18 2300 78 228 135
3 2023 1 1 31 2344 47 500 426
4 2023 1 1 33 2140 173 238 2352
5 2023 1 1 36 2048 228 223 2252
6 2023 1 1 503 500 3 808 815
7 2023 1 1 520 510 10 948 949
8 2023 1 1 524 530 -6 645 710
9 2023 1 1 537 520 17 926 818
10 2023 1 1 547 545 2 845 852
# ℹ 435,342 more rows
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
# tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
# hour <dbl>, minute <dbl>, time_hour <dttm>
Intro to SQL
Your turn: Set up database connection
db <- dbConnect_scidb(dbname = "imdb")
db %>%
dbGetQuery("SELECT * FROM kind_type
LIMIT 0,10;")
What kind_type
corresponds to movies?
Your turn: simple queries
- Find your favorite movie in the
title
table - Find Viola Davis in the
name
table - Make note of Viola Davis’s
person_id
Your turn: finding data info
List the fields in the name
, cast_info
, and char_name
tables. Do they have fields in common? How do you think these tables fit together?
dbListTables(db)
dbListFields(db, "title")
Your turn: JOIN
ing Viola Davis
Find all the rows in
cast_info
that correspond to Viola Davis as an actressAdd the names of the characters she played by
JOIN
ing thechar_name
table
Your turn: full filmography
Find Viola Davis’s full filmography, in chronological order. Include each movie’s title
, production_year
, and the name
of the character that she played.