Overview

Rebrickable is a website that shows you which LEGO sets you can build from the LEGO sets and parts that you already own. To do this, Rebrickable maintains a database of the entire LEGO catalog. In this document, we’ll summarize the LEGO sets in this database.

The data set was originally obtained from the 2022-09-09 repository on Tidy Tuesday.

Summarizing the Lego dataset

In the data set consists of 19798 LEGO sets (i.e., rows) and 6 variables (i.e., columns). The data set includes LEGO sets from 1949 to 2022. The average number of parts in a set was 161.1 with a standard deviation of 402.62. However, there are 3630 in the data set with 0 parts, making these summary statistics inaccurate.

Below is a scatterplot with smoother describing how the typical number of parts in a set has changed from 1949 to 2022.

library(tidyverse)

ggplot(sets, aes(year, num_parts)) +
    geom_jitter(alpha = 0.2, size = .5) +
    geom_smooth(color = "skyblue") +
    scale_y_continuous(trans = "log10") +
    coord_fixed(ratio = 10) +
    labs(x = "Year", y = "Number of parts",
             title = "LEGO sets are getting larger over the years",
             caption = "Data source: brickable.com") +
    theme_light()