class: center, middle, inverse, title-slide # Lab 01b: BMI 5/625 ## Introducing ggplot ### Steven Bedrick --- ## Introducing `ggplot` * Goal for this session: a very quick `ggplot` refresher -- * Meet our dataset: -- * [Palmer Penguins](https://allisonhorst.github.io/palmerpenguins/) -- ```r glimpse(penguins) ``` ``` Rows: 344 Columns: 8 $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, … $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, … $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186… $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, … $ sex <fct> male, female, female, NA, female, male, female, male… $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007… ``` --- ## Sidebar: Why not Fisher's Irises? <img src="01b-slides_files/figure-html/unnamed-chunk-2-1.png" width="80%" style="display: block; margin: auto;" /> -- * Hint: The original citation for that data is: -- * R. A. Fisher (1936) "The use of multiple measurements in taxonomic problems." _Annals of Eugenics_ 7(2): 179-188 -- * 🤦 --- ## So: Penguins! ```r glimpse(penguins) ``` ``` Rows: 344 Columns: 8 $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, … $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, … $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186… $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, … $ sex <fct> male, female, female, NA, female, male, female, male… $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007… ``` --- ## Core `ggplot` concepts -- * _Aesthetics_ map dimensions of our data to visual properties of the plot -- * _Geometries_ ("geoms") actually put "ink on the page" -- * Each type of graph (scatterplot, etc.) has a corresponding geom -- * Different geoms attend to different aesthetics -- * e.g., `geom_point` cares about `x` and `y` -- * Multiple geoms can be combined on the same plot -- * _Scales_ control axes, fills, etc. -- * _Themes_ control visual properties (fonts, background colors, etc.) --- class: center, middle ## `ggplot` is an _opinionated_ tool! --- ## Data considerations -- * By default, `ggplot` assumes "tidy" data -- * Think: one "row" per mark on the graph... -- * ... and then the various properties for each mark in columns. --- ## Our first plot ```r penguins %>% ggplot(mapping=aes( x=bill_length_mm, y=bill_depth_mm )) + geom_point() ``` -- <img src="01b-slides_files/figure-html/unnamed-chunk-5-1.png" width="80%" style="display: block; margin: auto;" /> --- ## What about color? ```r penguins %>% ggplot(mapping=aes( x=bill_length_mm, y=bill_depth_mm, color=species )) + geom_point() ``` -- <img src="01b-slides_files/figure-html/unnamed-chunk-6-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Want a different plot? Try a different geom ```r penguins %>% ggplot(mapping=aes( x=bill_length_mm, fill=species )) + geom_histogram() ``` <img src="01b-slides_files/figure-html/unnamed-chunk-7-1.png" width="80%" style="display: block; margin: auto;" /> --- ## We can specify aesthetic values by hand: ```r penguins %>% ggplot( mapping=aes(x=bill_length_mm, fill=species) ) + geom_histogram(alpha=0.6) ``` <img src="01b-slides_files/figure-html/unnamed-chunk-8-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Different geoms have different options: ```r penguins %>% ggplot( mapping=aes(x=bill_length_mm, fill=species) ) + geom_histogram(alpha=0.6) ``` <img src="01b-slides_files/figure-html/unnamed-chunk-9-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Different geoms have different options: ```r penguins %>% ggplot( mapping=aes(x=bill_length_mm, fill=species) ) + geom_histogram(alpha=0.6, position="identity") ``` <img src="01b-slides_files/figure-html/unnamed-chunk-10-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Customizing other aspects of the plot ```r penguins %>% ggplot( mapping=aes(x=bill_length_mm, fill=species) ) + geom_histogram(alpha=0.6, position="identity") + labs(x="Bill Length (mm)", y="Frequency") + ggtitle("Bill Length, by species") ``` <img src="01b-slides_files/figure-html/unnamed-chunk-11-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Workflow tip: Save a plot for later ```r basic.plot <- penguins %>% ggplot(aes(x=bill_length_mm, y=bill_depth_mm)) + geom_point() ``` --- ## Workflow tip: Save a plot for later ```r basic.plot + labs( x="Bill Length (mm)", y="Bill Depth (mm)" ) ``` <img src="01b-slides_files/figure-html/unnamed-chunk-13-1.png" width="300px" height="300px" style="display: block; margin: auto;" /> -- * This is useful when building a complex plot!