class: center, middle, inverse, title-slide #
Getting Started in R
an introduction to data analysis and visualisation
## Visualising Data ### Réka Solymosi, Sam Langton & Emily Buehler ### 4 July 2019 --- class: inverse, center, middle # Data viz --- ### The grammar of graphics ![](img/gg1.png) [Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19(1), 3-28.](http://vita.had.co.nz/papers/layered-grammar.pdf) --- ### The grammar of graphics ![](img/gg2.png) [Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19(1), 3-28.](http://vita.had.co.nz/papers/layered-grammar.pdf) --- ### The grammar of graphics ![](img/gg3.png) [Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19(1), 3-28.](http://vita.had.co.nz/papers/layered-grammar.pdf) --- ### The grammar of graphics ![](img/layers.png) [Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19(1), 3-28.](http://vita.had.co.nz/papers/layered-grammar.pdf) --- ### The grammar of graphics ![](img/combined.png) [Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19(1), 3-28.](http://vita.had.co.nz/papers/layered-grammar.pdf) --- ### The grammar of graphics ![](https://www.science-craft.com/wp-content/uploads/2014/06/ggplot-1.png) --- ### Creating a ggplot ```r library(ggplot2) ``` ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) ``` <img src="viz_slides_files/figure-html/unnamed-chunk-2-1.png" height="400px" /> --- ### Pseudocode ```r ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) ``` --- ### 3 variables to 1 plot: `colour =` ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, colour = class)) ``` <img src="viz_slides_files/figure-html/unnamed-chunk-4-1.png" height="400px" /> --- ### 3 variables to 1 plot: `size =` ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, size = class)) ``` ``` ## Warning: Using size for a discrete variable is not advised. ``` <img src="viz_slides_files/figure-html/unnamed-chunk-5-1.png" height="400px" /> --- ### 3 variables to 1 plot: `shape =` ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, shape = class)) ``` ``` ## Warning: The shape palette can deal with a maximum of 6 discrete values ## because more than 6 becomes difficult to discriminate; you have 7. ## Consider specifying shapes manually if you must have them. ``` ``` ## Warning: Removed 62 rows containing missing values (geom_point). ``` <img src="viz_slides_files/figure-html/unnamed-chunk-6-1.png" height="330px" /> --- ### 3 variables to 1 plot: `facet_wrap()` ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + facet_wrap(~ class, nrow = 2) ``` <img src="viz_slides_files/figure-html/unnamed-chunk-7-1.png" height="400px" /> --- ### Geoms How are these two plots similar? <img src="http://r4ds.had.co.nz/visualize_files/figure-html/unnamed-chunk-18-1.png" width="400px" /><img src="http://r4ds.had.co.nz/visualize_files/figure-html/unnamed-chunk-18-2.png" width="400px" /> --- ### Geoms ```r ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) + geom_smooth(mapping = aes(x = displ, y = hwy)) ``` <img src="viz_slides_files/figure-html/unnamed-chunk-9-1.png" height="400px" /> --- ### Geoms ++ ![](https://www.rstudio.com/wp-content/uploads/2016/11/ggplot2-cheatsheet-2-1.png) --- ### Geoms ++ [Online companions](https://www.trafforddatalab.io/graphics_companion/) ![](img/gg_online.png) --- ### Making this plot ![](viz_slides_files/figure-html/unnamed-chunk-10-1.png)<!-- --> --- ### Data + aesthetics + geom ```r bp <- ggplot(data = PlantGrowth) + geom_boxplot(mapping = aes(x = group, y = weight)) bp ``` --- ### Initial tweaks ```r bp <- ggplot(data = PlantGrowth) + geom_boxplot(mapping = aes(x = group, y = weight, fill = group), size = 1.2, alpha = 0.8) bp ``` --- ### Axes ```r bp + scale_x_discrete(labels = c("control", "treat1", "treat2")) ``` ```r # Hide x tick marks, labels, and grid lines bp + scale_x_discrete(breaks=NULL) ``` --- ### Labels ```r bp + labs(title = "Figure 1: group distributions", x = " ") ``` --- ### Colour brewers ```r bp + scale_fill_brewer(palette = "Accent") ``` --- ### Themes ```r bp + theme_minimal() ``` --- ### Themes (specific options) ```r bp + theme(legend.position = "none", axis.text.x = element_text(color="#707070", size=12), axis.title.x = element_text(size = 14), axis.text.y = element_text(color="#707070", size=10, angle=45)) ``` --- ### Making this plot ```r bp <- ggplot(data = PlantGrowth) + geom_boxplot(mapping = aes(x = group, y = weight, fill = group), size = 1.2, alpha = 0.8) + scale_x_discrete(labels = c("control", "treat1", "treat2")) + labs(title = "Figure 1: group distributions", x = " ") + scale_fill_brewer(palette = "Accent") + theme_minimal() + theme(legend.position = "none", plot.title = element_text(size=18), axis.text.x = element_text(color="#707070", size=12), axis.title.y = element_text(size=16), axis.text.y = element_text(color="#707070", size=10, angle=45)) bp ``` --- ### Making this plot interactive ```r ggplotly(bp) ``` --- ### Making this plot interactive
--- ###Rcolorbrewer [Sequential, diverging and qualitative colour scales from colorbrewer.org](http://ggplot2.tidyverse.org/reference/scale_brewer.html) & [Colour summaries from http://www.cookbook-r.com](http://www.cookbook-r.com/Graphs/Colors_(ggplot2)) ![](img/gg_colours.png) --- ### It's all the same! ![](img/gg_examples.png)