class: center, middle, inverse, title-slide # Data Visualisation in R ## with ggplot2 ### Jens Hüsers ### 2018/17/08 (updated: 2018-10-19) --- <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.2.0/css/all.css" integrity="sha384-hWVjflwFxL6sNzntih27bfxkr27PmbbK/iSvJ+a4+0owXq79v+lsFkW54bOGbiDQ" crossorigin="anonymous"> --- class: center, middle # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-1-1.png" width="576" /> --- class: middle, center # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-2-1.png" width="576" /> --- class: middle, center # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-3-1.png" width="576" /> --- class: middle, center # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-4-1.png" width="576" /> --- class: middle, center # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-5-1.png" width="576" /> --- class: middle, center # Why is visualisation important? <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-6-1.png" width="576" /> --- class: center, middle .very-large[.red[gg]plot] </br> .very-large[.red[G]rammar] .very-large[of] .very-large[.red[G]raphics] --- # Analogy - Written Grammar .pull-left[ <img src="img/lazy-dog.png" width="566" style="display: block; margin: auto;" /> ] .pull-right[ - Grammar of Graphics - Plotting **Framework** - Leland Wilkinson, 1999 - 2 Principles - Graphics: Distinct layers of grammatical elements (like adjectives and nouns) - Meaningful plots through aesthetic mapping (Grammatical rules how to ensemble the ”vocabulary”) ] --- class: middle, center # Overview of Grammatical Elements Element | Description ------------- | ------------- .small-code[.code-orange[Data]] | The dataset being plotted .small-code[.code-green[Aestetics]] | The visual elements used for our data .small-code[.code-blue[Geometrics]] | The scales onto which we map our data .small-code[Facet] | Plotting small multiples .small-code[Statistic] | Representations of our data to aid understanding .small-code[Coordinates] | The space on which the data will be plotted .small-code[Themes] | All non-data ink --- class: middle, center .large-code[**ggplot2**] Essential grammatical elements .large-code[.code-orange[data]] .large-code[+] .large-code[.code-green[aesthetics]] .large-code[+] .large-code[.code-blue[geometrics]] Optional grammatical elements .large-code[Facet] .large-code[Statistic] .large-code[Coordinates] .large-code[Themes] --- class: middle, center # Overview .large-code[.code-orange[Data]] | .large-code[+] | .large-code[.code-blue[aesthetics]] | .large-code[+] | .large-code[.code-green[geometrics]] ------------- | ------|----------|------------|------------------------- .small-code[.code-orange[Dataframe with variable]] | | .small-code[.code-blue[x-Axis]] | | .small-code[.code-green[Point]] .small-code[.code-orange[of interest]] | | .small-code[.code-blue[y-Axis]] | | .small-code[.code-green[Line]] | | .small-code[.code-blue[Color]] | | .small-code[.code-green[Histogram]] | | .small-code[.code-blue[Size]] | | .small-code[.code-green[Boxplot]] | | .small-code[.code-blue[Shape]] | | .small-code[.code-green[...]] | | .small-code[.code-blue[Fill]] | | .small-code[.code-green[]] | | .small-code[.code-blue[...]] | | .small-code[.code-green[]] --- # How to speak with the Grammar of Graphics ## ggplot Syntax ![](img/ggplot-syntax-plain.png) --- # How to speak with the Grammar of Graphics ## ggplot Syntax ![](img/ggplot-syntax-explained.png) --- # Scatterplot ```r ggplot(data = gapminder, mapping = aes(x = lifeExp, y = gdpPercap)) + geom_point() ``` .center[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-9-1.png" width="576" /> ] --- # Scatterplot - adding further .code-green[aesthetics] ```r ggplot(data = gapminder, mapping = aes(x = lifeExp, y = gdpPercap, color = continent)) + geom_point() ``` .center[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-11-1.png" width="576" /> ] --- # Scatterplot - adding further .code-green[aesthetics] ```r ggplot(data = gapminder, mapping = aes(x = lifeExp, y = gdpPercap, color = continent, size = pop)) + geom_point() ``` .center[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-13-1.png" width="576" /> ] --- # Barplots ```r ggplot(data = gapminder, mapping = aes(x = continent, fill = lifeExp_cat)) + geom_bar() ``` .center[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-15-1.png" width="576" /> ] --- # Boxplots ```r ggplot(data = gapminder, mapping = aes(x = continent, y = lifeExp)) + geom_errorbar(stat = "boxplot") + geom_boxplot() ``` <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-16-1.png" width="576" /> --- # Histogramm und Density Plots .pull-left[ ```r gapminder %>% ggplot(mapping = aes(x = lifeExp) geom_histogram() ``` ] .pull-right[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-18-1.png" width="576" /> ] --- # Histogramm und Density Plots .pull-left[ ```r gapminder %>% ggplot(mapping = aes(x = lifeExp, * fill=continent)) + geom_histogram() + * viridis::scale_fill_viridis(discrete = T) + * theme_minimal() ``` ] .pull-right[ <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-20-1.png" width="576" /> ] --- background-image: url(img/ignaz_semmelweis_1860.jpeg) background-size: 200px background-position: 95% 30% # Timeline Data .pull-left-wide[ Diese Zeitreihenanalyse basiert auf Daten von Igantz Semmelweis aus den Jahren 1841 bis 1849. Semmelweis beobachtete die Kindersterblichkeit in mehreren Kliniken. Dabei bemerkte er einen Zusammenhang zwischen den Hygiengebedingungen einzelner Kliniken und deren Mortalitätsrate Neugeborener. Je besser die Hygiene der Arzte war, insbesondere durch das Händewaschen vor dem Kontakt mit den Neugeborenen, desto geringer war die Mortalität. Seine Beobachtungen veröffentlichte Semmelweis in Tabellen und teilte seine Schlussfolgerungen in einem offenem Brief seinen Kollegen mit. Diese ignorierten jedoch seine Anweisungen. Die Datenvisualisierung war zu dieser Zeit noch unüblich. Möglicherweise wären seine Beobachtungen von den Kollegen eher wahrgenommen worden, wenn diese zusammenfassend in einer Zeitreihenanalyse dargestellt worden wären. ] <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-21-1.png" width="864" /> --- class: middle, center # Publication Ready Plots <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-22-1.png" width="576" /> --- class: middle, center # Publication Ready Plots <img src="slides-ggplot-gmds18_files/figure-html/unnamed-chunk-23-1.png" width="432" /> --- class: middle center background-image: url(../img/logo-hs-os-kompakt.jpg) # Kontakt | | | | :--------------------------------------------------------------------------------------------------------- | :---------------| | <a href="mailto:j.huesers@hs-osnabrueck.de">.black[<i class="fas fa-envelope"></i>] | j.huesers@hs-osnabrueck.de | | <a href="">.black[<i class="fa fa-link fa-fw"></i>] | sciphy-stats.com | | <a href="">.black[<i class="fab fa-twitter"></i>] | @jnshsrs | | <a href="http://github.com/jnshsrs">.black[<i class="fab fa-github"></i>] | @jnshsrs |