NYGC/AMNH Workshop on Microbial Ecology - Introduction to R

Claus Wilke remixed by Joseph Elsherbini
2022/08/24

Aesthetic Mapping Exercise

Introduction

In this worksheet, we will discuss a core concept of ggplot, the mapping of data values onto aesthetics.

We will be using the R package tidyverse, which includes ggplot() and related functions.

Make a new Rmarkdown document called aesthetic_mapping_excercise.Rmd, and copy the following code chunk:


```{r}
library(tidyverse)

temperatures <- read_csv("https://wilkelab.org/SDS375/datasets/tempnormals.csv") %>%
  mutate(
    location = factor(
      location, levels = c("Death Valley", "Houston", "San Diego", "Chicago")
    )
  ) %>%
  select(location, day_of_year, month, temperature)

temps_houston <- filter(temperatures, location == "Houston")
```

The dataset we will be working with contains the average temperature for each day of the year for Houston, TX:

Whenever you see a code chunk, copy it into your document, and if there are blanks try filling them in to complete the prompt. You can also copy the prose between code chunks, or add your own notes.


```{r}
temps_houston
```

# A tibble: 366 × 4
   location day_of_year month temperature
   <fct>          <dbl> <chr>       <dbl>
 1 Houston            1 01           53.9
 2 Houston            2 01           53.8
 3 Houston            3 01           53.8
 4 Houston            4 01           53.8
 5 Houston            5 01           53.8
 6 Houston            6 01           53.7
 7 Houston            7 01           53.7
 8 Houston            8 01           53.7
 9 Houston            9 01           53.7
10 Houston           10 01           53.7
# … with 356 more rows

Basic use of ggplot

In the most basic use of ggplot, we call the ggplot() function with a dataset and an aesthetic mapping (created with aes()), and then we add a geom, such as geom_line() to draw lines or geom_point() to draw points.

Try this for yourself. Map the column day_of_year onto the x axis and the column temperature onto the y axis, and use geom_line() to display the data.


```{r ggplot}
ggplot(temps_houston, aes(x = ___, y = ___)) +
  ___()
```

Try again. Now use geom_point() instead of geom_line().


```{r ggplot2}
ggplot(temps_houston, aes(x = day_of_year, y = temperature)) +
  ___()
 ```

And now swap which column you map to x and which to y.


```{r ggplot3}
  ggplot(temps_houston, aes(x = ___, y = ___)) +
    geom_point()
  ```

More complex geoms

You can use other geoms to make different types of plots. For example, geom_boxplot() will make boxplots. For boxplots, we frequently want categorical data on the x or y axis. For example, we might want a separate boxplot for each month. Try this out. Puth month on the x axis, temperature on the y axis, and use geom_boxplot().


```{r ggplot-boxplot}
ggplot(temps_houston, aes(x = ___, y = ___)) +
  ___()
```

Now put the month on the y axis and the temperature on the x axis.


```{r ggplot-boxplot2}
ggplot(___) +
  ___()
```

Adding color

Next we will be working with the dataset temperatures, which is similar to temps_houston but contains data for three more locations:


```{r temperatures}
temperatures
```

# A tibble: 1,464 × 4
   location     day_of_year month temperature
   <fct>              <dbl> <chr>       <dbl>
 1 Death Valley           1 01           51  
 2 Death Valley           2 01           51.2
 3 Death Valley           3 01           51.3
 4 Death Valley           4 01           51.4
 5 Death Valley           5 01           51.6
 6 Death Valley           6 01           51.7
 7 Death Valley           7 01           51.9
 8 Death Valley           8 01           52  
 9 Death Valley           9 01           52.2
10 Death Valley          10 01           52.3
# … with 1,454 more rows

Make a line plot of temperature against day_of_year, using the color aesthetic to color the lines by location.


```{r ggplot-color}
ggplot(temperatures, aes(x = ___, y = ___, color = ___)) +
  ___()
```

Try again, this time using location as the location along the y axis and temperature for the color. This plot requires geom_point() to look good.


```{r ggplot-color2}
ggplot(___) +
  ___()
```

(Hint: Try geom_point(size = 5) to create larger points.)

Using the fill aesthetic

Some geoms use a fill aesthetic, which is similar to color but applies to shaded areas. (color applies to lines and points.) For example, we can use the fill aesthetic with geom_boxplot() to color the interior of the box. Try this yourself. Plot month on x, temperature on y, and color the interior of the box by location.


```{r ggplot-fill}
ggplot(temperatures, ___) +
  ___()
```

Can you color the lines of the boxplot by location and the interior by month? Try it.


```{r ggplot-color-fill}
ggplot(temperatures, ___) +
  geom_boxplot()
```

Using aesthetics as parameters

Many of the aesthetics (such as color, fill, and also size to change line size or point thickness) can be used as parameters inside a geom rather than inside an aes() statement. The difference is that when you use an aesthetic as a parameter, you specify a specific value, such as color = "blue", rather than a mapping, such as aes(color = location). Notice the difference: Inside the aes() function, we don’t actually specify the specific color values, ggplot does that for us. We only say that we want the data values of the location column to correspond to different colors. (We will learn later how to tell ggplot to use specific colors in this mapping.)

Try this with the boxplot example from the previous section. Map location onto the fill aesthetic but set the color of the lines to "navyblue".


```{r ggplot-params}
ggplot(temperatures, ___) +
  ___(___)
```

Now do the reverse. Map location onto the line colors but fill the box with the color "navyblue".


```{r ggplot-params2}
ggplot(temperatures, ___) +
  ___(___)
```