Chapter 12 Modify Legend

In this chapter, we will focus on modifying the appearance of legend of plots when the aesthetics are mapped to variables.

12.1 Color

We will learn to modify the following when color is mapped to categorical variables:

  • title
  • breaks
  • limits
  • labels
  • values

Basic Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map the color of the points to the cyl variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl)))

As you can see, the legend acts as a guide for the color aesthetic. Now, let us learn to modify the different aspects of the legend.

Values

To change the default colors in the legend, use the values argument and supply a character vector of color names. The number of colors specified must be equal to the number of levels in the categorical variable mapped. In the below example, cyl has 3 levels (4, 6, 8) and hence we have specified 3 colors.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"))

Title

In the previous example, the title of the legend (factor(cyl)) is not very intuitive. If the user does not know the underlying data, they will not be able to make any sense out of it. Let us change it to Cylinders using the name argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(name = "Cylinders", 
    values = c("red", "blue", "green"))

Now, the user will know that the different colors represent number of cylinders in the car.

Limits

Let us assume that we want to modify the data to be displayed i.e. instead of examining the relationship between mileage and displacement for all cars, we desire to look at only cars with at least 6 cylinders. One way to approach this would be to filter the data using filter from dplyr and then visualize it. Instead, we will use the limits argument and filter the data for visualization.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"), limits = c(6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?
## Warning: Removed 11 rows containing missing values (geom_point).

As you can see above, ggplot2 returns a warning message indicating data related to 4 cylinders has been dropped. If you observe the legend, it now represents only 4 and 6 cylinders.

Labels

The labels in the legend can be modified using the labels argument. Let us change the labels to Four, Six and Eight in the next example. Ensure that the labels are intuitive and easy to interpret for the end user of the plot.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    labels = c('Four', 'Six', 'Eight'))

Breaks

When there are large number of levels in the mapped variable, you may not want the labels in the legend to represent all of them. In such cases, we can use the breaks argument and specify the labels to be used. In the below case, we use the breaks argument to ensure that the labels in legend represent two levels (4, 8) of the mapped variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    breaks = c(4, 8))

Putting it all together…

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(name = "Cylinders", values = c("red", "blue", "green"),
    labels = c('Four', 'Six', 'Eight'), limits = c(4, 6, 8), breaks = c(4, 6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?

12.2 Fill

we will learn to modify the following using scale_fill_manual() when fill is mapped to categorical variables:

  • title
  • breaks
  • limits
  • labels
  • values

Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map fill to the cyl variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22)

As you can see, the legend acts as a guide for the color aesthetic. Now, let us learn to modify the different aspects of the legend.

Title

The title of the legend (factor(cyl)) is not very intuitive. If the user does not know the underlying data, they will not be able to make any sense out of it. Let us change it to Cylinders using the name argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(name = "Cylinders", 
    values = c("red", "blue", "green"))

Values

To change the default colors in the legend, use the values argument and supply a character vector of color names. The number of colors specified must be equal to the number of levels in the categorical variable mapped. In the below example, cyl has 3 levels (4, 6, 8) and hence we have specified 3 colors.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(values = c("red", "blue", "green"))

Labels

The labels in the legend can be modified using the labels argument. Let us change the labels to Four, Six and Eight in the next example. Ensure that the labels are intuitive and easy to interpret for the end user of the plot.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(values = c("red", "blue", "green"),
    labels = c('Four', 'Six', 'Eight'))

Limits

Let us assume that we want to modify the data to be displayed i.e. instead of examining the relationship between mileage and displacement for all cars, we desire to look at only cars with at least 6 cylinders. One way to approach this would be to filter the data using filter from dplyr and then visualize it. Instead, we will use the limits argument and filter the data for visualization.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(values = c("red", "blue", "green"),
    limits = c(6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?

As you can see above, ggplot2 returns a warning message indicating data related to 4 cylinders has been dropped. If you observe the legend, it now represents only 4 and 6 cylinders.

Breaks

When there are large number of levels in the mapped variable, you may not want the labels in the legend to represent all of them. In such cases, we can use the breaks argument and specify the labels to be used. In the below case, we use the breaks argument to ensure that the labels in legend represent two levels (4, 8) of the mapped variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(values = c("red", "blue", "green"),
    breaks = c(4, 8))

Putting it all together…

ggplot(mtcars) +
  geom_point(aes(disp, mpg, fill = factor(cyl)), shape = 22) +
  scale_fill_manual(name = "Cylinders", values = c("red", "blue", "green"),
    labels = c('Four', 'Six', 'Eight'), limits = c(4, 6, 8), breaks = c(4, 6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?

12.3 Shape

We will learn to modify the following using scale_shape_manual when shape is mapped to categorical variables:

  • title
  • breaks
  • limits
  • labels
  • values

Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map the shape of the points to the cyl variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl)))

As you can see, the legend acts as a guide for the shape aesthetic. Now, let us learn to modify the different aspects of the legend.

Title

The title of the legend (factor(cyl)) is not very intuitive. If the user does not know the underlying data, they will not be able to make any sense out of it. Let us change it to Cylinders using the name argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(name = "Cylinders", values = c(4, 12, 24))

If you have mapped shape/size to a discrete variable which has less than six categories, you can use scale_shape().

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape(name = 'Cylinders')

Values

To change the default shapes in the legend, use the values argument and supply a numeric vector of shapes. The number of shapes specified must be equal to the number of levels in the categorical variable mapped. In the below example, cyl has 3 levels (4, 6, 8) and hence we have specified 3 different shapes.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(values = c(4, 12, 24))

Labels

The labels in the legend can be modified using the labels argument. Let us change the labels to Four, Six and Eight in the next example. Ensure that the labels are intuitive and easy to interpret for the end user of the plot.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(values = c(4, 12, 24), labels = c('Four', 'Six', 'Eight'))

Limits

Let us assume that we want to modify the data to be displayed i.e. instead of examining the relationship between mileage and displacement for all cars, we desire to look at only cars with at least 6 cylinders. One way to approach this would be to filter the data using filter from dplyr and then visualize it. Instead, we will use the limits argument and filter the data for visualization.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(values = c(4, 24), limits = c(6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?
## Warning: Removed 11 rows containing missing values (geom_point).

As you can see above, ggplot2 returns a warning message indicating data related to 4 cylinders has been dropped. If you observe the legend, it now represents only 4 and 6 cylinders.

Breaks

When there are large number of levels in the mapped variable, you may not want the labels in the legend to represent all of them. In such cases, we can use the breaks argument and specify the labels to be used. In the below case, we use the breaks argument to ensure that the labels in legend represent two levels (4, 8) of the mapped variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(values = c(4, 12, 24), breaks = c(4, 8))

Putting it all together…

ggplot(mtcars) +
  geom_point(aes(disp, mpg, shape = factor(cyl))) +
  scale_shape_manual(name = "Cylinders", labels = c('Six', 'Eight'),  
     values = c(4, 24), limits = c(6, 8), breaks = c(6, 8))
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?
## Warning: Removed 11 rows containing missing values (geom_point).

12.4 Size

We will learn to modify the following using scale_size_continuous when size aesthetic is mapped to variables:

  • title
  • breaks
  • limits
  • range
  • labels
  • values

Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map the size of the points to the hp variable. Remember, size must always be mapped to a continuous variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp))

As you can see, the legend acts as a guide for the size aesthetic. Now, let us learn to modify the different aspects of the legend.

Title

The title of the legend (hp) is not very intuitive. If the user does not know the underlying data, they will not be able to make any sense out of it. Let us change it to Horsepower using the name argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(name = "Horsepower")

Range

The range of the size of points can be modified using the range argument. We need to specify a lower and upper range using a numeric vector. In the below example, we use range and supply the lower and upper limits as 3 and 6. The size of the points will now lie between 3 and 6 only.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(range = c(3, 6))

Limits

Let us assume that we want to modify the data to be displayed i.e. instead of examining the relationship between mileage and displacement for all cars, we desire to look at only cars whose horsepower is between 100 and 350. One way to approach this would be to filter the data using filter from dplyr and then visualize it. Instead, we will use the limits argument and filter the data for visualization.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(limits = c(100, 350))
## Warning: Removed 9 rows containing missing values (geom_point).

Breaks

When the range of the variable mapped to size is large, you may not want the labels in the legend to represent all of them. In such cases, we can use the breaks argument and specify the labels to be used. In the below case, we use the breaks argument to ensure that the labels in legend represent certain midpoints (125, 200, 275) of the mapped variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(breaks = c(125, 200, 275))

Labels

The labels in the legend can be modified using the labels argument. Let us change the labels to “1 Hundred”, “2 Hundred” and “3 Hundred” in the next example. Ensure that the labels are intuitive and easy to interpret for the end user of the plot.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(breaks = c(100, 200, 300),
    labels = c("1 Hundred", "2 Hundred", "3 Hundred"))

Putting it all together…

ggplot(mtcars) +
  geom_point(aes(disp, mpg, size = hp)) +
  scale_size_continuous(name = "Horsepower", range = c(3, 6), 
    limits = c(0, 400), breaks = c(100, 200, 300),
    labels = c("1 Hundred", "2 Hundred", "3 Hundred"))

12.5 Transparency

We will learn to modify the following using scale_alpha_continuous() when alpha or transparency is mapped to variables:

  • title
  • breaks
  • limits
  • range
  • labels
  • values

Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map the transparency of the points to the hp variable. Remember, alpha must always be mapped to a continuous variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue')

As you can see, the legend acts as a guide for the alpha aesthetic. Now, let us learn to modify the different aspects of the legend.

Title

The title of the legend (hp) is not very intuitive. If the user does not know the underlying data, they will not be able to make any sense out of it. Let us change it to Horsepower using the name argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous("Horsepower")

Breaks

When the range of the variable mapped to size is large, you may not want the labels in the legend to represent all of them. In such cases, we can use the breaks argument and specify the labels to be used. In the below case, we use the breaks argument to ensure that the labels in legend represent certain midpoints (125, 200, 275) of the mapped variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous(breaks = c(125, 200, 275))

Limits

Let us assume that we want to modify the data to be displayed i.e. instead of examining the relationship between mileage and displacement for all cars, we desire to look at only cars whose horsepower is between 100 and 350. One way to approach this would be to filter the data using filter from dplyr and then visualize it. Instead, we will use the limits argument and filter the data for visualization.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous(limits = c(100, 350))

Range

The range of the transparency of points can be modified using the range argument. We need to specify a lower and upper range using a numeric vector. In the below example, we use range and supply the lower and upper limits as 0.4 and 0.8. The transparency of the points will now lie between 0.4 and 0.8 only.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous(range = c(0.4, 0.8))

Labels

The labels in the legend can be modified using the labels argument. Let us change the labels to “1 Hundred”, “2 Hundred” and “3 Hundred” in the next example. Ensure that the labels are intuitive and easy to interpret for the end user of the plot.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous(breaks = c(100, 200, 300),
    labels = c("1 Hundred", "2 Hundred", 
      "3 Hundred"))

Putting it all together…

ggplot(mtcars) +
  geom_point(aes(disp, mpg, alpha = hp), color = 'blue') +
  scale_alpha_continuous("Horsepower", breaks = c(100, 200, 300),
    limits = c(100, 350), range = c(0.4, 0.8),
    labels = c("1 Hundred", "2 Hundred", "3 Hundred"))

12.6 Guide

In this section, we will learn to modify

  • title
  • label
  • and bar

So far, we have learnt to modify the components of a legend using scale_* family of functions. Now, we will use the guide argument and supply it values using the guide_legend() function.

Title

Title Alignment

The horizontal alignment of the title can be managed using the title.hjust argument. It can take any value between 0 and 1.

  • 0 (left)
  • 1 (right)

In the below example, we align the title to the center by assigning the value 0.5.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(title = "Cylinders", title.hjust = 0.5))

Title Alignment (Vertical)

To manage the vertical alignment of the title, use title.vjust.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    title = "Horsepower", title.position = "top", title.vjust = 1))

Title Position

The position of the title can be managed using title.posiiton argument. It can be positioned at:

  • top
  • bottom
  • left
  • right
ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(title = "Cylinders", title.hjust = 0.5, 
      title.position = "top"))

Label

Label Position

The position of the label can be managed using the label.position argument. It can be positioned at:

  • top
  • bottom
  • left
  • right

In the below example, we position the label at right.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(label.position = "right"))

Label Alignment

The horizontal alignment of the label can be managed using the label.hjust argument. It can take any value between 0 and 1.

  • 0 (left)
  • 1 (right)

In the below example, we align the label to the center by assigning the value 0.5.

  • alignment
    • 0 (left)
    • 1 (right)
ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(label.hjust = 0.5))

Labels Alignment (Vertical)

The vertical alignment of the label can be managed using the label.vjust argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    label.vjust = 0.8))

Direction

The direction of the label can be either horizontal or veritcal and it can be set using the direction argument.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(direction = "horizontal"))

Rows

The label can be spread across multiple rows using the nrow argument. In the below example, the label is spread across 2 rows.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(nrow = 2))

Reverse

The order of the labels can be reversed using the reverse argument. We need to supply logical values i.e. either TRUE or FALSE. If TRUE, the order will be reversed.

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(reverse = TRUE))

Putting it all together…

ggplot(mtcars) + geom_point(aes(disp, mpg, color = factor(cyl))) +
  scale_color_manual(values = c("red", "blue", "green"),
    guide = guide_legend(title = "Cylinders", title.hjust = 0.5, 
      title.position = "top", label.position = "right", 
      direction = "horizontal", label.hjust = 0.5, nrow = 2, reverse = TRUE)
  )

Legend Bar

So far we have looked at modifying components of the legend when it acts as a guide for color, fill or shape i.e. when the aesthetics have been mapped to a categorical variable. In this section, you will learn about guide_colorbar() which will allow us to modify the legend when the aesthetics are mapped to a continuous variable.

Plot

Let us start with a scatter plot examining the relationship between displacement and miles per gallon from the mtcars data set. We will map the color of the points to the hp variable.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp))

Width

The width of the bar can be modified using the barwidth argument. It is used inside the guide_colorbar() function which itself is supplied to the guide argument of scale_color_continuous().

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    barwidth = 10))

Height

Similarly, the height of the bar can be modified using the barheight argument.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    barheight = 3))

Bins

The nbin argument allows us to specify the number of bins in the bar.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    nbin = 4))

Ticks

The ticks of the bar can be removed using the ticks argument and setting it to FALSE.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    ticks = FALSE))

Upper/Lower Limits

The upper and lower limits of the bars can be drawn or undrawn using the draw.ulim and draw.llim arguments. They both accept logical values.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp)) +
  scale_color_continuous(guide = guide_colorbar(
    draw.ulim = TRUE, draw.llim = FALSE))

12.6.0.1 Guides: Color, Shape & Size

The guides() function can be used to create multiple legends to act as a guide for color, shape, size etc. as shown below. First, we map color, shape and size to different variables. Next, in the guides() function, we supply values to each of the above aesthetics to indicate the type of legend.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp, 
    size = qsec, shape = factor(gear))) + 
  guides(color = "colorbar", shape = "legend", size = "legend")

Guides: Title

To modify the components of the different legends, we must use the guide_* family of functions. In the below example, we use guide_colorbar() for the legend acting as guide for color mapped to a continuous variable and guide_legend() for the legends acting as guide for shape/size mapped to categorical variables.

ggplot(mtcars) +
  geom_point(aes(disp, mpg, color = hp, size = wt, shape = factor(gear))) + 
  guides(color = guide_colorbar(title = "Horsepower"),
    shape = guide_legend(title = "Weight"), size = guide_legend(title = "Gear")
  )