Dual y-Axis
An R tutorial on how to create a plots with dual y-axes
Introduction
In some cases, a graph with two y-axes is desired for visualizing two different sets of data. However, this is sometimes frowned upon since the required scaling of the data can be adjusted to fit the desired narrative.
With that said, there still are situations were dual y-axes are
appropriate. This vignette will show you how to do this in
with ggplot2
, despite the package authors disagreements.
Prepare Data
## # A tibble: 214 × 4
## Date Day.Length Soil.Temperature Air.Temperature
## <date> <dbl> <dbl> <dbl>
## 1 2016-11-29 9.38 5.56 8.56
## 2 2016-11-30 9.35 5.34 8.74
## 3 2016-12-01 9.33 4.65 8.75
## 4 2016-12-02 9.31 4.17 8.68
## 5 2016-12-03 9.29 3.53 8.99
## 6 2016-12-04 9.27 3.66 10.9
## 7 2016-12-05 9.26 3.21 10.6
## 8 2016-12-06 9.24 2.89 9.21
## 9 2016-12-07 9.22 2.96 11.0
## 10 2016-12-08 9.21 3.02 9.90
## # ℹ 204 more rows
Single y-axis
First lets create a plot of Air Temperature
.
mp1 <- ggplot(xx, aes(x = Date)) +
geom_line(aes(y = Air.Temperature, color = "1"),
alpha = 0.7, size = 1.25) +
scale_color_manual(name = NULL, values = "red", labels = "Air Temp") +
scale_x_date(date_labels = "%b" , date_breaks = "1 month") +
labs(title = "Environmental Data", x = NULL,
y = "Temperature (\u00B0C)", caption = myCaption) +
theme_agData(legend.position = "bottom")
mp1
Now lets add a second set of data (Soil Temperature
), in
this case it uses the same unit as the first (°C).
But when we add another data set (Day Length
) using
different units (hours), problems arise.
Data Scaling
In this case, the range of Day Length
and
Temperature
are drastically different.
## [1] 31.9055
## [1] 5.78
In order to present the data better, we need to rescale it
yscaled=(y2i−min(y2))∗max(y1)−min(y1)max(y2)−min(y2)+min(y1)
where:
- y1 = Set of values you want to scale to
- y2 = Set of values to be rescaled to min and max of y1
- y2i = Value from the y2 set to be rescaled
in our case:
- y1 = Air + Soil Temperature
- y2 = Day Length
- y2i = Day Length on a specific day
Scaling the data can also be done with the rescale
function from the scales
package
That looks better. However, we still need to add the second y-axis, which will require some more math.
y2i=(yscaled−min(y1))∗max(y2)−min(y2)max(y1)−min(y1)+min(y2)
Double y-axis
# Prep sec_axis
mySA <- sec_axis(~(. - y1_min) * (y2_max - y2_min) / (y1_max - y1_min) + y2_min,
name = "Hours", breaks = 9:14)
# Plot
mp5 <- mp2 +
geom_line(data = xx, aes(y = Day.Length_scaled, color = "4"),
alpha = 0.7, size = 1.25) +
scale_color_manual(name = NULL, values = c("red","darkred","darkblue"),
labels = c("Air Temp","Soil Temp","Day Length")) +
scale_y_continuous(sec.axis = mySA)
mp5
To help better visualize this rescaling, we will use a simpler example.
yscaled=(y2i−min(y2))∗max(y1)−min(y1)max(y2)−min(y2)+min(y1)
yscaled=(7.5−5)∗40−2010−5+20)=30
y2i=(yscaled−min(y1))∗max(y2)−min(y2)max(y1)−min(y1)+min(y2)
y2i=(30−20)∗10−540−20+5=7.5
xx <- data.frame(x = 1:20, y = 1:20)
ggplot(xx, aes(x = x, y = y)) +
geom_hline(yintercept = 30, color = "blue", alpha = 0.7, size = 2) +
theme_agData(axis.text.y = element_text(color = "red", size = 10)) +
scale_y_continuous(limits = c(20, 40),
sec.axis = sec_axis(~ (. - 20) * (10 - 5) / (20 - 0) + 5,
name = "y2", breaks = 5:10)) +
labs(caption = myCaption)