From 80408111bfb1f315fd1d17e249abe920bfa3002b Mon Sep 17 00:00:00 2001 From: Gabriela Palomo Date: Sat, 7 Sep 2024 07:29:48 -0400 Subject: [PATCH] Update 10-Position_scales_and_axes.Rmd (#55) Changes to chapter 10. --- 10-Position_scales_and_axes.Rmd | 723 +++++++++++++------------------- 1 file changed, 297 insertions(+), 426 deletions(-) diff --git a/10-Position_scales_and_axes.Rmd b/10-Position_scales_and_axes.Rmd index 97f0dab..09493b8 100644 --- a/10-Position_scales_and_axes.Rmd +++ b/10-Position_scales_and_axes.Rmd @@ -9,51 +9,57 @@ - What are the defining components of an axis? - What is the relationship between scale and axis? -```{r 10-01} +```{r 10-01, warning=FALSE, message=FALSE} library(ggplot2) library(dplyr) library(stringr) # for demo of labels and some other stuff ``` -## Introduction / preliminaries / asides +## Preliminaries / asides {-} -This chapter introduces position scales and axes. It may also be helpful to understand position scales and axes as position scales and _guides_, because axes they share the same API as guides for non-positional scales like color legends. The parallel will be clearer in the next chapter. +- This chapter introduces position scales and axes or guides. +- Recommended: read documentations of the `{scales}` package, since that handles a lot of the (re-)scaling and transformation under the hood. + - Start with [rstudio::conf2020 talk on scales](https://www.rstudio.com/resources/rstudioconf-2020/the-little-package-that-could-taking-visualizations-to-the-next-level-with-the-scales-package/). -It's worthwhile to read documentations of the `{scales}` package to learn more about scales, since that handles a lot of the (re-)scaling and transformation under the hood. It may be good to start with the [rstudio::conf2020 talk on scales](https://www.rstudio.com/resources/rstudioconf-2020/the-little-package-that-could-taking-visualizations-to-the-next-level-with-the-scales-package/). +- It should also be noted that there's some discussion about revamping the `scales_*` API. See [issue #4269](https://github.com/tidyverse/ggplot2/issues/4269) and [PR #4271](https://github.com/tidyverse/ggplot2/pull/4271) -```{r 10-02} -knitr::include_url("https://www.rstudio.com/resources/rstudioconf-2020/the-little-package-that-could-taking-visualizations-to-the-next-level-with-the-scales-package/") +## Introduction {-} + +- Position scales control the locations of visual entities in a plot and how those locations are mapped to data values. + - usually x- and y-axis + - However, some plots require that you specify only one axis: `geom_histogram()` which computes a `count` variables that gets mapped into the y aesthetic. + +```{r, warning=FALSE, message=FALSE, fig.align='center'} +ggplot(mpg, + aes(x = displ)) + # only specifies x-axis but not y-axis + geom_histogram() ``` -It should also be noted that there's some discussion about revamping the `scales_*` API. See [issue #4269](https://github.com/tidyverse/ggplot2/issues/4269) and [PR #4271](https://github.com/tidyverse/ggplot2/pull/4271) +## Themes to discuss {-} -Lastly, a small aside on the book's `after_stat()` example it he intro, continuing nicely from our discussion on ggplot internals last week. +- Here we will discuss + - Continuous position scales, including transformations and zooming in and out of a plot. + - Date/time scales, which is a special case of a continous scale. + - Discrete position scales, including limits, breaks, and labels, and axis label customisation. + - Binned position scales. -```{r 10-03} -# Grab the layer object created by `geom_histogram()` -histogram_layer <- geom_histogram() -# Check what stat ggproto it uses -class(histogram_layer$stat)[1] -# Confirm that the stat has `after_stat(count)` as the default aes -StatBin$default_aes # or, `histogram_layer$stat$default_aes` -# The stat takes an x or y aesthetic, so it does implicitly maps -# `after_stat(count)` to the unspecified aes -StatBin$required_aes -geom_histogram(aes(x = displ))$mapping -# 'orientation' argument allows horizontal bars without `coord_flip()` -# as of v.3.3.0 (Dec 2020) -ggplot(mpg, aes(y = displ)) + - geom_histogram(orientation = "y") -``` -## 10.1 Numeric +## Numeric position scales {-} -### 10.1.1 Limits +- `scale_x_continous()` +- `scale_y_continous()` +- Both map linearly from the data value to a location on the plot. +- The limits should be a numeric vector of length two, or numeric value and NA. +- Other scales used for transformations: + - `scale_x_log10()` + - `scale_x_reverse()` + +## Numeric position scales: Limits {-} -The book doesn't have content for this section (??) +- All scales have limits that specify the values of the aesthetic over which the scale is defined: ranges of the axes. +- By default, limits are calculated from the range of the data variable, but this can be bypassed with the `limits` argument in the `scale()` function. -But we know that you can set limits with `xlim()`/`ylim()` or `scale_x|y_*(limits = )` ```{r 10-04} lim_plot <- ggplot(mtcars, aes(x = hp, y = disp)) + @@ -74,66 +80,55 @@ lim_plot + ``` -### 10.1.2 Out of bounds values +- Alternatively, you can also use `lims()`. +- Or just `xlim()` or `ylim()` -NOTE: A big theme of the `{scales}` package as of v1.1.1 (May 2020) is that they have very transparent function names. For example, the family of functions for Out Of Bounds (oob) handling are all named `oob_*()`. This is an intentional (re-)design of the package to work nicely with autocomplete. +```{r} +lim_plot + + lims(x = c(0,500), + y = c(0,500)) -```{r 10-05} -str_subset(ls(envir = asNamespace("scales")), "^oob_") +# Specifying just one axis +lim_plot + + xlim(c(0,500)) ``` -By default, data outside scales are set to `NA`. This is because the `oob` argument is set to `oob_censor()`/`censor()`. Note that oob only applies to continuous scales, since values of a discrete scale form a fixed set. -```{r 10-06} -body(scale_x_continuous) -formals(continuous_scale)$oob -``` +## Zooming in {-} -Book's examples: +- If your goal is to zoom in on part of the plot, it is usually better to use the `xlim()` and `ylim()` arguments of `coord_cartesian()`. +- when you truncate the scale limits, some data points will fall outside the boundaries you set, and ggplot2 has to make a decision about what to do with these data points. The default behavior in ggplot2 is to convert any data values outside the scale limits to NA. -```{r 10-07} +```{r, fig.align='center'} base <- ggplot(mpg, aes(drv, hwy)) + geom_hline(yintercept = 28, colour = "red") + - geom_boxplot(alpha = .2) # I set alpha here for a later demo + geom_boxplot() +# Base plot base -base + coord_cartesian(ylim = c(10, 35)) -base + ylim(10, 35) -``` -Equivalent solutions with `oob_*()` +# Zoom in with coord_cartesian() works well! +base + coord_cartesian(ylim = c(10, 35)) # works as expected -```{r 10-08} -# zoom only (keeps out of bounds values for the stat computation) -base + scale_y_continuous(limits = c(10, 35), oob = scales::oob_keep) -# default that removes out of bounds values -base + scale_y_continuous(limits = c(10, 35), oob = scales::oob_censor) -# squish option (plots outliers at the uper limit of y = 35) -base + scale_y_continuous(limits = c(10, 35), oob = scales::oob_squish) +# Zoom in with ylim() does not work well, look at the red line how it has moved. +# The boxplot is not the same +base + ylim(10, 35) # distorts the boxplot +#> Warning: Removed 6 rows containing non-finite values (`stat_boxplot()`). ``` -You can use oob functions for non-positional scales - -```{r 10-09} -df <- data.frame(x = 1:6, y = 8:13) -base <- ggplot(df, aes(x, y)) + - geom_col(aes(fill = x)) + # bar chart - geom_vline(xintercept = 3.5, colour = "red") # for visual clarity only - -base -base + scale_fill_gradient(limits = c(1, 3)) # oob = scales::oob_censor -base + scale_fill_gradient(limits = c(1, 3), oob = scales::squish) # scales::oob_squish -``` +## Visual range expansion {-} -### 10.1.3 Visual range expansion +- The visual range of the axes actually extends a little bit past the numeric limits that we have specified. +- Override the defaults setting with `expand()` argument wich expects a numeric vector. -Book examples: +- For example, one case where it’s usually preferable to remove this space is when using geom_raster(), which we can achieve by setting `expand = expansion(0)`: ```{r 10-10} f_plot <- ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) + theme(legend.position = "none") + f_plot f_plot + @@ -141,35 +136,44 @@ f_plot + scale_y_continuous(expand = c(0,0)) # expand = 0 ``` -With `expansion()` from v3.3.0 (Dec 2020) +- `expand` argument: For position scales, a vector of range expansion constants used to add some padding around the data to ensure that they are placed some distance away from the axes. Use the convenience function expansion() to generate the values for the expand argument. The defaults are to expand the scale by 5% on each side for continuous variables, and by 0.6 units on each side for discrete variables. + + +- With `expansion()`. +- Additive factor: specifies a constant space added to outside of the nominal axis limits. +- Multiplicative factor: adds space defined as a proportion of the size of the axis limit. +- These correspond to the add and mult arguments to expansion(), which can be length one (if the expansion is the same on both sides) or length two (to set different expansions on each side). +- The `add` argument is specified on the same scale as the data variable, whereas the `mult` argument is specified relative to the axis range. ```{r 10-11} formals(expansion) ``` -```{r 10-12} -f_plot + - scale_y_continuous(expand = expansion(mult = 0)) # mult = c(0, 0) -f_plot + - scale_y_continuous(expand = expansion(mult = 1)) -f_plot + - scale_y_continuous(expand = expansion(mult = c(0, 1))) -f_plot + - scale_y_continuous(expand = expansion(mult = c(0, 1))) + - scale_x_continuous(expand = expansion(add = c(0, 10))) -``` +```{r } +# Additive expansion of three units on both axes +f_plot + + scale_x_continuous(expand = expansion(add = 3)) + + scale_y_continuous(expand = expansion(add = 3)) +# Multiplicative expansion of 20% on both axes +f_plot + + scale_x_continuous(expand = expansion(mult = .2)) + + scale_y_continuous(expand = expansion(mult = .2)) -### 10.1.4 Exercises +# Multiplicative expansion of 5% at the lower end of each axes, +# and 20% at the upper end; for the y-axis the expansion is +# set directly instead of using expansion() +f_plot + + scale_x_continuous(expand = expansion(mult = c(.05, .2))) + + scale_y_continuous(expand = c(.05, 0, .2, 0)) +``` -### 10.1.5 Breaks +## Breaks {-} -```{r 10-13} -str_subset(ls(envir = asNamespace("scales")), "^breaks_") -``` +- Axis tick marks and legend tick marks are special cases of scale breaks -> `breaks` argument in the `scale_*()` function. -Book example: +- Let's see an example: ```{r 10-14} toy <- data.frame( @@ -180,330 +184,272 @@ toy <- data.frame( log = c(2, 5, 10, 2000) ) toy -#> const up txt big log -#> 1 1 1 a 1000 2 -#> 2 1 2 b 2000 5 -#> 3 1 3 c 3000 10 -#> 4 1 4 d 4000 2000 +``` + +- To set breaks manually, pass a vector of data values to `breaks` or set `breaks = NULL` to remove them and the corresponding tick marks. + +```{r} axs <- ggplot(toy, aes(big, const)) + geom_point() + labs(x = NULL, y = NULL) + axs -axs + scale_x_continuous(breaks = scales::breaks_extended()) -axs + scale_x_continuous(breaks = scales::breaks_extended(n = 2)) + axs + scale_x_continuous(breaks = NULL) ``` -Demo from `{scales}`: +- Grid lines move along with breaks -```{r 10-15} -scales::demo_continuous(c(1000, 4000), breaks = scales::breaks_extended()) -scales::demo_continuous(c(1000, 4000), breaks = scales::breaks_extended(n = 2)) -scales::demo_continuous(c(1000, 4000), NULL) +```{r} +axs + scale_x_continuous(breaks = c(1000, 2000, 4000)) +axs + scale_x_continuous(breaks = c(1000, 1500, 2000, 4000)) ``` -At the vector level: -```{r 10-16} -scales::breaks_extended()(c(1000, 4000)) -scales::breaks_extended(n = 2)(c(1000, 4000)) -``` +- You can pass a function to the argument `breaks`, but the package `scales` has several break functions that can help tweak the breaks: + + - `scales::breaks_extended()` creates automatic breaks for numeric axes. + - `scales::breaks_log()` creates breaks appropriate for log axes. + - `scales::breaks_pretty()` creates “pretty” breaks for date/times. + - `scales::breaks_width()` creates equally spaced breaks. + Other breaks: ```{r 10-17} -my_range <- c(1, 101) -scales::breaks_extended()(my_range) -scales::breaks_width(width = 10)(my_range) -scales::breaks_pretty(width = 10)(my_range) # pretty(1:101) -scales::breaks_log()(my_range) -``` +axs + + scale_x_continuous(breaks = scales::breaks_extended()) -Debugging arguments in `scale_*()` that take function factories - -```{r 10-18, eval = FALSE} -browserer <- function(...) { - params <- list(...) - browser() - if (exists("result")) { - return(result) - } -} -axs + scale_x_continuous(breaks = browserer) +axs + + scale_x_continuous(breaks = scales::breaks_extended(n = 2)) ``` +- With the `scales::breaks_width()` function you can define the spacing between breaks. + - `width` sets the distance between each break. Number or time/date in a single string in the form "{n} {unit}", e.g., "1 month", "4 sec". + - `offset` use if you don't want breaks to start at zero, or on a conventional date or time boundary such as the 1st of January or midnight. A negative number for offset will specify a new starting point with an offset away from the original one. -### 10.1.6 Minor breaks - -Book example: +```{r} +axs + + scale_x_continuous(breaks = scales::breaks_width(500)) -```{r 10-19} -mb <- unique(as.numeric(1:10 %o% 10 ^ (0:3))) -mb +# The offset shifts all the breaks by a specified amount +axs + + scale_x_continuous(breaks = scales::breaks_width(500, offset = 100)) -log_base <- ggplot(toy, aes(log, const)) + geom_point() - -log_base + scale_x_log10() -log_base + scale_x_log10(minor_breaks = mb) +axs + + scale_x_continuous(breaks = scales::breaks_width(500, offset = -100)) ``` -There are also minor break functions: - -```{r 10-20} -str_subset(ls(envir = asNamespace("scales")), "^minor_breaks_") -``` +## Minor breaks {-} +- You can adjust the minor breaks (the unlabeled faint grid lines that appear between the major grid lines). +- You can also supply a function to `minor_breaks`, such as `scales::minor_breaks_n()` or `scales::minor_breaks_width()` -### 10.1.7 Labels +- First let's create a vector of minor break values. -```{r 10-21} -str_subset(ls(envir = asNamespace("scales")), "^label_") +```{r 10-19} +#%o% generates a multiplication table +mb <- unique(as.numeric(1:10 %o% 10 ^ (0:3))) +mb ``` -Book examples: +- Now let's create a plot: -```{r 10-22} -axs + scale_y_continuous(labels = scales::label_percent()) -axs + scale_y_continuous(labels = scales::label_dollar(prefix = "", suffix = "€")) -``` +```{r} +log_base <- ggplot(toy, + aes(log, const)) + geom_point() -```{r 10-23} -tibble( - x = c("cat1", "cat2 with a really really realy long name", "cat3"), - y = 1:3 -) %>% - ggplot(aes(x, y)) + - geom_col() -``` +log_base -```{r 10-24} -tibble( - x = c("cat1", "cat2 with a really really realy long name", "cat3"), - y = 1:3 -) %>% - ggplot(aes(x, y)) + - geom_col() + - scale_x_discrete(labels = scales::label_wrap(width = 30)) +# Transforming x-axis to log10 +log_base + scale_x_log10() +log_base + scale_x_log10(breaks = c(0, 2, 5, 10, 50, 100, 500, 1000, 2000)) #major breaks +# Using my previous vector mb +log_base + scale_x_log10(minor_breaks = mb) # minor breaks ``` -### 10.1.8 Exercises +## Labels {-} +- Every break is associated with a label, and labels can be changed. +- You can supress lables with `labels = NULL` +- Let's see an example: -### 10.1.9 Transformations +```{r} +base <- ggplot(toy, aes(big, const)) + + geom_point() + + labs(x = NULL, y = NULL) + + scale_y_continuous(breaks = NULL) -Book example: +base -```{r 10-25} -ggplot(diamonds, aes(price, carat)) + - geom_bin2d() -# log transform x and y axes -ggplot(diamonds, aes(price, carat)) + - geom_bin2d() + - scale_x_continuous(trans = "log10") + - scale_y_continuous(trans = "log10") +base + + scale_x_continuous( + breaks = c(2000, 4000), + labels = c("2k", "4k")) # specify the labels for each break ``` +- Label functions that are useful from the `scales` package are: + - `scales::label_bytes()` formats numbers as kilobytes, megabytes etc. + - `scales::label_comma()` formats numbers as decimals with commas added. + - `scales::label_dollar()` formats numbers as currency. + - `scales::label_ordinal()` formats numbers in rank order: 1st, 2nd, 3rd etc. + - `scales::label_percent()` formats numbers as percentages. + - `scales::label_pvalue()` formats numbers as p-values: <.05, <.01, .34, etc. -> The transformation is carried out by a “transformer”, which describes the transformation, its inverse, and how to draw the labels. You can construct your own transformer using `scales::trans_new()` +```{r} +base <- ggplot(toy, aes(big, const)) + + geom_point() + + labs(x = NULL, y = NULL) + + scale_x_continuous(breaks = NULL) -Case study: make reversed log x-axis +base -```{r 10-26} -ggplot(starwars, aes(x = mass)) + - geom_histogram() +base + scale_y_continuous(labels = scales::label_percent(accuracy = 0)) -ggplot(starwars, aes(x = mass)) + - geom_histogram() + - scale_x_log10() +base + scale_y_continuous(labels = scales::label_percent(accuracy = 0.5)) -ggplot(starwars, aes(x = mass)) + - geom_histogram() + - scale_x_reverse() +base + scale_y_continuous( + labels = scales::label_dollar(prefix = "", suffix = "€") +) ``` -```{r 10-27} -scale_x_log10()$trans # scales::log10_trans() -scale_x_reverse()$trans # scales::reverse_trans() -``` -```{r 10-28} -formals(scales::trans_new) -log10_reverse <- scales::trans_new( - name = "log-10-reverse", - transform = function(x) -log(x, 10), - inverse = function(x) 10^(-x), - breaks = scales::log10_trans()$breaks, - minor_breaks = scales::log10_trans()$minor_breaks, - domain = scales::log10_trans()$domain -) -ggplot(starwars, aes(x = mass)) + - geom_histogram() + - scale_x_continuous(trans = log10_reverse) -``` -> Regardless of which method you use, the transformation occurs before any statistical summaries. To transform after statistical computation use `coord_trans()` +## Transformations {-} -From the docs: +- Several scale transformation functions that work on the x- or y-axis. +- All of these transformations do not affect the data, they just modify the axes. -```{r 10-29} -ggplot(diamonds, aes(carat, price)) + - geom_point() + - scale_x_log10() + - scale_y_log10() +```{r} +base <- ggplot(mpg, aes(displ, hwy)) + geom_point() -ggplot(diamonds, aes(carat, price)) + - geom_point() + - coord_trans(x = "log10", y = "log10") +base +base + scale_x_reverse() +base + scale_y_reverse() ``` -Example where stat transformation matters: +- Every continuous scale takes a `transform` argument allowing for using transformations: -```{r 10-30} -trans_plot <- ggplot(mpg, aes(drv, hwy)) + - geom_boxplot() -trans_plot + - scale_y_log10() -trans_plot + - coord_trans(y = "log10") # scales::log10_trans() +```{r} +# convert from fuel economy to fuel consumption +ggplot(mpg, aes(displ, hwy)) + + geom_point() + + scale_y_continuous(transform = "reciprocal") + +# log transform x and y axes +ggplot(diamonds, aes(price, carat)) + + geom_bin2d() + + scale_x_continuous(transform = "log10") + + scale_y_continuous(transform = "log10") ``` +- You can construct your own transform by using `scales::new_transform` -```{r 10-31} -layer_data(trans_plot) %>% - select(x, starts_with("y")) -layer_data(trans_plot + scale_y_log10()) %>% - select(x, starts_with("y")) -layer_data(trans_plot + coord_trans(y = "log10")) %>% - select(x, starts_with("y")) -``` +- The following table lists some of the more common variants: +| Name | Transformer | Function $f(x)$ | Inverse $f^{-1}(x)$ | +|----------------|----------------------------------|-------------------------|----------------------| +| `"asn"` | `scales::transform_asn` | $\tanh^{-1}(x)$ | $\tanh(y)$ | +| `"exp"` | `scales::transform_exp ()` | $e ^ x$ | $\log(y)$ | +| `"identity"` | `scales::transform_identity()` | $x$ | $y$ | +| `"log"` | `scales::transform_log()` | $\log(x)$ | $e ^ y$ | +| `"log10"` | `scales::transform_log10()` | $\log_{10}(x)$ | $10 ^ y$ | +| `"log2"` | `scales::transform_log2()` | $\log_2(x)$ | $2 ^ y$ | +| `"logit"` | `scales::transform_logit()` | $\log(\frac{x}{1 - x})$ | $\frac{1}{1 + e(y)}$ | +| `"probit"` | `scales::transform_probit()` | $\Phi(x)$ | $\Phi^{-1}(y)$ | +| `"reciprocal"` | `scales::transform_reciprocal()` | $x^{-1}$ | $y^{-1}$ | +| `"reverse"` | `scales::transform_reverse()` | $-x$ | $-y$ | +| `"sqrt"` | `scales::scale_x_sqrt()` | $x^{1/2}$ | $y ^ 2$ | -### ASIDE - A little more on transformations -`transform()` method of the [Scales ggproto](https://ggplot2.tidyverse.org/reference/ggplot2-ggproto.html#scales): +- Let's see an example: -> `transform()` Transforms a vector of values using self$trans. This occurs before the Stat is calculated. +```{r} +ggplot(mpg, aes(displ, hwy)) + + geom_point() + + scale_y_continuous(transform = "reciprocal") -Transformation changes the layer data +ggplot(mpg, aes(displ, hwy)) + + geom_point() + + scale_y_continuous(transform = scales::transform_reciprocal()) +``` -```{r 10-32} -toy # from Ch 10.1.5 -ggplot(toy, aes(big, txt)) + - geom_point() +- Remember you can transform the data manually first and opt not to do the transformation on the axes. +- The appearance of the geom will be the same, but the tick labels will be different. + - If you transform the data, the axes will be labelled in the transformed space. + - If you use a transformed scale, the axes will be labelled in the original data space. +- **Regardless of which method you use, the transformation occurs before any statistical summaries. To transform after statistical computation use `coord_trans()`.** -reversed_plot <- ggplot(toy, aes(big, txt)) + +```{r} +# Original data +ggplot(mpg, aes(displ, hwy)) + geom_point() + - scale_x_reverse() -reversed_plot -layer_data(reversed_plot) + labs(title = "Untransformed data or axes") -rev_trans <- scales::reverse_trans() -scales::reverse_trans -str(rev_trans) -rev_trans$transform(toy$big) -rev_trans$inverse(rev_trans$transform(toy$big)) -rev_trans$format(rev_trans$breaks(range(toy$big))) -``` +# manual transformation +ggplot(mpg, aes(log10(displ), hwy)) + + geom_point() + + labs(title = "Data transformed first") -Most useful for positioning purposes (ex: [`time_trans()`](https://scales.r-lib.org/reference/time_trans.html)) +# transform using scales +ggplot(mpg, aes(displ, hwy)) + + geom_point() + + scale_x_log10() + + labs(title = "Transformation applied to x-axis") -```{r 10-33} -hours <- seq(ISOdate(2000,3,20, tz = ""), by = "hour", length.out = 10) -t <- scales::time_trans() -t$transform(hours) -t$inverse(t$transform(hours)) -t$format(t$breaks(range(hours))) ``` -```{r 10-34} -date_trans_plot <- ggplot(tibble(hours = hours), aes(x = hours, y = 0)) + - geom_point() -layer_data(date_trans_plot) -``` +## Date-time {-} -## 10.2 Date-time +- Assuming you have appropriately formatted data mapped to the x aesthetic, ggplot2 will use `scale_x_date()` as the default scale for dates and `scale_x_datetime()` as the default scale for date-time data. -### 10.2.1 Breaks +- We've seen a few useful transformations througout like: `scales::breaks_pretty()` which creates “pretty” breaks for date/times. -Book example: +## Breaks {-} + +- The `date_breaks` argument allows you to position breaks by date units (years, months, weeks, days, hours, minutes, and seconds). ```{r 10-35} +#| layout-ncol: 2 + date_base <- ggplot(economics, aes(date, psavert)) + geom_line(na.rm = TRUE) + labs(x = NULL, y = NULL) date_base -date_base + scale_x_date(date_breaks = "25 years") +date_base + scale_x_date(date_breaks = "15 years") ``` -Making it explicit: +- Remember you can also set `width` and `offset`: "1 month" -```{r 10-36} -date_base + scale_x_date(breaks = scales::breaks_width("25 years")) -``` - -Book example: +## Labels {-} -```{r 10-37} -century20 <- as.Date(c("1900-01-01", "1999-12-31")) -breaks <- scales::breaks_width("25 years") -breaks(century20) -``` +- The book recommends using `date_labels` argument. -Using `offset` argument (unit = days): - -```{r 10-38} -breaks2 <- scales::breaks_width("25 years", offset = 31) # offsets to Feb -breaks2(century20) -``` +| String | Meaning | +|:-------|:-----------------------------------| +| `%S` | second (00-59) | +| `%M` | minute (00-59) | +| `%l` | hour, in 12-hour clock (1-12) | +| `%I` | hour, in 12-hour clock (01-12) | +| `%p` | am/pm | +| `%H` | hour, in 24-hour clock (00-23) | +| `%a` | day of week, abbreviated (Mon-Sun) | +| `%A` | day of week, full (Monday-Sunday) | +| `%e` | day of month (1-31) | +| `%d` | day of month (01-31) | +| `%m` | month, numeric (01-12) | +| `%b` | month, abbreviated (Jan-Dec) | +| `%B` | month, full (January-December) | +| `%y` | year, without century (00-99) | +| `%Y` | year, with century (0000-9999) | -Calculating the offset: - -```{r 10-39} -diff.Date(c(as.Date("1900-01-01"), as.Date("1900-02-01"))) # as.integer() to get value -``` - -### 10.2.2 Minor breaks - -Book examples: - -```{r 10-40} -date_base + scale_x_date( - limits = as.Date(c("2003-01-01", "2003-04-01")), - date_breaks = "1 month" -) - -date_base + scale_x_date( - limits = as.Date(c("2003-01-01", "2003-04-01")), - date_breaks = "1 month", - date_minor_breaks = "1 week" -) -``` +```{r} -> In the second plot, the major and minor beaks follow slightly different patterns: the minor breaks are always spaced 7 days apart but the major breaks are 1 month apart. Because the months vary in length, this leads to slightly uneven spacing. +#| layout-ncol: 2 -Explicit: - -```{r 10-41} -date_base + scale_x_date( - limits = as.Date(c("2003-01-01", "2003-04-01")), - breaks = scales::breaks_width("1 month") -) -date_base + scale_x_date( - limits = as.Date(c("2003-01-01", "2003-04-01")), - breaks = scales::breaks_width("1 month"), - minor_breaks = scales::breaks_width("1 week") -) -``` - - -### 10.2.3 Labels - -Book examples: - -```{r 10-42} base <- ggplot(economics, aes(date, psavert)) + geom_line(na.rm = TRUE) + labs(x = NULL, y = NULL) @@ -512,20 +458,25 @@ base + scale_x_date(date_breaks = "5 years") base + scale_x_date(date_breaks = "5 years", date_labels = "%y") ``` -```{r 10-43} -base + scale_x_date(labels = scales::label_date_short()) -lim <- as.Date(c("2004-01-01", "2005-01-01")) -base + scale_x_date(limits = lim, labels = scales::label_date_short()) -``` +- Remember you can include a line break character `\n` +```{r} +#| layout-ncol: 2 +lim <- as.Date(c("2004-01-01", "2005-01-01")) +base + scale_x_date(limits = lim, date_labels = "%b %y") +base + scale_x_date(limits = lim, date_labels = "%B\n%Y") +``` -## 10.3 Discrete -Book examples: +## Discrete position scales {-} + +- `scale_x_discrete()` and `scale_y_discrete()` ```{r 10-44} + +#| layout-ncol: 3 ggplot(mpg, aes(x = hwy, y = class)) + geom_point() @@ -536,132 +487,52 @@ ggplot(mpg, aes(x = hwy, y = class)) + ggplot(mpg, aes(x = hwy, y = class)) + geom_point() + - annotate("text", x = 5, y = 1:7, label = 1:7) + annotate("text", color = "blue", x = 5, y = 1:7, label = 1:7) ``` -### 10.3.1 Limits +## Limits, breaks, labels {-} > For discrete scales, limits should be a character vector that enumerates all possible values. -Censors missing categories in the set: - -```{r 10-45} -ggplot(mpg, aes(x = hwy, y = class)) + - geom_point() + - scale_y_discrete(limits = unique(mpg$class)[-1]) -``` +- Limits -Adds new categories without value: - -```{r 10-46} -ggplot(mpg, aes(x = hwy, y = class)) + - geom_point() + - scale_y_discrete(limits = c("A", unique(mpg$class))) -``` - -Same effect with `drop = FALSE` with unused factor levels - -```{r 10-47} -ggplot(mpg, aes(x = hwy, y = factor(class, levels = c("A", unique(class))))) + - geom_point() + - scale_y_discrete(drop = FALSE) -``` - -It drops unused factor levels by default, though - -```{r 10-48} -ggplot(mpg, aes(x = hwy, y = factor(class, levels = c("A", unique(class))))) + - geom_point() # + scale_y_discrete(drop = TRUE) -``` - - -### 10.3.2 Scale labels - -```{r 10-49} -layer_data(last_plot()) %>% - ggplot(aes(x = x, y = y, group = group)) + - geom_point() -``` - -### 10.3.2 Scale labels - -Book example: - -```{r 10-50} +```{r} base <- ggplot(toy, aes(const, txt)) + - geom_point() + + geom_label(aes(label = txt)) + + scale_x_continuous(breaks = NULL) + labs(x = NULL, y = NULL) -base -base + scale_y_discrete(labels = c(c = "carrot", b = "banana")) -``` - -```{r 10-51} -base + scale_y_discrete(labels = str_to_title) +base +base + scale_y_discrete(limits = c("a", "b", "c", "d", "e")) +base + scale_y_discrete(limits = c("d", "c", "a", "b")) ``` -Debugging strategy +- breaks -```{r 10-52, eval = FALSE} -browserer <- function(...) { - params <- list(...) - browser() -} -base + scale_y_discrete(labels = browserer) +```{r} +base + scale_y_discrete(breaks = c("b", "c")) +base + scale_y_discrete(labels = c(c = "carrot", b = "banana")) ``` -### 10.3.3 `guide_axis()` - -Book examples: +- Label positions. It's common to have to prevent labels from overlapping. -```{r 10-53} +```{r} base <- ggplot(mpg, aes(manufacturer, hwy)) + geom_boxplot() +base base + guides(x = guide_axis(n.dodge = 3)) base + guides(x = guide_axis(angle = 90)) -``` -More guides in `{ggh4x}` - [https://teunbrand.github.io/ggh4x/](https://teunbrand.github.io/ggh4x/index.html) - -```{r 10-54} -library(ggh4x) -tibble( - item = c("Coffee", "Tea", "Apple", "Pear", "Car"), - type = c("Drink", "Drink", "Fruit", "Fruit", ""), - amount = c(5, 1, 2, 3, 1) -) %>% - ggplot(aes(interaction(item, type), amount)) + - geom_col() + - scale_x_discrete(guide = guide_axis_nested()) # guides(x = "axis_nested") ``` -## 10.4 Binned - -Book example: - -```{r 10-55} -base <- ggplot(mpg, aes(hwy, class)) + geom_count() - -base -base + scale_x_binned(n.breaks = 10) -``` - -```{r 10-56} -ggplot(mtcars, aes(hp)) + - geom_histogram(binwidth = 20) - -ggplot(mtcars, aes(hp)) + - geom_bar() + - scale_x_binned(breaks = scales::breaks_width(width = 20)) -``` -## ASIDE - `geom_sf()` + limits +## ASIDE - `geom_sf()` + limits {-} -### Example from Twitter: +### Example from Twitter: {-} [https://twitter.com/Josh_Ebner/status/1470818469801299970?s=20](https://twitter.com/Josh_Ebner/status/1470818469801299970?s=20) -### Reprexes from Ryan S: +### Reprexes from Ryan S: {-} ```{r 10-57} library(sf) @@ -720,7 +591,7 @@ sf_plygn_1_wlims <- sf_plygn_1 + sf_plygn_1_wlims ``` -### Further exploration +### Further exploration {-} Using `geom_sf()` adds `CoordSF` by default @@ -778,7 +649,7 @@ Interesting note from the [docs](https://ggplot2.tidyverse.org/reference/ggsf.ht > ... specifying limits via position scales or xlim()/ylim() is strongly discouraged, as it can result in data points being dropped from the plot even though they would be visible in the final plot region. -### Internals +### Internals {-} ```{r 10-68} library(ggtrace) # v.0.4.5 @@ -822,7 +693,7 @@ dplyr::bind_cols(layer_grob(sf_plygn_1_wlims)[[1]][c("x", "y")]) # x for fifth r -## Meeting Videos +## Meeting Videos {-} ### Cohort 1