This notebook is best used in conjunction with the recorded delivery of the training session which is available on https://youtu.be/5klSpGC2puU and the Advanced R presentation available in the https://gitlab.com/SManzi/r-for-healthcare-training.
Import the ggplot2 library
library(ggplot2)
Import the ‘mpg’ dataset that comes with ggplot2 and assign it to a ggplot object
mpg <- ggplot2::mpg #import data
ggplot(data=mpg) #assign data to ggplot object
Scatter plots and basic aesthetics
In the examples below ‘displ’ means engine displacement and ‘hwy’ means highway fuel efficency
# Map displ to the x-axis and hwy to the y-axis
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ,y=hwy))
# Colour the points by class of vehicle
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ,y=hwy, color=class))
# Change the size of the points based on the class of vehicle
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ,y=hwy, size=class))
# Change the opacity (alpha) of the points based on the class of vehicle
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ,y=hwy, alpha=class))
# Shape of the points is determined by the class of the vehicle
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ,y=hwy, shape=class))
# All points on the plot are coloured blue
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
# Add a title and axis labels to the plot using the 'labs' component
ggplot(data=mpg) +
geom_point(mapping=aes(x=displ, y=hwy),
color="blue") +
labs(title="Example scatterplot",
x="Displacement", y="Highway efficency")
Facet wrap
Facet wrapping is used to determine how the plots are split up and organised, for example to graph our data by the class variable organised in 2 rows we use the facet wrap() function.
# separate into individual plots by class and arrange in 2 rows
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
Facet wrapping also allows you to plot by two variables to enable comparisons. We use the facet grid() function for this
# separate into individual plots by drive type and
# number of cylinders arranged as a grid
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ cyl)
Multiple geometry layers
There are lots of geometry layers, look at this https://rpubs.com/hadley/ggplot2-layers for an overview
# Plotting a smooth curve through the data
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy))
# Plotting points and smooth curve
ggplot(data = mpg) +
geom_point(mapping=aes(x=displ,y=hwy, color=drv)) +
geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv))
Global mapping
If you define the mapping and aesthetics in the ggplot() object these parameters will be applied to any subsequent layers reducing replications in the code
# global mapping of the x and and y axis variables
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping=aes(color=drv)) +
geom_smooth(mapping=aes(linetype=drv))
Bar plot
# quick bar plot
ggplot(data=mpg) +
geom_bar(mapping=aes(x=class))
Statistical transformations
There are anumber of built in statistical transformations that can be performed on your data to produce new inputs to plot
ggplot(data=mpg) +
stat_count(mapping=aes(x=class))
Coordinate transformations
Different coordinate systems can be used in your plots
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot()
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() +
coord_flip()
Saving your plots
bar <- ggplot(data = mpg) +
geom_bar(
mapping = aes(x = class, fill = class),
show.legend = FALSE,
width = 1
) +
theme(aspect.ratio = 1) +
labs(x = NULL, y = NULL)
bar + coord_flip()
bar + coord_polar()
# The plot will be saved to your working directory
# unless otherwise specified
ggsave("my_plot.png", plot=bar)
Histogram
ggplot(data=mpg) +
geom_histogram(mapping=aes(x=hwy),
col="black",
fill="grey")
Exercise
Using the ’midwest’ dataset create a graph or graphs that show something interesting about the data
mid <- ggplot2::midwest