VizWhiz 3
The ggplot
fanciness we covered in Lesson 1 and Lesson 2 is great, but sometimes your boss/grant agency/publication outlet insists on a bar or column graph. Or perhaps you simply need to visualise two continuous variables. In this lesson, we will show you how to plot bar and column graphs with error bars and how to plot a scatter plot.
Lesson Outcomes
By the end of the lesson, you should:
- 3.1 Understand the difference between
geom_bar
andgeom_col
and use them to plot frequency vs. summary data - 3.2 Be able to use
summarise
to calculate standard error andgeom_errorbar
to add error bars to a plot - 3.3 Be able to use
geom_point
to create a scatter plot of two continuous variables
3.1 Bars vs. columns
There are two geoms in ggplot that draw bar plots, geom_bar
and geom_col
.
When you want to plot frequency/count data and are happy to let R to do the counting autoamtically, use geom_bar
. It only requires that you tell it what you want on the x axis, and it will put frequency on the y axis.
If you want the height of the bar to represent a value you have calculated, then use geom_col
. For this geom, you need to tell it what you want for both the x axis and the y axis.
In this screencast, we’ll review:
- How to use
geom_bar
to plot count/frequency data - How to combine
summarise
andgeom_col
to plot mean bug levels by year - How to
group_by
more than one variable and usefacet_wrap
to plot mean bug levels by year, separately for each site
Here’s the plot for reference:
Watch the video and then carry out the following steps:
- Use
geom_bar
andfacet_wrap
to plot the number of readings that were taken each year, separately for each site - Use
group_by
andsummarise
withgeom_col
to plot the mean beach bug levels averaged across all the sites each year - Use
group_by
andsummarise
withgeom_col
to plot the mean beach bug levels each year, usingfacet_wrap
to plot each site separately
3.2 Error bars
Of course, good practice dictats that you need error bars on those columns. Never fear! Using summarise
, it is easy to calculate standard error.
In this screencast, we’ll review:
- How to use
summarise
to calculate the mean, standard deviation and standard error - How to add a
geom_errorbar
layer to your plot to display the mean beach bugs data in a column graph with error bars
Here’s the plot for reference:
Watch the video and then carry out the following steps:
- Use
group_by
andsummarise
to calculate the mean, standard deviation, N, and standard error for each site - Pipe that
summarise
into ageom_col
adding acoorid_flip
andgeom_errorbar
layer
3.3 Scatter plots
Sometimes you want to visualise the relationship between two continuous variables using a scatterplot. Our original beachbugs dataset doesn’t include any interesting variables that might be correlated with the bacteria levels, so we have pulled in some weather data to see whether bacteria levels might be related to rainfall or temperature, or some combination of the two.
You can download the rain_temp_beachbugs.csv data here
STOP THE PRESS
#RYouWithMe now has a data package! Check out https://ryouwithme.netlify.app/ for details about how to install and use the package to read the bakers, beach summary and frames.
In this screencast, we’ll review:
- How to import the rain_temp_beachbugs.csv dataset into R
- How to use
geom_point
andgeom_smooth
to plot a scatter plot and best fit line - How to use point color to illustrate potential interactions in your data
Here’s the plot for reference:
Watch the video and then carry out the following steps:
- use
read_csv
andhere
to read the rain_temp data into RStudio (need help? revisit Unit 1 Basic Basics Lesson 3: Loading Data - use
geom_point
to plot the relation between rainfall and beach bugs filter
the data for values more than 500 and add ageom_smooth
layer to see a regression line- colour the points by the temperature variable
Now it’s your turn!
Are bar plots your bread and butter? Or maybe you primarily work with bivariate continuous data? Either way, we want to see your plots! Tweet them to us at @RLadiesSydney using the hashtag: #vizwhiz!
Sydney-based RLadies are encouraged to share your successes, frustrations, and of course your plots, in our slack channel, #ryouwithme_3_vizwhiz.
Now that you’ve got the structural components of several of the most popular plots down, it’s time to learn how to customise the appearance of those plots! Onward to [Lesson 4](../03-VizWhiz-4/!