For the second lesson in Basic Basics we’re going to talk about packages. You likely won’t get very far in R without packages. Sure, you could write all the functions you need for your analysis yourself if you wanted, but the great thing about the #rstats community is that people write code, bundle it into packages, and then give them away FOR FREE because they are so terribly nice. Once you are an #rstats expert, you can create your own packages and give back, but for now let’s learn what packages are and how to use them.
By the end of the lesson, you should:
libraryfunction to load packages at the top of your script (and understand why it is best to do it there and not in the console!) 2.4 Know how to find useful information about how to use a particular package when you are trying something new.
2.1 What is a package?
A package is a bundle of code that a generous person has written, tested, and then given away. Most of the time packages are designed to solve a specific problem, so they to pull together functions related to a particular data science problem (e.g., data wrangling, visualisation, inference). Anyone can write a package, and you can get packages from lots of different places, but for beginners the best thing to do is get packages from CRAN, the Comprehensive R Archive Network. It’s easier than any of the alternatives, and people tend to wait until their package is stable before submitting it to CRAN, so you’re less likely to run into problems. You can find a list of all the packages on CRAN here.
Some packages are bundles of packages. For example, the tidyverse is an umbrella package that pulls together lots of individual data wrangling and visualisation packages, so that when you install
tidyverse you get 8 packages for price of 1 (actually they are free, but you get what I mean). The packages in the tidyverse include:
2.2 How do I install and load packages?
In this screencast, we’ll cover:
- How to install packages
- How to use the
libraryfunction to load packages when you want to use them
Watch the video and then carry out the following steps:
- Install the
- Add a section label to your script and
library()calls to the top of your script to load the
herepackages. You’ll be using these packages to read in data in Basic Basics Lesson 3!
- Browse the list of packages on CRAN, find one that looks interesting and install it (just because now you know how!) Hint - we will probably use the
skimrpackages soon, so have a go at installing those.
2.3 I’ve installed a package… now what?
Installing and loading packages is all well and good - but knowing what they do is pretty important when you want to use them! CRAN requires that package authors write documentation that goes with their package and these documents are designed to give you an idea of what functions are included and what the package can be used for.
When you are looking for information about a package there are a few places to look. Lets use the
janitor package as an example…
Step 1: Look on CRAN
The README file is most often pretty useful. janitor README
Step 3: Google it
You can see when we google “how to use the janitor package R” the first things that come up are CRAN documentation but under that there are links to documentation and blog posts by other R users who have found the package helpful and written about it.
Step 4: Have a look on Twitter
Searching twitter is also a great way of locating people who might have written a blog post about how to use a package. Search [#rstats janitor] to find people who have liked the package enough to bother tweeting about it.
Now that you have the
here (and a few other potentially useful packages) installed, let’s read in some data.
On to Lesson 3!