## 2018/11/05

For the second lesson in Basic Basics we’re going to talk about packages. You likely won’t get very far in R without packages. Sure, you could write all the functions you need for your analysis yourself if you wanted, but the great thing about the #rstats community is that people write code, bundle it into packages, and then give them away FOR FREE because they are so terribly nice. Once you are an #rstats expert, you can create your own packages and give back, but for now let’s learn what packages are and how to use them.

## Lesson Outcomes

By the end of the lesson, you should:

2.1 Understand what a package is     2.2 Understand how to install packages from the console (quadrant 2).     2.3 Be able to use the library function to load packages at the top of your script (and understand why it is best to do it there and not in the console!)     2.4 Know how to find useful information about how to use a particular package when you are trying something new.

## 2.1 What is a package?

A package is a bundle of code that a generous person has written, tested, and then given away. Most of the time packages are designed to solve a specific problem, so they to pull together functions related to a particular data science problem (e.g., data wrangling, visualisation, inference). Anyone can write a package, and you can get packages from lots of different places, but for beginners the best thing to do is get packages from CRAN, the Comprehensive R Archive Network. It’s easier than any of the alternatives, and people tend to wait until their package is stable before submitting it to CRAN, so you’re less likely to run into problems. You can find a list of all the packages on CRAN here.

Some packages are bundles of packages. For example, the tidyverse is an umbrella package that pulls together lots of individual data wrangling and visualisation packages, so that when you install tidyverse you get 8 packages for price of 1 (actually they are free, but you get what I mean). The packages in the tidyverse include:

• ggplot2
• dplyr
• tidyr
• purrr
• tibble
• stringr
• forcats

## 2.2 How do I install and load packages?

In this screencast, we’ll cover:

• How to install packages
• How to use the library function to load packages when you want to use them

Watch the video and then carry out the following steps:

1. Install the tidyverse and here packages
2. Add a section label to your script and library() calls to the top of your script to load the tidyverse and here packages. You’ll be using these packages to read in data in Basic Basics Lesson 3!
3. Browse the list of packages on CRAN, find one that looks interesting and install it (just because now you know how!) Hint - we will probably use the janitor and skimr packages soon, so have a go at installing those.

## 2.3 I’ve installed a package… now what?

Installing and loading packages is all well and good - but knowing what they do is pretty important when you want to use them! CRAN requires that package authors write documentation that goes with their package and these documents are designed to give you an idea of what functions are included and what the package can be used for.

When you are looking for information about a package there are a few places to look. Lets use the janitor package as an example…

### Step 2: See if the author wrote a package vignette

A vignette is a long form guide to using the package. For the janitor package, there is a link to the vignette in the README file. janitor vignette

Now that you have the tidyverse and here (and a few other potentially useful packages) installed, let’s read in some data.