Package Management

Working in R and RStudio

Athanasia Mo Mowinckel

What are packages / libraries in R?

  • Extensions to the programming language
    • contains functions & documentation of these
    • standardised for easier install and use across users
  • Often written by other users to solve specific tasks in R
    • solving particular statistical problems
    • new types of data visualisation

Common package issues in pipelines

A package masks the function of another package I use**

  • example: select() from {dplyr} being masked by select() from {MASS}

Common package issues in pipelines

A package masks the function of another package I use**

  • example: select() from {dplyr} being masked by select() from {MASS}

solution 1

load libraries in reverse order of importance.

  • i.e. load {dplyr} last, and its select() will be prioritised over the one from {MASS}.
library(MASS)
library(dplyr)

Common package issues in pipelines

A package masks the function of another package I use**

  • example: select() from {dplyr} being masked by select() from {MASS}

solution 2:

if you only need a single function from another package, call the function directly, rather than loading the entire library.

  • i.e. run MASS::lm.gls() rather than library(MASS); lm.gls().

  • the double colon :: enables you to access a library’s. functions without loading the entire library.

    • this is called namespacing

Common package issues in pipelines

Preventative measures:

  • load all libraries you need at the top of your scripts

    • this helps you control the order things are loaded and possible function masking

    • makes it clear to anyone else running the script what the dependencies are

  • When introducing a new package to your script

    • add it to the top of the script

    • restart your R session so it is clean

    • start re-running your code to make sure everything still works as expected

    • if it doesn’t, make sure its the first library to be loaded, and try again in a fresh R session

Common package issues in pipelines

The package I used 1 year ago no longer exists

There are archives of packages, like MRAN, that will likely help you. But you’ll need to know which version you used, etc.

Can be tricky, but the {checkpoint} package might be of great assistance!

R package managers

  • RStudio package manager

    • RStudio enterprise solution
  • packrat

    • one of the first

    • a little complicated to use, requiring some knowledge of package paths etc.

  • checkpoint

    • I have no experience with it, but it looks great!
  • renv

    • new and widely used

    • creates fairly simple procedures to follow

    • can become complicated in some situations

Use renv to make your R projects more:

  • Isolated
    • installing or updating a package for one project won’t break other projects
    • gives each project its own private package library
  • Portable
    • transport your projects from one computer to another
    • makes it easy to install the packages your project depends on.
  • Reproducible
    • records the exact package versions you depend on
    • ensures those exact versions are the ones that get installed wherever you go.

Using {renv}

renv should be used in projects

If you initiate renv when your not in a project, it will create one for you.

To initiate renv:

renv::init()

How did it find these packages?

It looks in all project files for:

  • library(package)
  • require(package)
  • requireNamespace("package")
  • package::function()

Will also look for packages listed in the DESCRIPTON file, as “Depends”, or “Imports”.

Demo

Your turn


Explore the changes {renv} makes to a project.


library(usethis)
# saves project on desktop by default for most users
use_course("capro-uio/wtf-explore-renv")
# use_course("capro-uio/wtf-explore-renv", destdir = "my/new/location")
# can alternatively download from 
# https://github.com/capro-uio/wtf-explore-renv


Read the README.md to get started.

15:00