Introduction

Working in R and RStudio

Athanasia Mo Mowinckel

CAPRO

We specialize in data processing and capture of large life science data for social sciences and humanities

CAPRO

  • Data flow and processing

  • Data capture

  • Project data management

  • Registry data handling

  • HPC analyses

Meet your neighbors

03:00

Pre-workshop checklist



https://www.capro.dev/workshop_rproj/#preparations

Asking for help

I’m stuck or need help

All good, ready to move on

Schedule 1

Saving source and blank slates 1 hour
Project-oriented workflow 1 hour
How to name files & practicing safe paths 1 hour
Lunch break 30 minutes
R package management 1.5 hours

Go here now



https://www.capro.dev/workshop_rproj/

Getting started

Checklist


R installed? Pretty recent?

     Recommended R ≥ 4.3.0

RStudio installed?

     I’m on 2023.06.1 Build 524     

Packages?

# Run in R
install.packages(c(
  "remotes",      # installing packages from GitHub
  "rmarkdown",    # rendering reports
  "fs",           # file system operations
  "here",         # navigating paths
  "usethis",      # for course materials
  "tidyverse"     # data-wrangling
))


What Did They Forget
to Teach You?

Learning objectives

  1. Establish the concept of the project as the basic organizational unit of work.

  2. Apply best practices in and leverage benefits of working in RStudio projects, including

  • Creating robust file paths that travel well in time and space.

  • Constructing human and machine readable file names that sort nicely.

  • Differentiating workflow elements, analysis inputs, and analysis outputs in project structure to create navigable programming interfaces.

  • Restarting R frequently, with a blank slate.

Why do we care about project management?

Portability

The ability to move the project without breaking code or needing adapting

  • you will change computers
  • you will reorganise your file structure
  • you will share your code with others

Reproducibility

The ability to rerun the entire process from scratch

  • not just for reviews
  • not just for best-practice science
  • also for future (or even present) you
  • and for your collaborators/helpers

Be organized

Be organized as you go,
not “tomorrow”

Don’t fret over past mistakes.

Raise the bar for new work.

Be organized



self-explaining >>> wordy, needy explainers

Be organized

>>>

  file salad
  + an out of date README

Good enough practices in scientific computing

PLOS Computational Biology

Wilson, Bryan, Cranston, Kitzes, Nederbragt, Teal (2017)

https://doi.org/10.1371/journal.pcbi.1005510

http://bit.ly/good-enuff

Practical Example

Your R installation

R packages

  • the natural unit for distributing R code

base R

  • 14 base + 15 recommended packages

  • ships with all binary distributions of R

For example, have you used lattice recently? 🤷

  • it came with your R installation, can use out of the box

  • library(lattice)

Additional packages


CRAN, ~20k packages

# install from CRAN
install.packages("remotes")
# attach
library(remotes)

GitHub, ??? packages

# install via remotes
remotes::install_github("rstats-wtf/wtfdbg")
# attach
library(wtfdbg)

Where do packages live locally?


By default, in the default library

.Library


All libraries for the current session

.libPaths()


All installed packages

installed.packages()

Syntax aside: pipes

  • 2014+ magrittr pipe %>%

  • 2021+ (R \(\geq\) 4.1.0) native R pipe |>

2022 Isabella Velásquez Understanding the native R pipe |> https://ivelasq.rbind.io/blog/understanding-the-r-pipe/

whatever(arg1, arg2, arg3, ...)
arg1 |>  
  whatever(arg2, arg3)
mean(0:10)
0:10 |> 
  mean()

Syntax aside: namespacing

dplyr::select()

  • tells R explicitly to use the function select from the package dplyr

  • can help to avoid name conflicts (e.g., MASS::select())

  • does not require library(dplyr)

library(dplyr)
select(mtcars, mpg, cyl) 
mtcars |>  
  select(mpg, cyl) 
# library(dplyr) not needed
dplyr::select(mtcars, mpg, cyl) 
mtcars |>  
  dplyr::select(mpg, cyl) 

🧐 Explore your R installation

install.packages("usethis")
library(usethis)
# saves project on desktop by default for most users
use_course("rstats-wtf/wtf-explore-libraries")
# use_course("rstats-wtf/wtf-explore-libraries", destdir = "my/new/location")
# can alternatively download from 
# https://github.com/rstats-wtf/wtf-explore-libraries

Read the README.md to get started.

  • 01_explore-libraries_spartan.R
    (directions to explore without suggested code)

  • 01_explore-libraries_comfy.R
    (directions to explore with suggested code)

  • 01_explore-libraries_solution.R
    (directions to explore with code solutions)

15:00