Skip to content

Instantly share code, notes, and snippets.

@przytu1
Last active January 23, 2017 15:43
Show Gist options
  • Save przytu1/1fc593524ca80db837315ff0d50a150a to your computer and use it in GitHub Desktop.
Save przytu1/1fc593524ca80db837315ff0d50a150a to your computer and use it in GitHub Desktop.
RR

RR

Concept

R developers need a tool to reproduce runtime environments for their projects. R ecosystem is evolving really fast - R interpreter, packages, system libraries are subject to frequent updates. Also, different data scientists work in different configurations. This results in problems with reproducibility.

Existing solutions

  • Packrat (only R libraries, private lib for a project)
  • Switchr (only switching between libraries using .libPaths())
  • Checkout (recreating older versions of packages)
  • Docker (dev-ops tool, not very friendly for regular R user)

We want to create enhanced R interpreter. Similarly to iPython for Python, it is going to be equipped with commands that enable easy switch between R sessions in specific, well defined environments. The backend for these environments will be implemented using Docker containers.

Final goal is to integrate RR interpreter with RStudio to make reproducible research even more accessible.

Use Cases

  1. Publishing research and associated R code

    Using RR allows data scientists who download this research to reproduce results without spending time on recreating environment. Research can be published together with the environment image. That means when anybody comes back to the code after years, it should be easily executed.

  2. Teamwork

    When you do your research in a team, it is important to run code in the same configuration. Using RR reduces overhead with recreating environment on different machines and with developing this environment during the research.

  3. Sharing Shiny applications

    When a teammate creates a shiny app and you want to run in locally, you don't need to manually reproduce the environment.

  4. Experimenting how code works with different R or package version

    RR allows to switch environment with one command, therefore now such experimenting becomes accessible.

Example Workflow

~ RR
R version 3.3.2 -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

> use.env("appsilon/my-script")
Downloading environment...
Using environment appsilon/my-script. 30 packages available. R version is 3.1.0

> show.packages()
devtools - 1.12.1
shiny    - 0.13.2
...
stringr  - 1.13

> use.R("3.0.0")
Downloading environment...
Started new R 3.0.0 session. You have 0 packages available

> install.packages("dplyr")
> install.packages("DT")
> save.env("appsilon/project-r-3.0.0")
> publish.env()
Uploading environment...
Environment is available at http://rr.appsilondatascience.com/appsilon/project-r-3.0.0

> available.env()
Available environments on your machine:
appsilon/my-script
appsilon/project-r-3.0.0
R 3.0.0

> install.env("appsilon/cool-project")
Downloading environment...
Done!

> available.env()
Available environments on your machine:
appsilon/some-project
appsilon/new-project
R 3.0.0
appsilon/cool-project
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment