a - Getting started

Non-technical intro to Nix

If you are familiar with the concept of package managers, you can skip to the next section.

To ensure that a project is reproducible you need to deal with at least four things:

For the three first bullet points, the consensus seems to be a mixture of Docker to deal with system dependencies, {renv} for the packages (or {groundhog}, or a fixed CRAN snapshot like those Posit provides) and the R installation manager to install the correct version of R (unless you use a Docker image as base that already ships the required version by default). As for the last point, the only way out is to be able to compile the software for the target architecture. There’s a lot of moving pieces, and knowledge that you need to have to get it right.

But it turns out that this is not the only solution. Docker + {renv} (or some other way to deal with packages) is likely the most popular way to ensure reproducibility of your projects, but there are other tools to achieve this. One such tool is called Nix.

Nix is a package manager for Linux distributions, macOS and it even works on Windows if you enable WSL2. What’s a package manager? If you’re not a Linux user, you may not be aware. Let me explain it this way: in R, if you want to install a package to provide some functionality not included with a vanilla installation of R, you’d run this:

install.packages("dplyr")

It turns out that Linux distributions, like Ubuntu for example, work in a similar way, but for software that you’d usually install using an installer (at least on Windows). For example you could install Firefox on Ubuntu using:

sudo apt-get install firefox

(there’s also graphical interfaces that make this process “more user-friendly”). In Linux jargon, packages are simply what we call software (or I guess it’s all “apps” these days). These packages get downloaded from so-called repositories (think of CRAN, the repository of R packages) but for any type of software that you might need to make your computer work: web browsers, office suites, multimedia software and so on.

So Nix is just another package manager that you can use to install software.

But what interests us is not using Nix to install Firefox, but instead to install R and the R packages that we require for our analysis (or any other programming language that we need). But why use Nix instead of the usual ways to install software on our operating systems?

The first thing that you should know is that Nix’s repository, nixpkgs, is huge. Humongously huge. As I’m writing these lines, there’s more than 100’000 pieces of software available, and the entirety of CRAN is also available through nixpkgs. So instead of installing R as you usually do and then use install.packages() to install packages, you could use Nix to handle everything. But still, why use Nix at all?

Nix has an interesting feature: using Nix, it is possible to install software in (relatively) isolated environments. So using Nix, you can install as many versions of R and R packages that you need. Suppose that you start working on a new project. As you start the project, with Nix, you would install a project-specific version of R and R packages that you would only use for that particular project. If you switch projects, you’d switch versions of R and R packages.

If you are familiar with {renv}, you should see that this is exactly the same thing: the difference is that not only will you have a project-specific library of R packages, you will also have a project-specific R version. So if you start a project now, you’d have R version 4.2.3 installed (the latest version available in nixpkgs but not the latest version available, more on this later), with the accompagnying versions of R packages, for as long as the project lives (which can be a long time). If you start a project next year, then that project will have its own R, maybe R version 4.4.2 or something like that, and the set of required R packages that would be current at that time. This is because Nix always installs the software that you need in separate, (isolated) environments on your computer. So you can define an environment for one specific project.

But Nix goes even further: not only can you install R and R packages using Nix (in isolated) project-specific environments, Nix even installs the required system dependencies. So for example if I need {rJava}, Nix will make sure to install the correct version of Java as well, always in that project-specific environment (so if you already some Java version installed on your system, there won’t be any interference).

What’s also pretty awesome, is that you can use a specific version of nixpkgs to always get exactly the same versions of all the software whenever you build that environment to run your project’s code. The environment gets defined in a simple plain-text file, and anyone using that file to build the environment will get exactly, byte by byte, the same environment as you when you initially started the project. And this also regardless of the operating system that is used.

The Nix package manager

Nix is a package manager that can be installed on your computer (regardless of OS) and can be used to install software like with any other package manager. If you’re familiar with the Ubuntu Linux distribution, you likely have used apt-get to install software. On macOS, you may have used homebrew for similar purposes. Nix functions in a similar way, but has many advantages over classic package managers. The main advantage of Nix, at least for our purposes, is that its repository of software is huge. As of writing, it contains more than 100.000 packages, and the entirety of CRAN and Bioconductor is available through Nix’s repositories.

This means that using Nix, it is possible to install not only R, but also all the packages required for your project. The obvious question is why use Nix instead of simply installing R and R packages as usual. The answer is that Nix makes sure to install every dependency of any package, up to required system libraries. For example, the {xlsx} package requires the Java programming language to be installed on your computer to successfully install. This can be difficult to achieve, and {xlsx} bullied many R developers throughout the years (especially those using a Linux distribution, sudo R CMD javareconf still plagues my nightmares).

But with Nix, it suffices to declare that we want the {xlsx} package for our project, and Nix figures out automatically that Java is required and installs and configures it. It all just happens without any required intervention from the user. The second advantage of Nix is that it is possible to pin a certain revision of the Nix packages’ repository (called nixpkgs) for our project. Pinning a revision ensures that every package that Nix installs will always be at exactly the same versions, regardless of when in the future the packages get installed.

Other “Nix”es

There are several implementations of the Nix package manager that you might want to try if you’re feeling adventurous… for now, we recommend to stick with Nix.

You can give Lix a try, or Tvix.

rix workflow

The idea of {rix} is for you to declare the environment you need using the provided rix() function. rix() is the package’s main function and generates a file called default.nix which is then used by the Nix package manager to build that environment. Ideally, you would set up such an environment for each of your projects. You can then use this environment to either work interactively, or run R scripts. It is possible to have as many environments as projects, and software that is common to environments will simply be re-used and not get re-installed to save space. Environments are isolated for each other, but can still interact with your system’s files, unlike with Docker where a volume must be mounted. Environments can also interact with the software installed on your computer through the usual means, which can sometimes lead to issues. For example, if you already have R installed, and a user library of R packages, more caution is required to properly use environments managed by Nix.

It is important at this stage to understand that you should not call install.packages() from a running Nix environment. If you want to add packages to a Nix environment while analyzing data, you need to add it the default.nix expression and rebuild the environment. This is explained in greater detail in vignette("d1-installing-r-packages-in-a-nix-environment").

To avoid interference between your main R library of packages and your Nix environments, calling rix() will also run rix_init(), which will create a custom .Rprofile in the project’s directory. This .Rprofile will ensure that if you have a user library of packages, these won’t get loaded by an R version running in a Nix shell. It will also redefine install.packages() to throw an error if you try to use it.

rix() has several arguments:

For example:

rix(
  r_ver = "latest",
  r_pkgs = c("dplyr", "chronicler"),
  ide = "other"
)

The call above writes a default.nix file in the current working directory. This default.nix can in turn be used by Nix to build an environment containing the latest version of R, with the {dplyr} and {chronicler} packages.

Take note of the ide = "other" argument: this argument, and the values it can take, are further discussed in the vignette vignette("e-interactive-use") but continue reading this vignette and then vignettes numbered by a “d”.

Using default.nix files

The Nix package manager can be used to build reproducible development environments according to the specifications found in the generated default.nix files, which contain a Nix expression. An expression is Nix jargon for a function with multiple inputs and one output, this output being our development environment. {rix} does not require Nix to be installed to generate valid expressions (but does require an internet connection), so you could generate expressions and use them on other machines. To actually build an environment using a default.nix file, go to where you chose to write it (ideally in a new, empty folder that will be the root folder of your project) and use the Nix package manager to build the environment. Call the following function in a terminal:

nix-build

Nix install packages in a dedicated folder on your computer, called the Nix store.

Once Nix is done building the environment, you can start working on it interactively by using the following command in a terminal emulator (not the R console):

nix-shell

You will drop into a Nix shell which provides the installed software. It is not mandatory to call nix-build first: you can immediately call nix-shell. The advantage of using nix-build first is that it create a file called result which will prevent the environment to get garbage collected if you clean the Nix store.

If you want to build an environment for an older version of R, you might get a warning telling you that you cannot build the expression, but that you can directly drop into it.

If you want to completely isolate your Nix environment from the rest of the system, we recommend using nix-shell --pure to drop into the environment, as described in the documentation of rix_init().

Finally, if you want to delete an environment, delete the result file first (if you used nix-build) and then call nix-store --gc, which will delete all the orphaned packages.

Now that you know more about Nix and {rix}, it is time to get these tools installed on your system.