11 Continuous Integration with GitHub Actions

11.1 Introduction

We are almost at the end of our journey. In the previous chapters, we built reproducible environments with Nix, organised our code into pure functions, added robustness with monads, proved correctness with unit tests, managed our collaboration with Git, and bundled everything into shareable packages. We can now run our pipelines in a 100% reproducible way.

However, all of this still requires manual steps. And maybe that’s not a problem; if your environment is set up and users only need to drop into a Nix shell and run the pipeline, that’s already a huge improvement. But you should keep in mind that manual steps don’t scale. Imagine you are part of a team that needs to quickly ship products to clients. Several people contribute to the product, and you might need to work on multiple projects in the same day. You and your teammates should be focusing on writing code, not on repetitive tasks like building images or running tests. Ideally, we would want to automate these steps. That is what we are going to learn in this chapter.

This chapter will introduce you to Continuous Integration (CI) with GitHub Actions. You will learn how to set up workflows that automatically run your tests when you push code, how to build Docker images and recover artifacts, and how to run your pipelines directly from GitHub’s servers. Because we’re using Git to trigger all the events and automate the whole pipeline, this approach is sometimes called GitOps.

You may have heard the term “CI/CD,” where CD stands for Continuous Deployment or Continuous Delivery. We will focus on CI in this chapter. Continuous Deployment (automatically pushing results to a database, dashboard, or API) is highly specific to your organisation and infrastructure. What we cover here, however, gives you the foundation: once your pipeline runs reliably on CI, the “deployment” step is just one more workflow job pointing to wherever your results need to go.

11.2 Getting your repo ready for Github Actions

Obviously, you should use a project that is versioned on GitHub. Use the package we’ve developed previously. If you go on its GitHub page, you should see an “Actions” tab on top:

This will open a new view where you can select a lot of available, ready to use actions. “Actions” are premade scripts that execute some commands you might need: such as setting up R, Python, running tests, etc. Since we’re using Nix, we don’t really need to look for any actions to set up our environments. However, we might want to use some pre-made actions to upload artifacts for instance.

To actually configure our repository to run actions, we need to edit a file in our project under the .github/workflows directory (create them if needed). In it, write a yaml file called hello.yaml and write the following in it:

name: Hello world
on: [push]
jobs:
  say-hello:
    runs-on: ubuntu-latest
    steps:
      - run: echo "Hello from Github Actions!"
      - run: echo "This command is running from an Ubuntu VM each time you push."

Let’s study this workflow definition line by line:

name: Hello world

Simply gives a name to the workflow.

on: [push]

When should this workflow be triggered? Here, whenever something gets pushed.

jobs:

What is the actual things that should happen? This defines a list of actions.

  say-hello:

This defines the say-hello job.

    runs-on: ubuntu-latest

This job should run on an Ubuntu VM. You can also run jobs on Windows or macOS VMs, but this uses more compute minutes than a Linux VM (which doesn’t matter for public projects. For private projects, the amount of compute minutes is limited).

    steps:

What are the different steps of the job?

      - run: echo "Hello from Github Actions!"

First, run the command echo "Hello from Github Actions!". This commands runs inside the VM. Then, run this next command:

      - run: echo "This command is running from an Ubuntu VM each time you push."

If we take a look at the commit we just pushed, on GitHub, we see this yellow dot next to the commit name. This means that an action is running. We can then take a look at the output of the job, and see that our commands, defined with the run statements in the workflow file, succeeded and echoed what we asked them.

11.3 Nix and GitHub Actions

To set up Nix on GitHub Actions you can use several steps (create a new file called run-tests.yaml):

- name: Install Nix
  uses: cachix/install-nix-action@v31
  with:
    nix_path: nixpkgs=https://github.com/rstats-on-nix/nixpkgs/archive/r-daily.tar.gz

- name: Setup Cachix
  uses: cachix/cachix-action@v15
  with:
    name: rstats-on-nix

If you’re repository contains a default.nix file, the same environment you’ve been using locally can be used on GitHub Actions just as easily. You can also instead generate the default.nix from the gen-env.R script:

- name: Build dev env
  run: |
    nix-shell --expr "$(curl -sl https://raw.githubusercontent.com/ropensci/rix/main/inst/extdata/default.nix)" --run "Rscript -e 'source(\"gen-env.R\")'"

You can then use the shell to run whatever you need. For example, if you’re developing a package, you could run unit tests on each push:

- name: devtools::test() via nix-shell
  run: nix-shell --run "Rscript -e \"devtools::test(stop_on_failure = TRUE)\""

stop_on_failure = TRUE is needed to make the step fail if there’s an error, otherwise, the step would run successfully, even with failing tests.

Of course, if you’re developing a Python package, use nix-shell --run "pytest" instead to run the tests.

I highly recommend you run tests when pull requests get opened:

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]

This will ensure that if someone contributes to your project, you know immediately if what they did breaks tests or not. If it does, ask them to fix the code until tests pass.

11.4 Running a dockerized workflow

This next example can be found in this repository. This example doesn’t use Nix, {rix} nor {rixpress}, but the point here is to show how a Docker container can be executed on GitHub Actions, and artifacts can be recovered. The process is always the same, regardless is inside the Docker image. If you want to follow along, fork this repository.

This is what our workflow file looks like:

name: Reproducible pipeline

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]

jobs:

  build:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v5
    - name: Build the Docker image
      run: docker build -t my-image-name .
    - name: Docker Run Action
      run: docker run --rm --name my_pipeline_container -v /github/workspace/fig/:/home/graphs/:rw my-image-name
    - uses: actions/upload-artifact@v4
      with:
        name: my-figures
        path: /github/workspace/fig/

For now, let’s focus on the run statements, because these should be familiar:

run: docker build -t my-image-name .

and:

run: docker run --rm --name my_pipeline_container -v /github/workspace/fig/:/home/graphs/:rw my-image-name

The only new thing here, is that the path has been changed to /github/workspace/. This is the home directory of your repository, so to speak. Now there’s the uses keyword that’s new:

uses: actions/checkout@v5

This action checkouts your repository inside the VM, so the files in the repo are available inside the VM. Then, there’s this action here:

- uses: actions/upload-artifact@v4
  with:
    name: my-figures
    path: /github/workspace/fig/

This action takes what’s inside /github/workspace/fig/ (which will be the output of our pipeline) and makes the contents available as so-called “artifacts”. Artifacts are the outputs of your workflow. In our case, as stated, the output of the pipeline. So let’s run this by pushing a change, and let’s take a look at these artifacts!

After the action done running, you will be able to download a zip file containing the plots. It is thus possible to rerun our workflow in the cloud. This has the advantage that we can now focus on simply changing the code, and not have to bother with boring manual steps. For example, let’s change this target in the _targets.R file:

tar_target(
  commune_data,
  clean_unemp(
    unemp_data,
    place_name_of_interest = c(
      "Luxembourg", "Dippach",
      "Wiltz", "Esch/Alzette",
      "Mersch", "Dudelange"),
    col_of_interest = active_population)
)

I’ve added “Dudelange” to the list of communes to plot. Pushing this change to GitHub triggers the action we’ve defined before. The plots (artifacts) get refreshed, and we can download them. Take a look and see that Dudelange was added in the communes.png plot!

It is also possible to “deploy” the plots directly to another branch, and do much, much more. I just wanted to give you a little taste of Github Actions (and more generally GitOps). The possibilities are virtually limitless, and I still can’t get over the fact that Github Actions is free for public repositories.

11.5 Building a Docker image and pushing it to a registry

It is also possible to build a Docker image and have it made available on an image registry. You can see how this works on this repository. This images can then be used as a base for other reproducible pipelines, as in this repository. Why do this? Well because of “separation of concerns”. You could have a repository which builds in image containing your development environment: this could be an image with a specific version of R and R packages built with Nix. And then have as many repositories as projects that run pipelines using that development environment image as a basis. Simply add the project-specific packages that you need for each project.

11.6 Running a rixpress Pipeline from GitHub Actions

With Nix and {rixpress}, running your pipeline directly on GitHub Actions is straightforward. Because Nix handles all dependencies reproducibly, you don’t need Docker as an intermediary. The rixpress_demos repository contains several complete examples; here we will walk through the key steps.

The workflow triggers on pushes and pull requests to main:

on:
  pull_request:
    branches: [main, master]
  push:
    branches: [main, master]

After checking out the repository and installing Nix (with Cachix for faster builds), the first step generates or regenerates the development environment from the gen-env.R script:

- name: Build dev env
  run: |
    nix-shell -p R "rPackages.rix" "rPackages.rixpress" --run "Rscript gen-env.R"

Next, the pipeline definition is generated from gen-pipeline.R:

- name: Generate pipeline
  run: |
    nix-shell --quiet --run "Rscript gen-pipeline.R"

You can optionally visualise the DAG to verify the pipeline structure:

- name: Check DAG
  run: |
    nix-shell --quiet -p haskellPackages.stacked-dag --run "stacked-dag dot _rixpress/dag.dot"

Finally, build and inspect the pipeline:

- name: Build pipeline
  run: |
    nix-shell --quiet --run "Rscript -e 'rixpress::rxp_make()'"

- name: Inspect built derivations
  run: |
    nix-shell --quiet --run "Rscript -e 'rixpress::rxp_inspect()'"

- name: Show result
  run: |
    nix-shell --quiet --run "Rscript -e 'rixpress::rxp_read(\"confusion_matrix\")'"

11.6.1 Caching Pipeline Outputs Between Runs

While Nix caches derivations, CI runners are ephemeral: each run starts fresh. To avoid rebuilding the entire pipeline every time, {rixpress} provides rxp_export_artifacts() and rxp_import_artifacts() to persist outputs between runs.

Before building, check if cached outputs exist and import them:

- name: Import cached outputs if available
  run: |
    if [ -f "../outputs/my_pipeline/pipeline_outputs.nar" ]; then
      nix-shell --quiet --run "Rscript -e 'rixpress::rxp_import_artifacts(archive_file = \"../outputs/my_pipeline/pipeline_outputs.nar\")'"
    else
      echo "No cached outputs found, will build from scratch"
    fi

After building, export the outputs so they can be reused:

- name: Export outputs to avoid rebuild
  run: |
    mkdir -p ../outputs/my_pipeline
    nix-shell --quiet --run "Rscript -e 'rixpress::rxp_export_artifacts(archive_file = \"../outputs/my_pipeline/pipeline_outputs.nar\")'"

Finally, commit the cached outputs back to the repository:

- name: Push cached outputs
  run: |
    cd ..
    git config --global user.name "GitHub Actions"
    git config --global user.email "actions@github.com"
    git pull --rebase --autostash origin main
    git add outputs/my_pipeline/pipeline_outputs.nar
    if git diff --cached --quiet; then
      echo "No changes to commit."
    else
      git commit -m "Update cached pipeline outputs"
      git push origin main
    fi

This pattern ensures that only changed derivations are rebuilt on subsequent runs. The .nar file format is Nix’s archive format and contains all the built outputs.

11.6.2 The Easy Way: `rxp_ga()`

If the above seems like a lot of boilerplate, {rixpress} provides a helper function that generates a complete GitHub Actions workflow for you:

rixpress::rxp_ga()

This creates a .github/workflows/run-rxp-pipeline.yaml file that handles everything: installing Nix, setting up Cachix, generating the environment and pipeline, importing and exporting artifacts, and storing them in a dedicated rixpress-runs orphan branch. Using an orphan branch keeps your main branch clean while persisting the cached outputs between runs.

For most projects, running rxp_ga() once and committing the generated workflow file is all you need to get your pipeline running on CI.

11.7 GitHub Actions without Nix

If you’re not using Nix, you’ll have to set up GitHub Actions manually. Suppose you have a package project and want to run unit tests on each push. See for example the {myPackage} package, in particular this file. This action runs on each push and pull request on Windows, Ubuntu and macOS:

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]

jobs:
  rcmdcheck:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest, macos-latest]

Several steps are executed, all using pre-defined actions from the r-lib project:

    steps:
    - uses: actions/checkout@v4
    - uses: r-lib/actions/setup-r@v2
    - uses: r-lib/actions/setup-r-dependencies@v2
      with:
        extra-packages: any::rcmdcheck
        needs: check
    - uses: r-lib/actions/check-r-package@v2

An action such as r-lib/actions/setup-r@v2 will install R on any of the supported operating systems without requiring any configuration from you. If you didn’t use such an action, you would need to define three separate actions: one that would be executed on Windows, on Ubuntu and on macOS. Each of these operating-specific actions would install R in their operating-specific way.

Check out the workflow results to see how the package could be improved here.

Here again, using Nix simplifies this process immensely. Look at this workflow file from {rix}’s repository here. Setting up the environment is much easier, as is running the actual test suite.

11.8 Advanced patterns

Now that you understand the basics, let’s look at some more advanced patterns that will make your CI workflows more efficient and informative.

11.8.1 Caching with Cachix

Building Nix environments from scratch on every CI run can be slow. Cachix solves this by providing a binary cache for your Nix derivations. Once you build something, subsequent runs can download the pre-built binaries instead of rebuilding from source.

To use Cachix, you first need to create a free account at cachix.org and create a cache. Then, generate an auth token and add it as a secret in your GitHub repository settings (under Settings → Secrets and variables → Actions). Call it something like CACHIX_AUTH.

Here is a workflow that builds your development environment and pushes the results to your Cachix cache:

name: Update Cachix cache

on:
  push:
    branches: [main]

jobs:
  build-and-cache:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Nix
        uses: DeterminateSystems/nix-installer-action@main

      - uses: cachix/cachix-action@v15
        with:
          name: your-cache-name
          authToken: '${{ secrets.CACHIX_AUTH }}'

      - name: Build and push to cache
        run: |
          nix-build
          nix-store -qR --include-outputs $(nix-instantiate default.nix) | cachix push your-cache-name

The key line here is the nix-store command at the end. It queries all the dependencies of your build and pushes them to Cachix. The next time you or anyone else runs this workflow, the cachix/cachix-action will automatically pull from your cache, dramatically speeding up the build.

If you want to build on both Linux and macOS (since Nix binaries are platform-specific), you can use a matrix:

jobs:
  build-and-cache:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest]

11.8.2 Storing outputs in orphan branches

When running a pipeline on CI, you often want to keep the outputs (plots, data, reports) without committing them to your main branch. A clean solution is to store them in an orphan branch. An orphan branch has no commit history and is completely separate from your main code.

Here is the pattern:

- name: Check if outputs branch exists
  id: branch-exists
  run: git ls-remote --exit-code --heads origin pipeline-outputs
  continue-on-error: true

- name: Create orphan branch if needed
  if: steps.branch-exists.outcome != 'success'
  run: |
    git checkout --orphan pipeline-outputs
    git rm -rf .
    echo "Pipeline outputs" > README.md
    git add README.md
    git commit -m "Initial commit"
    git push origin pipeline-outputs
    git checkout -

- name: Push outputs to branch
  run: |
    git config --local user.name "GitHub Actions"
    git config --local user.email "actions@github.com"
    git fetch origin pipeline-outputs
    git worktree add ./outputs pipeline-outputs
    cp -r _outputs/* ./outputs/
    cd outputs
    git add .
    git commit -m "Update outputs" || echo "No changes"
    git push origin pipeline-outputs

This pattern first checks if the branch exists using git ls-remote. If not, it creates an orphan branch. Then it uses git worktree to work with both branches simultaneously, copies the outputs, and pushes them. The {rix} and {rixpress} packages use this pattern to store pipeline outputs between runs.

11.8.3 Creating workflow summaries

GitHub Actions has a built-in feature for creating rich summaries that appear directly on the workflow run page. You write Markdown to a special file path stored in the GITHUB_STEP_SUMMARY environment variable.

- name: Create summary
  run: |
    echo "## Pipeline Results 🎉" >> $GITHUB_STEP_SUMMARY
    echo "" >> $GITHUB_STEP_SUMMARY
    echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
    echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
    echo "| Tests passed | 42 |" >> $GITHUB_STEP_SUMMARY
    echo "| Coverage | 87% |" >> $GITHUB_STEP_SUMMARY

You can also generate the summary dynamically from your R or Python code:

- name: Generate summary from R
  run: |
    nix-shell --run "Rscript -e '
      results <- readRDS(\"results.rds\")
      cat(\"## Analysis Complete\n\n\", file = Sys.getenv(\"GITHUB_STEP_SUMMARY\"), append = TRUE)
      cat(paste(\"Processed\", nrow(results), \"observations\n\"), file = Sys.getenv(\"GITHUB_STEP_SUMMARY\"), append = TRUE)
    '"

This is particularly useful for:

Showing test results at a glance
Displaying key metrics from your analysis
Providing download links to artifacts
Reporting any warnings or issues

The summary appears right on the Actions tab, making it easy for collaborators to see what happened without digging through logs.

11.9 Conclusion

This chapter introduced Continuous Integration with GitHub Actions, the final piece of our reproducible workflow.

Key takeaways:

Automation removes manual steps: Every push triggers tests, builds, and deployments without human intervention
Nix simplifies CI setup: The same default.nix you use locally works on GitHub Actions, eliminating “works on my machine” problems
Cachix speeds up builds: By caching Nix derivations, subsequent runs avoid rebuilding unchanged dependencies
rxp_ga() handles the boilerplate: One function call generates a complete workflow for running {rixpress} pipelines on CI
Artifact caching persists outputs: Using rxp_export_artifacts() and an orphan branch, pipeline outputs survive between ephemeral CI runs

With continuous integration in place, your reproducible analytical pipeline is truly automated. Push your code, and GitHub takes care of the rest: running tests, building your environment, executing your pipeline, and storing the results. This frees you to focus on what matters: the analysis itself.