Adventures on the quest for long-term reproducible deployment

Ludovic Courtès — March 13, 2024

Rebuilding software five years later, how hard can it be? It can’t be that hard, especially when you pride yourself on having a tool that can travel in time and that does a good job at ensuring reproducible builds, right?

Continue reading…

Guix-HPC Activity Report, 2023

Céline Acary-Robert, Emmanuel Agullo, Ludovic Courtès, Marek Felšöci, Konrad Hinsen, Arun Isaac, Ontje Lünsdorf, Pjotr Prins, Simon Tournier, Philippe Virouleau, Ricardo Wurmus — February 16, 2024

This document is also available as PDF (printable booklet)

We are pleased to publish the sixth Guix-HPC annual report. Launched in 2017, Guix-HPC is a collaborative effort to bring reproducible software deployment to scientific workflows and high-performance computing (HPC). Guix-HPC builds upon the GNU Guix software deployment tool to empower HPC practitioners and scientists who need reliability, flexibility, and reproducibility; it aims to support Open Science and reproducible research.

Continue reading…

HIP and ROCm come to Guix

Ludovic Courtès, Thomas Gibson, Kjetil Haugen, Florent Pruvost — January 30, 2024

We have some exciting news to share: AMD has just contributed 100+ Guix packages adding several versions of the whole HIP and ROCm stack! ROCm is AMD’s Radeon Open Compute Platform, a set of low-level support tools for general-purpose computing on graphics processing units (GPGPUs), and HIP is the Heterogeneous Interface for Portability, a language one can use to write code (computational kernels) targeting GPUs or CPUs. The whole stack is free and “open source” software—a breath of fresh air!—and is seeing increasing adoption in HPC. And, it can now be deployed with Guix!

Continue reading…

Videos of the 2023 workshop are on-line

Céline Acary-Robert, Pierre-Antoine Bouttier, Ludovic Courtès, Alexandre Dehne-Garcia, Simon Tournier — January 29, 2024

Back in November, the First Workshop on Reproducible Software Environments for Research and High-Performance Computing was held in Montpellier, France. Coming from France primarily but also from Czechia, Germany, the Netherlands, Slovakia, Spain, and the United Kingdom to name a few, 120 people—scientists, high-performance computing (HPC) practitioners, system administrators, and enthusiasts alike—came to listen to the talks, attend the tutorials, and talk to one another.

Continue reading…

Announcing the First Workshop on Reproducible Software Environments

Simon Tournier, Ludovic Courtès — September 18, 2023

We’re excited to announce the First Workshop on Reproducible Software Environments for Research and High-Performance Computing (HPC), which will take place in Montpellier, France, on November 8–10th, 2023! The preliminary program is on-line, and now’s the time for you to register!

Continue reading…

Reproducible research hackathon: experience report

Simon Tournier, Ludovic Courtès — July 12, 2023

Two weeks ago, on June 27th, we held an second on-line hackathon on reproducible research issues. This hackathon was a collaborative effort to bring GNU Guix to concrete examples inspired by contributions to the online journal ReScience C.

Continue reading…

A guide to reproducible research papers

Ludovic Courtès, Marek Felšöci, Konrad Hinsen, Philippe Swartvagher — June 23, 2023

A core tenet of science is the ability to independently verify research results. When computations are involved, verifiability implies reproducibility: one should be able to re-run the computations to ensure they get the same results, at which point they may want to start experimenting with variants of the computational methods, feed it different data sets, and so on. This is the motivation behind our work on Guix: we want to empower scientists by providing a tool in support of reproducible computations and experimentation.

Continue reading…

Reproducible Research Hackathon—let’s redo!

Simon Tournier — May 12, 2023

It's time to run the second Reproducible Research hackathon! The first one was from... 2020, already! The date: Tuesday June, 27th. Start: 9h30 (CEST) End: 17h30.

Continue reading…

Continuous integration and continuous delivery for HPC

Ludovic Courtès — March 6, 2023

Will those binaries actually work? This is a central question for HPC practitioners and one that’s sometimes hard to answer: increasingly complex software stacks being deployed, and often on a variety of clusters. Will that program pick the right libraries? Will it perform well? With each cluster having its own hardware characteristics, portability is often considered unachievable. As a result, HPC practitioners rarely take advantage of continuous integration and continuous delivery (CI/CD): building software locally on the cluster is common, and software validation is often a costly manual process that has to be repeated on each cluster.

Continue reading…

Guix-HPC Activity Report, 2022

Céline Acary-Robert, Ludovic Courtès, Yann Dupont, Marek Felšöci, Konrad Hinsen, Ontje Lünsdorf, Pjotr Prins, Philippe Swartvagher, Simon Tournier, Ricardo Wurmus — February 10, 2023

This document is also available as PDF (printable booklet).

Continue reading…

Guix-HPC at FOSDEM

Ludovic Courtès — January 24, 2023

As has been the case for 9 years (!), Guix will be present at FOSDEM, the big annual free software developer conference in Europe. There will be no less than ten Guix-related talks, of which the following are particularly relevant to the HPC and reproducible research communities:

Continue reading…

CRAN, a practical example for being reproducible at large scale using GNU Guix

Lars-Dominik Braun — December 21, 2022

A recent study published in Nature Scientific Data in February 2022 gives empirical insight into the success rate of reproducing R scripts obtained from Harvard’s Dataverse:

Continue reading…

Is reproducibility practical?

Ludovic Courtès — July 21, 2022

Our attention was recently caught by a nice slide deck on the methods and tools for reproducible research in R. Among those, the talk mentions Guix, stating that it is “for professional, sensitive applications that require ultimate reproducibility”, which is “probably a bit overkill for Reproducible Research”. While we were flattered to see Guix suggested as a good tool for reproducibility, the very notion that there’s a kind of “reproducibility” that is “ultimate” and, essentially, impractical, is something that left us wondering: What kind of reproducibility do scientists need, if not the “ultimate” kind? Is “reproducibility” practical at all, or is it more of a horizon?

Continue reading…

Celebrating 10 years of Guix in Paris, 16–18 September

Ludovic Courtès, Tanguy Le Carrour, Simon Tournier — June 13, 2022

It’s been ten years of GNU Guix! To celebrate, and to share knowledge and enthusiasm, a birthday event will take place on September 16–18th, 2022, in Paris, France. The program is being finalized, but you can already register!

Continue reading…

Back to the future: modules for Guix packages

Ludovic Courtès — May 6, 2022

Some things in our software world are timeless. The venerable Environment Modules are one of these. If you’ve ever used a high-performance cluster in the last three decades, chances are you’re already familiar with it. Modules is about managing software environments, just like Guix is—or, perhaps more accurately, guix shell.

Continue reading…

Guix-HPC Activity Report, 2021

Pierre-Antoine Bouttier, Ludovic Courtès, Yann Dupont, Marek Felšöci, Felix Gruber, Konrad Hinsen, Arun Isaac, Pjotr Prins, Philippe Swartvagher, Simon Tournier, Ricardo Wurmus — February 3, 2022

This document is also available as PDF (printable booklet).

Continue reading…

ccwl for concise and painless CWL workflows

Arun Isaac — January 10, 2022

In modern science, analysis is required to process data. When the data-flow is linear, such a process is easily represented by tools such as the standard Unix pipeline. However, this data-flow is often modeled by a directed graph: each processing node may have one or more inputs and the outputs may be directed to different processing nodes. This directed graph, mainly used in the fields of bioinformatics, medical imaging and astronomy, among many others, is called a workflow.

Continue reading…

Tuning packages for a CPU micro-architecture

Ludovic Courtès — January 6, 2022

It should come as no surprise that the execution speed of programs is a primary concern in high-performance computing (HPC). Many HPC practitioners would tell you that, among their top concerns, is the performance of high-speed networks used by the Message Passing Interface (MPI) and use of the latest vectorization extensions of modern CPUs.

Continue reading…

When Docker images become fixed-point

Simon Tournier — October 22, 2021

We like to say that Docker images are like smoothies: you can immediately tell whether it’s your liking, but you can hardly guess what the ingredients are. Although containers are an efficient way to ship things, the core question is how these things are produced.

Continue reading…

What’s in a package

Ludovic Courtès — September 20, 2021

There is no shortage of package managers. Each tool makes its own set of tradeoffs regarding speed, ease of use, customizability, and reproducibility. Guix occupies a sweet spot, providing reproducibility by design as pioneered by Nix, package customization à la Spack from the command line, the ability to create container images without hassle, and more.

Continue reading…

HPC & reproducible research in Guix 1.3.0

Simon Tournier, Ludovic Courtès — May 19, 2021

Version 1.3.0 of GNU Guix was announced a few days ago. Some 212 people contributed to more than 8,300 commits since version 1.2.0 released in November 2020. This post focuses on important changes for HPC users, admins, and scientific practitioners.

Continue reading…

First French-speaking workshop on reproducible software environments

Ludovic Courtès, Konrad Hinsen, Simon Tournier — April 9, 2021

We are organizing the first French-speaking workshop on the reproducibility of software environments for scientists, engineers, and system administrators. The workshop will take place on-line on May 17–18th, 2021 from 09:00 to 12:30 CEST. Stay tuned for more reproducible research events!

Nous avons le plaisir d’annoncer le premier atelier francophone sur la reproductibilité des environnements logiciels, qui aura lieu en ligne les matinées des 17 et 18 mai 2021 — programme et informations pratiques sur la page de l’événement.

Cet atelier fait suite à l’intérêt porté par la communauté francophone du calcul scientifique aux questions de reproductibilité, notamment lors de l’Action Nationale de Formation UST4HPC 2021 et avec la journée reproductibilité de la Société Informatique de France (SIF) qui aura lieu le 10 mai. Elle s’inscrit aussi dans le cadre des activités du groupe Guix-HPC.

Au programme, sept retours d’expériences de scientifiques et de responsables d’administration système sur le déploiement logiciel dans les centres de calcul avec Guix mais aussi Spack ou module, et sur la création de pipelines reproductibles pour la recherche avec Debian, Org-Mode et Guix.

Ces exposés seront suivis d’échanges sur les attentes et propositions de chacun·e, aussi bien du point de vue scientifique qu’en termes d’administration de centre de calcul.

La participation est libre et gratuite mais nous vous invitons toutefois à vous inscrire.

Guix-HPC Activity Report, 2020

Lars-Dominik Braun, Ludovic Courtès, Pjotr Prins, Simon Tournier, Ricardo Wurmus — February 9, 2021

This document is also available as PDF (printable booklet).

Continue reading…

Guix-Jupyter 0.2.1 released!

Ludovic Courtès — January 25, 2021

We are pleased to announce Guix-Jupyter 0.2.1, a new release of our Guix-powered Jupyter kernel for self-contained and reproducible notebooks.

Continue reading…

More scientific packages for GNU Guix

Lars-Dominik Braun — January 4, 2021

With increased usage of GNU Guix at scientific institutions there are also growing needs for packaging software used in research and teaching. The best place for that has been and still is Guix’ main repository because there the software is accessible and maintainable by the entire Guix community.

Continue reading…

HPC & reproducible research in Guix 1.2.0

Simon Tournier, Ludovic Courtès — November 24, 2020

Version 1.2.0 of GNU Guix was announced yesterday. Some 200 people contributed more than 10,000 commits since the previous release. This post focuses on important changes for HPC users, admins, and practitioners made since version 1.1.0 was released in April 2020.

Continue reading…

Reproducible research hackathon: experience report

Simon Tournier, Ludovic Courtès — July 10, 2020

Last week, on July 3rd, we held an on-line hackathon on reproducible research issues. This hackathon was a collaborative effort to bring GNU Guix to concrete examples inspired by to contributions the recent Ten Years Reproducibility Challenge organized by ReScience.

Continue reading…

Reproducible Research Hackathon

Simon Tournier — June 30, 2020

Several submissions to the recent Ten Years Reproducibility Challenge organized by ReScience took advantage of GNU Guix, as discussed earlier.

Continue reading…

Reproducible research articles, from source code to PDF

Ludovic Courtès — June 16, 2020

Early this year, ReScience, which is concerned with publishing replications (successful or not) of previously-published articles, organized the Ten Years Reproducibility Challenge. The idea is simple: pick a paper of yours that is at least ten years old, and try to replicate its results. The first difficulty is usually to get the source code of the software used to produce the results and to get that code to build and run. This challenge helped highlight again ways in which research practices can and must be improved. We took it as an opportunity to devise new practices and tools to ensure reproducibility and provenance tracking for articles, end-to-end: from source code to PDF.

Continue reading…

Faster relocatable packs with Fakechroot

Ludovic Courtès — May 18, 2020

The guix pack command creates “application bundles” that can be used to deploy software on machines that do not run Guix (yet!), such as HPC clusters. Since its inception in 2017, it has seen a number of improvements, such as the ability to create Docker and Singularity container images. Some clusters lack these tools, though, and the addition of relocatable packs was a way to address that. This post looks at a new execution engine for relocatable packs that has just landed with the goal of improving performance.

Continue reading…

HPC & reproducible research in Guix 1.1.0

Ludovic Courtès — April 16, 2020

Version 1.1.0 of Guix was announced yesterday. As the announcement points out, some 200 people contributed more than 14,000 commits since the previous release. This post focuses on important changes for HPC users, admins, and scientists made since version 1.0.1 was released in May 2019.

Continue reading…

Guix-HPC Activity Report, 2019

Ludovic Courtès, Paul Garlick, Konrad Hinsen, Pjotr Prins, Ricardo Wurmus — February 17, 2020

This document is also available as PDF (printable booklet).

Continue reading…

Guix-HPC at FOSDEM

Ludovic Courtès — January 27, 2020

As in previous years, GNU Guix will be present at FOSDEM, the main yearly free software developer conference in Europe, with no less than nine Guix-related talks!

Continue reading…

Reproducible computations with Guix

Konrad Hinsen — January 14, 2020

This post is about reproducible computations, so let's start with a computation. A short, though rather uninteresting, C program is a good starting point. It computes π in three different ways:

Continue reading…

Optimized and portable Open MPI packaging

Ludovic Courtès — December 19, 2019

High-performance networks have constantly been evolving, in sometimes hard-to-decipher ways. Once upon a time, hardware vendors would pre-install an MPI implementation (often an in-house fork of one of the free MPI implementations) specially tailored for their hardware. Fortunately, this time appears to be gone. Despite that, there is still widespread belief that MPI cannot be packaged in a way that achieves best performance on a variety of contemporary high-speed networking hardware.

Continue reading…

Towards reproducible Jupyter notebooks

Ludovic Courtès — October 10, 2019

Jupyter Notebooks are becoming a key component of the researcher’s toolbox when it comes to sharing and reproducing computational experiments. Jupyter notebooks allow users to not only intermingle a narrative with supporting code in a way reminiscent of literate programming, they also make it easy to interact with the code and, thus, build on the work of each other.

Continue reading…

Chapter of “Evolutionary Genomics” on workflow tools and Guix

Ludovic Courtès — September 9, 2019

The book Evolutionary Genomics was published in July this year. Of particular interest to Guix-HPC is the chapter entitled “Scalable Workflows and Reproducible Data Analysis for Genomics”, by Francesco Strozzi et al.:

Continue reading…

GNU Guix 1.0: a solid foundation for HPC and reproducible science

Ludovic Courtès — May 6, 2019

GNU Guix 1.0.0 was released just a few days ago! This is a major milestone for Guix, which has been under development for seven years, with more than 40,000 commits made by 260 people, and no less than 19 “0.x” releases.

Continue reading…

Connecting reproducible deployment to a long-term source code archive

Ludovic Courtès — March 29, 2019

GNU Guix can be used as a “package manager” to install and upgrade software packages as is familiar to GNU/Linux users, or as an environment manager, but it can also provision containers or virtual machines, and manage the operating system running on your machine.

Continue reading…

Guix-HPC Activity Report, 2018

Eric Bavier, Ludovic Courtès, Paul Garlick, Pjotr Prins, Ricardo Wurmus — February 12, 2019

This document is also available as PDF (printable booklet).

Continue reading…

Creating a reproducible workflow with CWL

Pjotr Prins — January 21, 2019

In the quest for truly reproducible workflows I set out to create an example of a reproducible workflow using GNU Guix, IPFS, and CWL. GNU Guix provides content-addressable, reproducible, and verifiable software deployment. IPFS provides content-addressable storage, and CWL describes workflows that can run on specifically supported backend hardware system. In principle, this combination of tools should be enough to provide reproducibility with provenance and improved security.

Continue reading…

PiGx paper awarded at the International Conference on Genomics (ICG-13)

Ricardo Wurmus — January 11, 2019

December 2018 the Akalin lab at the Berlin Institute of Medical Systems Biology (BIMSB) published a paper about a collection of reproducible genomics pipelines called PiGx that are made available through GNU Guix. The article was awarded third place in the GigaScience ICG-13 Prize. Representing the authors, Ricardo Wurmus was invited to present the work on PiGx and Guix in Shenzhen, China at ICG-13.

Ricardo Wurmus presenting at ICG-13.

Ricardo urged the audience of wet lab scientists and bioinformaticians to apply the same rigorous standards of experimental design to experiments involving software: all variables need to be captured and constrained. To demonstrate that this does not need to be complicated, Ricardo reported the experiences of the Akalin lab in building a collection of reproducibly built automated genomics workflows using GNU Guix.

Due to technical difficulties the recording of the talk was lost, so Ricardo re-recorded the talk a few weeks later.

HPC & reproducible research in Guix 0.16.0

Ludovic Courtès — December 7, 2018

Version 0.16.0 of Guix was released yesterday. It’s slated to be the last release before 1.0, and as usual, it brings noteworthy packages and features for HPC and reproducible research.

Continue reading…

HPC goodies in Guix 0.15.0

Ludovic Courtès — July 6, 2018

Version 0.15.0 of Guix was released today. As usual, it brings packages and features that we hope HPC users and sysadmins will enjoy. This release brings us close to our goals for 1.0, so it’s probably one of the last zero-dot-something releases.

Continue reading…

Paper on reproducible bioinformatics pipelines with Guix

Ricardo Wurmus — May 9, 2018

I’m happy to announce that the bioinformatics group at the Max Delbrück Center that I’m working with has released a preprint of a paper on reproducibility with the title Reproducible genomics analysis pipelines with GNU Guix.

Continue reading…

Pre-built binaries vs. performance

Ludovic Courtès — January 31, 2018

Guix follows a transparent source/binary deployment model: it will download pre-built binaries when they’re available—like apt-get or yum—and otherwise falls back to building from source. Most of the time the project’s build farm provides binaries so that users don’t have to spend resources building from source. Pre-built binaries may be missing when you’re installing a custom package, or when the build farm hasn’t caught up yet. However, deployment of binaries is often seen as incompatible with high-performance requirements—binaries are “generic”, so how can they take advantage of cutting-edge HPC hardware? In this post, we explore the issue and solutions.

Continue reading…

Guix-HPC at FOSDEM

Ludovic Courtès — January 29, 2018

GNU Guix will be present at FOSDEM, the main yearly free software developer conference in Europe, and in particular in the HPC track.

Continue reading…

HPC goodies in Guix 0.14.0

Ludovic Courtès — December 8, 2017

Version 0.14.0 of Guix was announced yesterday. In this post we look at the many goodies that made it into Guix during this release cycle.

Continue reading…

Installing Guix on a cluster

Ludovic Courtès — November 23, 2017

Previously we discussed ways to use Guix-produced packages on a cluster where Guix is not installed. In this post we look at how a cluster sysadmin can install Guix for system-wide use, and discuss the various tradeoffs.

Continue reading…

Using Guix Without Being root

Ludovic Courtès — October 2, 2017

In the previous post, we saw that Guix’s build daemon needs to run as root, and for a good reason: that’s currently the only way to create isolated build environments for packages on GNU/Linux. This requirement means that you cannot use Guix on a cluster where the sysadmins have not already installed it. In this article, we discuss how to take advantage of Guix on clusters that lack a proper Guix installation.

Continue reading…

Reproducibility vs. root privileges

Ludovic Courtès — September 22, 2017

Guix is a good fit for multi-user environments such as clusters: it allows non-root users to install packages at will without interfering with each other. However, a common complaint is that installing Guix requires administrator privileges. More precisely, guix-daemon, the system-wide daemon that spawns package builds and downloads on behalf of users, must be running as root. This is not much of a problem on one's laptop but it surely makes it harder to adopt Guix on an HPC cluster.

Continue reading…

Guix-HPC debut!

Ludovic Courtès, Roel Janssen, Pjotr Prins, Ricardo Wurmus — September 5, 2017

This post marks the debut of Guix-HPC, an effort to optimize GNU Guix for reproducible scientific workflows in high-performance computing (HPC). Guix-HPC is a joint effort between Inria, the Max Delbrück Center for Molecular Medicine (MDC), and the Utrecht Bioinformatics Center (UBC). Ludovic Courtès, Ricardo Wurmus, Roel Janssen, and Pjotr Prins are driving the effort in each of these institutes, each one focusing specific areas of interest within this overall Guix-HPC effort. Our institutes have in common that they are users of HPC, and that, as scientific research institutes, they have an interest in using reproducible methodologies to carry out their research.

Continue reading…

  • MDC
  • Inria
  • UBC
  • UTHSC