Will those binaries actually work? This is a central question for HPC practitioners and one that’s sometimes hard to answer: increasingly complex software stacks being deployed, and often on a variety of clusters. Will that program pick the right libraries? Will it perform well? With each cluster having its own hardware characteristics, portability is often considered unachievable. As a result, HPC practitioners rarely take advantage of continuous integration and continuous delivery (CI/CD): building software locally on the cluster is common, and software validation is often a costly manual process that has to be repeated on each cluster.
Some things in our software world are timeless. The venerable
Environment Modules are one of these.
If you’ve ever used a high-performance cluster in the last three
decades, chances are you’re already familiar with it. Modules is about
managing software environments, just like Guix is—or, perhaps more
It should come as no surprise that the execution speed of programs is a primary concern in high-performance computing (HPC). Many HPC practitioners would tell you that, among their top concerns, is the performance of high-speed networks used by the Message Passing Interface (MPI) and use of the latest vectorization extensions of modern CPUs.
command creates “application bundles” that can be used to deploy
software on machines that do not run Guix (yet!), such as HPC clusters. Since
its inception in
it has seen a number of improvements, such as the ability to create
Docker and Singularity container images. Some clusters lack these
tools, though, and the addition of relocatable
was a way to address that. This post looks at a new execution engine
for relocatable packs that has just
landed with the goal of improving
High-performance networks have constantly been evolving, in sometimes hard-to-decipher ways. Once upon a time, hardware vendors would pre-install an MPI implementation (often an in-house fork of one of the free MPI implementations) specially tailored for their hardware. Fortunately, this time appears to be gone. Despite that, there is still widespread belief that MPI cannot be packaged in a way that achieves best performance on a variety of contemporary high-speed networking hardware.
December 2018 the Akalin lab at the Berlin Institute of Medical Systems Biology (BIMSB) published a paper about a collection of reproducible genomics pipelines called PiGx that are made available through GNU Guix. The article was awarded third place in the GigaScience ICG-13 Prize. Representing the authors, Ricardo Wurmus was invited to present the work on PiGx and Guix in Shenzhen, China at ICG-13.
Ricardo urged the audience of wet lab scientists and bioinformaticians to apply the same rigorous standards of experimental design to experiments involving software: all variables need to be captured and constrained. To demonstrate that this does not need to be complicated, Ricardo reported the experiences of the Akalin lab in building a collection of reproducibly built automated genomics workflows using GNU Guix.
Due to technical difficulties the recording of the talk was lost, so Ricardo re-recorded the talk a few weeks later.
I’m happy to announce that the bioinformatics group at the Max Delbrück Center that I’m working with has released a preprint of a paper on reproducibility with the title Reproducible genomics analysis pipelines with GNU Guix.
This post marks the debut of Guix-HPC, an effort to optimize GNU Guix for reproducible scientific workflows in high-performance computing (HPC). Guix-HPC is a joint effort between Inria, the Max Delbrück Center for Molecular Medicine (MDC), and the Utrecht Bioinformatics Center (UBC). Ludovic Courtès, Ricardo Wurmus, Roel Janssen, and Pjotr Prins are driving the effort in each of these institutes, each one focusing specific areas of interest within this overall Guix-HPC effort. Our institutes have in common that they are users of HPC, and that, as scientific research institutes, they have an interest in using reproducible methodologies to carry out their research.