Guix-HPC — Reproducible software deployment for high-performance computingRecent Posts2024-03-26T18:30:54ZAdventures on the quest for long-term reproducible deploymentLudovic Courtèsguix-devel@gnu.org2024-03-13T15:00:00Z<p>Rebuilding software five years later, how hard can it be? It can’t be
<em>that</em> hard, especially when you pride yourself on having a tool that
can <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-time_002dmachine.html">travel in
time</a>
and that does a good job at ensuring <a href="https://reproducible-builds.org/docs/definition/">reproducible
builds</a>, right?</p><p>In hindsight, we can tell you: it’s more challenging than it
seems. Users attempting to travel 5 years back with <code>guix time-machine</code>
are (or <em>were</em>) unavoidably going to hit bumps on the road—a real
problem because that’s one of the use cases Guix aims to support well,
in particular in a <a href="https://hpc.guix.info/blog/tag/reproducibility/">reproducible
research</a> context.</p><p>In this post, we look at some of the challenges we face while traveling
back, how we are overcoming them, and open issues.</p><h1>The vision</h1><p>First of all, one clarification: Guix aims to support time travel, but
we’re talking of a time scale measured in years, not in decades. We
know all too well that this is already very ambitious—it’s something
that probably nobody except <a href="https://nixos.org">Nix</a> and Guix are even
trying. More importantly, software deployment at the scale of decades
calls for very different, more radical techniques; it’s the work of
archivists.</p><p>Concretely, Guix 1.0.0 was <a href="https://guix.gnu.org/en/blog/2019/gnu-guix-1.0.0-released/">released in
2019</a> and
our goal is to allow users to travel as far back as 1.0.0 and redeploy
software from there, as in this example:</p><pre><code>$ guix time-machine -q --commit=v1.0.0 -- \
environment --ad-hoc python2 -- python
> guile: warning: failed to install locale
Python 2.7.15 (default, Jan 1 1970, 00:00:01)
[GCC 5.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>></code></pre><p>(The command above uses <code>guix environment</code>, the <a href="https://guix.gnu.org/en/blog/2021/from-guix-environment-to-guix-shell/">predecessor of <code>guix shell</code></a>,
which didn’t exist back then.)
It’s only 5 years ago but it’s pretty much remote history on the scale
of software evolution—in this case, that history comprises major
changes <a href="https://guix.gnu.org/en/blog/2021/the-big-change/">in Guix
itself</a> and
<a href="https://guix.gnu.org/en/blog/2020/guile-3-and-guix/">in Guile</a>.
How well does such a command work? Well, it depends.</p><p>The project has two build farms; <code>bordeaux.guix.gnu.org</code> has been
keeping substitutes (pre-built binaries) of everything it built since
roughly 2021, while <code>ci.guix.gnu.org</code> keeps substitutes for roughly two
years, but there is currently no guarantee on the duration
substitutes may be retained.
Time traveling to a period where substitutes are available is
fine: you end up downloading lots of binaries, but that’s OK, you rather
quickly have your software environment at hand.</p><h1>Bumps on the build road</h1><p>Things get more complicated when targeting a period in time for which
substitutes are no longer available, as was the case for <code>v1.0.0</code> above.
(And really, we should assume that substitutes won’t remain available
forever: fellow NixOS hackers recently had to seriously consider
<a href="https://discourse.nixos.org/t/nixos-s3-long-term-resolution-phase-1/36493">trimming their 20-year-long history of
substitutes</a>
because the costs are not sustainable.)</p><p>Apart from the long build times, the first problem that arises in the
absence of substitutes is source code unavailability. I’ll spare you
the details for this post—that problem alone would deserve a book.
Suffice to say that we’re lucky that we started working on <a href="https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/">integrating
Guix with Software
Heritage</a>
years ago, and that there has been great progress over the last couple
of years to get closer to <a href="https://ngyro.com/pog-reports/latest/">full package source code
archival</a> (more precisely: 94% of
the source code of packages available in Guix in January 2024 is
archived, versus 72% of the packages available in May 2019).</p><p>So what happens when you run the <code>time-machine</code> command above? It
brings you to May 2019, a time for which none of the official build
farms had substitutes until a few days ago. Ideally, thanks to
<a href="https://guix.gnu.org/manual/devel/en/html_node/Build-Environment-Setup.html">isolated build
environments</a>,
you’d build things for hours or days, and in the end all those binaries
will be here just as they were 5 years ago. In practice though, there
are several problems that isolation as currently implemented does <em>not</em>
address.</p><p><img src="/static/images/blog/safety-last.jpg" alt="Screenshot of movie “Safety Last!” with Harold Lloyd hanging from a clock on a building’s façade." /></p><p>Among those, the most frequent problem is <em>time traps</em>: software build
processes that fail after a certain date (these are also referred to as
“time bombs” but we’ve had enough of these and would rather call for a
ceasefire). This plagues a handful of packages out of almost 30,000 but
unfortunately we’re talking about packages deep in the dependency graph.
Here are some examples:</p><ul><li><a href="https://issues.guix.gnu.org/56137">OpenSSL</a> unit tests fail
after a certain date because some of the X.509 certificates they use
have expired.</li><li><a href="https://issues.guix.gnu.org/44559">GnuTLS</a> had similar issues;
newer versions rely on
<a href="https://packages.guix.gnu.org/packages/datefudge/">datefudge</a> to
fake the date while running the tests and thus avoid that problem
altogether.</li><li>Python 2.7, found in Guix 1.0.0, also <a href="https://issues.guix.gnu.org/65378">had that
problem</a> with its TLS-related
tests.</li><li>OpenJDK <a href="https://issues.guix.gnu.org/68333">would fail to build at some
point</a> with this interesting
message: <code>Error: time is more than 10 years from present: 1388527200000</code> (the build system would consider that its data about
currencies is likely outdated after 10 years).</li><li>Libgit2, a dependency of Guix, had (has?) a <a href="https://issues.guix.gnu.org/55326">time-dependent
tests</a>.</li><li>MariaDB tests <a href="https://issues.guix.gnu.org/34351">started failing in
2019</a>.</li></ul><p>Someone traveling to <code>v1.0.0</code> will hit several of these, preventing
<code>guix time-machine</code> from completing. A serious bummer, especially to
those who’ve come to Guix from the perspective of making their <a href="https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/">research
workflow
reproducible</a>.</p><p>Time traps are the main road block, but there’s more! In rare cases,
there’s software influenced by kernel details not controlled by the
build daemon:</p><ul><li>Tests of the hwloc hardware locality library <a href="https://issues.guix.gnu.org/54767">would fail when
running on a Btrfs file system</a>.</li></ul><p>In a handful of cases, but important ones, builds might fail when
performed on certain CPUs. We’re aware of at least two cases:</p><ul><li>Python 3.9 to 3.11 would set a signal handler stack <a href="https://github.com/python/cpython/issues/91124">too small for
use on Intel Sapphire Rapids Xeon
CPUs</a> (it’s more
complicated than this but the end result is: it will no longer build
on modern hardware).</li><li>Firefox would reportedly <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1882015">crash on Raptor Lake CPUs running an buggy
version of their
firmware</a>.</li></ul><p>Neither time traps nor those obscure hardware-related issues can be
avoided with the isolation mechanism currently used by the build daemon.
This harms time traveling when substitutes are unavailable. Giving up
is not in the ethos of this project though.</p><h1>Where to go from here?</h1><p>There are really two open questions here:</p><ol><li>How can we tell which packages needs to be “fixed”, and how:
building at a specific date, on a specific CPU?</li><li>How can keep those aspects of the build environment (time, CPU
variant) under control?</li></ol><p>Let’s start with #2. Before looking for a solution, it’s worth
remembering where we come from. The build daemon runs build processes
with a <a href="https://www.man7.org/linux/man-pages/man2/chroot.2.html">separate root file
system</a>, under
dedicated user IDs, and in separate <a href="https://www.man7.org/linux/man-pages/man7/namespaces.7.html">Linux
namespaces</a>,
thereby minimizing interference with the rest of the system and ensuring
a <a href="https://guix.gnu.org/manual/devel/en/html_node/Build-Environment-Setup.html">well-defined build
environment</a>.
This technique was
<a href="https://archive.softwareheritage.org/browse/revision/9397cd30c8a6ffd65fc3b85985ea59ecfb72672b/">implemented</a>
by Eelco Dolstra for Nix in 2007 (with namespace support <a href="https://archive.softwareheritage.org/browse/revision/df716c98d203ab64cdf05f9c17fdae565b7daa1c/">added
in
2012</a>),
at a time where the word <em>container</em> had to do with boats and before
“Docker” became the name of a software tool. In short, the approach
consists in <em>controlling the build environment</em> in every detail (it’s at
odds with the strategy that consists in achieving reproducible builds
<a href="https://tests.reproducible-builds.org/debian/index_variations.html"><em>in spite</em> of high build environment
variability</a>).
That these are mere processes with a bunch of bind mounts makes this
approach inexpensive and appealing.</p><p>Realizing we’d also want to control the build environment’s date,
we naturally turn to Linux namespaces to address that—Dolstra, Löh, and
Pierron already suggested something along these lines in the conclusion
of their <a href="https://edolstra.github.io/pubs/nixos-jfp-final.pdf">2010 <em>Journal of Functional Programming</em>
paper</a>. Turns out
there <em>is</em> now a <a href="https://www.man7.org/linux/man-pages/man7/time_namespaces.7.html">time
namespace</a>.
Unfortunately it’s limited to <code>CLOCK_MONOTONIC</code> and <code>CLOCK_BOOTTIME</code>
clocks; the manual page states:</p><blockquote><p>Note that time namespaces do not virtualize the <code>CLOCK_REALTIME</code>
clock. Virtualization of this clock was avoided for reasons of
complexity and overhead within the kernel.</p></blockquote><p>I hear you say: <em>What about
<a href="https://packages.guix.gnu.org/packages/datefudge/">datefudge</a> and
<a href="https://packages.guix.gnu.org/packages/libfaketime/">libfaketime</a>?</em>
These rely on the <code>LD_PRELOAD</code> environment variable to trick the dynamic
linker into pre-loading a library that provides symbols such as
<code>gettimeofday</code> and <code>clock_gettime</code>. This is a fine approach in some
cases, but it’s too fragile and too intrusive when targeting arbitrary
build processes.</p><p>That leaves us with essentially one viable option: virtual machines
(VMs). The full-system QEMU lets you specify the initial real-time
clock of the VM with the <code>-rtc</code> flag, which is exactly what we need
(“user-land” QEMU such as <code>qemu-x86_64</code> does not support it). And of
course, it lets you specify the CPU model to emulate.</p><h1>News from the past</h1><p>Now, the question is: where does the VM fit? The author considered
writing a <a href="https://guix.gnu.org/manual/devel/en/html_node/Package-Transformation-Options.html">package
transformation</a>
that would change a package such that it’s built in a well-defined VM.
However, that wouldn’t really help: this option didn’t exist in past
revisions, and it would lead to a different build anyway from the
perspective of the daemon—a different
<a href="https://guix.gnu.org/manual/devel/en/html_node/Derivations.html"><em>derivation</em></a>.</p><p>The best strategy appeared to be
<a href="https://guix.gnu.org/manual/devel/en/html_node/Daemon-Offload-Setup.html"><em>offloading</em></a>:
the build daemon can offload builds to different machines over SSH, we
just need to let it send builds to a suitably-configured VM. To do
that, we can reuse some of the machinery initially developed for
<a href="https://guix.gnu.org/manual/devel/en/html_node/Virtualization-Services.html#index-childhurd_002c-offloading"><em>childhurds</em></a>
that takes care of setting up offloading to the VM: creating substitute
signing keys and SSH keys, exchanging secret key material between the
host and the guest, and so on.</p><p>The end result is a <a href="https://guix.gnu.org/manual/devel/en/html_node/Virtualization-Services.html#Virtual-Build-Machines">service for Guix System
users</a>
that can be configured in a few lines:</p><pre><code class="language-scheme">(use-modules (gnu services virtualization))
(operating-system
;; …
(services (append (list (service virtual-build-machine-service-type))
%base-services)))</code></pre><p>The default setting above provides a 4-core VM whose initial date is
January 2020, emulating a Skylake CPU from that time—the right setup for
someone willing to reproduce old binaries. You can check the
configuration like this:</p><pre><code>$ sudo herd configuration build-vm
CPU: Skylake-Client
number of CPU cores: 4
memory size: 2048 MiB
initial date: Wed Jan 01 00:00:00Z 2020</code></pre><p>To enable offloading to that VM, one has to explicitly start it, like
so:</p><pre><code>$ sudo herd start build-vm</code></pre><p>From there on, every native build is offloaded to the VM. The key part
is that with almost no configuration, you get everything set up to build
packages “in the past”. It’s a Guix System only solution; if you run
Guix on another distro, you can set up a similar build VM but you’ll
have to go through the cumbersome process that is all taken care of
automatically here.</p><p>Of course it’s possible to choose different configuration parameters:</p><pre><code class="language-scheme">(service virtual-build-machine-service-type
(virtual-build-machine
(date (make-date 0 0 00 00 01 10 2017 0)) ;further back in time
(cpu "Westmere")
(cpu-count 16)
(memory-size (* 8 1024))
(auto-start? #t)))</code></pre><p>With a build VM with its date set to January 2020, we have been able to
rebuild Guix and its dependencies along with a bunch of packages such as
<code>emacs-minimal</code> from <code>v1.0.0</code>, overcoming all the time traps and other
challenges described earlier. As a side effect, substitutes
are now available from <code>ci.guix.gnu.org</code> so you can even try this at
home without having to rebuild the world:</p><pre><code>$ guix time-machine -q --commit=v1.0.0 -- build emacs-minimal --dry-run
guile: warning: failed to install locale
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
38.5 MB would be downloaded:
/gnu/store/53dnj0gmy5qxa4cbqpzq0fl2gcg55jpk-emacs-minimal-26.2</code></pre><p>For the fun of it, we went as far as <code>v0.16.0</code>, <a href="https://guix.gnu.org/blog/2018/gnu-guix-and-guixsd-0.16.0-released/">released in December
2018</a>:</p><pre><code>guix time-machine -q --commit=v0.16.0 -- \
environment --ad-hoc vim -- vim --version</code></pre><p>This is the furthest we can go since
<a href="https://guix.gnu.org/manual/devel/en/html_node/Channels.html">channels</a>
and the underlying mechanisms that make time travel possible did not
exist before that date.</p><p>There’s one “interesting” case we stumbled upon in that process: in
OpenSSL 1.1.1g (released April 2020 and packaged <a href="https://archive.softwareheritage.org/browse/revision/c4868e38289baf3a9a74bdf32166d321f7365725/">in December
2020</a>),
some of the test certificates are not valid <em>before</em> April 2020, so the
build VM needs to have its clock set to May 2020 or thereabouts.
Booting the build VM with a different date can be done without
reconfiguring the system:</p><pre><code>$ sudo herd stop build-vm
$ sudo herd start build-vm -- -rtc base=2020-05-01T00:00:00</code></pre><p>The <code>-rtc …</code> flags are passed straight to QEMU, which is handy when
exploring workarounds…</p><p>The <a href="https://ci.guix.gnu.org/jobset/time-travel"><code>time-travel</code> continuous integration
jobset</a> has been set up to
check that we can, at any time, travel back to one of the past releases.
This at least ensures that Guix itself and its dependencies have
substitutes available at <code>ci.guix.gnu.org</code>.</p><h1>Reproducible research workflows reproduced</h1><p>Incidentally, this effort rebuilding 5-year-old packages has allowed us
to fix embarrassing problems. Software that accompanies research papers
that followed our <a href="https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/">reproducibility
guidelines</a>
could no longer be deployed, at least not without this clock twiddling
effort:</p><ul><li><a href="https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone">code</a>
of <a href="https://doi.org/10.5281/zenodo.3886739"><em>[Re] Storage Tradeoffs in a Collaborative Backup Service for
Mobile Devices</em></a>, submitted
as part of the ReScience <a href="https://rescience.github.io/ten-years/"><em>Ten Years Reproducibility
Challenge</em></a> in June 2020,
and which is precisely about showcasing reproducible deployment with
Guix;</li><li><a href="https://archive.softwareheritage.org/browse/revision/707f00afef8f6ef1f29a7a4c961dd714f82833f5/">code</a>
of the 2022 Nature Scientific Data article entitled <a href="https://doi.org/10.1038/s41597-022-01720-9"><em>Toward
practical transparent verifiable and long-term reproducible research
using Guix</em></a>, which
relied on an April 2020 revision of Guix to deploy (Simon Tournier
who co-authored the paper <a href="https://simon.tournier.info/posts/2023-12-21-repro-paper.html">reported
earlier</a>
on a failed attempt showing just how challenging it was).</li></ul><p>It’s good news that we can now re-deploy these 5-year-old software
environments with minimum hassle; it’s bad news that holding this
promise took extra effort.</p><p>The ability to reproduce the environment of software that accompanies
research work should not be considered a mundanity or an exercise that’s
<a href="https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/">“overkill”</a>.
The ability to rerun, inspect, and modify software are the natural
extension of the scientific method. Without a companion reproducible
software environment, research papers <em>are merely the advertisement of
scholarship</em>, to paraphrase Jon Claerbout.</p><h1>The future</h1><p>The astute reader surely noticed that we didn’t answer question #1
above:</p><blockquote><p>How can we tell which packages needs to be “fixed”, and how: building
at a specific date, on a specific CPU?</p></blockquote><p>It’s a fact that Guix so far lacks information about the date, kernel,
or CPU model that should be used to build a given package.
<a href="https://guix.gnu.org/manual/devel/en/html_node/Derivations.html">Derivations</a>
purposefully lack that information on the grounds that it cannot be
enforced in user land and is <em>rarely</em> necessary—which is true, but
“rarely” is not the same as “never”, as we saw. Should we create a
catalog of date, CPU, and/or kernel annotations for packages found in
past revisions? Should we define, for the long-term, an
all-encompassing derivation format? If we did and effectively required
virtual build machines, what would that mean from a
<a href="https://guix.gnu.org/en/blog/tags/bootstrapping/">bootstrapping</a>
standpoint?</p><p>Here’s another option: build packages in VMs running in the year 2100,
say, and on a baseline CPU. We don’t need to require all users to set
up a virtual build machine—that would be impractical. It may be enough
to set up the project build farms so they build everything that way.
This would allow us to catch time traps and <a href="https://en.wikipedia.org/wiki/Year_2038_problem">year 2038
bugs</a> before they bite.</p><p>Before we can do that, the <code>virtual-build-machine</code> service needs to be
optimized. Right now, offloading to build VMs is as heavyweight as
offloading to a separate physical build machine: data is transferred
back and forth over SSH over TCP/IP. The first step will be to run SSH
over a paravirtualized transport instead such as <a href="https://www.man7.org/linux/man-pages/man7/vsock.7.html"><code>AF_VSOCK</code>
sockets</a>.
Another avenue would be to make <code>/gnu/store</code> in the guest VM an overlay
over the host store so that inputs do not need to be transferred and
copied.</p><p>Until then, happy software (re)deployment!</p><h1>Acknowledgments</h1><p>Thanks to Simon Tournier for insightful comments on a previous version
of this post.</p><blockquote><p><em>Originally published <a href="https://guix.gnu.org/blog/2024/adventures-on-the-quest-for-long-term-reproducible-deployment/">on the Guix
blog</a>.</em></p></blockquote>Guix-HPC Activity Report, 2023Céline Acary-Robert, Emmanuel Agullo, Ludovic Courtès, Marek Felšöci, Konrad Hinsen, Arun Isaac, Ontje Lünsdorf, Pjotr Prins, Simon Tournier, Philippe Virouleau, Ricardo Wurmusguix-devel@gnu.org2024-02-16T14:00:00Z<p>We are pleased to publish the sixth Guix-HPC annual report.
Launched in 2017, Guix-HPC is a collaborative effort to <strong>bring
reproducible software deployment to scientific workflows and
high-performance computing</strong> (HPC). Guix-HPC builds upon the
<a href="https://guix.gnu.org">GNU Guix</a> software deployment tool to
empower HPC practitioners and scientists who
need reliability, flexibility, and reproducibility; it aims to
support Open Science and reproducible research.</p><p>Guix-HPC started as a joint software development project involving three
research institutes: <a href="https://www.inria.fr/en/">Inria</a>, the <a href="https://www.mdc-berlin.de/">Max
Delbrück Center for Molecular Medicine
(MDC)</a>, and the <a href="https://ubc.uu.nl/">Utrecht Bioinformatics
Center (UBC)</a>. GNU Guix for HPC and reproducible
research has since received contributions from many individuals and
organizations, including <a href="https://www.cnrs.fr/en">CNRS</a>, <a href="https://u-paris.fr/en/">Université
Paris Cité</a>, the <a href="https://uthsc.edu/">University of Tennessee Health
Science Center</a> (UTHSC), <a href="https://www.csl.cornell.edu/">Cornell
University</a>, and
<a href="https://www.amd.com">AMD</a>. HPC remains a conservative domain but over
the years, we have reached out to many organizations and people who
share our goal of improving upon the status quo when it comes to
software deployment.</p><p>This report highlights key achievements of Guix-HPC between <a href="https://hpc.guix.info/blog/2022/02/guix-hpc-activity-report-2021/">our
previous
report</a>
a year ago and today, February 2024. This year was marked by exciting
developments for HPC and reproducible workflows: the organization of a
<a href="https://hpc.guix.info/events/2023/workshop">three-day workshop in
November</a> on this very topic
where 120 researchers and HPC practitioners met, the expansion of the
package collection available to Guix users—including significant
contributions by AMD and new channels giving access to all of
Bioconductor, along with more ground work to meet the needs of HPC and
reproducible research.</p><h1>Outline</h1><p>Guix-HPC aims to tackle the following high-level objectives:</p><ul><li><em>Reproducible scientific workflows.</em> Improve the GNU Guix tool set
to better support reproducible scientific workflows and to simplify
sharing and publication of software environments.</li><li><em>Cluster usage.</em> Streamlining Guix deployment on HPC clusters, and
providing interoperability with clusters not running Guix.</li><li><em>Outreach & user support.</em> Reaching out to the HPC and scientific
research communities and organizing training sessions.</li></ul><p>The following sections detail work that has been carried out in each of
these areas.</p><h1>Reproducible Scientific Workflows</h1><p>Supporting reproducible research workflows is a major goal for Guix-HPC.
This section looks at progress made on packaging and tooling.</p><h2>Packages</h2><p>The package collection available from Guix keeps growing: as of this
writing, Guix itself provides more than 29,000 packages, all free
software, making it <a href="https://repology.org/"><strong>the fifth largest free software
distribution</strong></a>. With the addition of scientific
computing <em>channels</em>, users have access to more than 52,000 packages!</p><p>We updated the
<a href="https://github.com/UMCUGenetics/hpcguix-web">hpcguix-web</a> package
browser and <a href="https://hpc.guix.info/browse">its Guix-HPC instance</a> to make it
easier to search these channels, to navigate them, and to get set up
using them. The <a href="https://hpc.guix.info/channels">channels</a> page lists
channels commonly used by the scientific community. A noteworthy
example is <a href="https://github.com/guix-science/guix-science">Guix-Science</a>,
now home to hundreds of packages. Most of these channels are under
<em>continuous integration</em>, with pre-built binaries being published from
build farms such as <a href="https://guix.bordeaux.inria.fr">that hosted by
Inria</a>.</p><p><img src="/static/images/blog/guix-cran.png" alt="Logo of Guix-Bioc." /></p><p>Expanding on the introduction of the
<a href="https://github.com/guix-science/guix-cran"><code>guix-cran</code></a> channel last
year, we are happy to announce the new
<a href="https://github.com/guix-science/guix-bioc"><code>guix-bioc</code></a> channel.
This new channel makes most of the entire
<a href="https://bioconductor.org">Bioconductor</a> collection of R packages
available as Guix packages. Substitutes are provided by the build
farm at <code>guix.bordeaux.inria.fr</code> to speed up installation times. The
channel <strong>augments the collection of R packages</strong> provided by the
Guix default channel and the <code>guix-cran</code> channel. Creating and
updating <code>guix-bioc</code> is fully automated and happens without any human
intervention. The channel itself is always in a usable state, because
updates are tested with <code>guix pull</code> before committing and pushing
them. The same limitations of the <code>guix-cran</code> channel with regard to
potential build failures due to undeclared build or runtime
dependencies also apply to this channel. Improvements to the CRAN
importer in Guix, however, have allowed us to reduce the failure rate
and raise the quality of both channels.</p><p>These two automated channels grow the number of R packages available
in reproducible Guix environments by 21,635 to a total of 24,187.
Unlike other efforts that aim to provide binaries of R packages, the
collection of R packages in Guix fully captures all dependencies,
including those that would otherwise be considered “system
dependencies”, insulating Guix environments from system-level changes
over time. The increasing coverage of package sources archived by
<a href="https://www.softwareheritage.org">Software Heritage</a> puts Guix in a
unique position as a solid foundation for reliable long-term
reproducible research with R.</p><p><img src="/static/images/blog/rocm-logo.png" alt="AMD ROCm logo." /></p><p>A major highlight this year is the <a href="https://hpc.guix.info/blog/2024/01/hip-and-rocm-come-to-guix/"><strong>100+ packages contributed by
AMD</strong></a>
for its ROCm and HIP toolchain for GPUs. Those include 5 versions of
the entire <a href="https://hpc.guix.info/package/hipamd">HIP</a>/<a href="https://hpc.guix.info/package/rocm-toolchain">ROCm
toolchain</a>, all the way
down to LLVM and including support in communication libraries
<a href="https://hpc.guix.info/package/ucx">ucx</a> and
<a href="https://hpc.guix.info/package/openmpi">Open MPI</a>. Anyone who has tried
to package or to build this will understand that this is a major
contribution: the software stack is complex, requiring careful assembly
of the right versions or variants of each component.</p><p>Those packages are a boost to the supercomputer users. We have been
able to use them to run HIP/ROCm benchmarks on the French national
supercomputer
<a href="https://genci.fr/en/centre-informatique-national-de-lenseignement-superieur-cines">Adastra</a>,
which features AMD Instinct MI250X GPUs, leveraging <code>guix pack</code> to ship
the code. We expect this joint effort with AMD to continue so we can
deliver other parts of the stack—e.g., rocBLAS, rocFFT, and related math
libraries—and to enable ROCm support in other packages such as PyTorch and Tensorflow.</p><p>For those systems where the HIP/ROCm stack cannot be used, the <a href="https://hpc.guix.info/channel/guix-science-nonfree">Guix
Science Nonfree
channel</a> provides
various versions of CUDA and cuDNN. This channel now also provides
CUDA-enabled variants of packages from the <a href="https://hpc.guix.info/channel/guix-science">Guix Science
channel</a> that only support
CPU-based inference. Of note is the addition of both the CPU- and
CUDA-enabled variants of JAX, the machine learning framework for
accelerated linear algebra and automated differentiation of numerical
functions. Recent versions of <strong>Tensorflow 2</strong> and related Tensorflow
libraries are now also available, thanks to the addition of a Bazel
build system abstraction in the <a href="https://hpc.guix.info/channel/guix-science">Guix Science
channel</a>.</p><p>Other notable additions to the <a href="https://hpc.guix.info/channel/guix-hpc">Guix-HPC
channel</a> include the plethora of
dependencies needed to build <a href="https://github.com/GEOS-DEV/GEOS">GEOS</a>, a
geophysical simulation framework, and
<a href="https://hpc.guix.info/package/medinria">medInria</a>, a medical image
processing and visualization package, both contributed by Inria
engineers.</p><h2>Guix Packager, a Packaging Assistant</h2><p><a href="https://guix.gnu.org/manual/devel/en/html_node/Defining-Packages.html">Defining
packages</a>
for Guix is not all that hard but, as always, it is much harder the first
time you do it, especially when starting from a blank page or not
being familiar with the programming environment of Guix. <a href="https://guix-hpc.gitlabpages.inria.fr/guix-packager/">Guix
Packager</a> is a <strong>new
web user interface to get you started</strong>.</p><p><img src="/static/images/blog/guix-packager.gif" alt="Screenshot of Guix Packager." /></p><p>The interface aims to be intuitive: fill in forms on the left and it
produces a correct, ready-to-use package definition on the right.
Importantly, it helps avoid pitfalls that trip up many newcomers:
adding an input adds the right variable name and modules, turning tests
on and off or adding configure flags can be achieved without prior
knowledge of the likes of keyword arguments and G-expressions.</p><p>While the tool's feature set provides a great starting point, there are still a
few things that may be worth implementing. For instance, only the GNU and
CMake build systems are supported so far; it would make sense to include
a few others (Python-related ones might be good candidates).</p><p>Ultimately, Guix Packager does not intend to provide a full package definition
editor, but rather a simple entry point for people looking into starting to
write packages definitions.
It complements a set of steps we've taken over time to make packaging in Guix
approachable. Indeed, while package definitions are
actually code written in the Scheme language, the <code>package</code> “language”
was designed <a href="https://arxiv.org/abs/1305.4584">from the get-go</a> to be
fully declarative—think JSON with parentheses instead of curly braces and
semicolons.</p><h2>Nesting Containerized Environments</h2><p>The <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-shell.html"><code>guix shell --container</code></a>
(or <code>guix shell -C</code>) command lets users create isolated software
environments—<em>containers</em>— providing nothing but the packages specified
on the command line. This has proved to be a great way to ensure the
run-time environment of one’s software is fully controlled, free from
interference from the rest of the system.</p><p>Recently though, a new use case came up, calling for support of <strong>nested
containers</strong>. As <a href="https://issues.guix.gnu.org/62411">Konrad Hinsen
explained</a>, the need for nested
containers arises, for example, when dealing with workflow execution
engines such as Snakemake and CWL: users may be willing to use Guix to
deploy both the engine itself <em>and</em> the software environment of the
tasks the engine spawns.</p><p>This is now possible thanks to the new <code>--nesting</code> or <code>-W</code> option, to be
used in conjunction with <code>--container</code> or <code>-C</code>. This option lets users
create <em>nested containerized environments</em> as in this example:</p><pre><code>guix shell --container --nesting coreutils -- \
guix shell --container python</code></pre><p>The “outer” <code>shell</code> creates a container that contains nothing but
<code>coreutils</code>—the package that provides <code>ls</code>, <code>cp</code>, and other core
utilities; the “inner” <code>shell</code> creates a new container that contains
nothing but Python. For a Snakemake workflow, one would run:</p><pre><code>guix shell --container --nesting snakemake -- \
snakemake …</code></pre><p>… which in turn allows the individual tasks of the workflow to run <code>guix shell</code> as well.</p><h2>Concise Common Workflow Language</h2><p>The <a href="https://hpc.guix.info/blog/2022/01/ccwl/">Concise Common Workflow Language
(ccwl)</a> is a <strong>concise syntax to
express Common Workflow Language (CWL) workflows</strong>. It is implemented as an
EDSL (Embedded Domain Specific Language) in Guile Scheme. Unlike workflow
languages such as the Guix Workflow Language (GWL), ccwl is agnostic to
deployment. It does not use Guix internally to deploy applications. It
merely picks up applications from <code>PATH</code> and thus interoperates well with Guix
and any other package managers of the user's choice. ccwl also compiles to
CWL and thus reuses all tooling built to run CWL workflows. Workflows written
in ccwl may be freely reused by CWL users without impediment, thus ensuring
smooth collaboration between ccwl and CWL users.</p><p><a href="https://ccwl.systemreboot.net">ccwl 0.3.0 was released in January 2024</a>.
ccwl 0.3.0 comes with significantly better compiler error messages to detect
errors early and provide helpful error messages to users. ccwl 0.3.0 also
adds new constructs to express scattering workflow steps and other more
complex workflows.</p><h2>Ensuring Source Code Availability</h2><p>Our joint effort with <a href="https://www.softwareheritage.org">Software
Heritage</a> (SWH) has made major strides
this year on the two main fronts: <strong>increasing archive coverage, and
improving source code recovery capabilities</strong>. The two are closely
related but involve different work; together, they contribute to making
Guix a tool of choice for reproducible research workflows.</p><p><img src="/static/images/blog/swh-guix.png" alt="Medley of the Software Heritage and Guix logos, by Marla Da Silva." /></p><p>Timothy Sample has been leading the archival effort and closely
monitoring it. His latest <a href="https://ngyro.com/pog-reports/2024-01-26/"><em>Preservation of Guix
Report</em></a>, published in
January 2024, reveals that 94% of the package source code referred to by
Guix at that time is archived in SWH. That number has been steadily
increasing since we started this effort in 2019. Archival coverage for
the entire 2019–2024 period is 85%. Having identified the missing bits,
the SWH team is now retroactively <a href="https://gitlab.softwareheritage.org/swh/infra/sysadm-environment/-/issues/5222">ingesting package source code of
historical Guix
revisions</a>.</p><p>Guix’s ability to recover source code from SWH has improved in part
thanks to the newly-added support for bzip2-compressed archives in
<a href="https://ngyro.com/software/disarchive.html">Disarchive</a>, the tool
designed to allow Guix to recover exact copies of source code <em>tarballs</em>
such as <code>.tar.gz</code> and <code>tar.bz2</code> files.</p><p>A longstanding issue for automatic recovery from SWH is a mismatch
between the cryptographic hashes used in Guix and in SWH to refer to
content—a problem identified <a href="https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/">early
on</a>.
This has been addressed by a recent SWH feature deployed in
January 2024: SWH now computes and exposes nar SHA256 hashes for
directories—the very hashes used in Guix package definitions. Those
hashes are added as an extension of the SWH data model called <em>external
identifiers</em> or <em>ExtIDs</em>; the HTTP interface lets us obtain the SWHID
corresponding to a <code>nar-sha256</code> ExtID, which is exactly what was
necessary to ensure <em>content-addressed access</em> in all cases.
Consequently, the fallback code in Guix was changed to use that method.
This will allow Guix to recover source code for version control systems
(VCS) other than Git, which was previously not possible.</p><p><img src="/static/images/blog/hpcguix-web-swh-badge.png" alt="Software Heritage badge as shown by the hpcguix-web package browser." /></p><p>To make SWH archival more tangible to users and packagers, we modified
the hpcguix-web package browser, visible <a href="https://hpc.guix.info/browse">on the Guix-HPC web
site</a>, to include a <strong>source code archival
badge</strong> on every package page. The badge, served by SWH, is currently
shown both for packages whose source code is fetched from a Git
repository, and for packages whose source code is fetched from a
tarball. The information is comparable to that checked by the <code>guix lint -c archival</code> command.</p><h2>Reproducible Research in Practice</h2><p>In February 2023, Marek Felšöci defended his PhD thesis entitled <a href="https://theses.hal.science/tel-04077474"><em>Fast
solvers for high-frequency
aeroacoustics</em></a>. The thesis
was part of a collaboration between Inria and Airbus and deals with
direct methods for solving coupled sparse/dense linear systems.
Chapter 8 of the manuscript explains the strategy that was used to
achieve reproducible and verifiable results and how Guix, Software
Heritage, and other tools support it. It is another testimonial showing
how <strong>reproducible computational workflows</strong> can be achieved, even in a
demanding HPC context.</p><p>In a talk entitled <a href="https://hpc.guix.info/events/2023/workshop/video/everyone-can-learn-how-to-guix/"><em>Everyone Can Learn How to
Guix</em></a>,
medical doctor Nicolas Vallet defended a similar thesis: tools such as
Guix can support reproducible research workflows and be viewed as key
enablers even in scientific domains one might think of as detached from
software deployment considerations.</p><p><a href="https://numpex.org/">NumPEx</a> is the <strong>French national program for
exascale HPC</strong>, launched in mid-2023 with a 41 M€ budget for 6 years.
Its <a href="https://numpex.org/exadi-development-and-integration/">Development and Integration
project</a> aims to
ensure the dozens of HPC libraries and applications developed by French
researchers can easily be deployed on national and European clusters,
with high quality assurance levels. Guix is one of the deployment tools
used to achieve those goals and well poised to do so. The project has
just recruited two engineers to help with packaging, continuous
integration, and training in this context.</p><p>We hope this will not only help create synergies with the broader Guix
community, but also contribute to increasing awareness about
reproducible deployment in HPC circles. Meanwhile, conducting
reproducible research on supercomputers that lack Guix is already possible:
by creating an image with <code>guix pack</code>, deploying it on
the supercomputer, and setting up the host environment properly.
Experiments have shown that it
does not lead to any significant performance difference compared to
the same code and software stack deployed natively. The motivation,
technical details, and performance study were presented in a talk
entitled <a href="https://hpc.guix.info/events/2023/workshop/video/reconciling-high-performance-computing-with-the-use-of-third-party-libraries-/"><em>Reconciling high-performance computing with the use of
third-party
libraries?</em></a></p><p>Another aspect related to reproducibile HPC research and development
is the environment used to write code, document it, post-process data,
produce scientific reports. Offering researchers and developers a way
to share the <strong>exact same working environment</strong> is one way to facilitate
collaboration. The <a href="https://elementaryx.gitlabpages.inria.fr/">Elementary Emacs configuration coupled with
Guix</a> (ElementaryX) project
is an attempt towards such an elementary yet reproducible environment.</p><h1>Cluster Usage and Deployment</h1><p>The sections below highlight the experience of cluster administration
teams and report on tooling developed around Guix for users and
administrators on HPC clusters.</p><h2>Usage at the German Aerospace Center</h2><p>The Institute of Networked Energy Systems of the German Aerospace Center
(DLR) has <strong>set up a Guix installation in its HPC system</strong> and transitioned
several workflows to Guix, which are related to <a href="https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-solar-radiation-timeseries?tab=overview">remote sensing and
solar surface
radiation</a>
and feed data into the European Copernicus Atmosphere Monitoring Service
<a href="https://atmosphere.copernicus.eu/">CAMS</a>. Similar to containers, Guix
software stacks are almost independent of the host system. However,
container support in HPC systems is limited and still evolving. Guix
relocation options offer more flexibility and the software stack has
been successfully deployed in HPC clusters available to the DLR (like
<a href="https://www.dlr.de/en/research-and-transfer/research-infrastructure/hpc-cluster/cara">CARA</a>,
<a href="https://www.dlr.de/en/research-and-transfer/research-infrastructure/hpc-cluster/caro">CARO</a>
and
<a href="https://www.dlr.de/en/latest/news/2023/02/a-new-era-in-geoinformation-with-terrabyte">terrabyte</a>),
thereby enabling easy scaling of the radiation services.</p><h2>Guix System Cluster at GLiCID</h2><p><a href="https://www.glicid.fr/">GLiCID</a> is the HPC center for research in the
French region <em>Pays de la Loire</em>, resulting from the merger of
pre-existing HPC centers in the region.</p><p>The installation of new machines in June 2023 has led to the launch of a
new common system infrastructure—identity management, SLURM services,
databases, etc.—mostly independent from the solutions provided by the
manufacturers. Installed on two remote data centers, the infrastructure
needs to be highly available, and its deployment can be complex. The
team wanted to guarantee simple, predictable redeployment of the
infrastructure in the event of problems.</p><p>Guix, already offered to all cluster users, has a proven track record of
reproducibility, a desirable feature not just for scientific software
but also for the infrastructure itself. That is why the team embarked
on an effort to <strong>build its infrastructure with Guix System</strong>, which led to
the development of Guix System services for HPC—for OpenLDAP, SLURM, and
more. <a href="https://hpc.guix.info/events/2023/workshop/video/reproducible-virtual-machine-management-with-guix/">They
reported</a>
on the impact of these choices at the Workshop in Montpellier, and are
currently making progress to reach a 100% <em>Guixified</em> infrastructure.</p><h2>Pangenome Genetics Research Cluster at UTHSC</h2><p>At UTHSC, Memphis (USA), we are running a 16-node large-memory <a href="http://genenetwork.org/facilities/">Octopus
HPC cluster</a> (438 real CPU cores)
dedicated to pangenome and genetics research. In 2023, the cluster
effectively doubled in size with 192 4 GHz CPU cores, 144,000 GPU cores,
and SSDs added. The storage adding up to a 200 TB Lizardfs fiber-optic
connected distributed network storage.</p><p>Notable about this HPC cluster is that it is <em>administered by the users
themselves</em>. Thanks to Guix, <strong>we install, run and manage the cluster as
researchers</strong>—and roll back in case of a mistake. UTHSC IT manages the
infrastructure—i.e., physical placement, electricity, routers and
firewalls—but beyond that there are no demands on IT. Thanks to
out-of-band access, we can completely (re)install machines
remotely. Octopus runs Guix on top of a minimal Debian install and we
are experimenting with pure Guix virtual machines and nodes that can be
run on demand. Almost all deployed software has been packaged in Guix
and can be installed on the head-node by regular users on the cluster
without root access. This same software is shared through NFS on the
nodes. See the
<a href="https://git.genenetwork.org/guix-bioinformatics/">guix-bioinformatics</a>
channel for all deployment configuration.</p><p>At FOSDEM 2023, Arun Isaac presented Tissue, our <a href="https://archive.fosdem.org/2023/schedule/event/tissue/">minimalist Git+plain text issue tracker</a> that allows us to move away from GitHub source code hosting, continuous integration (CI), and issue trackers.
We have also started to use Guix with the <a href="https://hpc.guix.info/blog/2022/01/ccwl-for-concise-and-painless-cwl-workflows/">Concise Common Workflow Language (CCWL)</a> for reproducible pangenome workflows (see above) on our Octopus HPC.</p><h2>Supporting RISC-V</h2><p>RISC-V is making inroads with HPC, e.g. <a href="https://riscv.org/blog/2023/07/risc-v-summit-europe-2023-highlights-from-barcelona/">in Barcelona</a> and with the new Barcelona Supercomputing Center Sargantana chip.</p><p>Christopher Batten (Cornell) and Michael Taylor (University of Washington) are in charge of <strong>creating the NSF-funded RISC-V supercomputer</strong> with 2,000 cores per node and 16 nodes in a rack (NSF PPoSS grant 2118709), targeting Guix driven pangenomic workloads by Erik Garrison, Arun Isaac, Andrea Guarracino, and Pjotr Prins.</p><p>The supercomputer will incorporate Guix and the GNU Mes bootstrap, with input from Arun Isaac, Efraim Flashner and others.
<a href="https://nlnet.nl">NLNet</a> funds RISC-V support for the Guix <code>riscv64</code> target from Efraim Flashner and the GNU Mes RISC-V bootstrap project with Ekaitz Zarraga, Andrius Štikonas, and Jan Nieuwenhuizen. The bootstrap is now working from stage0 to tcc-boot0.</p><p>TinyCC compiles the RISC-V target, but still has some issues to resolve. The next steps include compiling the GNU C library, various versions of GCC, and packages beyond.
GNU Mes 0.25.1 was released with RISC-V support and a <code>boostrappable-tcc</code> branch. Both are available in Guix, though the RISC-V bootstrap is not yet enabled by default.</p><h1>Outreach and User Support</h1><p>Guix-HPC is in part about “spreading the word” about our approach to
reproducible software environments and how it can help further the goals of
reproducible research and high-performance computing development. This section
summarizes talks and training sessions given this year.</p><h2>Talks</h2><p>Since last year, we gave the following talks at the following venues:</p><ul><li><a href="https://fosdem.org/2024/schedule/event/fosdem-2024-2651-making-reproducible-and-publishable-large-scale-hpc-experiments/"><em>Making reproducible and publishable large-scale HPC
experiments</em></a>,
HPC & Big Data track, FOSDEM, Feb. 2024 (Philippe Swartvagher)</li><li><a href="https://simon.tournier.info/posts/2023-12-14-seminar-pasteur.html"><em>Toward practical transparent, verifiable and long-term reproducible
research using
Guix</em></a>,
Institut Pasteur, Dec. 2023 (Simon Tournier)</li><li><a href="https://nextcloud.init.mpg.de/index.php/s/RgB7H9L4yart69z"><em>Reproducible software deployment in scientific computing</em></a>,
Event of the Max Planck Society, Sept. 2023 (Ricardo Wurmus)</li><li><em>Guix: Funktionale Paketverwaltung zur wirklichen Reproduzierbarkeit</em>,
Second IT4Science Days, Meeting of the Helmholtz Association and the Max Planck Society, Sept. 2023 (Ricardo Wurmus)</li><li><a href="https://2023.programming-conference.org/track/programming-2023-papers#program"><em>Building a Secure Software Supply Chain with
GNU Guix</em></a>,
Programming Conference, March 2023 (Ludovic Courtès)</li><li><a href="https://simon.tournier.info/posts/2023-02-23-seminar-irill.html"><em>Functional programming paradigm applied to package management: toward
reproducible computational
environment</em></a>,
IRILL, Feb. 2023 (Simon Tournier)</li><li><a href="https://archive.fosdem.org/2023/schedule/event/openresearch_guix/"><em>Guix, toward practical transparent, verifiable and long-term reproducible
research</em></a>,
Open Research Tools and Technology track,
FOSDEM, Feb. 2023 (Simon Tournier)</li><li><a href="https://scienceouverte.unistra.fr/websites/science-ouverte/science_ouverte/fichiers_23/rllr-1.pdf"><em>Vers une étude expérimentale reproductible avec GNU
Guix</em></a>,
<a href="https://scienceouverte.unistra.fr/formations/rencontres-logiciels-libres-de-recherche">Rencontres sur les logiciels libres de
recherche</a>,
Université de Strasbourg, Feb. 2023 (Marek Felšöci)</li><li><a href="https://archive.fosdem.org/2023/schedule/event/cpu_tuning_gnu_guix/"><em>Reproducibility and performance: why
choose?</em></a>,
HPC & Big Data track, FOSDEM, Feb. 2023 (Ludovic Courtès)</li></ul><p>To this list we should add 11 talks given for the First Workshop on
Reproducible Software Environments for Research and High-Performance
Computing, held in November 2023, for which <a href="https://hpc.guix.info/events/2023/workshop/program/">videos are now
on-line</a>.</p><h2>Events</h2><p>As in previous years, Pjotr Prins and Manolis Ragkousis spearheaded the organization of the
<a href="https://archive.fosdem.org/2023/schedule/track/declarative_and_minimalistic_computing/">“Declarative and minimalistic computing”
track</a>
at <strong>FOSDEM 2023</strong>, which was home to several Guix talks, along with the
satellite <strong>Guix Days</strong> where 50 Guix contributors gathered.</p><p>This year, we held a <a href="https://hpc.guix.info/blog/2023/05/reproducible-research-hackathon-let-redo/">second <strong>on-line reproducible research
hackathon</strong></a>
on reproducible research issues. This hackathon was a collaborative effort to
leverage Guix to achieve reproducible software deployment for articles contributed to the online
journal <a href="https://rescience.github.io/">ReScience C</a>. As outlined in our <a href="https://hpc.guix.info/blog/2023/07/reproducible-research-hackathon-experience-report/">write-up on the experience</a>, this served as an excellent opportunity to put into practice our <a href="https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/">guide to reproducible research
papers</a>,
and it helped us identify open issues for long-term and archivable
reproducibility.</p><p><img src="/static/images/workshop-group-photo-2023.jpg" alt="Group picture of the attendees on Friday, November 10th, 2023. By Tess Gobain." /></p><p>This year we organized the <a href="https://hpc.guix.info/events/2023/workshop/"><strong>First Workshop on Reproducible Software
Environments for Research and High-Performance
Computing</strong></a>, which took
place in Montpellier, France, in November 2023. Coming from France
primarily but also from Czechia, Germany, the Netherlands, Slovakia,
Spain, and the United Kingdom to name a few, 120 people—scientists,
high-performance computing (HPC) practitioners, system administrators,
and enthusiasts alike—came to listen to the talks, attend the tutorials,
and talk to one another.</p><p>Our ambition was to gather people from diverse backgrounds with a shared
interest in improving their research workflows and development
practices. The 11 talks and 8 tutorials, along with the hallway
discussions and group dinner, have allowed us to share skills and
experience. Videos of the talks edited by the video team at Institut
Agro, our host, are available <a href="https://hpc.guix.info/events/2023/workshop/program/">on the event’s web
site</a>.</p><p>Many thanks to our publicly-funded academic sponsors who made this event
possible: ISDM, our primary sponsor for this event, Institut Agro for
hosting the workshop in such a beautiful place, and EuroCC² and Inria
Academy for their financial and logistical support. We look forward to
organizing a second edition!</p><h2>Training Sessions</h2><p>For the French HPC Guix community, we continued the monthly on-line event called <a href="https://hpc.guix.info/events/2022/café-guix/"><strong>Café Guix</strong></a>, originally started in October 2021. Each month, a user or developer informally presents a Guix feature or workflow and answers questions. These sessions are now recorded and are available on the web page, gathering up to 70 people. This is <a href="https://hpc.guix.info/events/2024/café-guix/">continuing in 2024</a>.</p><p>Pierre-Antoine Bouttier and Ludovic Courtès ran a 4-hour Guix training
session as part of the <a href="https://calcul.math.cnrs.fr/2023-06-anf-ust4hpc.html"><strong>User Tools for
HPC</strong></a> (UST4HPC) event
organized by CNRS (<em>action nationale de formation</em>, ANF) in June 2023.
The session targeted an audience of HPC system administrators with no
prior experience with Guix. Material (in French) is <a href="https://gitlab.inria.fr/guix-hpc/ust4hpc-2023">available
on-line</a>.</p><p>Marek Felšöci and Ludovic Courtès ran a 4-hour tutorial as part of the
<a href="https://2023.compas-conference.fr/">Compas</a> HPC conference, in
June 2023. The tutorial showed how to devise reproducible research workflows
combining the literal programming facilities of Org-Mode with Guix.
Supporting material <a href="https://gitlab.inria.fr/tutoriel-guix-compas-2023/">is available
on-line</a>.</p><p>On September 27, Ricardo Wurmus hosted a 3-hour tutorial on the use of
Guix for reproducible science as a session at the second <em>IT4Science
Days</em>, a joint meeting of representatives of the <em>Helmholtz
Association of German Research Centres</em> and the <em>Max Planck Society</em>.
The workshop was attended by system administrators and scientists hailing from research institutes all over Germany.</p><p>The <a href="https://hpc.guix.info/events/2023/workshop/"><strong>workshop on reproducible software
environments</strong></a> that took
place in Montpellier, France, in November 2023 was home to 8 tutorials,
half of which about Guix. Each Guix tutorial had a different target
audience: users-to-be (people with no prior experience with Guix),
novice packagers, experienced packagers, and system administrators.
Supporting material is available on the web page of the event.</p><p>A new <strong>MOOC on Reproducible Research practices</strong> has almost been completed. It will be stress-tested in February 2024 and open to the public on the <a href="https://www.fun-mooc.fr/">platform FUN</a> in spring. One of its three modules is about reproducible computational environments, introducing the various obstacles to reproducibility and presenting practical solutions. One of them is Guix, and in particular Guix containers defined by manifest files and frozen in time through channel files. Exporting such containers to Docker and Singularity is also discussed, because of the importance of these technologies in HPC.</p><h1>Personnel</h1><p>As part of Guix-HPC, participating institutions
have dedicated work hours to the project, which we summarize here.</p><ul><li>Inria: 3.5 person-years (Ludovic Courtès and Romain Garbage;
contributors to the Guix-HPC channel: Emmanuel Agullo, Julien
Castelnau, Luca Cirrottola, Marek Felšöci, Marc Fuentes, Nathalie
Furmento, Gilles Marait, Florent Pruvost, Philippe Swartvagher;
system administrator in charge of Guix on the
PlaFRIM and Grid’5000 clusters: Julien Lelaurain)</li><li>University of Tennessee Health Science Center (UTHSC): 3+ person-years (Efraim Flashner, Bonface Munyoki, Fred Muriithi, Arun Isaac, Andrea Guarracino, Erik Garrison and Pjotr Prins)</li><li>CNRS: 0.2 person-year (Konrad Hinsen)</li><li>CNRS and Université Grenoble-Alpes (GRICAD): 0.2 person-year (Céline Acary-Robert, Pierre-Antoine Bouttier)</li><li>Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC): 2 person-years
(Ricardo Wurmus, Navid Afkhami, and Mădălin Ionel Patrașcu)</li><li>Université Paris Cité: 0.75 person-year (Simon Tournier)</li></ul><p>Guix itself is a collaborative effort, receiving code contributions
from about 100 people every month, along with lots of crucial non-coding
contributions: organizing events, writing documentation, giving
tutorials, and more.</p><h1>Perspectives</h1><p>As the second decade dawns on the GNU Guix project, we shall take the
opportunity to not only look back on past achievements, but evaluate
our current position with respect to our goals and adjust our
trajectory if necessary. Previous issues of the Activity Report had a
common refrain: the importance of continuous efforts to <strong>connect the
communities</strong> that meet at the intersection of Open Science,
reproducible research, software development, system administration,
and systems design. This issue is no different—the Guix-HPC effort
remains committed to strengthening the ability of these communities to
establish practices that further Open Science and make reproducible
research workflows accessible.</p><p>The <a href="https://hpc.guix.info/events/2023/workshop/"><strong>workshop on reproducible software
environments</strong></a> in
Montpellier may serve as an example of what this may look like in
practice. The presenters in these sessions discussed issues of
reproducible research and showcased the various roles Guix can assume
in a diverse community of research practitioners: whether as the core
of a platform for ad-hoc research environments; as the nexus that
binds medical data, the tools of interpretation, and the scientific
publication; or as the workhorse for reliably deploying entire HPC
sites. As a project whose development priorizes increasing user
autonomy, Guix has clearly found its niche among enthusiastic Open
Science practitioners in a wide range of scientific fields.</p><p>While these activities are certainly encouraging, we need to
acknowledge the fact that this level of engagement is not
representative of the impact Guix has had on the wider scientific
community. Challenges remain in bringing all the benefits and
guarantees that Guix provides to <strong>where researchers actually do their
computing</strong>, to the systems that system administrators get to build and
maintain, and to the existing platforms and networks that represent
the landscape in which computer-aided research takes place.</p><p>On the technical side, this could mean to contribute extensions to
existing workflow systems like Snakemake or Nextflow; to develop tools
and implement adapters for deploying Guix containers and virtual
machine images to platforms like OpenStack; or to bridge gaps to
support users of commercial third-party cloud computing platforms
whose moats remain difficult to cross without leaving user autonomy
behind.</p><p>These technical goals are, of course, informed by the needs of members
of the reproducible research community who are currently represented
in the Guix-HPC efforts. In the coming year, we want to continue to
reach out to the wider community by organizing training sessions and
workshops, and to gain better insight into how we can improve Guix to
serve their needs. It is our mission to put the tools we build in the
hands of practitioners at large—and to shape these tools together.
Let’s talk—we’d love to <a href="https://hpc.guix.info/about">hear from
you</a>!</p>HIP and ROCm come to GuixLudovic Courtès, Thomas Gibson, Kjetil Haugen, Florent Pruvostguix-devel@gnu.org2024-01-30T15:30:00Z<p>We have some exciting news to share: AMD has just contributed 100+ Guix
packages adding several versions of the whole HIP and ROCm stack!
<a href="https://github.com/ROCm/ROCm">ROCm</a> is AMD’s <em>Radeon Open Compute
Platform</em>, a set of low-level support tools for general-purpose
computing on graphics processing units (GPGPUs), and
<a href="https://github.com/rocm-developer-tools/hip">HIP</a> is the <em>Heterogeneous
Interface for Portability</em>, a language one can use to write code
(computational kernels) targeting GPUs or CPUs. The whole stack is free
and “open source” software—a breath of fresh air!—and is seeing
increasing adoption in HPC. <em>And</em>, it can now be deployed with Guix!</p><p>In this post, written by AMD engineers and Inria research software
engineers, we look at the packages AMD contributed and how you can use
them, and we discuss the use cases at AMD and relation with the French and
European supercomputing environments.</p><p><img src="/static/images/blog/rocm-logo.png" alt="AMD ROCm logo." /></p><h1>More than 100+ packages</h1><p>The 100+ packages Kjetil Haugen and Thomas Gibson of AMD contributed
to the <a href="https://hpc.guix.info/channel/guix-hpc">Guix-HPC channel</a>
include 5 versions of the entire
<a href="https://hpc.guix.info/package/hipamd">HIP</a>/<a href="https://hpc.guix.info/package/rocm-toolchain">ROCm
toolchain</a>, all the way
down to LLVM and including support in communication libraries
<a href="https://hpc.guix.info/package/ucx">ucx</a> and
<a href="https://hpc.guix.info/package/openmpi">Open MPI</a>. Anyone who’s tried
to package or to build this will understand that this is a major
contribution: the software stack is complex, requiring careful assembly
of the right versions or variants of each component.</p><p>As always with Guix, a key element here is that the package set is
<em>self-contained</em>: these packages as well those that depend on them do
not and in fact cannot rely on an external ROCm installation, contrary
to what is customary in HPC
environments.
This is what has allowed us to run the exact same software stack both at
AMD and on the French HPC clusters, as we will see below.</p><p>The foci of the initial packaging effort are to create a solid
interface between Guix and ROCm, and to provide the components
needed to start leveraging Guix for developing and deploying ROCm
applications. To that end we provide two primary packages as the
foundation for the AMD ROCm stack:</p><ol><li>The <a href="https://hpc.guix.info/package/rocm-toolchain">ROCm toolchain</a></li><li>The HIP runtime for the AMD platform: <a href="https://hpc.guix.info/package/hipamd">hipamd</a></li></ol><p>Note that all ROCm packages in Guix are considered experimental as
the modest patching required to adapt to the Guix ecosystem
implies that they deviate from the officially released ROCm binaries.
Also note that we may modify the design as we gain experience with
using Guix in our daily work.</p><p>The ROCm toolchain is analogous to
<a href="https://hpc.guix.info/package/clang-toolchain"><code>clang-toolchain</code></a>,
and provides the ROCm variants of core LLVM components, such
as clang, clang runtime, lld, libomp, and associated headers/binaries.
In addition, the ROCm toolchain also provides the necessary
ROCr/HSA runtimes and device libraries required for GPU offloading
support. All <a href="https://rocm.docs.amd.com/en/docs-5.5.1/release/gpu_os_support.html#linux-supported-gpus">supported GPU architectures</a>
can be found via AMD's official ROCm documentation.</p><p>The implementation of HIP runtime for AMD GPUs, <a href="https://hpc.guix.info/package/hipamd">hipamd</a>, is an extension of the ROCm
toolchain which provides necessary headers and the compiler wrapper
<code>hipcc</code>. This is the primary user-facing package for developing
or deploying applications using HIP; it provides a basic toolchain
for most GPU kernel development, but does not include math libraries such
as rocBLAS or rocFFT. Math libraries will be provided at a later date.</p><p>Due to the fact that both hardware and software advance quite
rapidly, we make generous use of generator functions that enable
the installation of multiple versions of ROCm/HIP to ensure that both
existing stable versions as well as latest releases can be made easily
available. Having older versions available ensures that projects
relying on a particular release of ROCm/HIP are not distrupted.
This also enables developers to examine performance impacts between
versions to help guide their optimization efforts and track
regressions/improvements.</p><p>As an application developer using Guix, one can utilize the
<code>guix shell</code> command to create environments (on top of your
system environment or completely isolated) with a fully functional
HIP toolchain for any version you specify. For example:</p><pre><code class="language-bash">guix shell hipamd@5.7.1</code></pre><p>This shell will contain not only the standard ROCm-based Clang toolchain
and its associated compilers/linkers, but will also
provide <code>hipcc</code> and its associated utilities such as <code>hipconfig</code>
(for HIP and Clang versions, include paths, and built-in flags)
and <code>rocminfo</code> (for querying device information).</p><pre><code class="language-shell">[env]$ ls -l `which hipcc`
lrwxrwxrwx 1 root root 66 Dec 31 1969 /gnu/store/2j5hqm1rk7q8h3ivwklpwmiv8nzkq15v-profile/bin/hipcc -> /gnu/store/kcfisihalab9fh75dd15rzwj30mv34bk-hipamd-5.7.1/bin/hipcc
[env]$ hipcc --version
HIP version: 5.7.1
clang version 17.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /gnu/store/r9zz6hjmgs2c79091s0s9zc43d0zq9vc-rocm-toolchain-5.7.1/bin</code></pre><p>As an illustrative example, we can clone the open-source STREAM project for GPUs,
<a href="https://github.com/UoB-HPC/BabelStream">BabelStream</a>, and directly compile and
run HIP implementation of the benchmark:</p><pre><code class="language-shell">[env]$ git clone git@github.com:UoB-HPC/BabelStream.git</code></pre><p>Once the repository is cloned, we can build the project using CMake
as shown below:</p><pre><code class="language-shell">[env]$ cd BabelStream/
[env]$ cmake -Bbuild -H. -DMODEL=hip -DCMAKE_CXX_COMPILER=hipcc
[env]$ cmake --build build</code></pre><p>If neither Git nor CMake are available on your system, you can simply add both <code>git</code>
and <code>cmake</code> to your <code>guix shell</code> command to automatically install them into your environment!</p><p>And finally, you can run the executable and immediately observe
the measured streaming performance:</p><pre><code class="language-shell">[env]$ ./build/hip-stream
BabelStream
Version: 5.0
Implementation: HIP
Running kernels 100 times
Precision: double
Array size: 268.4 MB (=0.3 GB)
Total size: 805.3 MB (=0.8 GB)
Using HIP device AMD Radeon RX 6800 XT
Driver: 50731921
Memory: DEFAULT
Init: 0.150206 s (=5361.344563 MBytes/sec)
Read: 0.212430 s (=3790.920912 MBytes/sec)
Function MBytes/sec Min (sec) Max Average
Copy 520715.707 0.00103 0.00104 0.00103
Mul 450652.522 0.00119 0.00120 0.00119
Add 438387.222 0.00184 0.00186 0.00184
Triad 448402.828 0.00180 0.00180 0.00180
Dot 438838.728 0.00122 0.00123 0.00123 </code></pre><p>This example shows how to obtain an interactive development environment
with <code>guix shell</code> but if all you want is BabelStream, there’s a
ready-to-use <a href="https://hpc.guix.info/package/babelstream-hip">package</a>.</p><h1>Benchmarks</h1><p><img src="/static/images/blog/adastra-banner.jpg" alt="Banner showing of one of the Adastra racks." /></p><p>Adastra, one of the French national supercomputers, builds upon AMD
GPUs. It’s a <a href="https://genci.fr/en/centre-informatique-national-de-lenseignement-superieur-cines">78 PFlop
machine</a>
that was <a href="https://genci.fr/actualites/adastra-near-30-grands-challenges-towards-more-sustainable-science-and-already-first">ranked #3 in the November 2023 edition of
Green500</a>.
ROCm and HIP are available pre-installed on Adastra, but naturally, we
at Inria wanted to ensure that those packages that had been tested at
AMD would also give the expected performance on this machine.
Guix is currently unavailable on Adastra so
we created a bundle of <a href="https://hpc.guix.info/package/hpcg">hpcg</a>, a
synthetic benchmark that exercises HIP, to ship it over to Adastra:</p><pre><code>guix pack -RR hpcg bash-minimal -S /bin=bin</code></pre><p>After unpacking, the resulting bundle lets us run <code>hpcg</code> on a single
node of Adastra—each node contains 4 AMD MI250X GPUs, each with 2
Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. We’d first
allocate 8 CPUs on one node with SLURM:</p><pre><code>salloc --time=01:00:00 --nodes=1 --ntasks-per-node=8 --cpus-per-task=8 \
--gpus-per-task=1 --threads-per-core=1 --exclusive --account=ces1926 \
--constraint=MI250 --mem=256000
ssh $SLURM_NODELIST</code></pre><p>… and then run our Guix-built <code>hpcg</code> on the compute node, with 8 MPI
processes:</p><pre><code>module purge
GUIX_ROOT=$HOME/guix/hpcg
${GUIX_ROOT}/bin/mpirun -n 8 --map-by L3CACHE \
--launch-agent ${GUIX_ROOT}/bin/orted \
-x GUIX_EXECUTION_ENGINE=performance \
${GUIX_ROOT}/bin/rochpcg 280 280 280 180</code></pre><p>Notice that we’re using the Guix-provided <code>mpirun</code>. We run <code>module purge</code> to avoid interference from environment modules available on the
system. By setting <code>GUIX_EXECUTION_ENGINE</code> to <code>performance</code>, we
instruct the Guix-provided wrapper of <code>hpcg</code> to <a href="https://hpc.guix.info/blog/2020/05/faster-relocatable-packs-with-fakechroot/">select a relocation
mechanism with no
overhead</a>.</p><p>The benchmark prints the kind of output we expected:</p><pre><code>Total Time: 181.62 sec
Setup Time: 0.06 sec
Optimization Time: 0.12 sec
DDOT = 1809.6 GFlop/s (14476.5 GB/s) 226.2 GFlop/s per process ( 1809.6 GB/s per process)
WAXPBY = 804.0 GFlop/s ( 9648.2 GB/s) 100.5 GFlop/s per process ( 1206.0 GB/s per process)
SpMV = 1465.6 GFlop/s ( 9229.1 GB/s) 183.2 GFlop/s per process ( 1153.6 GB/s per process)
MG = 1935.1 GFlop/s (14934.8 GB/s) 241.9 GFlop/s per process ( 1866.9 GB/s per process)
Total = 1795.6 GFlop/s (13616.4 GB/s) 224.4 GFlop/s per process ( 1702.1 GB/s per process)
Final = 1647.8 GFlop/s (12495.8 GB/s) 206.0 GFlop/s per process ( 1562.0 GB/s per process)</code></pre><p>The software stack was packaged once and can now be used on a variety of
machines without spending hours or days in deployment and testing. That
alone is no small feat in a world where <em>ad hoc</em> HPC cluster deployments
remain the norm.</p><h1>Guix at AMD</h1><p><img src="/static/images/blog/amd-lab-notes.png" alt="Logo of “AMD lab notes”." /></p><p>Currently, the use of Guix within AMD is a grassroot effort among members
of the Data Center GPU Software Solutions Group. The team engages in porting
and optimization of HPC applications across a variety of engineering
disciplines, organizes ROCm training and hackathons, provides feedback to
ROCm development teams, and participates in the bring-up process
preceeding the release of new hardware. More details about our activities
can be found at <a href="https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-readme/">AMD lab notes</a>.</p><p>Compared to most engineers, we touch a larger number of applications, across a
larger number of HPC systems, and with a greater variety of software dependencies
and GPU architectures. An immediate consequence is that the overhead of dependency
management can become quite significant. Moreover, the effort is often duplicated
between engineers working on applications with similar dependencies, system
administrators providing environment modules, and deployment engineers preparing
container images and recipes.</p><p>As a functional package manager, Guix promises deduplication and reproducibility.
In other words, <strong>if a package description is created by someone somewhere, it can
used by anyone anywhere</strong>! Guix is already providing a lot of value for individual
engineers. The primary use case is to allow the use of less contested resources
for development (workstations with gaming card) and reserve more contested resources
for performance testing (nodes with emerging GPU architectures). We are currently
considering using Guix to <a href="https://hpc.guix.info/blog/2022/05/back-to-the-future-modules-for-guix-packages/">create environment
modules</a>
and are working on integrating
<a href="https://guix.gnu.org/en/cuirass/">Cuirass</a> into engineering workflows.</p><p>After using Guix extensively to package ROCm, there are two things missing to
better support GPU-based development. First, a mechanism for running unit tests
on the GPU. This is currently impossible because the isolated environments in
which Guix builds packages do not expose the GPU. Second, a mechanism to
specify the target GPU architecture on the fly—e.g., through package transformations.
The size of many GPU libraries is proportional to the number of GPU architectures
supported and limiting the support only to the GPUs available on the system of
interest is good software hygiene and may signficicantly reduce compilation time.</p><p>Beyond that, we are mostly happy with the range of functionality Guix offers.
However, we would like a more interactive debugging environment. Keeping the build
directory, i.e. <code>guix build -K</code> and subsequently running <code>guix shell --container</code>
on that directory as described in the <a href="https://guix.gnu.org/manual/devel/en/html_node/Debugging-Build-Failures.html">Guix manual</a>
gets us close, but providing a <code>gdb</code>-like user experience where we can set breakpoints,
and list, inspect, step through, modify, and rerun build phases would be helpful.</p><h1>HIP, Guix, and HPC in Europe</h1><p>HPC research teams at Inria develop software ranging from run-time
support libraries such as <a href="https://hpc.guix.info/package/starpu">StarPU</a>
and <a href="https://hpc.guix.info/package/hwloc">hwloc</a>, to linear algebra
solvers such as <a href="https://hpc.guix.info/package/chameleon">Chameleon</a>, to
numerical simulation libraries. Having the HIP/ROCm stack packaged in
Guix allows us to deploy and run those even more complex stacks on
supercomputers and readily take advantage of their processing power
without going through a tedious installation and testing process.</p><p>This makes even more of a difference considering the breadth and depth
of HPC software developed in <a href="https://numpex.org/">NumPEx</a>. NumPEx is
the French national program for exascale HPC, launched in mid-2023 with
a 41 M€ budget for 6 years. Its <a href="https://numpex.org/exadi-development-and-integration/">Development and Integration
project</a> aims to
ensure the dozens of HPC libraries and applications developed by French
researchers can easily be deployed on national and European clusters,
with high quality assurance levels. Guix is one of the deployment tools
used to achieve those goals and well poised to do so; having a
well-tested GPGPU package set makes it an even better fit.</p><p>It remains to be seen whether
<a href="https://eurohpc-ju.europa.eu/jules-verne-consortium-will-host-new-eurohpc-exascale-supercomputer-france-2023-06-20_en">Jules-Verne</a>,
the EuroHPC exascale supercomputer to be hosted in France in 2025, will
provide AMD GPUs. Given that the software stack for these GPUs is free
software, this would send a strong signal in favor of Open Science, in
line with the <a href="https://en.unesco.org/science-sustainable-future/open-science/recommendation">recommendations of
UNESCO</a>
and those of the <a href="https://www.ouvrirlascience.fr/second-national-plan-for-open-science/">French Plan for Open
Science</a>.</p><h1>This is just the beginning</h1><p>All these packages are available from the <a href="https://hpc.guix.info/channel/guix-hpc">Guix-HPC
channel</a>; they are
continuously-built on <a href="https://guix.bordeaux.inria.fr/eval/latest/dashboard?spec=guix-hpc">the build farm at
Inria</a>,
providing users with readily usable binaries.</p><p>With the HIP and ROCm foundations in place, there’s a lot on our agenda:
providing rocBLAS, rocFFT, and related math libraries, taking advantage
of these in the linear algebra and numerical simulation packages
developed at Inria and in NumPEx, and working with the broader Guix
community to provide ROCm-enabled variants of major packages like
<a href="https://hpc.guix.info/package/python-pytorch">PyTorch</a>. We plan to
make the ROCm/HIP packages part of the main Guix channel once we have
gained enough experience. The other important benefit we expect from
this collaboration is to better cater to the needs of engineers at AMD.</p><p>Working together in the open has been a fruitful and pleasant experience
and we can already foresee lots of opportunities to keep this going!</p>Videos of the 2023 workshop are on-lineCéline Acary-Robert, Pierre-Antoine Bouttier, Ludovic Courtès, Alexandre Dehne-Garcia, Simon Tournierguix-devel@gnu.org2024-01-29T10:00:00Z<p>Back in November, the <a href="https://hpc.guix.info/events/2023/workshop/">First Workshop on Reproducible Software
Environments for Research and High-Performance
Computing</a> was held in
Montpellier, France. Coming from France primarily but also from
Czechia, Germany, the Netherlands, Slovakia, Spain, and the United
Kingdom to name a few, 120 people—scientists, high-performance computing
(HPC) practitioners, system administrators, and enthusiasts alike—came
to listen to the talks, attend the tutorials, and talk to one another.</p><p><img src="https://hpc.guix.info/static/images/workshop-group-photo-2023.jpg" alt="Group picture of the attendees on Friday, November 10th, 2023. By Tess Gobain." /></p><p>Our ambition was to gather people from diverse backgrounds with a shared
interest in improving their research workflows and development
practices. The 11 talks and 8 tutorials, along with the hallway
discussions and group dinner (very nice!), have allowed us to share
skills and experience.</p><p>Today, we’re publishing <a href="https://hpc.guix.info/events/2023/workshop/program"><strong>videos of the
talks</strong></a> including
short interviews with the
<a href="https://hpc.guix.info/events/2023/workshop/speakers">speakers</a>
(tutorials were not recorded but supporting material is linked from the
<a href="https://hpc.guix.info/events/2023/workshop/program">program</a>).</p><p>Our gratitude goes to the video team at Institut Agro for taking care of
the live stream during the event and for editing those videos—thank you!
Many thanks to our publicly-funded academic sponsors who made this event
possible: <a href="https://isdm.umontpellier.fr/">ISDM</a>, our primary sponsor for
this event, <a href="https://www.institut-agro.fr/en">Institut Agro</a> for hosting
the workshop in such a beautiful place, and
<a href="https://www.eurocc-access.eu/">EuroCC²</a> and <a href="https://www.inria-academy.fr/">Inria
Academy</a> for their financial and
logistical support.</p><p>“When will be the <em>second</em> workshop?”, participants would ask as we were
wrapping up. We don’t know yet, but if you’d like to host the next
edition or to sponsor it, do <a href="mailto:contact-guixhpc-days@services.cnrs.fr">get in touch with
us</a>!</p><p>The <em>bonus video</em> below will give you a feel of what the event in
Montpellier was like…</p><p><img src="/static/videos/workshop-2023/99-aftermovie.webm" alt="Short video giving an overview of the event and the venue." /></p><blockquote><p><em>Video by Institut Agro’s video team, published under
<a href="https://creativecommons.org/licenses/by-nc/3.0/">CC-BY-NC 3.0</a>.
<a href="https://git.savannah.gnu.org/cgit/guix/guix-artwork.git/tree/promotional/guix-hpc-workshop-2023">Guix
artwork</a>
by Luis Felipe published under
<a href="https://creativecommons.org/licenses/by-sa/4.0/">CC-BY-SA 4.0</a>.</em></p></blockquote><p>Enjoy!</p>Announcing the First Workshop on Reproducible Software EnvironmentsSimon Tournier, Ludovic Courtèsguix-devel@gnu.org2023-09-18T14:30:00Z<p>We’re excited to announce the <a href="https://hpc.guix.info/events/2023/workshop/">First Workshop on Reproducible Software
Environments for Research and High-Performance Computing
(HPC)</a>, which will take
place in <strong>Montpellier, France</strong>, on <strong>November 8–10th, 2023</strong>! The
<a href="https://hpc.guix.info/events/2023/workshop/program/">preliminary
program</a> is
on-line, and now’s the time for you to
<a href="https://repro4research.sciencesconf.org/registration">register</a>!</p><p>This event can be seen as a followup to the research session of the <a href="https://10years.guix.gnu.org/program/#Friday">Ten
Years of Guix</a> event and
the earlier <a href="https://hpc.guix.info/events/2021/atelier-reproductibilit%C3%A9-environnements/">French-speaking Workshop on Reproducible Software
Environments</a>.</p><p>The <a href="https://hpc.guix.info/events/2023/workshop/program">program</a> features
talks by scientists, engineers, and system administrators from different
backgrounds who will share their experience with Guix, as well as tutorials on
GitLab, Guix, and other tools that support scientific workflows—from bioinfo
analyses to HPC and source code archival.</p><p>The <a href="https://hpc.guix.info/events/2023/workshop/speakers">list of
speakers</a> shows a
variety of positions and scientific disciplines—psychology, linear
algebra, biophysics, medicine, bioinfo, system administration—that we believe very
much shows that software environment reproducibility is a cross-cutting
concern that can be tackled whether or not one identifies themself as a
“geek”.</p><p><a href="https://hpc.guix.info/events/2023/workshop/speakers/#Yann-Dupont">Yann
Dupont</a>,
system architect at the GliCID HPC center in France, writes:</p><blockquote><p>Guix is the perfect Swiss Army knife that every digital plumber
should have in their toolkit. We use it extensively, not only to
enhance the software we offer to researchers, but also to build the
GLiCID infrastructure.</p></blockquote><p>Working with bioinformatics and genomics research teams, <a href="https://hpc.guix.info/events/2023/workshop/speakers/#Ricardo-Wurmus">Ricardo
Wurmus</a>,
also known for his outstanding contributions to Guix, subscribes to this view:</p><blockquote><p>As a software engineer working in large and often changing teams I
depend on Guix to ensure that development environments as well as
complicated production deployments are free from surprises. In my role
to support researchers with complex scientific software environments I
cannot think of a more flexible and reliable foundation for reproducibly
customizable deployments to laptops, HPC systems, and the cloud. I’m
excited to see that the program is full of experience reports and
tutorials by experienced HPC practitioners, and I can’t wait to get a
chance to learn more about how Guix is used in other research
environments.</p></blockquote><p>To <a href="https://hpc.guix.info/events/2023/workshop/speakers/#Nicolas-Vallet">Nicolas
Vallet</a>,
medical doctor and researcher in the Hematology and Cell Therapy department of
the University Hospital of Tours (France), it’s about sharing research results:</p><blockquote><p>As a scientist, I've experienced frustration when attempting to run packages
described in research papers but encountering compatibility issues with my
system. My goal is to ensure that my research will be accessible to a wide
audience, regardless of their location or technical expertise. Guix has
provided me with a solution to achieve it. I'm now proud to share not only
the raw data and analysis pipeline from my projects, but also detailed
instructions on how to recreate the transparent computational environment
used, making my research more accessible to others.</p></blockquote><p>Whether you’re a scientist, a practitioner, a newcomer, or a power user, we’d
love to see you in November.</p><p>Stay tuned for updates!</p>Reproducible research hackathon: experience reportSimon Tournier, Ludovic Courtèsguix-devel@gnu.org2023-07-12T15:20:00Z<p>Two weeks ago, on June 27th, we held an second <a href="https://hpc.guix.info/blog/2023/05/reproducible-research-hackathon-let-redo/">on-line
hackathon</a>
on reproducible research issues. This hackathon was a collaborative effort to
bring GNU Guix to concrete examples inspired by contributions to the online
journal <a href="https://rescience.github.io">ReScience C</a>.</p><p>A small but enthusiastic group of about 5 people connected to the
<code>#guix-hpc</code> IRC channel on Libera.chat and hacked the good
reproducibility hack.
The day was interspersed by three video chats; the first to exchange about
interests, background and working plan, the second to report the work in
progress and the last to address the achievements and list future ideas.</p><p>As we are
<a href="https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/">advocating</a>,
this command line:</p><pre><code>guix time-machine -C channels.scm -- shell -m manifest.scm</code></pre><p>… captures all the requirements for redeploying the same computational
environment. Specifically:</p><ul><li><code>channels.scm</code> pins a specific revision of Guix and potentially <a href="https://hpc.guix.info/channels/">other
channels</a>;</li><li><code>manifest.scm</code> specifies the packages required by the computational
environment.</li></ul><p>The three goals of the hackathon were:</p><ol><li>Pick a <a href="https://github.com/ReScience/submissions/issues?q=is%3Aissue+is%3Adone+">ReScience C
submission</a>
and add these two files: <code>channels.scm</code> and <code>manifest.scm</code>.</li><li>If needed, <a href="https://guix.gnu.org/manual/en/html_node/Defining-Packages.html">define
packages</a>.
These could then go to Guix itself or one of the relevant
dedicated channels:
<a href="https://github.com/guix-science/guix-science">Guix-Science</a>,
<a href="https://gitlab.inria.fr/guix-hpc/guix-past">Guix-Past</a>, etc.</li><li>Identify open issues that hinder reproducibility of software environment
environments.</li></ol><p>Here’s a recap. TLDR, it was a success!</p><h1>Complete “Guixification”</h1><p>These two papers based on Python software were considered:</p><ul><li><a href="https://rescience.github.io/bibliography/Torre-Ortiz_2021.html"><em>[Re] Neural Network Model of Memory
Retrieval</em></a>,
ReScience C 6, 3, #8, 2021.</li><li><a href="https://rescience.github.io/bibliography/Misiek_2022.html"><em>[Re] A general model of hippocampal and dorsal striatal learning and
decision
making</em></a>,
ReScience C 8, 1, #4, 2022.</li></ul><p>Writing the two files, <code>channels.scm</code> and <code>manifest.scm</code>, was rather
straightforward. This led to two pull requests again the original papers:
<a href="https://github.com/c-torre/replication-recanatesi-2015/pull/1">here</a> and
<a href="https://github.com/thomasMisiek/mixed-coordination-models/pull/9">there</a>.
Nothing fancy: most of the work consisted in “translating” the
<code>requirements.txt</code> file used by <code>pip</code> to <code>manifest.scm</code>.</p><p>On a side note, would it be possible to take advantage of GitHub’s
continuous integration,
<em>GitHub Action</em>, to guide the review process? The first idea would be to
let GitHub Action run some part of the numerical processing. However, the
resources offered by GitHub are limited or are not suitable for numerical
experiments. Instead, GitHub Action can be exploited to
<a href="https://guix.gnu.org/manual/devel/en/guix.html#Invoking-guix-pack">pack</a> the
software environment and publish the resulting artifact. For instance, Docker
images are popular and Guix can <a href="https://guix.gnu.org/manual/devel/en/guix.html#index-Docker_002c-build-an-image-with-guix-pack">produce
them</a>;
for details about producing Docker images using Guix on the top of GitHub
Action, see <a href="https://github.com/zimoun/mixed-coordination-models/commit/322e17a60e09e2b17af6c6b04ffdcc67d9990dfa">this
example</a>
based on ReScience article above (8, 1, #4, 2022). In a nutshell, GitHub Action
runs the following command:</p><pre><code>guix time-machine -C channels.scm \
-- pack -f docker --save-provenance -m manifest.scm </code></pre><p>A reviewer could then load this Docker image artifact produced by Guix. Or
they could directly generate the software environment from the files
<code>channels.scm</code> and <code>manifest.scm</code>. Either way, a reviewer is thus able
to inspect the software environment of the submission. Last, because of the
<code>--save-provenance</code> option, the Docker image brings Guix
information for <a href="https://hpc.guix.info/blog/2021/10/when-docker-images-become-fixed-point/">reproducing
itself</a>.</p><h1>Partial port to Guix</h1><p>Other papers tracked by ReScience had been considered:</p><ul><li><a href="https://rescience.github.io/bibliography/Wallrich_2022.html"><em>[Re] Groups of diverse problem-solvers outperform groups of
highest-ability problem-solvers - most of the
time</em></a>,
8, 1, #6, 2022.</li><li><a href="https://rescience.github.io/bibliography/Boersch-Supan_2021.html"><em>[Re] Modeling Insect Phenology Using Ordinal Regression and Continuation
Ratio</em></a>,
7, 1, #5, 2021.</li><li><a href="https://github.com/ReScience/submissions/issues/69"><em>[Re] A circuit model of auditory
cortex</em></a>,
review still pending.</li><li><a href="https://github.com/ReScience/submissions/issues/43"><em>[Re] Particle Image Velocimetry with Optical
Flow</em></a>, initial paper
from 1998 and the reproduction had been sent for the <a href="http://rescience.github.io/ten-years/">Ten Years
Reproducibility Challenge</a>.</li></ul><p>We did not complete the reproduction of all of these papers using Guix
due to lack of time or computational resources. Progress on the first
paper is visible in this <a href="https://github.com/civodul/diversity_abm_replication/tree/guix-environment">Git
repository</a>.
The main pitfall illustrated by this paper is that not all of the
experiment’s source code was available in the repository; some of it was
stored elsewhere on-line and transparently downloaded and run <em>via</em>
Python’s <a href="https://github.com/operatorequals/httpimport"><code>httpimport</code></a>.
This is problematic for several reasons: that code might simply vanish,
it could be modified between the time the authors submitted the paper
and the time someone else attempts to reproduce it, or it could be
<em>maliciously</em> modified. The solution was to get the current copy of the
relevant code inside the repository and to remove uses of <code>httpimport</code>.
This experiment is computationally very expensive though, and we could
not run it on time on our local cluster.</p><p>About the second paper, the main difficulty was related to time zone. The
variable <code>TZDIR</code> required an adjustment. Hopefully, thanks to the
<a href="https://guix.gnu.org/manual/devel/en/guix.html#Inferiors">inferiors</a> Guix
feature, a custom manifest combining two different Guix revisions allows to
generate the software environment based on R ecosystem where the numerical
experiment of the paper can be run.</p><p>The ReScience reviewer of the third paper took advantage of the hackathon for
resuming and trying Guix for the software environment. The files <code>channels.scm</code>
and <code>manifest.scm</code> were created without any big issue. The paper’s
computational experiment runs on Jupyter Notebook, and it runs
out-of-the-box with the <code>--pure</code> option of <code>guix shell</code>—running it with <code>--container</code>,
for improved isolation, is left as an exercising for the reader. One
drawback was that the paper’s author invokes <code>apt install</code> in the middle of
the notebook. On the Guix side, one difficulty was
<a href="https://guix.gnu.org/manual/devel/en/guix.html#Using-TeX-and-LaTeX">finding the right TeX Live
packages</a>;
another one was the interaction with the Python library <code>matplotlib</code>, which can be
troublesome. The session was a double opportunity: dive in Guix-specific
details—this hackathon was the right place to share knowledge!—and
this specific review, which started in March, is now almost finished. Win-win!</p><p>The fourth and last paper were a challenge: produce a software environment
where C code from 1998 can run. And that’s a <a href="https://github.com/ReScience/submissions/issues/43#issuecomment-1611213643">positive
result</a>!
The two tables agree with those in the paper. The C code compiles and runs,
although some warnings are raised and possibly turned off via specific
compiler flags, and the Bash shell scripts are not fully portable and required
minor tweaks. The C code has no dependencies and thus it significantly
simplify the portability and eases the reproducibility.</p><h1>Towards long-term and archivable reproducibility</h1><p>Over the years running Guix daily in scientific context, we have
already identified many potential roadblocks to achieve long-term
reproducible software environments—from unfixed bugs to
unimplemented features. Verifiable environment deployment
can only be achieved when all the following conditions are met:</p><ul><li>availability of all the source code;</li><li>backward-compatibility of the Linux kernel system call interface;</li><li>some compatibility of the hardware (CPU, etc.);</li><li>no “time bomb”—software whose behavior is a function of the current
time.</li></ul><p>This hackathon was a nice opportunity to check their
status and list what already works and what still remains, all based on a
concrete example:</p><ul><li><a href="http://rescience.github.io/bibliography/Courtes_2020.html"><em>[Re] Storage Tradeoffs in a Collaborative Backup Service for Mobile
Devices</em></a>,
ReScience C 6, 1, #6, 2020.</li></ul><p>This paper runs Guix end-to-end: it uses Guix to compile all the requirements,
run all the experiments and last generate the final report. Let us check if two
independent observers are able to verify the same result with three years
between the two observations (2020—2023).</p><p>We know that this paper’s computational experiment is reproducible with
Guix today under “normal circumstances” (<a href="https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone/-/blob/master/README.md">try
it!</a>),
so we set out to experiment with an <em>extreme</em> worst-case scenario: no
pre-built binaries are available—everything needs to be <em>rebuilt from
source</em>—and none of the source code hosting sites is reachable, with the
exception of the <a href="https://www.softwareheritage.org/">Software Heritage
archive</a>. The ambition of Software
Heritage is to collect, preserve, and share all software that is
publicly available in source code form. Guix <a href="https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/">fetches code from
Software
Heritage</a>
as a fallback when source code hosting sites disappear. To our
knowledge, redeploying software under such extreme conditions is
practically impossible, unless of course one is using Guix—or at least
that’s what we wanted to verify.</p><p>In summary, the outcome of this experiment is impressive. Considering
this extreme worst-case setup, it's awesome that it <em>almost</em> works
out-of-the-box. The remaining open issues we identified are:</p><ul><li>Guix user interface annoyances: manual <code>--fallback</code> or <code>--no-substitutes</code>
options and inconsistent error messages.</li><li>Holes in Software Heritage and Disarchive coverage of the source code
we needed.</li><li>Source origin hash mismatches between Guix normalization and Software
Heritage normalization.</li><li>“Time bomb”: the test suite of some packages is failing because it is
time-dependent (<a href="https://issues.guix.gnu.org/56137">example</a>).</li><li>Weaknesses in the <a href="https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/">full-source
bootstrap</a>.</li><li>The archive of all the binary seeds of this bootstrap.</li></ul><p>For the interested reader, take a look at the <a href="https://simon.tournier.info/posts/2023-06-23-hackathon-repro.html">complete
details</a>.
Does it mean we have a roadmap the next hackathon? If you are interested,
we’d love to <a href="https://hpc.guix.info/about">hear your ideas</a>!</p><p>Last but not least, a one-day on-line get-together is a great opportunity to
tackle longstanding topics while helping each other and welcoming newcomers on
board. Thanks to everyone for joining! It’s been a pleasant and productive
experience, so <a href="https://hpc.guix.info/blog/">stay tuned</a> for other rounds!</p>A guide to reproducible research papersLudovic Courtès, Marek Felšöci, Konrad Hinsen, Philippe Swartvagherguix-devel@gnu.org2023-06-23T12:00:00Z<p>A core tenet of science is the ability to independently <em>verify</em>
research results. When computations are involved, verifiability implies
reproducibility: one should be able to re-run the computations to ensure
they get the same results, at which point they may want to start
experimenting with variants of the computational methods, feed it
different data sets, and so on. This is the motivation behind our work
on Guix: we want to empower scientists by providing a tool in support of
reproducible computations <em>and</em> experimentation.</p><p>This article is a guide to using Guix for reproducible research work:
producing research articles with enough information so that anyone,
anytime can re-run the computational experiments it describes. Before
showing how to get this done with Guix, let’s look at existing practices
and see where they fall short.</p><h1>On the difficulty of sharing computational processes</h1><p>A citation attributed to Jon Claerbout summarizes the problem:</p><blockquote><p><em>Published documents are merely the advertisement of scholarship
whereas the computer programs, input data, parameter values,
etc. embody the scholarship itself.</em></p></blockquote><p>Authors of research papers often realize that they need to share
not only the data and source code but also the
software environment they used, somehow. There are two common ways to
do that, sometimes used in combination: recording software package names
and versions in the paper (in an appendix), and providing a ready-to-use
application bundle such as a Docker or virtual machine image.</p><p>Recording name/version pairs is appealing. The intuition is that by
communicating the names and version numbers of my dependencies, someone
can recreate the same environment that I used. However, where should I
stop? If it’s an R program, should I only list R packages? What about
R itself? Should I include the linear algebra libraries R depends on?
And if my code is C/C++, should I include the compiler version number?
The C library? Fellow researcher Konrad Hinsen <a href="https://10years.guix.gnu.org/video/guix-as-a-tool-for-computational-science/">gives this
definition</a>:</p><blockquote><p><em>“Code” is the code you care about.
“Environment” is code you don’t care about.</em></p></blockquote><p>The problem is that all of “the environment” influences the results
produced by the code you care about; it’s hard to make a judgment call
to decide that some things should be excluded and others not.</p><p><img src="/static/images/blog/bioinfo-paper-installation-instructions.png" alt="Installation instructions from a bioinfo paper that leave a bit to be desired." /></p><p>So we see articles with software environments descriptions ranging from
“<em>I used Ubuntu 22.05</em>” to long lists of package name/version—in
research domains where R is used a lot, authors often
provide the output of
<a href="https://rdrr.io/r/utils/sessionInfo.html"><code>sessionInfo()</code></a>
as an appendix—some
even including environment variable definitions! One
obvious issue with those package name/version lists is that they are not
actionable: you’re not going to build and install every single package
version by hand, it’s just not practical. So in the end, they act more
as a hint: if the software behaves differently than what’s described in
the paper, it <em>might be</em> because I’m using a slightly different version
of some dependency.
The second problem is that a name/version pair fails to capture the
complexity of a package dependency graph: it doesn’t tell which build
options were used, whether patches were applied, which optional
dependencies were enabled, and so on.</p><p><img src="/static/images/blog/sc-environment-variables.png" alt="An Artifact Evaluation appendix proudly showing its environment variables." /></p><p>To address that, the other option is <em>ship the bits</em>: provide a Docker
or virtual machine image of containing the software of interest. This
is what more and more conference Artifact Evaluation Committees have
come to recommend. It sure lets you run the code in the right software
environment, but the cost is high: you can’t tell what code
you’re running. The image is a big binary blob that was produced by a
complex computational process (<code>apt install</code>, <code>pip install</code>, <code>make</code>,
etc.) but usually one cannot map its contents back to source code.</p><p>You may object that, if you have the <code>Dockerfile</code>, then it’s fine. It’s
not. <code>Dockerfile</code>s describe a process that is usually not reproducible
since it depends on external resources such as the set of binary
packages distributed by, say, Ubuntu at a given point in time. Even if
it were reproducible, the whole process is fundamentally opaque: it
assembles opaque binaries, starting with a full operating system image
and piling binaries fetched by <code>pip</code> or other tools.</p><p>Conversely, Guix is, at its core, about providing <a href="https://reproducible-builds.org/"><em>a verifiable path
from source code to binary</em></a>. Guix
packages are essentially <a href="https://guix.gnu.org/manual/en/html_node/Defining-Packages.html">source
code</a>
that describes how to build software from source.</p><p>Our goal in the remainder of this article is to provide a step-by-step
guide on using Guix to manage the software environment of your research
software.</p><h1>Executable provenance meta-data</h1><p>With Guix as the basis of your computational workflow, you can get
what’s in essence <em>executable provenance meta-data</em>: it’s like that long
list of package name/version pairs, except more precise and immediately
deployable. Let’s see how this can be achieved.</p><h2>Step 1: Setting up the environment</h2><p>The first step will be to identify precisely what packages you need in
your software environment. Assuming you have a Python script that uses
NumPy, you can start by creating an environment that contains these two
packages and <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-shell.html">try to run your code in that
environment</a>:</p><pre><code>guix shell -C python python-numpy -- python3 ./myscript.py</code></pre><p>The <code>-C</code> flag here (or <code>--container</code>) instructs <code>guix shell</code> to create
that environment in an isolated container with nothing but the two
packages you asked for. That way, if <code>./myscript.py</code> needs more than these
two packages, it’ll fail to run and you’ll immediately notice. On some
systems <code>--container</code> is not supported; in that case, you can resort to
<code>--pure</code> instead.</p><p>Perhaps you’ll find that you also need Pandas and add it to the
environment:</p><pre><code>guix shell -C python python-numpy python-pandas -- \
python3 ./myscript.py</code></pre><p>If you fail to guess the name of the package (this one was easy!), try
<code>guix search</code>.</p><p>Environments for Python, R, and similar high-level languages are
relatively easy to set up. For C/C++ code, you may find need many more
packages:</p><pre><code>guix shell -C gcc-toolchain cmake coreutils grep sed make openmpi -- …</code></pre><p>Or perhaps you’ll find that you could just as well provide a
<a href="https://guix.gnu.org/manual/devel/en/html_node/Defining-Packages.html">definition</a>
for your package.</p><p>Eventually, you’ll have a list of packages that satisfies your needs.</p><blockquote><p><strong>What if a package is missing?</strong> Guix and the main scientific and
HPC channels provide about <a href="https://hpc.guix.info/browse">25,000
packages</a> today. Yet, there’s always
the possibility that the one package you need is missing. In that
case, you will need to provide a <a href="https://guix.gnu.org/manual/devel/en/html_node/Defining-Packages.html">package
definition</a>
for it in a <a href="https://guix.gnu.org/manual/devel/en/html_node/Creating-a-Channel.html">dedicated
channel</a>
of yours. For software in Python, R, and other high-level languages,
most of the work can usually be automated by using <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-import.html"><code>guix import</code></a>.
<a href="https://guix.gnu.org/contact/">Join the friendly Guix community</a> to
get help!</p></blockquote><h2>Step 2: Recording the environment</h2><p>Now that you have that <code>guix shell</code> command line with a list of
packages, the best course of action is to save it in a <em>manifest</em>
file—essentially a software bill of materials—that Guix can then
ingest. There are <a href="https://guix.gnu.org/manual/devel/en/html_node/Writing-Manifests.html">other ways to do
that</a>
but the easiest way to get started is by “translating” your command line
into a manifest:</p><pre><code>guix shell python python-numpy python-pandas \
--export-manifest > manifest.scm</code></pre><p>Put that manifest under version control! From there anyone can redeploy
the software environment described by the manifest and run code in that
environment:</p><pre><code>guix shell -C -m manifest.scm -- python3 ./myscript.py</code></pre><p>Here’s what <code>manifest.scm</code> reads:</p><pre><code class="language-scheme">;; What follows is a "manifest" equivalent to the command line you gave.
;; You can store it in a file that you may then pass to any 'guix' command
;; that accepts a '--manifest' (or '-m') option.
(specifications->manifest
(list "python" "python-numpy" "python-pandas"))</code></pre><p>It’s a code snippet that lists packages. Notice that there are no version
numbers! Indeed, these version numbers are specified in package definitions,
located in Guix channels. To allow others to reproduce the exact same
environment as the one you’re running, you need to <em>pin Guix itself</em> , by
<a href="https://guix.gnu.org/manual/devel/en/html_node/Replicating-Guix.html">capturing the current Guix channel commits with <code>guix describe</code></a>:</p><pre><code>guix describe -f channels > channels.scm</code></pre><p>This <code>channels.scm</code> file is similar in spirit to “lock files” that some
deployment tools employ to pin package revisions. You should also keep
it under version control in your code, and possibly update it once in a
while when you feel like running your code against newer versions of its
dependencies. With this file, anyone, <em>at any time and on any machine</em>,
can now reproduce the exact same environment by running:</p><pre><code>guix time-machine -C channels.scm -- shell -C -m manifest.scm -- \
python3 ./myscript.py</code></pre><p>In this example we rely solely on the <code>guix</code> channel, which provides the
Python packages we need. Perhaps some of the packages you need live <a href="https://hpc.guix.info/channels">in
other channels</a>—maybe <code>guix-cran</code> if you
use R, maybe <code>guix-science</code>. That’s fine: <code>guix describe</code> also captures
that.</p><p>Of course do include a <code>README</code> file giving the exact command to run the
code. Not everyone uses Guix so it can be helpful to also provide
minimal non-Guix setup instructions: which package versions are used,
how software is built, etc. As we have seen, such instructions would
likely be inaccurate and inconvenient to follow at best. Yet, it can be
a useful starting point to someone trying to recreate a <em>similar</em>
environment using different tools. It should probably be presented as
such, with the understanding that the only way to get the <em>same</em>
environment is to use Guix.</p><h2>Step 3: Ensuring long-term source code archival</h2><p>We insisted on version control before: for the <code>manifest.scm</code> and
<code>channels.scm</code> files, but of course also for your own code. Our
recommendation is to have these two <code>.scm</code> files in the same repository
as the code they’re about.</p><p><img src="/static/images/blog/software-heritage-logo-title.svg" alt="Logo of Software Heritage" /></p><p>Since the goal is enabling reproducibility, source code
availability is a prime concern. Source code hosting services come and
go and we don’t want our code to vanish in a whim and render our
published research work unverifiable. <a href="https://www.softwareheritage.org/">Software
Heritage</a> (SWH for short) is <em>the</em> solution
for this: SWH archives public source code and provides unique intrinsic
identifiers to refer to
it—<a href="https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html"><em>SWHIDs</em></a>.
Guix itself is <a href="https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/">connected to
SWH</a>
to (1) ensure that the source code of its packages is archived, and
(2) to fall back to downloading from the SWH archive should code vanish
from its original site.</p><p>Once your own code is available in a public version-control repository,
such as a Git repository on your lab’s hosting service, you can ask SWH
to archive it by going to its <a href="https://archive.softwareheritage.org/save/">Save Code
Now</a> interface. SWH will
process the request asynchronously and eventually you’ll find your code
has made it into <a href="https://archive.softwareheritage.org/">the archive</a>.</p><h2>Step 4: Referencing the software environment</h2><p>This brings us to the last step: referring to our code <em>and</em> software
environment in our beloved paper. We already have all our code and Guix
files in the same repository, which is archived on SWH. Thanks to SWH,
we now have a SWHID, which uniquely identifies the relevant revision of
our code.</p><p>Following <a href="https://www.softwareheritage.org/howto-archive-and-reference-your-code/">SWH’s own
guide</a>,
we’ll pick an <code>swh:dir</code> kind of identifier, which refers to the
directory of the relevant revision/commit of our repository, and we’ll
keep <em>contextual info</em> for clarity—that includes the original URL.
Putting it all together, we’ll conclude our paper with a sentence along
these lines:</p><blockquote><p>The source code used to produce this study, as well as instructions to
run it in the right software environment using GNU Guix, is archived
on Software Heritage as
<a href="https://archive.softwareheritage.org/swh:1:dir:cc8919d7705fbaa31efa677ce00bef7eb374fb80;origin=https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone;visit=swh:1:snp:71a4d08ef4a2e8455b67ef0c6b82349e82870b46;anchor=swh:1:rev:36fde7e5ba289c4c3e30d9afccebbe0cfe83853a"><code>swh:1:dir:cc8919d7705fbaa31efa677ce00bef7eb374fb80;origin=https://gitlab.inria.fr/lcourtes-phd/edcc-2006-redone;visit=swh:1:snp:71a4d08ef4a2e8455b67ef0c6b82349e82870b46;anchor=swh:1:rev:36fde7e5ba289c4c3e30d9afccebbe0cfe83853a</code></a>.</p></blockquote><p>With this information, the reader can:</p><ul><li>get the source code;</li><li>reproduce its software environment with <code>guix time-machine</code> and run
the code;</li><li>inspect and possibly modify both the code and its environment.</li></ul><p>Mission accomplished!</p><h1>Examples</h1><p>Perhaps you don’t feel adventurous enough to be the first one to follow this
methodology. Worry not: you won’t be the first! Here are examples of
reproducible papers built along the lines of this guide (with some variations),
in several different fields:</p><p><img src="/static/images/blog/paper-swh-link.png" alt="Research paper linking to a repository using its SWHID." /></p><ul><li>Philippe Swartvagher et al., <em>Tracing task-based runtime systems: feedbacks
from the StarPU case</em>. This article studies the impact of tracing complex
HPC applications, especially what are the sources of performance
degradation when an application execution is traced; evaluates the
solutions to reduce the tracing overhead; and explores clock
synchronization issues when distributed applications are traced. The paper
is still under review but its content is available in Philippe's
<a href="https://theses.hal.science/tel-03989856">thesis</a>. Considered applications
are C programs using MPI, launched with Slurm, then Python scripts are used
to process results and generate plots. The <a href="https://gitlab.inria.fr/pswartva/paper-starpu-traces-r13y">companion
repository</a>
contains instructions and scripts to reproduce the whole study.</li><li>Emmanuel Agullo, Marek Felšöci, Guillaume Sylvand, <a href="https://hal.inria.fr/hal-03263603"><em>A comparison of selected
solvers for coupled FEM/BEM linear systems arising from discretization of
aeroacoustic problems</em></a> with the
associated <a href="https://hal.inria.fr/hal-03263620">technical report</a> describing
the experimental environment and providing instructions for reproducing the
experiments. Experiments in this study rely on private
industrial code and can thus be reproduced only by a limited
number of people. However, the publicly available material provides everyone
with a fully documented example of building reproducible experimental
studies within a constrained industrial context thanks the association of
GNU Guix and the <a href="http://www.literateprogramming.com">literate programming</a>
in <a href="https://orgmode.org">Org mode</a>.</li><li>Vic-Fabienne Schumann et al., <a href="https://doi.org/10.1016/j.scitotenv.2022.158931"><em>SARS-CoV-2 infection dynamics
revealed by wastewater sequencing analysis and
deconvolution</em></a>
(<a href="https://www.medrxiv.org/content/10.1101/2021.11.30.21266952v3">preprint</a>).
The pipeline used to compute the results shown in the article is
made with <a href="https://bioinformatics.mdc-berlin.de/pigx/">PIGx</a>, a tool
and collection of genomics pipelines that builds upon Guix. The
“Data/Code Availability” section links to a
<a href="https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/BIMSBbioinfo/pigx_sars-cov-2">repository</a>
that contains the manifest and channels files that were used and
instructions to run the analysis.</li><li><p>Three contributions to the <a href="https://rescience.github.io/ten-years/">Ten Years Reproducibility
Challenge</a> organized by the
ReScience C journal. In each article, the link to the code repository is at the
bottom of the first page.</p><ul><li><p>Ludovic Courtès, <em>[Re] Storage Tradeoffs in a Collaborative
Backup Service for Mobile Devices</em>, <a href="http://dx.doi.org/10.5281/zenodo.3886739">ReScience C 6, 1,
#6</a>. This article
reproduces the results of a 10-year old article. Experiments in
the original article involved a complex software stack and did
not use Guix (it actually predates Guix!). The article shows
how to come up with a similar software environments a decade
later, and how to use Guix to produce a pipeline that goes
<a href="https://hpc.guix.info/blog/2020/06/reproducible-research-articles-from-source-code-to-pdf/"><em>from source code to
PDF</em></a>.</p></li><li><p>Konrad Hinsen, <em>[¬Rp] Stokes drag on conglomerates of spheres</em>,
<a href="http://dx.doi.org/10.5281/zenodo.3889694">ReScience C 6, 1,
#7</a>. Tries to
reproduce a study in computational fluid dynamics, based on
Fortran code published in 1993. Ultimately fails because some of
the code was lost, but the surviving code works nicely in a
reproducible Guix environment.</p></li><li><p>Konrad Hinsen, <em>[Rp] Structural flexibility in proteins — impact
of the crystal environment</em>, <a href="http://dx.doi.org/10.5281/zenodo.3886447">ReScience C 6, 1,
#5</a>. Describes the
reproduction of a computation of the normal modes of protein
crystals, originally done in 2008 using Python scripts that no
longer work with modern Python versions. A Guix environment
based on the channel
<a href="https://gitlab.inria.fr/guix-hpc/guix-past">guix-past</a>
makes it possible to run historical versions of Python and some
of its libraries.</p></li></ul></li></ul><h1>Wrap-up</h1><p>The key takeaways of this guide for reproducible papers are:</p><ul><li>Recording package name/version is often of little help when it comes
to running the code; conversely, providing an opaque image makes it
easy to run the code but prevents verifiability and experimentation.</li><li>Guix lets you record the software environment with two files:
<code>manifest.scm</code>, which lists software packages, and <code>channels.scm</code>,
which pins Guix and its channels to a specific revision.</li><li>A combined command consumes these files and reproduces the exact
same software environment: <code>guix time-machine -C channels.scm -- shell -m manifest.scm</code>.</li><li>With these files and your code under version control and archived on
Software Heritage, it’s enough to share one SWHID in your paper.</li></ul><p>Here are resources to learn more about this whole process:</p><ul><li><a href="https://doi.org/10.1038/s41597-022-01720-9"><em>Toward practical transparent verifiable and long-term reproducible
research using Guix</em></a>,
Nature Scientific Data article (volume 9, Oct. 2022) by N. Vallet
<em>et al.</em></li><li><a href="https://10years.guix.gnu.org/video/guix-as-a-tool-for-computational-science/"><em>Guix as a tool for computational
science</em></a>,
talk by K. Hinsen at the Ten Years of Guix event</li><li><a href="https://10years.guix.gnu.org/video/using-guix-for-scientific-reproducible-and-publishable-experiments/"><em>Using Guix for scientific, reproducible, and publishable
experiments</em></a>,
talk by P. Swartvagher at the same venue</li><li><a href="https://10years.guix.gnu.org/video/archive-reference-describe-and-cite-software-source-code-a-pathway-to-reproducibility/"><em>Archive, reference, describe and cite software source code: a
pathway to
reproducibility</em></a>,
talk by M. Gruenpeter at the same venue</li><li><a href="https://tuto-techno-guix-hpc.gitlabpages.inria.fr/guidelines/">Guix and Org mode, a powerful association for building a reproducible
research
study</a>, a
self-contained tutorial by M. Felšöci.</li></ul><p>If you’re interested, please join our next <a href="https://hpc.guix.info/blog/2023/05/reproducible-research-hackathon-let-redo/">Reproducible Research
Hackathon</a>,
which will take place on-line on June 27th, 2023, come to the <a href="/events/2023/workshop">Workshop
on Reproducible Software Environments</a> in
November 2023, and/or subscribe to the <a href="https://guix.gnu.org/en/contact/"><code>guix-science</code> mailing
list</a>!</p>Reproducible Research Hackathon—let’s redo!Simon Tournierguix-devel@gnu.org2023-05-12T12:00:00Z<p>It's time to run the second Reproducible Research hackathon! The first one
was
from... <a href="/blog/2020/07/reproducible-research-hackathon-experience-report/">2020</a>,
already! The date: <strong>Tuesday June, 27th</strong>. Start: 9h30 (CEST) End: 17h30.</p><blockquote><p><strong>Update</strong>: Check out <a href="https://hpc.guix.info/blog/2023/07/reproducible-research-hackathon-experience-report/">this
report</a>
about the hackathon.</p></blockquote><p>We propose to collectively tackle some of the issues about reproducible
research:</p><ul><li>identify stumbling blocks in using Guix to write end-to-end pipelines,</li><li>document how to achieve this,</li><li>feed the <a href="https://gitlab.inria.fr/guix-hpc/guix-past">Guix-Past</a> channel
by other old packages,</li><li>provide <code>guix.scm</code> for some papers already published.</li></ul><p>Anyone is welcome! Feel free to join if you would like to hack with us.</p><p>We suggest to pick articles from the <a href="https://rescience.github.io/">ReScience
C</a> or <a href="https://computo.sfds.asso.fr/">COMPUTO</a> –
they provide a high level of transparency about the materials required for
redoing. The best experiment would to choose articles from
<a href="https://rescience.github.io/read/#volume-6-2020">2020</a>. As a warm up, maybe
<a href="https://rescience.github.io/bibliography/Courtes_2020.html">Courtès, L., <em>Storage Tradeoffs in a Collaborative Backup Service for Mobile
Devices</em></a>? Or else
from the <a href="https://rescience.github.io/ten-years/">Ten Years Reproducibility
Challenge</a> which took advantage of
GNU Guix. If you prefer to work on your own topic that you would like to
redo, you are welcome.</p><p>We will meet <strong>Tuesday 27th June</strong> at <strong>9:30 CEST</strong> on the <code>#guix-hpc</code> channel
of irc.libera.chat You can use this <a href="https://web.libera.chat/?nick=PotentialUser-?#guix-hpc">web
client</a> (set the
nickname you wish) to reach us. We will provide a link for one BigBlueButton
instance (video meeting), stay tuned!</p><blockquote><p>▶ Join us on <a href="https://bbb.inria.fr/cou-pmi-g09-ute">BigBlueButton</a> at
9:30 CEST! ◀</p><p>Here’s a <a href="https://codimd.math.cnrs.fr/DZB3kDhZTT-HxwIm-wc0Dw">pad 🗒</a>
for note-taking during the day.</p></blockquote><p>At the end of day, we would like to draw the lines of an experiment report
summarizing the successes or the roadblocks.</p><hr /><p>There's a lot we can do and we'd love to <a href="https://hpc.guix.info/about">hear your
ideas</a>!</p><p>Drop us an email at <a href="mailto:guix-science@gnu.org"><code>guix-science@gnu.org</code></a>.</p>Continuous integration and continuous delivery for HPCLudovic Courtèsguix-devel@gnu.org2023-03-06T15:00:00Z<p>Will those binaries <em>actually work</em>? This is a central question for HPC
practitioners and one that’s sometimes hard to answer: increasingly
complex software stacks being deployed, and often on a variety of
clusters. Will that program pick the right libraries? Will it perform
well? With each cluster having its own hardware characteristics,
portability is often considered unachievable. As a result, HPC
practitioners rarely take advantage of <a href="https://en.wikipedia.org/wiki/Continuous_integration">continuous integration and
continuous
delivery</a> (CI/CD):
building software locally on the cluster is common, and software
validation is often a costly manual process that has to be repeated on
each cluster.</p><p>We discussed before that use of pre-built binaries is not inherently an
obstacle to performance, be it for
<a href="https://hpc.guix.info/blog/2019/12/optimized-and-portable-open-mpi-packaging/">networking</a>
or for
<a href="https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/">code</a>—a
property often referred to as <em>performance portability</em>. Thanks to
performance portability, continuous delivery <em>is</em> an option in HPC. In
this article, we show how Guix users and system administrators have
benefited from continuous integration and continuous delivery on HPC
clusters.</p><h1>Hermetic builds</h1><p>But first things first: before we talk about continuous integration, we
need to talk about <em>hermetic</em> or <em>isolated</em> builds. One of the key
insights of the <a href="https://edolstra.github.io/pubs/phd-thesis.pdf">pioneering work of Eelco Dolstra on the Nix package
manager</a> is this: by
building software in isolated environments, we can eliminate
interference with the rest of the system and practically achieve
<a href="https://reproducible-builds.org/docs/definition">reproducible builds</a>.
Simply put, if Alice runs a build process in an isolated environment on
a supercomputer, and Bob runs the same build process in an isolated
environment on their laptop, they’ll get the same output (unless of
course the build process is not deterministic).</p><p>From that perspective, pre-built binaries in Guix (and Nix) are merely
<a href="https://guix.gnu.org/en/manual/en/html_node/Substitutes.html"><em>substitutes for local
builds</em></a>:
you can choose to build things locally, but <em>as an optimization</em> you may
just as well fetch the build result from someone you trust—since it’s
the same as what you’d get anyway.</p><p>A closely related property is full control of the software package
dependency graph. Guix package definitions stand alone: they can only
refer to one another and cannot refer to software that happens to be
available on the machine in <code>/usr/lib64</code>, say—that directory is not even
visible in the isolated build environment! Thus, a package in Guix has
its dependencies fully specified, down to the C library—and even
<a href="https://guix.gnu.org/en/blog/2020/guix-further-reduces-bootstrap-seed-to-25/">further
down</a>.</p><p>Thanks to hermetic builds and standalone dependency graphs, sharing
binaries is safe: by shipping the package and all its dependencies,
without making any assumptions on software already available on the
cluster, you control what you’re going to run.</p><h1>Continuous integration & continuous delivery</h1><p>Guix uses continuous integration to build its more than 22,000 packages
on several architectures: x86_64, i686, AArch64, ARMv7, and POWER9. The
project has two independent build farms. The main one, known as
<a href="https://ci.guix.gnu.org"><code>ci.guix.gnu.org</code></a>, was generously donated by
the <a href="https://www.mdc-berlin.de/">Max Delbrück Center for Molecular Medicine
(MDC)</a> in Germany; it has more than twenty
64-core x86-64/i686 build machines and a dozen of build machines for the
remaining architectures.</p><p><img src="/static/images/blog/package-workflow.png" alt="Diagram showing the Guix packaging workflow." /></p><p>The diagram above illustrates the packaging workflow in Guix, which can
be summarized as follows:</p><ol><li>packagers write a <a href="https://guix.gnu.org/manual/en/html_node/Defining-Packages.html">package
definition</a>;</li><li>they test it locally by using <code>guix build</code>;</li><li>eventually someone with commit access pushes the changes to the Git
repository;</li><li>build farms pull from the repository and build the new package.</li></ol><p>Build farms are a quality assurance tool for packagers. For instance,
<a href="https://ci.guix.gnu.org"><code>ci.guix</code></a> runs
<a href="https://guix.gnu.org/en/cuirass">Cuirass</a>. The web interface often
surprises newcomers—it sure looks different from those of Jenkins or
GitLab-CI!—but the key part is that it provides a dashboard that one can
navigate to look for packages that fail to build, fetch build logs, and
so on.</p><p>A big difference with traditional continuous integration tools is that
build results from the build farm are not thrown away: by running <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-publish.html"><code>guix publish</code></a>
on the build farm, those binaries are made accessible to Guix users. Any
Guix user may add <code>ci.guix.gnu.org</code> to their <a href="https://guix.gnu.org/manual/en/html_node/Getting-Substitutes-from-Other-Servers.html">list of substitute
URLs</a>
and they will transparently get binaries from that server.</p><p>One can check whether pre-built binaries of specific packages are
available on substitute servers by running <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-weather.html"><code>guix weather</code></a>:</p><pre><code>$ guix weather gromacs petsc scotch
computing 3 package derivations for x86_64-linux...
looking for 5 store items on https://ci.guix.gnu.org...
https://ci.guix.gnu.org ☀
100.0% substitutes available (5 out of 5)
at least 41.5 MiB of nars (compressed)
109.6 MiB on disk (uncompressed)
0.112 seconds per request (0.2 seconds in total)
8.9 requests per second
looking for 5 store items on https://bordeaux.guix.gnu.org...
https://bordeaux.guix.gnu.org ☀
100.0% substitutes available (5 out of 5)
at least 30.0 MiB of nars (compressed)
109.6 MiB on disk (uncompressed)
0.051 seconds per request (0.2 seconds in total)
19.7 requests per second</code></pre><p>That way, one can immediately tell whether deployment will be quick or
whether they’ll have to <a href="https://xkcd.com/303/">wait for compilation to
complete</a>…</p><h1>Publishing binaries for third-party channels</h1><p>Our research institutes typically have
<a href="https://hpc.guix.info/channels">channels</a> providing packages for their
own software or software related to their field. How can they benefit
from continuous integration and continuous delivery?</p><p><img src="/static/images/blog/cuirass-evaluation-page.png" alt="Screenshot of Cuirass showing failing and succeeding package builds." /></p><p>At Inria, we set up a <a href="https://guix.bordeaux.inria.fr">build farm</a> that
runs Cuirass and publishes its binaries with <code>guix publish</code>. Cuirass is
configured to build the packages of selected channels such as
<a href="https://gitlab.inria.fr/guix-hpc/guix-hpc"><code>guix-hpc</code></a> and
<a href="https://github.com/guix-science/guix-science"><code>guix-science</code></a> (the Guix
manual
<a href="https://guix.gnu.org/manual/en/html_node/Continuous-Integration.html#index-cuirass_002dservice_002dtype">explains</a>
how to set up Cuirass on Guix System; you can also check out the
<a href="https://gitlab.inria.fr/guix-hpc/sysadmin/-/blob/master/head-node.scm">configuration</a>
of this build farm for details). That way, it complements the official
build farms of the Guix project.</p><p>The HPC clusters that the teams at Inria use, in particular
<a href="https://plafrim.fr">PlaFRIM</a> and <a href="https://grid5000.fr">Grid’5000</a>, are
set up to fetch substitutes from <code>https://guix.bordeaux.inria.fr</code> in
addition to the Guix’s default substitute servers. When deploying
packages from our channels on one of these clusters, binaries are
readily available—a significant productivity boost! That also applies
to <a href="https://hpc.guix.info/blog/2022/01/tuning-packages-for-a-cpu-micro-architecture/">binaries tuned for a specific CPU
micro-architecture</a>.</p><p>The Grid’5000 setup takes advantage of this flexibility in interesting
ways. Grid’5000 is a “cluster of clusters” with 8 sites, each of which
has its own Guix installation. To share binaries among sites, each site
runs a <code>guix publish</code> instance, and each site has the other sites in its
list of substitute URLs. That way, if a site has already built, say,
Open MPI, the other sites will transparently fetch Open MPI binaries
from it instead of rebuilding it.</p><p>While Cuirass is a fine continuous integration tool tightly integrated
with Guix, it’s also entirely possible to use one of the mainstream
tools instead. Here are examples of computing infrastructure that
publishes pre-built binaries:</p><ul><li><a href="https://www.glicid.fr/">GliCID</a>, the Tier-2 cluster for the region
of Nantes (France), builds packages with Cuirass and <a href="https://guix-substitutes.glicid.fr">publishes
binaries</a>.</li><li><a href="https://leibniz-psychology.org">ZPID</a> <a href="https://substitutes.guix.psychnotebook.org/">publishes
binaries</a> of relevant
packages built with a simple <a href="https://github.com/leibniz-psychology/psychnotebook-deploy/blob/a314d74d1eae432419bd5218bd37cbe49dcef31d/src/zpid/machines/yamunanagar/ci.scm#L16">cron
script</a>.</li><li>GeneNetwork runs continuous integration jobs with
<a href="https://ci.genenetwork.org/">Laminar</a> and
<a href="http://guix.genenetwork.org">publishes</a> the resulting binaries.</li><li>Phil Beadling of Quantile Technologies
<a href="https://www.cloudbees.com/videos/purely-functional-ci-cd-pipeline-using-jenkins-with-guix">explained</a>
how they integrated Guix in their Jenkins CI/CD pipeline.</li></ul><p>As you can see, there’s a whole gamut of possibilities, ranging from the
“low-tech” setup to the fully-featured CI/CD pipeline. In all of these,
<code>guix publish</code> takes care of the publication part. If your focus is on
delivering binaries for a small set of packages, a periodic cron job as
shown above is good enough. If you’re dealing with a large package set
and are also interested in quality assurance, a tool like Cuirass may be
more appropriate.</p><h1>Wrapping up</h1><p>We computer users all too often work in silos. Developers might have
their own build and deployment machinery that they use for continuous
integration (GitLab-CI with some custom Docker image?); system
administrators might deploy software on clusters in their own way
(Singularity image? environment modules?); and users might end up
running yet other binaries (locally built? custom-made?). We got used
to it, but if we take a step back, it looks like this is one and the
same activity with a different cloak depending on who you’re talking to.</p><p>Guix provides a unified approach to software deployment; building,
deploying, publishing binaries, and even <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-pack.html">building container
images</a>
all build upon the same fundamental mechanisms. We have seen in this
blog post that this makes it easy to continuously build and publish
package binaries. The productivity boost is twofold: local
recompilation goes away, and site-specific software validation is
reduced to its minimum.</p><p>For HPC practitioners and hardware vendors, this is a game changer.</p><h1>Acknowledgments</h1><p>Thanks to Lars-Dominik Braun, Simon Tournier, and Ricardo Wurmus for
their insightful comments on an earlier draft of this post.</p>Guix-HPC Activity Report, 2022Céline Acary-Robert, Ludovic Courtès, Yann Dupont, Marek Felšöci, Konrad Hinsen, Ontje Lünsdorf, Pjotr Prins, Philippe Swartvagher, Simon Tournier, Ricardo Wurmusguix-devel@gnu.org2023-02-10T15:45:00Z<p><em>This document is also available as
<a href="https://hpc.guix.info/static/doc/activity-report-2022.pdf">PDF</a>
(<a href="https://hpc.guix.info/static/doc/activity-report-2022-booklet.pdf">printable
booklet</a>).</em></p><p>Guix-HPC is a collaborative effort to bring reproducible software
deployment to scientific workflows and high-performance computing (HPC).
Guix-HPC builds upon the <a href="https://guix.gnu.org">GNU Guix</a> software
deployment tools and aims to make them useful for HPC practitioners
and scientists concerned with dependency graph control and customization and, uniquely, reproducible research.</p><p>Guix-HPC was launched in September 2017 as a joint software development
project involving three research institutes:
<a href="https://www.inria.fr/en/">Inria</a>, the <a href="https://www.mdc-berlin.de/">Max Delbrück Center for
Molecular Medicine (MDC)</a>, and the <a href="https://ubc.uu.nl/">Utrecht
Bioinformatics Center (UBC)</a>. GNU Guix for HPC and
reproducible science has received contributions from additional
individuals and organizations, including <a href="https://www.cnrs.fr/en">CNRS</a>,
<a href="https://u-paris.fr/en/">Université Paris Cité</a>, the <a href="https://uthsc.edu/">University of
Tennessee Health Science Center</a> (UTHSC), the
<a href="https://leibniz-psychology.org/">Leibniz Institute for Psychology</a>
(ZPID), <a href="https://www.csl.cornell.edu/">Cornell University</a>, and a
growing number of organizations deploying Guix on their HPC clusters.</p><p>This report highlights key achievements of Guix-HPC between <a href="https://hpc.guix.info/blog/2022/02/guix-hpc-activity-report-2021/">our
previous
report</a>
a year ago and today, February 2023. This year was marked by exciting
developments for HPC and reproducible workflows: the release of
<a href="https://guix.gnu.org/en/blog/2022/gnu-guix-1.4.0-released/">GNU Guix 1.4.0 in
December</a>,
the celebration of ten years of Guix with a <a href="https://10years.guix.gnu.org">three-day
conference</a>, several releases of the Guix
Workflow Language (GWL), more work on supporting RISC-V processors, and
more publications relying on Guix as a foundation for reproducible
computational workflows.</p><h1>Outline</h1><p>Guix-HPC aims to tackle the following high-level objectives:</p><ul><li><em>Reproducible scientific workflows.</em> Improve the GNU Guix tool set
to better support reproducible scientific workflows and to simplify
sharing and publication of software environments.</li><li><em>Cluster usage.</em> Streamlining Guix deployment on HPC clusters, and
providing interoperability with clusters not running Guix.</li><li><em>Outreach & user support.</em> Reaching out to the HPC and scientific
research communities and organizing training sessions.</li></ul><p>The following sections detail work that has been carried out in each of
these areas.</p><h1>Reproducible Scientific Workflows</h1><p><img src="https://hpc.guix.info/static/images/blog/lab-book.svg" alt="Lab book." /></p><p>Supporting reproducible research workflows is a major goal for Guix-HPC.</p><h2>Guix Workflow Language</h2><p>The <a href="https://workflows.guix.info">Guix Workflow Language</a> (or GWL) is
a scientific computing extension to GNU Guix's declarative language
for package management. It allows for the declaration of scientific
workflows, which will always run in reproducible environments that GNU
Guix automatically prepares. The general idea with the GWL is a
simple inversion of priorities: put reproducible software deployment
<em>first</em> and <em>extend</em> the deployment infrastructure provided by Guix
with tools to declare and run workflows. As a consequence, the GWL
benefits directly from the continued development of Guix's salient
features pertaining to software reproducibility and reliable,
predictable deployment. Much of the work on the GWL is thus aimed at
recasting these features through the lens of a domain-specific
language for describing workflows as a graph of processes that are
inextricably linked with their associated software stacks.</p><p>The year 2022 saw three releases of the Guix Workflow Language:
version 0.4.0 on January 28, version 0.5.0 on July 21, and version
0.5.1 on November 13, representing the cumulative efforts of four
contributors. The changes include fixes to errors discovered in
active use of the GWL for scientific workflows, adjustments in the
details of how the GWL extends Guix, and laying the groundwork for
improved performance.</p><p><img src="/static/images/blog/gwl-logo-black.png" alt="Logo of the Guix Workflow Language." /></p><p>The German National Research Data Infrastructure—specifically its
engineering sciences branch <a href="https://nfdi4ing.de/">NFDI4Ing</a>—
recognizes workflow management systems as an important tool towards
reproducible and reusable scientific workflows. A special interest
group discussed and compared several workflow management systems,
including GWL, along three different user story perspectives. The
discussion paper entitled “Evaluation of tools for describing,
reproducing and reusing scientific workflows” highlights GWL’s
abilities to easily reproduce compute environments and to provide
precise software provenance tracking as well as its flexible workflow
definition. The special interest group recommends the GWL to
specialists with high requirements on software reproducibility and
integrity. The <a href="https://preprints.inggrid.org/repository/view/5/">preprint of the discussion paper is available
here</a>.</p><h2>Reproducible GNU R Environments</h2><p>The R language is widely used for statistics in general and notably in
bioinformatics. A common practice for
installing R packages, from within the <code>R</code> session, is to run the
<code>install.packages</code> utility: it allows users to download and install packages
from <a href="https://cran.org">CRAN</a> and CRAN-like
repositories such as <a href="https://bioconductor.org/">Bioconductor</a>, or from
local files.</p><p>While convenient, use of <code>install.packages</code> raises the question of the
level of control over the software “supply chain”. Some R packages are not just plain
R scripts and instead also contain C, C++, or Fortran parts, mainly for performance,
or require external system-wide dependencies unmanaged by
<code>install.packages</code>, such as linear algebra libraries. Therefore, computational
environments populated with the builtin utility <code>install.packages</code> might not
be reproducible from one machine to another.</p><p>This is where the <code>r-guix-install</code> package comes in. <code>r-guix-install</code>,
<a href="https://CRAN.R-project.org/package=guix.install">which is available on
CRAN</a>, allows users to
install R packages <em>via</em> Guix from within the running <code>R</code> session,
similarly as <code>install.packages</code> but where the complete supply chain is
controlled by Guix. In addition, if the requested R package does not
exist in Guix at this time, the package and all its missing dependencies
will be imported recursively and the generated package definitions will
be written to <code>~/.Rguix/packages.scm</code>. This record of imported packages
can be used later to reproduce the environment, and to add the packages
in question to a proper Guix channel (or to Guix itself). <code>guix.install()</code>
not only supports installing packages from CRAN, but also from
Bioconductor or even arbitrary Git or Mercurial repositories, replacing
the need for installation <em>via</em> <code>devtools</code>.</p><p>While this approach works well for individual users, Guix installations with a
larger user base, for instance institution-wide, would benefit from the
default availability of the entire CRAN package collection with pre-built
substitutes to speed up installation times. Additionally, reproducing
environments would include fewer steps if the package recipes were available
to anyone by default.</p><p><img src="/static/images/blog/guix-cran.png" alt="Logo of Guix-CRAN." /></p><p>The new <a href="https://github.com/guix-science/guix-cran"><code>guix-cran</code></a>
channel was built to address that issue. It extends the package collection by providing all CRAN
packages missing in Guix proper and has all of the properties mentioned above.</p><p>Creating and updating <code>guix-cran</code> is fully automated and happens without any
human intervention. The channel itself is always in a usable state, because
updates are tested with <code>guix pull</code> before committing and pushing them.
However, some packages may not build or work, because build or runtime
dependencies (usually undeclared in CRAN itself) are missing. Any improvement
to the already very good Guix CRAN importer, like enhanced auto-detection of
these missing dependencies, also improves the channel’s quality. More
details are available in a
<a href="https://hpc.guix.info/blog/2022/12/cran-a-practical-example-for-being-reproducible-at-large-scale-using-gnu-guix/">blog post</a>.</p><h2>Packages</h2><p>As of this writing, Guix comes with more than 22,000 packages, which
makes it one of the ten biggest free software distributions <a href="https://repology.org">according
to Repology</a>. This is the result of more than
15,000 commits made by 343 people since last year—an impressive level of
activity sustained thanks to the Guix tooling and continuous integration
services.</p><p>Many scientific packages have been added or upgraded in Guix. As an
example, Bioconductor, the R suite for bioinformatics, was upgraded to
3.16; OCaml 5 with support for shared memory parallelism and effect
handlers was introduced; the <code>snakemake</code> package in Guix received an
important bug fix, making <code>snakemake</code> usable for parallel execution on
HPC clusters. The most common scientific and HPC packages were updated
and improved: Open MPI and its many dependencies, SLURM, OpenBLAS,
Scotch, SUNDIALS, and GROMACS, to name a few. The Julia package set is
still growing; Julia was upgraded to 1.6.7 and then to 1.8.3, with fixes
for i686 and improvements of the Julia build system.</p><p>In addition to the growing collection of curated packages provided as
part of the main Guix channel, we maintain a number of special-purpose
channels that provide additional packages for scientific and
high-performance computing. An up-to-date list of Guix channels
maintained by members of the Guix HPC effort is <a href="https://hpc.guix.info/channels/">available on the
project page</a>. The <a href="https://hpc.guix.info/browse">on-line package
package browser</a> also makes it easier to
navigate channels.</p><p>The <a href="https://github.com/guix-science/guix-science">Guix-Science
channel</a>, initiated in
2021, now provides more than 600 packages, complementing the rich
scientific package collection available in Guix proper. Chief among the
changes it received this year are an update of
<a href="https://hpc.guix.info/package/rstudio">R Studio</a> and improvements to
the Jupyter Lab and Jupyter Hub packaging, and the addition of Integrative
Genomics Viewer (IGV).</p><h2>Ensuring Source Code Availability</h2><p>The <a href="https://10years.guix.gnu.org/">10 Years of Guix</a> event was an
opportunity for developers of Guix and Software Heritage (SWH) to discuss
<em>intrinsic identifiers</em>. An intrinsic identifier only depends on the
data content itself and it requires three ingredients for its
computation: a representation of the <em>structure</em> of this content
(serializer), a cryptographic hash algorithm, and an encoding for the
resulting byte string. While converting from one encoding to another is
trivial—e.g., between base64 and base32—it is, naturally, impossible to
“convert” a cryptographic hash to the hash computed by a different
function. All three parameters can be selected with command-line
options to the <code>guix hash</code> command.</p><p>By default Guix computes a SHA256 hash over the Nar serialization of
source archives and version-control checkout (“Nar” stands for
<em>normalized archive</em>; it is the serialization format inherited from
Nix). Instead, the SWH archive computes the <a href="https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html">SHA1 hash of a
Git-serialized representation of the
files</a>.
This discrepancy deprives Guix of a simple and reliable way to query the
SWH archive by content hash. This led to a
<a href="https://gitlab.softwareheritage.org/swh/meta/-/issues/4538">discussion</a>
about the possibility for SWH to compute and preserve Nar hashes as
additional information for code it archives—so-called
<code>ExtID</code> identifiers. Doing so could improve archive coverage for code source
referenced by Guix packages, in particular for Subversion checkouts as
used by most of the TeX Live packages.</p><p><img src="/static/images/blog/swh-guix.png" alt="Medley of the Software Heritage and Guix logos, by Marla Da Silva." /></p><p>As discussed in last year’s report, Guix contributor Timothy Sample was
awarded a grant <a href="https://www.softwareheritage.org/2022/01/13/preserving-source-code-archive-files/">by Software Heritage and the Alfred P. Sloan
Foundation</a>
to further their work on Disarchive.
<a href="https://ngyro.com/software/disarchive.html">Disarchive</a> bridges the gap
between source code archives (<em>tarballs</em>) packages refer to and content
stored in the SWH archive. It does so by providing a command to
<em>extract</em> the metadata of a tarball, and another command to <em>reassemble</em>
metadata and content, thereby restoring the original tarball. This work
is key to improving source code availability for the many packages built
from source code tarballs.</p><p>Last year, the Guix project deployed infrastructure to continuously
build and publish a Disarchive database at
<a href="https://disarchive.guix.gnu.org/"><code>disarchive.guix.gnu.org</code></a>. Guix is able
to combine Disarchive and SWH as a fallback when downloading a tarball from its
original URL fails, significantly improving source code archival coverage.</p><p>This work was <a href="https://gitlab.softwareheritage.org/swh/devel/swh-model/-/issues/2430#note_23708">initiated</a>
a few years back and is still ongoing. A
proposal to integrate Disarchive into the SWH archive is <a href="https://gitlab.softwareheritage.org/swh/devel/swh-model/-/merge_requests/267">being
discussed</a>.
We believe Disarchive integration would be a great step forward, not
just for Guix, but for all the distributions and tools that rely on
source tarball availability.</p><h2>Reproducible Research in Practice</h2><p>This section highlights scientific productions made with GNU Guix.</p><p>Guix was used to ensure the reproducibility of experiments for the study of
memory contention between computations and communications on several different
HPC clusters. A <a href="https://gitlab.inria.fr/pswartva/paper-model-memory-contention-r13y">public
companion</a>
explains how to reproduce the experiments with and without GNU Guix.</p><ul><li>Alexandre Denis et al., <a href="https://hal.inria.fr/hal-03871630v1"><em>Predicting Performance
of Communications and Computations under Memory Contention in Distributed
HPC Systems</em></a></li></ul><p>The <a href="https://gitlab.inria.fr/pswartva/paper-starpu-traces-r13y">reproducible
paper</a> about the
impact of tracing on complex HPC application executions, mentioned in the
previous Guix-HPC Activity Report, is still under review for publication.
However, first feedbacks from reviewers requested several complementary
experiments. These complementary experiments were made about a year after the
initial experiments presented in the paper. Having a complete workflow based on
GNU Guix really helped to dive back into the experimental context and
configurations used a year before!</p><p>Philippe Swartvagher defended his <a href="https://theses.hal.science/tel-03989856">PhD
thesis</a> on the interactions between HPC
task-based runtime systems and communication libraries. In an appendix of the
manuscript, he explains how he used on different HPC clusters GNU Guix,
packages from the Guix-HPC channel and Software Heritage, to ensure
reproducibility of his experiments.</p><p><img src="/static/images/blog/guix-repro-swartvagher.png" alt="Screenshot of an article referencing its companion code that includesGuix channel and manifestdata." /></p><p>The PhD thesis of Marek Felšöci (to be defended in February
2023), which is part of a collaboration between Inria and Airbus, is set in an industrial
aeroacoustic context and deals with direct methods for solving coupled
sparse/dense linear systems. Within the thesis, the author dedicates a full
chapter to the topic of reproducibility. Throughout this chapter, he addresses
the challenges of ensuring a reproducible research study in computer science in
general and in the context of the thesis in particular. The questions related to
the usage of non-free software are discussed as well. The author then presents
the strategy he adopts to face these challenges including working principles,
software tools and their alternatives. To share the resulting guidelines, he
provides a minimal working example of a reproducible research study on solvers
for coupled sparse/dense systems. Moreover, he introduces and references
examples of actual studies from the thesis following the advocated principles
and techniques for improving reproducibility.</p><p><img src="/static/images/blog/pigx.svg" alt="Logo of PiGx." /></p><p>The latest addition to the <a href="https://bioinformatics.mdc-berlin.de/pigx/">PiGx framework of reproducible scientific
workflows backed by Guix</a>
is PiGx SARS-CoV-2, a pipeline for analysing
data from sequenced wastewater samples and identifying given lineages
of SARS-CoV-2. The output of the PiGx SARS-CoV-2 pipeline is
summarized in a report which provides an intuitive visual overview
about the development of variant abundance over time and location.
This is the first of the released PiGx pipelines that comes with
concise yet comprehensive instructions on how to use <code>guix time-machine</code> to reproduce the software environment used for the
analyses presented in the paper:</p><ul><li>Vic-Fabienne Schumann et al., <a href="https://doi.org/10.1016/j.scitotenv.2022.158931"><em>SARS-CoV-2 infection dynamics
revealed by wastewater sequencing analysis and
deconvolution</em></a></li></ul><p>Guix was used as the computational environment manager of
biomedical research on the administration of azithromycin drug after
allogeneic hematopoietic stem cell transplantation for hematologic
malignancies. Studying 240 samples from patients randomized in this phase 3
controlled clinical trial was a unique opportunity to better understand the
mechanisms underlying relapse, the first cause of mortality after
transplantation. The various data processing scripts and associated
computational environments using
<code>manifest.scm</code>
and <code>channels.scm</code>
files for use with <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-time_002dmachine.html"><code>guix time-machine</code></a>
and <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-shell.html"><code>guix shell</code></a>
are available <a href="https://gitlab.com/nivall/azimut-blood">here</a>,
<a href="https://gitlab.com/nivall/azimut-in-vitro">there</a> or
<a href="https://gitlab.com/nivall/azimutscrna">there</a>.</p><ul><li>Nicolas Vallet et al. <a href="https://doi.org/10.1182/blood.2022016926"><em>Azithromycin promotes relapse by disrupting immune
and metabolic networks after allogeneic stem cell
transplantation</em></a></li></ul><h1>Cluster Usage and Deployment</h1><p>As part of our effort to streamline Guix deployment on HPC clusters, we
updated and improved our <a href="https://guix.gnu.org/cookbook/en/html_node/Installing-Guix-on-a-Cluster.html">cluster installation
guide</a>,
which is now part of the Guix Cookbook. The guide describes the steps
needed to get Guix running on a typical HPC cluster where nodes come
with a distribution other than Guix System, such as CentOS or Rocky
Linux.</p><p>The sections below highlight the experience of cluster administration
teams and report on tooling developed around Guix for users and
administrators on HPC clusters.</p><h2>Genetics Research Cluster at UTHSC</h2><p>At the University of Tennessee Health Science Center (UTHSC) in Memphis
(USA), we are running an 11-node large-memory <a href="http://genenetwork.org/facilities/">HPC Octopus
cluster</a> (264 cores) dedicated to
pangenome and genetics research. In 2022 more storage added. Notable
about this HPC is that it is <em>administered by the users themselves</em>.
Thanks to GNU Guix we install, run and manage the cluster as researchers
(and roll back in case of a mistake). UTHSC’s information technology
(IT) department manages the infrastructure—i.e., physical placement,
routers and firewalls—but beyond that there are no demands on IT.
Thanks to out-of-band access we can completely (re)install machines
remotely.</p><p>Octopus runs GNU Guix on top of a minimal Debian install and we are
experimenting with Guix System nodes that can be run on demand.
LizardFS is used for distributed network storage. Almost all deployed
software has been packaged in GNU Guix and can be installed by regular
users on the cluster without root access, see the
<a href="https://gitlab.com/genenetwork/guix-bioinformatics">guix-bioinformatics</a>
channel.</p><h2>Tier-2 Research Cluster at GliCID</h2><p><a href="https://www.glicid.fr/">GliCID</a>, a Tier-2 cluster in Nantes (France), will
have a new computing cluster installed in the summer of 2023. To retain
control over the system and avoid proprietary tools specific to this
type of facility, GliCID chose to build an independent cluster
infrastructure into which the newly delivered cluster will be
integrated.</p><p>This infrastructure consists of virtual machines (VMs) generated from
Guix operating system definitions and providing services such as:
identity management, databases, monitoring, high availability, login
machines, slurms servers, documentation servers—over 20 VMs in total.
The generated images are directly pushed on Ceph RBD volumes and
consumed by KVM hypervisors, which avoids a deployment phase. Now
fully operational, this infrastructure is entering a test phase. The
choice of Guix has proven to be perfectly adapted to control the whole
infrastructure and to obtain redeployable, reproducible and easily
scalable machines.</p><p>Compute nodes are a mix of virtual compute machines running Guix System,
and physical machines from a previous cluster running another
distribution. Making native and “foreign” Guix installations cohabit
while guaranteeing the consistency of the profiles turned out to be
challenging. One specific issue GliCID overcame was managing a shared
independent <code>/gnu/store</code>, shared by all the nodes as per the <a href="https://guix.gnu.org/cookbook/en/html_node/Installing-Guix-on-a-Cluster.html">standard
cluster setup
instructions</a>,
and merging the <code>/gnu/store</code> directory of native nodes via overlayfs.</p><p>In 2023, GliCID plans to increase the share of infrastructure machines
running Guix System, to factor more code and improve the quality of
operating system definitions, packages, and services that have been
<a href="https://gitlab.univ-nantes.fr/glicid-public/guix-glicid">developed
internally</a>,
and to contribute more of these upstream.</p><h2>Packages as Environment Modules</h2><p><img src="/static/images/blog/modules-logo.svg" alt="Environment Modules logo." /></p><p>To support seasoned HPC users and system administrators, we developed
<a href="https://hpc.guix.info/blog/2022/05/back-to-the-future-modules-for-guix-packages/">Guix-Modules</a>,
a tool to generate <a href="http://modules.sourceforge.net/"><em>environment
modules</em></a>. Environment modules are a
venerable tool that lets HPC users “load” the software environment of
their choice <em>via</em> the <code>module load</code>. This gives a lot of flexibility:
users can use their favorite software packages without interfering with
one another, and they can also manipulate different environments. The
downside of this tool that modules are all too often handcrafted on each
cluster: an <code>openmpi/4.1.4</code> module might be called differently on
another cluster, or it might be a different version, or it might be
built with different options. In other words, use of modules is usually
specific to one cluster, and users have to “port” their code when
switching to a different clusters as they cannot expect to find the same
modules.</p><p>Nevertheless, the <code>module</code> command remains widespread, well-known, and
convenient. Guix-Modules generates modules for the chosen Guix
packages, such that users can then run <code>module load</code> to use them,
without having any knowledge of Guix. For system administrators, the
benefit is obvious: instead of having to build and maintain tens of
modules for scientific software, they can instead generate them all at
once and provide users with battle-tested packages found in Guix. For
users, the immediate benefit is a smooth transition to Guix, but also
reproducibility and provenance tracking: the generated modules record
provenance information, which allows users to deploy the exact same
software elsewhere or at a different point in time.</p><p>A similar interoperability layer was previously developed for the Spack
and EasyBuild package managers with similar motivations. In the case of
Guix, we hope this will help user accustomed to <code>module</code> migrate towards
reproducible deployment without having to change their habits overnight.</p><h2>Containers, Singularity, and Docker</h2><p>For HPC environments that do not support running native Guix software
deployment Guix supports building lightweight reproducible containers
that only have the software that is really needed. At UTHSC we are
distributing binary deployments as Docker containers that run on
state-of-the art compute HPCs. These containers were developed and
tested first on a separate computer with GNU Guix installed, and
produced with <code>guix pack</code>.</p><p>Research teams at Inria resort to <code>guix pack</code> as well when targeting
supercomputers where Guix is not installed. Scientists can deploy their
software using Guix directly on clusters that support it, such as
Grid’5000, PlaFRIM, and some of the Tier-2 clusters; when they need to
deploy it on Tier-1 supercomputers, they build a Singularity image that
they ship and run there. This is both a productivity boost—no need to
manually rebuild software!—and the guarantee that they <em>are</em> running the
same software.</p><p>Having Guix available on those supercomputers would of course make the
process even smoother; we plan to engage with those cluster
administration teams to make Guix available in the future.</p><h2>Supporting POWER9 and RISC-V CPUs</h2><p>While it is perhaps early days to call RISC-V an HPC platform, there are indicators that this may happen in the near future with investments from the <a href="https://www.tomshardware.com/news/risc-v-cluster-demonstrated">USA</a>, the <a href="https://www.european-processor-initiative.eu/epi-epac1-0-risc-v-test-chip-samples-delivered/">EU</a>, India, and China. RISC-V hardware platforms and vendors will become common in the coming years.</p><p><img src="/static/images/blog/risc-v.png" alt="RISC-V logo." /></p><p>Together with Chris Batten of Cornell and Michael Taylor of the University of Washington, Erik Garrison and Pjotr Prins are UTHSC PIs responsible for leading the NSF-funded <a href="https://news.cornell.edu/stories/2021/11/5m-grant-will-tackle-pangenomics-computing-challenge">RISC-V supercomputer for pangenomics</a>. It will incorporate GNU Guix and the <a href="https://guix.gnu.org/en/blog/2019/guix-reduces-bootstrap-seed-by-50/">GNU Mes bootstrap</a>, with input from Arun Isaac, Efraim Flashner and others. <a href="https://nlnet.nl/project/current.html">NLNet</a> is funding RISC-V support for GNU Guix with Efraim Flashner and the GNU Mes RISC-V bootstrap project with Ekaitz Zarraga and Jan Nieuwenhuizen. We aim to continue adding RISC-V support to GNU Guix at a rapid pace. After the Guix days in Paris, Alexey Abramov was the first to bootstrap GNU Guix for RISC-V on the Polarfire platform.</p><p>Why is the combination of GNU Mes and GNU Guix exciting for RISC-V? First of all, RISC-V is a very modern modular open hardware architecture that provides further guarantees of transparency and security. It extends reproducibility to the transistor level and for that reason generates interest from the Bitcoin community, for example. Because there are no licensing fees involved, RISC-V is already a major force in IoT and will increasingly penetrate hardware solutions, such as storage microcontrollers and network devices, going all the way to GPU-style parallel computing and many-core solutions with thousands of cores on a single die. GNU Mes and GNU Guix are particularly suitable for RISC-V because Guix can optimize generated code for different RISC-V targets and is able to parameterize deployed software packages for included/excluded RISC-V modules.</p><h1>Outreach and User Support</h1><p>Guix-HPC is in part about “spreading the word” about our approach to
reproducible software environments and how it can help further the goals
of reproducible research and high-performance computing development.
This section summarizes articles, talks, and training sessions given
this year.</p><h2>Articles</h2><p>The following refereed articles about Guix were published:</p><ul><li>Nicolas Vallet, David Michonneau, and Simon Tournier, <a href="https://doi.org/10.1038/s41597-022-01720-9"><em>Toward
practical transparent verifiable and long-term reproducible research
using Guix</em></a>, Nature
Scientific Data, volume 9 issue 1, October 2022</li><li>Ludovic Courtès, <a href="https://hal.inria.fr/hal-03604971"><em>Reproducibility and Performance: Why
Choose?</em></a>, IEEE CiSE volume 4,
issue 3, June 2022</li><li>Ludovic Courtès, <a href="https://doi.org/10.22152/programming-journal.org/2023/7/1"><em>Building a Secure Software Supply Chain with
GNU Guix</em></a>,
Programming Journal, volume 7 issue 1, June 2022</li></ul><p>The following refereed articles about research that uses Guix were published:</p><ul><li>Alexandre Denis et al., <a href="https://hal.inria.fr/hal-03871630v1"><em>Predicting Performance
of Communications and Computations under Memory Contention in Distributed
HPC Systems</em></a></li><li>Vic-Fabienne Schumann et al., <a href="https://doi.org/10.1016/j.scitotenv.2022.158931"><em>SARS-CoV-2 infection dynamics
revealed by wastewater sequencing analysis and
deconvolution</em></a>,
Science of the Total Environment, volume 853, December 2022</li><li>Nicolas Vallet et al. <a href="https://doi.org/10.1182/blood.2022016926"><em>Azithromycin promotes relapse by disrupting immune
and metabolic networks after allogeneic stem cell
transplantation</em></a></li></ul><p>Over the year we published <a href="https://hpc.guix.info/blog/">six articles on the Guix-HPC
blog</a> touching topics such as environment
modules, reproducible R environments, and reproducibility.</p><h2>Talks</h2><p>Since last year, we gave the following talks at the following venues:</p><ul><li><a href="https://archive.fosdem.org/2022/schedule/event/commonworkflowlang/"><em>Concise Common Workflow Language—Concision and Elegance in a
Workflow Language Using
Lisp</em></a>,
FOSDEM, Feb. 2022 (Arun Isaac)</li><li><a href="https://carrv.github.io/2022/"><em>Using Guix in Computer Architecture Research</em> at both the gem5
users' workshop and the Sixth Workshop on Computer Archiecture
Research with RISC-V (CARRV'22) in New York City,
NY</a>, June 2022, Christopher Batten (Cornell University), Pjotr Prins, Efraim Flashner, Arun Isaac (The University of Tennessee Health Science Center), Jan van Nieuwenhuizen (Joy of Source), Ekaitz Zarraga (ElenQ Technologies), Tuan Ta, Austin Rovinski (Cornell University), Erik Garrison (The University of Tennessee Health Science Center)</li><li><a href="https://10years.guix.gnu.org/program/#gnu-guix-and-the-risc-v-future"><em>GNU Guix and the RISC-V
Future</em></a>,
Ten Years of Guix, Sep. 2022, (Pjotr Prins)</li><li><a href="https://communs.numerique.gouv.fr/posts/annonce-programme-journee-bluehats-2022/"><em>GNU Guix, vers la reproductibilité
computationnelle</em></a>,
BlueHats session of <a href="https://www.opensource-experience.com/en/">Open Source
Experience</a>, Nov. 2022
(Simon Tournier)</li><li><a href="https://www.ibens.ens.fr/spip.php?article172&lang=en"><em>Toward practical transparent verifiable and long-term reproducible research
using Guix</em></a>,
bioinfo seminar at <a href="https://www.ibens.ens.fr/?lang=en">Institut de Biologie de l'École Normale Supérieure
(IBENS)</a>, Dec. 2022 (Simon
Tournier)</li></ul><h2>Events</h2><p>As in previous years, Pjotr Prins spearheaded the organization of the
<a href="https://archive.fosdem.org/2022/schedule/track/declarative_and_minimalistic_computing/">“Declarative and minimalistic computing”
track</a>
at FOSDEM 2022, which was home to several Guix talks.</p><p><img src="https://10years.guix.gnu.org/static/images/photos/2022_0916_15334700.small.jpg" alt="Group photo around the birthday cake. By Christopher Baines, CC0." /></p><p>This year was also the tenth year of Guix as a project. Its first lines
of code were written in April 2012, and it has since received code
contributions by more than 800 people at an impressive rate, not to
mention non-coding contributions in many areas—from helping out
newcomers, to designing graphics, to translating documentation.</p><p>To celebrate, we organized <a href="https://10years.guix.gnu.org/"><em>Ten Years of
Guix</em></a>, a three-day event that took place
in Paris, France, in September 2022, with <a href="https://10years.guix.gnu.org/sponsors">support from research and
non-profit organizations</a>. About
50 people came in Paris and the event was also live-streamed.</p><p>This event was one of kind: it brought together scientists and free
software hackers, two communities that evidently have shared values—as
the <em>open science</em> movement demonstrates—and that benefit from one
another. The program was organized as follows:</p><ul><li>Friday, September 16th, was dedicated to <em>reproducible deployment
for reproducible research</em>. Scientists and practitioners shared
their experience building reproducible research workflows, using
Guix and other tools.</li><li>Saturday focused on development <em>with</em> Guix and <em>on</em> Guix, as well
as community topics.</li><li>Sunday had more in-depth presentations of Guix as well as informal
discussions and skill-sharing sessions.</li></ul><p>A total of 34 talks were given and <a href="https://10years.guix.gnu.org/program/">videos are available
on-line</a>—many thanks to the
Debian video team for making it possible!</p><p><img src="https://10years.guix.gnu.org/static/images/photos/2022_0917_15530400.small.jpg" alt="The cake! Picture by Chrisopher Baines, CC0." /></p><p>Oh and of course, we ate not one but two birthday cakes.</p><h2>Training Sessions</h2><p>For the French HPC Guix community, we continued the monthly on-line
event called <a href="https://hpc.guix.info/events/2022/café-guix/">“Café
Guix”</a>, originally started
in October 2021. Each month, a user or developer informally presents a
Guix feature or workflow and answers questions. These sessions are now recorded
and are available on the webpage.</p><p>A mini-tutorial about Guix was presented by Simon Tournier on May 19,
2022 during the French Higher Education and Research Days on Networking (JRES). The
1h <a href="https://replay.jres.org/w/3TuYmocHwKtzs7q1VtL1GB">video</a> and the
<a href="https://conf-ng.jres.org/2021/document_revision_2595.html?download">support</a>
are available (in French). In June, <a href="https://www.inrae.fr/en">INRAe</a>
(the French institute for research in agriculture, food, and environment)
organized in Montpellier a training session covering tools such as
Kubernetes and OpenStack, and hosted a session dedicated to computational
reproducibility where Simon Tournier
<a href="https://gitlab.com/zimoun/fcc-inrae">presented</a>
how Guix can help.</p><p>On May 30, 2022 the Max Delbrück Center for Molecular Medicine in the
Helmholtz Association (MDC) hosted a Guix workshop as part of the Data
Science Café in Berlin. The workshop was entitled “Managing
reproducible and transparent software environments with GNU Guix” and
was presented by Ricardo Wurmus.</p><p>The Inria research center in Nancy (France)
periodically organizes afternoon technical seminars, referred to as “Tuto
Techno”, about a technology or programming language. On June 14, 2022
the research center hosted Marek Felšöci who gave a <a href="https://tuto-techno-guix-hpc.gitlabpages.inria.fr/slides/tuto-techno-guix-hpc.pdf">presentation</a>
on the use of Guix combined with literate
programming with Org Mode for building reproducible research studies. The
presentation was followed by a hands-on session. Attendees were guided
through the process of constructing a standalone Git repository containing a
research study entirely reproducible thanks to Guix and the literate description
of the experimental environment, source code and methods in Org mode. At the end
of the hands-on session, participants learned how to use
Software Heritage to guarantee a long-term availability of their work.
The tutorial is self-contained and <a href="https://tuto-techno-guix-hpc.gitlabpages.inria.fr/guidelines/">publicly
available</a> for
anyone who would like to try it out.</p><p>A training session was given during the <a href="https://osd-uga-2022.sciencesconf.org/">Open Science
Days</a>, which took place in
Grenoble, France, 13–15 December 2022. Entitled <a href="https://archive.softwareheritage.org/swh:1:dir:731f3a71e4676c7ec0ab83c72aa060c9a094630a">“<em>Déploiement
reproductible des logiciels scientifiques avec
GNU Guix</em>”</a>
(“Reproducible scientific software deployment with GNU Guix”) and given
by Ludovic Courtès, Konrad Hinsen, and Simon Tournier, the session
introduced the use of <code>guix shell</code> and <code>guix time-machine</code> as the
building blocks of reproducible workflows. Training material is
<a href="https://gitlab.inria.fr/guix-hpc/open-science-days-tutorial">available
on-line</a>.</p><p>Another training session was organized by SARI (part of the DevLog
knowledge network at CNRS) in Grenoble on the 8th of December 2022. It
aimed to help people use Guix on the GriCAD HPC cluster.</p><p>Work has started on a sequel to the <a href="https://www.fun-mooc.fr/en/courses/reproducible-research-methodological-principles-transparent-scie/">Reproducible Research MOOC</a> by Inria Learning Lab, which will include an introduction to Guix for managing software environments for reproducible research.</p><h1>Personnel</h1><p>GNU Guix is a collaborative effort, receiving contributions from more
than 90 people every month—a 50% increase compared to last year. As
part of Guix-HPC, participating institutions have dedicated work hours
to the project, which we summarize here.</p><ul><li>Inria: 2.5 person-years (Ludovic Courtès; contributors to the Guix-HPC
channel: Emmanuel Agullo, Luca Cirrottola, Marek Felšöci, Marc
Fuentes, Nathalie Furmento, Gilles Marait, Florent Pruvost, Matthieu
Simonin, Philippe Swartvagher, Mathieu Verite; system administrator in
charge of Guix on the PlaFRIM and Grid’5000 clusters: Julien
Lelaurain)</li><li>Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC): 2 person-years
(Ricardo Wurmus and Mădălin Ionel Patrașcu)</li><li>University of Paris Cité: 0.75 person-year (Simon Tournier)</li><li>University of Tennessee Health Science Center (UTHSC): 3+ person-years (Efraim Flashner, Bonface Munyoki, Fred Muriithi, Arun Isaac, Jorge Gomez, Erik Garrison and Pjotr Prins)</li><li>CNRS and UGA (GRICAD): 0.3 person-year (Céline Acary-Robert, Pierre-Antoine Bouttier, Oliver Henriot)</li></ul><h1>Perspectives</h1><p>With <a href="https://en.unesco.org/science-sustainable-future/open-science/recommendation">UNESCO’s Recommendation on Open
Science</a>
and the many Open Science initiatives at the national and institutional
levels, awareness of the Open Science and reproducible research
principles is on the rise. Its implications are also better understood
in particular when it comes to software: software publication and
licensing, issues of software deployment, provenance tracking, and
reproducibility are becoming central to scientific practices.
Addressing these issues requires commitment of the scientific community
at large: scientists, but also research software engineers (RSEs) and
system administrators.</p><p>The Guix-HPC effort is unique in its ability to connect these
communities. This Activity Report as well as the program of the Ten
Years of Guix event earlier this year are proof that researchers,
engineers, and system administrators all have a stake in what we are
building. Together, we shape tools and practices that further Open
Science and make reproducible research workflows practical.</p><p>Bringing these tools and practices to the scientific community is a key challenge for the project.
While Guix gets more recognition as an enabler for reproducible research,
misconceptions persist: that Guix only caters to <a href="https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/">the needs of
“reproducibility
professionals”</a>,
or that it reproducibility is <a href="https://hal.inria.fr/hal-03604971/">antithetical to
performance</a>. In the coming year,
we want to reach out to broader user communities—again scientists,
engineers, and system administrators—and to provide training sessions.
It is our mission to put the tools we build in the hands of
practitioners at large.</p><p>There are technical challenges ahead for the coming year, in line with
what we have been doing: improving the user experience for scientists,
improving the user story when running software on a Guix-less cluster,
bridging the gap with users that do not interact with software <em>via</em> the
command line or Jupyter, bringing Guix System and <code>guix deploy</code> to HPC
cluster administrators, and achieving 100% coverage of package source
code in the Software Heritage archive.</p><p>The GNU Guix project turned ten this year. It started with the
development of a “package manager” and is now providing a complete
<em>deployment toolbox</em>: a package manager, but also a development
environment manager, a container provisioning tool, a standalone
operating system, and a cluster deployment tool. Besides its technical
achievements, it has raised the bar of what one can expect in terms of
software deployment—reproducibility, provenance tracking, and
transparency. We are determined to make more strides in that direction.</p><p>There’s a lot we can do and we’d love to <a href="https://hpc.guix.info/about">hear your ideas</a>!</p>Guix-HPC at FOSDEMLudovic Courtèsguix-devel@gnu.org2023-01-24T14:00:00Z<p>As has been the case <a href="https://guix.gnu.org/en/blog/tags/fosdem/">for 9 years
(!)</a>, Guix will be present at
<a href="https://fosdem.org/2023">FOSDEM</a>, the big annual free software
developer conference in Europe. There will be <a href="https://guix.gnu.org/blog/2023/meet-guix-at-fosdem-2023/">no less than ten
Guix-related
talks</a>, of
which the following are particularly relevant to the HPC and
reproducible research communities:</p><ul><li>In the <a href="https://fosdem.org/2023/schedule/track/open_research_tools_and_technology/">Open Research Tools
track</a>,
<a href="https://fosdem.org/2023/schedule/event/openresearch_guix/"><em>Guix, toward practical transparent, verifiable and long-term
reproducible
research</em></a>
will be an introduction to Guix (by Simon Tournier) for an audience
of scientists interested in coming up with scientific practices that
improves verifiability and transparency.</li><li>In the <a href="https://fosdem.org/2023/schedule/track/risc_v/">RISC-V
track</a>, Efraim
Flashner will talk about the latest breakthroughs in <a href="https://fosdem.org/2023/schedule/event/rv_gnu_guix/"><em>Porting
RISC-V to
GNU Guix</em></a>—and
the other way around.</li><li>In the <a href="https://fosdem.org/2023/schedule/track/hpc_big_data_and_data_science/">HPC
track</a>,
Ludovic Courtès will give a lightning talk about CPU tuning in Guix
entitled <a href="https://fosdem.org/2023/schedule/event/cpu_tuning_gnu_guix/"><em>Reproducibility and performance: why
choose?</em></a>.</li></ul><p>There are lots of exciting talks in each of these tracks, check it out!
Talks will be live-streamed so you can join and chat with us even if
you’re not physically present.</p><p>Prior to FOSDEM, the community will meet in person for the <a href="https://guix.gnu.org/blog/2023/meet-guix-at-fosdem-2023/">Guix
Days</a>, two
days to informally discuss organizational matters, technical issues, and
road maps.</p><p>See you in Brussels!</p>CRAN, a practical example for being reproducible at large scale using GNU GuixLars-Dominik Braunguix-devel@gnu.org2022-12-21T15:50:00Z<p>A recent <a href="https://doi.org/10.1038/s41597-022-01143-6">study published in <em>Nature Scientific Data</em> in February
2022</a> gives empirical insight
into the success rate of reproducing R scripts obtained from Harvard’s
Dataverse:</p><blockquote><p><em>We re-executed R code from each of the replication packages using
three R software versions, R 3.2, R 3.6, and R 4.0, in a clean
environment.</em>
[…]
<em>We find that 74% of R files failed to complete without
error in the initial execution, while 56% failed when code cleaning
was applied, showing that many errors can be prevented with good
coding practices.</em></p></blockquote><p>Given that more than half of the published R files failed to run even when
trying to run it with three different R versions, recording the exact
environment software is supposed to run in could be declared a <em>good
coding practice</em> for scientific publications.</p><p>The R ecosystem itself provides tools to capture and restore R software
environments, including <a href="https://rstudio.github.io/packrat/">Packrat</a>
and its successor <a href="https://rstudio.github.io/renv/">renv</a>
which both originate from within the RStudio project. Two replication
packages in the study above used renv while the others did not record
the environment at all.</p><p>Looking at renv more closely reveals that it is able to
capture the current R version and installed packages in a lockfile
called <code>renv.lock</code>. However, <a href="https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/">as noted
before</a>,
restoring an environment comes with a few
<a href="https://rstudio.github.io/renv/articles/renv.html#caveats">caveats</a>:
First of all, renv does not install a different version of R if the
recorded and current version disagree. This is a manual step and up to
the user. The same is true for packages with external dependencies. Those
libraries, their headers and binaries also need to be installed by the
user in the correct version, which is <em>not</em> recorded in the lockfile.
Furthermore renv supports restoring packages installed from git
repositories, but fails if the user did not install git beforehand.</p><p>None of the guesswork and manual installation steps are required
when using GNU Guix, since software in its repositories is
bit-for-bit reproducible. It also provides scripts (“importer”)
to turn packages from various language-specific repositories like
<a href="https://pypi.org/">PyPi</a> for Python, <a href="https://crates.io/">crates.io</a>
for Rust and <a href="https://cran.r-project.org/">CRAN</a> for R into Guix package
recipes.</p><p>An example workflow for the CRAN package
<a href="https://CRAN.R-project.org/package=zoid">zoid</a>, which is not available
in Guix proper, would look like this:</p><ol><li><p>Import the package into a manifest.</p><pre><code class="language-console">$ guix import cran -r zoid > manifest.scm</code></pre></li><li><p>Edit <code>manifest.scm</code> to import the required modules and return a
usable manifest containing the package and R itself.</p><pre><code class="language-scheme">(use-modules (guix packages)
(guix download)
(guix licenses)
(guix build-system r)
(gnu packages cran)
(gnu packages statistics))
(define-public r-zoid …)
(packages->manifest (list r-zoid r))</code></pre></li><li><p>Run your code.</p><pre><code class="language-console">$ guix shell -m manifest.scm -- R -e 'library(zoid)'</code></pre></li></ol><p>Although Guix displays hints which modules are missing when trying to
use an incomplete manifest, editing the manifest file to include all of
them can be quite tedious.</p><p>For R specifically the R package
<a href="https://CRAN.R-project.org/package=guix.install">guix.install</a> provides
a way to automate this import. It also uses <code>guix import</code>, but references
dependencies using package specifications like <code>(specification->package "r-bh")</code>. This way no extra logic to figure out the correct module
imports is required. It then extends the package search path, including
the newly written file at <code>~/.Rguix/packages.scm</code>, installs the package
into the default Guix profile at <code>~/.guix-profile</code> and adds this profile
to R’s search path.</p><p>While this approach works well for individual users, Guix installations
with a larger user-base, for instance institution-wide, would benefit
from the default availability of the entire CRAN package collection with
pre-built substitutes to speed up installation times. Additionally,
reproducing environments would include fewer steps if the package
recipes were available to anyone by default.</p><h2>Introducing guix-cran</h2><p>GNU Guix provides a mechanism called “channels”,
which can extend the package collection in Guix
proper. <a href="https://github.com/guix-science/guix-cran">guix-cran</a> does
exactly that: It provides all CRAN packages missing in Guix proper in
a channel and has all of the properties mentioned above. It can be
installed globally via <code>/etc/guix/channels.scm</code> and packages can be
pre-built on a central server.</p><p>As of commit <code>cc7394098f306550c476316710ccad20a510fa4b</code> there are 17431
packages available in guix-cran. 95% of them are buildable and only 0.5%
of these builds are not reproducible via <code>guix build --check</code>. It is
also possible to use old package versions via <code>guix time-machine</code>, similar
to what <a href="https://mran.microsoft.com/documents/rro/reproducibility">MRAN</a>
offers. However, that time-frame only spans about two months right now.</p><p>Creating and updating guix-cran is <a href="https://github.com/guix-science/guix-cran-scripts">fully
automated</a> and happens
without any human intervention. Improvements to the already very good
CRAN importer also improve the channel’s quality. The channel itself
is always in a usable state, because updates are tested with <code>guix pull</code>
before committing and pushing them. However some packages may not build
or work, because (usually undeclared) build or runtime dependencies are
missing. This could be improved through better auto-detection in the
CRAN importer.</p><p>Currently building the channel derivation is very slow, most
likely due to Guile performance issues. For this reason packages
are split into files by the first letter of their name. This way they can
still be referenced deterministically by their first letter.
Since the number of loadable modules is <a href="https://www.mail-archive.com/guile-devel@gnu.org/msg16244.html">limited to
8192</a>,
creating one module file per package is not possible and putting them
all into the same file is even slower.</p><p>The channel is not signed, because all changes are automated anyway.</p><h2>Usage</h2><p>Using guix-cran requires the following steps:</p><ol><li><p>Create <code>channels.scm</code>:</p><pre><code class="language-scheme">(cons
(channel
(name 'guix-cran)
(url "https://github.com/guix-science/guix-cran.git"))
%default-channels)</code></pre></li><li><p>Create <code>manifest.scm</code>:</p><pre><code class="language-scheme">(specifications->manifest '("r-zoid" "r"))</code></pre></li><li><p>Run:</p><pre><code class="language-console">$ guix time-machine -C channels.scm -- shell -m manifest.scm -- R -e 'library(zoid)'</code></pre></li></ol><p>For true reproducibility it’s necessary to pin the channels to a
specific commit by running</p><pre><code class="language-console">$ guix time-machine -C channels.scm -- describe -f channels > channels.pinned.scm</code></pre><p>once and using <code>channels.pinned.scm</code> instead of <code>channels.scm</code> from there on.</p><h2>Appendix</h2><p>Ludovic Courtès, Simon Tournier and Ricardo Wurmus provided valuable
feedback to the draft of this post.</p><p>The channel statistics above can be reproduced using the following
manifest (<code>channels.scm</code>):</p><pre><code class="language-scheme">(list
(channel
(name 'guix)
(url "https://git.savannah.gnu.org/git/guix.git")
(branch "master")
(commit
"4781f0458de7419606b71bdf0fe56bca83ace910")
(introduction
(make-channel-introduction
"9edb3f66fd807b096b48283debdcddccfea34bad"
(openpgp-fingerprint
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA"))))
(channel
(name 'guix-cran)
(url "https://github.com/guix-science/guix-cran.git")
(branch "master")
(commit
"cc7394098f306550c476316710ccad20a510fa4b")))</code></pre><p>And the following Scheme code to obtain a list of all packages provided
by guix-cran (<code>list-packages.scm</code>):</p><pre><code class="language-scheme">(use-modules (guix discovery)
(gnu packages)
(guix modules)
(guix utils)
(guix packages))
(let* ((modules (all-modules (%package-module-path)))
(packages (fold-packages
(lambda (p accum)
(let ((mod (file-name->module-name (location-file (package-location p)))))
(if (member (car mod) '(guix-cran))
(cons p accum)
accum)))
'() modules)))
(for-each (lambda (p) (format #t "~a~%" (package-name p))) packages))</code></pre><p>And this Bash script:</p><pre><code class="language-bash">#!/bin/sh
guix pull -p guix-profile -C channels.scm
export GUIX_PROFILE=`pwd`/guix-profile
source guix-profile/etc/profile
guix repl list-packages.scm > packages
cat packages| parallel -j 4 'rm -f builds/{} && guix build --no-grafts --timeout=300 -r builds/{} -q {} 2>&1 && guix build --no-grafts --timeout=300 --check -q {} 2>&1' | tee build.log
echo "total" && wc -l packages
echo "success" && sort -u build.log | grep '^/gnu/store' | wc -l
echo "failure" && sort -u build.log | grep 'failed$' | wc -l
echo "non-reproducible" && sort -u build.log | grep 'differs$' | wc -l</code></pre>Is reproducibility practical?Ludovic Courtèsguix-devel@gnu.org2022-07-21T15:00:00Z<p>Our attention was recently caught by a nice slide deck on the methods
and tools <a href="https://web.archive.org/web/20220620120430/https://umr-astre.pages.mia.inra.fr/presentations/reproducible-research-in-r/#/principles-of-reproducible-research">for reproducible research in
R</a>.
Among those, the talk <a href="https://web.archive.org/web/20220620120430/https://umr-astre.pages.mia.inra.fr/presentations/reproducible-research-in-r/#/guix">mentions
Guix</a>,
stating that it is “<em>for professional, sensitive applications that
require <strong>ultimate reproducibility</strong></em>”, which is “<em>probably a bit
overkill for Reproducible Research</em>”. While we were flattered to see
Guix suggested as a good tool for reproducibility, the very notion that
there’s a kind of “reproducibility” that is “ultimate” and, essentially,
impractical, is something that left us wondering: What kind of
reproducibility do scientists need, if not the “ultimate” kind? <em>Is
“reproducibility” practical at all</em>, or is it more of a horizon?</p><p>In this post, we question the way we Guix people have been discussing
“reproducibility” in the context of software deployment. We identify
sources of confusion and show that reproducibility is a <em>means</em> that can
help achieve different goals. Our conclusion, perhaps unsurprisingly,
is that the kinds of “reproducibilities” offered by a tool like Guix are
not a luxury for a professional elite: they’re a foundation for reliable
software deployment and for verifiable research.</p><h1>Two kinds of reproducibility</h1><p>When we talk about “reproducibility” in the context of Guix, we really
have two related but different goals in mind. The first goal is being
able to <em>redeploy the same software environment</em> on different machines
or at different points in time, with little effort.</p><p>This first goal is very practical: it’s about letting everyone on a team
use the same software, it’s about letting you install the same software
on two different machines, whether it’s a laptop running Guix System, a
virtual machine running Debian, or a supercomputer running CentOS, and
it’s about letting you rerun the computational experiment of a
scientific article months later.</p><p>The second goal is <em>verifiability</em>. Let’s imagine a scenario where you
publish an article and, as accompanying material, you publish source
code together with a Docker image on Zenodo containing the code that was
<em>supposedly</em> used to produce the results in the article and that
<em>supposedly</em> corresponds to that source code.</p><p>I say “supposedly” because you cannot tell for sure unless you verify.
There are two hypotheses one might want to verify:</p><ol><li>That the source code matches the binary in the Docker image;</li><li>That the program produces the output shown in the article.</li></ol><p>Scientific conferences now often have Artifact Evaluation Committees,
which in practice verify that source code is available, and, when things
go well, that the container image can produce the results shown in the
article—the source/binary correspondence is all too often left out as a
technical detail. Reproducible research is about being able to verify
research outcomes though, and executable artifacts are one such outcome.</p><h1>“Professional” vs. “good enough”</h1><p>“<em>I see what you’re headed to</em>”, you note, “<em>but bit-for-bit
reproducibility is overkill, I don’t</em> need <em>it</em>.” Wait, we didn’t even
mention bit-for-bit reproducibility (yet)!</p><p>Let’s get back to the first of our two goals: the ability to deploy the
same software environment, anytime anywhere. Maybe there are “good
enough” approaches, not as “overkill” as what Guix does, and yet that
achieve that goal?</p><p>Maybe. The slide deck mentioned above is concerned primarily with
<a href="https://www.r-project.org">GNU R</a> software. At almost 30 years, R is
all wisdom and reliability. The language rarely changes, its developers
pay attention to backward compatibility, minimizing breakage for the
thousands of user-contributed packages available on
<a href="https://cran.r-project.org/">CRAN</a>. If your software environment
consists entirely of R modules, the
<a href="https://rstudio.github.io/packrat/">Packrat</a> tool can do wonders: it
can create snapshots of the package name/version pairs used in your
session and eventually <a href="https://rstudio.github.io/packrat/walkthrough.html">restore those
snapshots</a> by
looking up those name/version pairs. It is “good enough” in the sense
that the restored environment is “likely” to behave “similarly”,
compared to the original environment. It is not “ultimate
reproducibility” because there are many things that could lead to
different behavior: you might be restoring with a different version of
R, or one built or configured differently, with a different set of
dependencies, or it might run on a different operating system.</p><p>This approach falls short for software environments that are not 100% R.
This is not uncommon, if you think about R packages that wrap C/C++
libraries (zlib, Cairo, cURL, Eigen, etc.). Those libraries are beyond
the scope of Packrat; whether Packrat can restore an R package that
depends on C/C++ libraries depends on external factors: whether those
libraries were pre-installed through some other mean, whether the
“right” versions are available, whether a C/C++ compiler is available,
and so on. It might succeed, or it might fail at build time (due to the
lack of a suitable compiler or dependencies) or at run time (due to
binary incompatibilities, different dependency versions or build
options, etc.) What’s “good enough” for 100% R projects isn’t good
enough to let you redeploy polyglot environments.</p><p>Other package management tools that have a partial vision of the
dependency graph—from <code>pip</code> and Conda to EasyBuild and Spack—suffer from
that shortcoming. They may or may not be able to redeploy software
packages; those packages might fail to build, because <a href="https://hpc.guix.info/blog/2017/09/reproducibility-and-root-privileges/">their build
environment is not tightly
controlled</a>,
or they might fail at run time <a href="https://hpc.guix.info/blog/2021/09/whats-in-a-package/">due to binary
incompatibilities</a>.
These are very practical problems.</p><h1>Bit for bit</h1><p>This brings us to our second goal: verifiability. For us developers of
package management tools, the question is: how can we enable users to
<em>independently verify</em> the source/binary correspondence? In our
artifact evaluation scenario, we might want to provide reviewers with a
Docker image for convenience, but how can we let them verify that the
binaries in that image correspond to the accompanying source code?</p><p>This is where <em>reproducible builds</em> come in: as a <a href="https://guix.gnu.org/en/blog/2015/reproducible-builds-a-means-to-an-end/">means to allow for
independent verification of the source/binary
correspondence</a>.
The definition that many in the field agree on
<a href="https://reproducible-builds.org/docs/definition/">states</a>:</p><blockquote><p>A build is reproducible if given the same source code, build
environment and build instructions, any party can recreate bit-by-bit
identical copies of all specified artifacts.</p></blockquote><p>“Bit-by-bit identical copies”. That phrase suggests perfection.
Perfection doesn’t exist though, and it’s not unusual for scientists and
practitioners to stop reading at “bit-by-bit”, saying: “<em>nah—this is
nice in theory but just impractical and overkill</em>”.</p><p>Think about it though: how hard can it be to make a software build
process reproducible bit-for-bit? Fortunately, compilers behave in a
deterministic fashion: given the same input, they produce the same
output. Experience with software distributions as large as Debian, Arch
Linux, NixOS, and Guix has shown that there’s a core of <a href="https://reproducible-builds.org/docs/">well-identified
sources of non-reproducibility</a>.
Addressing them takes some effort but is not insurmountable: <a href="https://isdebianreproducibleyet.com/">more than 90% of Debian
packages</a> and <a href="https://data.guix.gnu.org/repository/1/branch/master/latest-processed-revision/package-reproducibility">at least 75% of
Guix
packages</a>
are indeed reproducible bit-for-bit. Guix provides users
<a href="https://guix.gnu.org/manual/devel/en/html_node/On-Trusting-Binaries.html">with</a>
<a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-challenge.html">tools</a>
that, we hope, are accessible to those who are not professional in the
field of bit-for-bit reproducibility.</p><p>The same goes at a higher level. Earlier we wrote that a tool like
Packrat can let you restore an environment “likely to behave similarly”
compared to the original one. How would one define “similarly” though?
If the computation produces different output, what conclusion can you
draw? Will you incriminate the method, when you know your software
environment doesn’t faithfully mirror the one that was originally used?
No, you’ll have at best a lot of guesswork to do before you can draw any
conclusion. Conversely, if you know you deployed the same software,
bit-for-bit, then you’ve significantly reduced the search space in case
the computation produces different output. Bit-for-bit reproducibility
might <em>sound</em> overkill, but it’s the only practical way to determine way
to determine whether a computational process is reproducible.</p><h1>Practicality</h1><p>This blog post was ignited by a slide deck. Perhaps what the author
alluded to when they mentioned “<em>ultimate reproducibility</em>” and Guix
being “<em>overkill</em>” is that Guix as a project is on a quixotic quest for
reproducibility; but perhaps what they suggested by framing it as
“<em>professional</em>” is that using it is difficult.</p><p>The answer is that if you liked <code>pip install</code> or <code>apt install</code>, you’ll
love <code>guix install</code>. Over ten years of development, we’ve worked hard
on the user interface and documentation to make it easier to <a href="https://guix.gnu.org/manual/devel/en/html_node/Getting-Started.html">get
started</a>.
That doesn’t mean everything’s perfect—one of the talks at the upcoming
Ten Years of Guix event is about <a href="https://10years.guix.gnu.org/program/#how-to-make-gnu-guix-irresistible-in-2022-and-beyond">making Guix more
approachable</a>
and we’re always eager to get feedback from newcomers—but at least the
basics should be accessible to anyone who has used the command line
before, or even just
<a href="https://hpc.guix.info/blog/2019/10/towards-reproducible-jupyter-notebooks/">Jupyter</a>.</p><p>Our message is that it <em>is</em> possible to achieve these two types of
“reproducibility”: the ability to deploy the same environment anywhere
anytime, and the ability to verify the source/binary correspondence of
an existing deployment. “Good enough” solutions are good enough in
narrow cases only. We can and must demand more of our deployment tools.</p><h1>Beyond reproducibility</h1><p>This post focuses on reproducibility, but we should keep in mind that the
scientific process does not consist in merely reproducing experiments
as-is—it’s about experimenting, fiddling with the computation to evaluate
the impact of a parameter on the output, changing parts of the
code, and so forth. In a thoughtful article, Hinsen identifies <a href="https://blog.khinsen.net/posts/2020/11/20/the-four-possibilities-of-reproducible-scientific-computations/">four
“essential
possibilities”</a>
for reproducible computations:</p><blockquote><ol><li><p>The possibility to inspect all the input data and all the source
code that can possibly have an impact on the results.</p></li><li><p>The possibility to run the code on a suitable computer of one’s own
choice in order to verify that it indeed produces the claimed
results.</p></li><li><p>The possibility to explore the behavior of the code, by inspecting
intermediate results, by running the code with small modifications,
or by subjecting it to code analysis tools.</p></li><li><p>The possibility to verify that published executable versions of the
computation, proposed as binary files or as services, do indeed
correspond to the available source code.</p></li></ol></blockquote><p>These four items might look consensual but their practical implications
are wide-ranging. The first item is unlocked by publishing scientific
software under a free license—as UNESCO
<a href="https://www.unesco.org/en/natural-sciences/open-science">recommends</a>—and
the two kinds of reproducibilities discussed in this article support #2
and #4. To <em>explore</em> the behavior of the code, we need more. Guix
eases exploration with <a href="https://guix.gnu.org/manual/devel/en/html_node/Package-Transformation-Options.html">“package transformation
options”</a>,
which let users deploy variants of the software environment, for example
by applying a patch somewhere in the software stack or swapping one
dependency for another. A “frozen” application bundle such as a Docker
image does not provide this lever.</p><p>That most scientific processes now involve software should be an
opportunity to <em>improve</em> reproducibility and provenance tracking and to
facilitate experimentation, not the other way around.</p><h1>Acknowledgments</h1><p>Many thanks to Ricardo Wurmus who provided valuable feedback on an
earlier draft of this post.</p>Celebrating 10 years of Guix in Paris, 16–18 SeptemberLudovic Courtès, Tanguy Le Carrour, Simon Tournierguix-devel@gnu.org2022-06-13T15:00:00Z<p>It’s been <a href="https://guix.gnu.org/en/blog/2022/10-years-of-stories-behind-guix/">ten years of
GNU Guix</a>! To
celebrate, and to share knowledge and enthusiasm, a <a href="https://10years.guix.gnu.org">birthday
event</a> will take place on <strong>September
16–18th, 2022</strong>, in Paris, France. The program is being finalized, but
you can <a href="https://10years.guix.gnu.org">already register</a>!</p><blockquote><p><strong>Update</strong> (2022-07-12): <a href="https://10years.guix.gnu.org/program">Preliminary
program</a> published!</p></blockquote><p><img src="/static/images/blog/10-years-of-guix_colorful-10.gif" alt="10 year anniversary artwork" /></p><p>This is a community event with several twists to it:</p><ul><li>Friday, September 16th, is dedicated to <strong>reproducible research
workflows and high-performance computing</strong> (HPC)—the focuses of the
<a href="https://hpc.guix.info">Guix-HPC</a> effort. It will consist of talks
and experience reports by scientists and practitioners.</li><li>Saturday targets <strong>Guix and free software enthusiasts</strong>, users and
developers alike. We will reflect on ten years of Guix, show what
it has to offer, and present on-going developments and future
directions.</li><li>on Sunday, users, developers, developers-to-be, and other
contributors will <strong>discuss technical and community topics</strong> and
join forces for hacking sessions, <a href="https://en.wikipedia.org/wiki/Unconference">unconference
style</a>.</li></ul><p><a href="https://10years.guix.gnu.org">Check out the web site</a> and consider
registering as soon as possible so we can better estimate the size of
the birthday cake!</p><p>If you’re interested in presenting a topic, in facilitating a session,
or in organizing a hackathon, please get in touch with the organizers at
<code>guix-birthday-event@gnu.org</code> and we’ll be happy to make room for you.
We’re also looking for people to help with logistics, in particular
during the event; please let us know if you can give a hand.</p><p>Whether you’re a scientist, an enthusiast, or a power user, we’d love to
see you in September. Stay tuned for updates!</p><blockquote><p><em>Originally published <a href="https://guix.gnu.org/en/blog/2022/celebrating-10-years-of-guix-in-paris/">on the Guix
blog</a>.</em></p></blockquote>Back to the future: modules for Guix packagesLudovic Courtèsguix-devel@gnu.org2022-05-06T14:45:00Z<p>Some things in our software world are timeless. The venerable
<a href="http://modules.sourceforge.net/">Environment Modules</a> are one of these.
If you’ve ever used a high-performance cluster in the last three
decades, chances are you’re already familiar with it. Modules is about
managing software environments, just like Guix is—or, perhaps more
accurately, <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-shell.html"><code>guix shell</code></a>.</p><p>You will be delighted, or surprised, to learn that Guix
now has a <a href="https://gitlab.inria.fr/guix-hpc/guix-modules">compatibility layer with
Modules</a>.</p><p><img src="/static/images/blog/modules-logo.svg" alt="Environment Modules logo." /></p><h1>The legacy of Modules</h1><p>As Furlani’s <a href="http://modules.sourceforge.net/docs/Modules-Paper.pdf">1991 introductory paper
explains</a>,
Modules were—and still are—a key enabler for Unix users, especially in
high-performance computing (HPC). The <code>module</code> command lets users
manipulate their software environment in terms of packages, without
having to be Unix or shell experts; they let them <em>compose</em> packages and
build the software environment of their choice, without interfering with
other users; they give a level of flexibility that Unix alone wouldn’t
provide. The command-line interface is easily understood:</p><pre><code>module load gcc/11.2</code></pre><p>“loads” GCC 11.2 in your shell. You can
“load” and “unload” software components at will:</p><pre><code>module load python/3.8
module unload gcc</code></pre><p>As an <em>interface</em>, Modules are easy to use and understand.
However, they leave it up to sysadmins (sometimes users) to
actually <em>deploy</em> the software. The common approach has been for
sysadmins to build and install, <em>by themselves</em>, the software that
Modules refer to. The end result is that modules vary from machine to
machine. For example the <code>gcc</code> module shown above might refer to
GCC 11.2 on one cluster and GCC 8 on another; it might have an entirely
different name on a third cluster. Likewise, the <code>python/3.8</code> module
above might refer to different patch-level versions of Python 3.8, or
it might refer to a variant of Python
built with different dependencies or different build flags.</p><p>These issues have been largely mitigated by package managers such as
<a href="https://easybuild.io/">EasyBuild</a> and <a href="https://spack.io/">Spack</a>: both
automate package builds, and both can generate <a href="https://modules.readthedocs.io/en/stable/modulefile.html"><em>module
files</em></a>—Tcl
snippets that define environment variables to set when “loading” a
module. With EasyBuild and Spack, it becomes possible to not only
automate deployment and module file generation, but also to deploy
<em>similar</em> software on different machines.</p><p>“Similar”, though, does not mean “the same”. Software built with Spack
or EasyBuild depends on software already available on the host system:
it is built <em>on top</em> of a GNU/Linux distribution, which could be
CentOS 7.4 (released in 2017), or Ubuntu 22.04, or really anything else.
Thus, software installed with these tools depends on software provided
by the underlying distribution, at build time and at run time.</p><p>This “hidden dependency” makes it hard to redeploy the exact same
environment on a different machine or at a different point in time: the
same build process
<a href="https://github.com/easybuilders/easybuild-easyconfigs/issues/10666">might</a>
<a href="https://github.com/spack/spack/issues/16780">fail</a>, or it might succeed
but the resulting software might <a href="https://github.com/easybuilders/easybuild-easyconfigs/issues/3408">behave
differently</a>.
<a href="https://hal.inria.fr/hal-01161771/en">Our approach in Guix</a> is to <em>not</em>
have that “hidden dependency”. Instead, the package dependency graph
that Guix manipulates is <em>self-contained</em>: it includes package
definitions for <em>all</em> the user-land software one may use.</p><h1>From Guix to Modules</h1><p>The news today is the release of
<a href="https://gitlab.inria.fr/guix-hpc/guix-modules">Guix-Modules</a>, a new tool to
generate module files from
Guix packages. The primary goal, as with the module file generation
tools in EasyBuild and Spack, is to make it easy for HPC cluster
sysadmins to provide a set of modules for their users—more on that
below. Guix-Modules is an extension of Guix. To use it, you need to
install it and to set the <code>GUIX_EXTENSIONS_PATH</code> environment variable,
like so:</p><pre><code>guix install guix-modules
export GUIX_EXTENSIONS_PATH="$HOME/.guix-profile/share/guix/extensions"</code></pre><p>That gives you a new <code>guix module</code> sub-command.</p><p>Let’s say you want to generate modules to <code>/opt/modules</code> for selected
packages; you can do so by running:</p><pre><code>guix module create -o /opt/modules \
coreutils gcc-toolchain python python-numpy</code></pre><p>As with all Guix commands, it will build or download the packages if they’re not
around already and populate <code>/opt/modules</code> with a bunch of module files.
If <code>/opt/modules</code> already existed, it has been backed up under
<code>/var/guix/profiles</code>, which lets you roll back to the previous modules
should you regret your changes.</p><p>As an admin, you can periodically update the set of modules by running:</p><pre><code>guix pull
guix module create -o /opt/modules …</code></pre><p>The good thing is that users can still access the previous module set,
until you explicitly remove it, under <code>/var/guix/profiles</code>.</p><p>Instead of having those long <code>guix module create</code> command lines, you can
opt for listing the packages of interest in a <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-package.html#index-profile-manifest"><em>manifest
file</em></a>,
which you can keep under version control. As with most other <code>guix</code>
commands, you can pass the manifest with:</p><pre><code>guix module create -m my-modules.scm -o /opt/modules</code></pre><p>Once the modules have been generated, you can happily load and unload
them using the familiar <code>module</code> sub-commands:</p><pre><code>unset MODULEPATH
module use /opt/modules
module load gcc-toolchain/11.2.0
module load python/3.9.9</code></pre><p>Voilà! If you’re a sysadmin, here’s a new way to offer scientific
software to your users without asking them to change their habits. The
generated module files work equally well with <a href="http://modules.sourceforge.net/">the “original” Module
implementation</a> and with
<a href="https://lmod.readthedocs.io/">Lmod</a>.</p><h1>Provenance tracking</h1><p>Since we, Guix developers, pride ourselves on providing a deployment
tool with good support for provenance tracking, we couldn’t just let
that <code>guix module</code> command generate module files of unclear provenance.
Users—we think—ought to be able to determine the provenance of the
modules they use. We want to avoid the scenario many HPC practitioners
are familiar with whereby, six months after publishing an article, you
can no longer reproduce the computational results it contains because
the relevant modules have been upgraded or removed from under your feet
and you just don’t know how to reproduce them.</p><p>Thus, <code>guix module create</code> records provenance data in the module files
it generates. You can view that info by running <code>module help</code>:</p><pre><code>$ module help openblas
----------- Module Specific Help for 'openblas/0.3.18' ------------
This module was generated from a GNU Guix package.
Provenance data (channels):
(list (channel
(url "https://git.savannah.gnu.org/git/guix.git")
(branch "master")
(commit
"4ba35ccd18f90314caa76ea1833ffc383559401c")
(name 'guix)
(introduction
(make-channel-introduction
"9edb3f66fd807b096b48283debdcddccfea34bad"
(openpgp-fingerprint
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))
</code></pre><p>What <code>module help</code> shows is the list of
<a href="https://guix.gnu.org/manual/en/html_node/Channels.html"><em>channels</em></a>
from which this particular package was built. The information is in a
format that <code>guix time-machine</code> can readily consume. Assuming you
store the <code>(list (channel …))</code> snippet in file <code>channels.scm</code>, you can
go to another machine, at a later point in time, and deploy <em>the exact
same software</em> with this command:</p><pre><code>guix time-machine -C channels.scm -- \
shell gcc-toolchain openblas</code></pre><p>For users, it makes a big difference: modules are no
longer ephemeral—they’re now a reproducible artifact <em>that
you can redeploy with Guix anywhere, anytime</em>.</p><h1>Customization</h1><p>HPC users are often demanding when it comes to customizing
software build processes. Guix supports this need with a gamut of
<a href="https://guix.gnu.org/manual/en/html_node/Package-Transformation-Options.html">package transformation
options</a>
available from the command line as well as through <a href="https://guix.gnu.org/manual/en/html_node/Defining-Package-Variants.html">programming
interfaces</a>.
Good news: <code>guix module create</code> honors package transformation options.</p><p>Among those, the <code>--tune</code> option, which instructs Guix to <a href="https://hpc.guix.info/blog/2022/01/tuning-packages-for-a-cpu-micro-architecture/">optimize
relevant packages for the host
micro-architecture</a>,
may come in handy. If you know your cluster contains only Skylake CPUs,
you’d rather make sure relevant packages are optimized for Skylake. To
do that, you would run, say:</p><pre><code>guix module create --tune=skylake \
gcc-toolchain openblas gsl</code></pre><p>In this particular case, <a href="https://hpc.guix.info/package/gsl">GSL</a> gets
built for Skylake, using GCC’s <code>-march=skylake</code> option (OpenBLAS itself
<a href="https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/">chooses optimized routines at run
time</a>
so it is unaffected).</p><p>“But what about reproducibility?”, you ask. The chosen package
transformation option(s)—<code>--tune</code> in this case—are <em>also</em> recorded as
part of the provenance data. This is what <code>module help</code> reports:</p><pre><code>$ module help gsl
----------- Module Specific Help for 'gsl/2.7' --------------------
This module was generated from a GNU Guix package.
Provenance data (channels):
(list (channel
(url "https://git.savannah.gnu.org/git/guix.git")
(branch "master")
(commit
"4ba35ccd18f90314caa76ea1833ffc383559401c")
(name 'guix)
(introduction
(make-channel-introduction
"9edb3f66fd807b096b48283debdcddccfea34bad"
(openpgp-fingerprint
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))
Package transformations:
((tune . "skylake"))
</code></pre><p>The “Package transformations” bit is self-explanatory; it can be
passed as-is to
<a href="https://guix.gnu.org/manual/en/html_node/Defining-Package-Variants.html#index-options_002d_003etransformation"><code>options->transformation</code></a>
in a manifest.</p><p>We strongly believe one <a href="https://hal.inria.fr/hal-03604971">shouldn’t have to choose between performance
and reproducibility</a> and this is what
this feature set supports.</p><h1>Why all the fuss?</h1><p>Guix is <a href="https://guix.gnu.org/en/blog/2022/10-years-of-stories-behind-guix/">ten years
old</a>,
Guix-HPC itself is <a href="https://hpc.guix.info/blog/2017/09/guix-hpc-debut/">turning five this
year</a>, so you might
wonder why after all these years we’re adding a Modules compatibility layer. After
all, <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-shell.html"><code>guix shell</code></a>
can set up software environments on-the-fly in a way that is comparable to
<code>module load</code>. For instance, to start a shell to use GCC and Python as
in the example above, you would type:</p><pre><code>guix shell gcc-toolchain@11 python@3.8</code></pre><p>More generally, Guix puts users in control: it lets them upgrade when
they want to and allows them to <a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-time_002dmachine.html">travel in
time</a>;
it lets them <a href="https://guix.gnu.org/manual/en/html_node/Package-Transformation-Options.html">customize
packages</a>,
and it lets them <a href="https://guix.gnu.org/manual/en/html_node/Replicating-Guix.html">replicate the same
environment</a>
elsewhere or at a different point in time.</p><p>Using Guix directly remains the most empowering approach for users, but
module files created from Guix packages can satisfy a number of user
needs:</p><ol><li>Matching user habits. For some communities, not having to learn a
new command—even if it’s not all that different, even if it has
more to offer—is a big plus. It’s not uncommon for cluster admins
to offer Modules <em>in addition</em> to Guix or other tools for that
reason.</li><li>Supporting incremental software environment construction. With
<code>module</code>, you can “load” and “unload” modules until you obtain the
desired environment, whereas <code>guix shell</code> currently expects a list
of packages upfront. While exploring a problem space, the
incremental mode might be more convenient—and indeed, <a href="https://issues.guix.gnu.org/54375">patches have
recently been discussed</a> to
support an incremental mode in <code>guix shell</code>.</li><li>Supporting simple Guixy cluster setups. The <a href="https://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster/">Guix typical cluster
setup</a>
requires running the build daemon, ensuring it can access the
network to download source or binaries, making it accessible to
front nodes and (optionally) build nodes, and setting up a couple
of NFS exports. Sysadmins who’d rather not do that can instead use
<code>guix module create</code> and offer those modules to users. The
<code>/gnu/store</code> directory still needs to be exported over NFS, but
that’s a read-only export, and it’s all that’s needed—a simpler
setup.</li></ol><p>If you’re an HPC cluster user or system administrator, we’d love to hear
your thoughts <a href="https://hpc.guix.info/about/">on the <code>guix-science</code> mailing list or <code>#guix-hpc</code> channel
on Libera.chat</a>!</p>Guix-HPC Activity Report, 2021Pierre-Antoine Bouttier, Ludovic Courtès, Yann Dupont, Marek Felšöci, Felix Gruber, Konrad Hinsen, Arun Isaac, Pjotr Prins, Philippe Swartvagher, Simon Tournier, Ricardo Wurmusguix-devel@gnu.org2022-02-03T14:00:00Z<p><em>This document is also available as
<a href="https://hpc.guix.info/static/doc/activity-report-2021.pdf">PDF</a>
(<a href="https://hpc.guix.info/static/doc/activity-report-2021-booklet.pdf">printable
booklet</a>).</em></p><p>Guix-HPC is a collaborative effort to bring reproducible software
deployment to scientific workflows and high-performance computing (HPC).
Guix-HPC builds upon the <a href="https://guix.gnu.org">GNU Guix</a> software
deployment tools and aims to make them useful for HPC practitioners
and scientists concerned with dependency graph control and customization and, uniquely, reproducible research.</p><p>Guix-HPC was launched in September 2017 as a joint software development
project involving three research institutes:
<a href="https://www.inria.fr/en/">Inria</a>, the <a href="https://www.mdc-berlin.de/">Max Delbrück Center for
Molecular Medicine (MDC)</a>, and the <a href="https://ubc.uu.nl/">Utrecht
Bioinformatics Center (UBC)</a>. GNU Guix for HPC and
reproducible science has received contributions from additional
individuals and organizations, including <a href="https://www.cnrs.fr/en">CNRS</a>,
the <a href="https://u-paris.fr/en/">University of Paris (Diderot)</a>,
the <a href="https://uthsc.edu/">University of Tennessee Health Science Center</a>
(UTHSC), the <a href="https://leibniz-psychology.org/">Leibniz Institute for
Psychology</a> (ZPID),
<a href="https://www.cray.com">Cray, Inc.</a> (now HPE), and <a href="http://tourbillion-technology.com/">Tourbillion
Technology</a>.</p><p>This report highlights key achievements of Guix-HPC between <a href="https://hpc.guix.info/blog/2021/02/guix-hpc-activity-report-2020/">our
previous
report</a>
a year ago and today, February 2022. This year was marked by exciting
developments for HPC and reproducible workflows: the release of
<a href="https://guix.gnu.org/en/blog/2021/gnu-guix-1.3.0-released/">GNU Guix 1.3.0 in
May</a>, the
ability to tune packages for a CPU micro-architecture with the <code>--tune</code>
option, improved Software Heritage support, new releases of Guix-Jupyter
and the Guix Workflow Language (GWL), support for POWER9 CPUs and
on-going work porting to RISC-V, and more.</p><h1>Outline</h1><p>Guix-HPC aims to tackle the following high-level objectives:</p><ul><li><em>Reproducible scientific workflows.</em> Improve the GNU Guix tool set
to better support reproducible scientific workflows and to simplify
sharing and publication of software environments.</li><li><em>Cluster usage.</em> Streamlining Guix deployment on HPC clusters, and
providing interoperability with clusters not running Guix.</li><li><em>Outreach & user support.</em> Reaching out to the HPC and scientific
research communities and organizing training sessions.</li></ul><p>The following sections detail work that has been carried out in each of
these areas.</p><h1>Reproducible Scientific Workflows</h1><p><img src="https://hpc.guix.info/static/images/blog/lab-book.svg" alt="Lab book." /></p><p>Supporting reproducible research workflows is a major goal for Guix-HPC.
The ability to <em>reproduce</em> and <em>inspect</em> computational
experiments—today’s lab notebooks—is key to establishing a rigorous
scientific method. <a href="https://en.unesco.org/science-sustainable-future/open-science/recommendation">UNESCO’s Recommendation on Open
Science</a>,
published in November 2021, recognizes the importance of free software
in research and further notes (§7d):</p><blockquote><p>In the context of open science, when open source code is a component
of a research process, enabling reuse and replication generally
requires that it be accompanied with open data and open specifications
of the environment required to compile and run it.</p></blockquote><p>This key point is often overlooked: the ability to reproduce and inspect
the software environments of experiments <em>is a prerequisite</em> for
transparent and reproducible research workflows.</p><p>To that end, we work not only on deployment issues, but also <em>upstream</em>—ensuring
source code is archived at Software Heritage—and
<em>downstream</em>—devising
tools and workflows for scientists to use. The sections below summarize
the progress made on these fronts and include experience reports by
two PhD candidates showing in concrete terms how Guix fits in
reproducible HPC workflows.</p><h2>Workflow Languages</h2><p>The <a href="https://workflows.guix.info">Guix Workflow Language</a> (or GWL) is
a scientific computing extension to GNU Guix's declarative language
for package management. It allows for the declaration of scientific
workflows, which will always run in reproducible environments that GNU
Guix automatically prepares. In the past year the GWL has received
several bug fixes and infrastructure for detailed logging; it also
gained a DRMAA process engine to submit generated jobs to any HPC
scheduler with an implementation of DRMAA, such as Slurm and Grid
Engine. This was made possible through the newly released <a href="https://lists.gnu.org/archive/html/guile-user/2021-04/msg00081.html">high-level
Guile bindings to DRMAA version
1</a>.
We <a href="https://lists.gnu.org/archive/html/gwl-devel/2022-01/msg00000.html">released version 0.4.0 of the
GWL</a>
on January 29.</p><p>Earlier in January, we announced <a href="https://ccwl.systemreboot.net/">ccwl</a>, the Concise Common
Workflow Language. ccwl is a workflow language with a concise syntax
compiling to the <a href="https://www.commonwl.org/">Common Workflow
Language</a> (CWL). While GWL offers a novel
workflow language with integrated deployment <em>via</em> Guix, ccwl instead
aims to leverage tooling around the popular Common Workflow Language
while addressing some of its limitations.
We published a <a href="https://hpc.guix.info/blog/2022/01/ccwl-for-concise-and-painless-cwl-workflows/">detailed
article introducing
ccwl</a>
and expounding its merits. ccwl significantly cuts short on the
verbosity of CWL, thus removing one of the barriers to its wider
adoption. ccwl is implemented as a domain specific language embedded
in GNU Guile, and interoperates with GNU Guix to provide
reproducibility. ccwl also aims to minimize frustration for users by
providing strong compile-time error checking and high-quality error
messages. We also plan to pre-package commonly used command-line
scientific tools into ready-made ccwl workflows. Work on these
exciting new features is already underway.</p><h2>Reproducible Software Deployment for Jupyter</h2><p>We <a href="https://hpc.guix.info/blog/2019/10/towards-reproducible-jupyter-notebooks/">announced
Guix-Jupyter</a>
two years ago, with two goals: making notebooks <em>self-contained</em> or
“deployment-aware” so that they automatically deploy the software (and
data!) that they need, and making said deployment <em>bit-reproducible</em>.
Earlier this year, we published version 0.2.2 as a bug-fix release.</p><p><img src="/static/images/blog/guix-jupyter/guix-jupyter.png" alt="Guix-Jupyter logo." /></p><p>Guix-Jupyter is implemented as a Jupyter <em>kernel</em>: it acts as a proxy
between the notebook and the programming language notebook cells are
written in. It interprets annotations found in the notebook to deploy
precisely the right software packages needed to run the notebook. We
believe this is a robust approach to address the Achilles’ heel that
software deployment represents for reproducible computations with
Jupyter.</p><p>Yet, because <a href="https://mybinder.org/">Binder</a> and its associated services
and tools are a popular way to deploy Jupyter notebooks, we wanted to
offer an alternative solution integrated with Binder. Under the hood,
Binder builds upon
<a href="https://repo2docker.readthedocs.io/en/latest/">repo2docker</a>, a tool to
build Docker images straight from source code repositories. Repo2docker
has a number of back-ends called <em>buildpacks</em> to handle packaging
metadata in a variety of formats: when a <code>setup.py</code> file is available,
software is deployed using standard Python tools, the presence of an
<code>install.R</code> file leads to deployment using GNU R, an <code>apt.txt</code> file
instructs it to install software using Debian’s package manager, and so
on.</p><p>As part of a three-month internship at Inria, Hugo Lecomte implemented a
Guix buildpack for repo2docker. If a <code>guix.scm</code> or a <code>manifest.scm</code>
file is found in the source repository, repo2docker uses it to populate
the Docker image being built. Additionally—and this is a significant
difference compared to other buildpacks—, software deployed with Guix
can be <em>pinned</em> at a specific revision: if a
<code>channels.scm</code> file is
found, the buildpack passes it to <code>guix time-machine</code>; this ensures that
software is deployed from the exact Guix revision specified in
<code>channels.scm</code>.</p><p>This Guix buildpack for repo2docker has been <a href="https://github.com/jupyterhub/repo2docker/pull/1048">submitted upstream and
reviewed</a>, but as
of this writing it has yet to be merged. We believe it provides another
convenient way for Jupyter Notebook users to ensure their code runs in
the right software environment.</p><h2>Ensuring Source Code Availability</h2><p>Guix lets users re-deploy software environments, for instance <em>via</em>
<a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-time_002dmachine.html"><code>guix time-machine</code></a>.
This is possible because Guix can rebuild software, which, in turn, is
only possible if source code is permanently available. <a href="https://www.softwareheritage.org/2019/04/18/software-heritage-and-gnu-guix-join-forces-to-enable-long-term-reproducibility/">Since
2019</a>
Guix developers collaborate with Software Heritage (SWH) to make that a
reality. A lot has been achieved since then but some challenges remained before we
could be sure that SWH would archive every piece of source code Guix
packages refer to.</p><p>One of the main roadblocks <a href="https://hpc.guix.info/blog/2019/03/connecting-reproducible-deployment-to-a-long-term-source-code-archive/">we identified early
on</a>
are source code archives—
<code>tar.gz</code> and similar files, colloquially known
as “tarballs”. SWH, rationally, stores the <em>contents</em> of these archives,
but it does not store the archives themselves. Yet, most Guix package
definitions refer to tarballs; Guix expects to be able to download those
tarballs and to verify that they match.
How do we deal with this impedance mismatch?</p><p><img src="/static/images/blog/disarchive-swh-diagram.png" alt="Diagram showing Disarchive and Software Heritage." /></p><p>Last year, Guix developer Timothy Sample <a href="https://hpc.guix.info/blog/2021/02/guix-hpc-activity-report-2020/">had just started work to
address
this</a>.
Timothy developed a tool called
<a href="https://ngyro.com/software/disarchive.html">Disarchive</a> that supports
two operations: “disassembling” and “reassembling” tarballs. In the
former case, it extracts tar and compression metadata along with an
identifier (SWHID) pointing to contents available at SWH; in the latter
case, Disarchive assembles content and metadata to <em>recreate</em> the
tarball as it initially existed. From there we create a <em>Disarchive
database</em> that maps cryptographic hashes of tarballs to their metadata.</p><p>This year we deployed, on the Guix build farm, infrastructure to
<a href="https://ci.guix.gnu.org/jobset/disarchive">continuously build the
database</a> and to publish it
at <a href="https://disarchive.guix.gnu.org"><code>disarchive.guix.gnu.org</code></a>. We
added support in Guix so that it can use Disarchive + SWH as a fallback
when downloading a tarball from its original URL fails, significantly
improving source code archival coverage.</p><p>Beyond Guix, this work is crucial for all the deployment tools that rely
on the availability of tarballs—Brew, Gentoo, Nix, Spack, and other
package managers, but also scientific workflow tools such as
<a href="https://maneage.org/">Maneage</a> and individual <code>Dockerfile</code>s and
scripts. This led SWH and the Sloan Foundation to <a href="https://www.softwareheritage.org/2022/01/13/preserving-source-code-archive-files/">allocate a
grant</a>
so that Timothy Sample could address some of the remaining challenges.</p><p>Among those, Timothy has already been able to expand Disarchive
compression support beyond gzip—version 0.4.0 adds support for xz, the
second most popular compression format for tarballs. To have a clear
vision of the progress being made, Timothy has been publishing periodic
<em>Preservation of Guix Reports</em>. The <a href="https://ngyro.com/pog-reports/2022-01-16/">latest
one</a> shows that archival
coverage for all the Guix revisions since version 1.0.0 is at 72%; the
breakdown by revision shows that coverage reaches 86% for recent
commits. Simon Tournier has been carefully monitoring coverage and
discussing with other Guix developers and with the SWH team to identify
reasons why specific pieces of source code would not be archived.
Ludovic Courtès had the pleasure to join the <a href="https://www.softwareheritage.org/news/events/swh5years/">SWH Fifth Anniversary
event</a>, on
behalf of the Guix team, to show all the progress made and to discuss
the road ahead.</p><h2>Tuning Packages for a CPU</h2><p>GNU Guix is now well known for supporting “reproducibility”, which is
really twofold: it is first the ability to re-deploy the same software
stack on another machine or at a different point in time, and second the
ability to <em>verify</em> that binaries being run match the source code—the
latter is what <a href="https://reproducible-builds.org/docs/definition/">reproducible
builds</a> are concerned
with.</p><p><img src="/static/images/blog/cpu-tuning-poster.png" alt="Illustration of CPU tuning." /></p><p>However, in HPC circles there is the entrenched perception that
reproducibility is antithetic to performance. Practitioners are
especially concerned with the performance of the Message Passing
Interface (MPI) implementations on high-speed network devices, and with
the ability of code to use single-instruction/multiple-data (SIMD)
extensions of the latest CPUs—such as AVX-512 on x86_64, or NEON on ARMv8. We
showed that these concerns are largely unfounded in a <a href="https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/">2018 article on
achieving performance with portable
binaries</a>
and in a <a href="https://hpc.guix.info/blog/2019/12/optimized-and-portable-open-mpi-packaging/">2019 article on
Open MPI</a>.</p><p>The former article showed how performance-sensitive C code is already
taking advantage of <em>function multi-versioning</em> (FMV). There remain
cases, though, where this technique is not applicable. As a result,
GNU/Linux distributions—from Guix to Debian and CentOS—that distribute
binaries built for the <em>baseline</em> x86_64 architecture miss out on SIMD
optimizations. A notorious example of packages that do not support FMV
is C++ header-only libraries, such as the Eigen linear algebra library.</p><p>To address this, <a href="https://hpc.guix.info/blog/2022/01/tuning-packages-for-a-cpu-micro-architecture/">we introduced what we call <em>package
multi-versioning</em></a>:
with the new <code>--tune</code> package transformation option, Guix users can
obtain a package variant specifically tailored for the host CPU. Yet,
users can avoid rebuilding time-consuming local builds if a pre-built
binary for the same CPU variant is available on-line.</p><p>While building a package with <code>-march=native</code> (instructing the compiler
to optimize for the CPU of the build machine) leaves no trace, using
Guix’s <code>--tune</code> is properly recorded in metadata. For example, a Docker
image built with <code>guix pack --tune --save-provenance</code> contains, in its
metadata, the CPU type for which it was tuned, allowing for independent
verification of its binaries. This is to our knowledge the first
implementation of CPU tuning that does not sacrifice reproducibility.</p><h2>Packaging</h2><p>The package collection that comes with Guix keeps growing. It now contains
more than 20,000 curated packages, including many scientific packages ranging from
run-time support software such as implementations of the Message Passing
Interface (MPI), to linear algebra software, to statistics and bioinformatics
modules for R.</p><p>The Julia programming language has been gaining traction in the
scientific community and efforts in Guix reflect that momentum. At the time of the previous report,
February 2021, Guix included a dozen Julia packages. Today, January 2022,
it includes more than 260 Julia packages, from bioinformatics software
such as BioSequence.jl to
machine learning software like Zygote.jl. Under the hood, the Julia build
system in Guix has been
improved; in particular, it now supports both parallel builds and
parallel tests, providing a significant speedup. It also
allows the built-in Julia package manager <code>Pkg</code> to find packages
already installed by Guix.</p><p>In 2021, we added the popular PyTorch machine learning framework to our
package collection. While it had long been available <em>via</em> <code>pip</code>, the
Python package manager, we highlighted <a href="https://hpc.guix.info/blog/2021/09/whats-in-a-package/">in a blog
post</a> things that we as users do not
notice about packages: what is <em>inside</em> of them, and the work behind it.
We showed that the requirements for Guix packages to build software from
source and to avoid bundling external dependencies are key to
transparency, auditability, and provenance tracking—all of which are
ultimately the foundations of reproducible research.</p><p>Many scientific packages were upgraded: the Dune finite element
libraries have been updated to 2.7.1, the <a href="https://hpc.guix.info/package/python-pygmsh">Python bindings to
Gmsh</a> were updated to
7.1.11, <a href="https://hpc.guix.info/packages/petsc">PETSc</a> and related
packages were updated to 3.16.1, to name a few. Run-time support
packages such as MPI libraries also received a number of updates.</p><p>Statistical and bioinformatics packages for the R programming language
have seen regular comprehensive upgrades, closely following updates to
the popular CRAN and Bioconductor repositories. At the time of this
writing Guix provides a collection of more than 1900 reproducibly
built R packages, making R one of the best supported programming
environments in Guix.</p><p>Core packages have seen important changes; in particular, packages are
now built with GCC 10.3 by default (instead of 7.5), using the GNU C
Library version 2.33. The style of package inputs has been
<a href="https://guix.gnu.org/en/blog/2021/the-big-change/">considerably
simplified</a>; together
with the introduction of <a href="https://guix.gnu.org/manual/devel/en/html_node/Invoking-guix-style.html"><code>guix style</code></a>
for automatic formatting, we hope it will make it easier to get started
writing new packages.</p><h2>Supporting POWER9 and RISC-V CPUs</h2><p>In April 2021, Guix <a href="https://guix.gnu.org/en/blog/2021/new-supported-platform-powerpc64le-linux/">gained support for POWER9
CPUs</a>,
a platform that some HPC clusters build upon. While support in Guix—and
in the broader free software stack—is not yet on par with that of
x86_64, it is gradually improving. The project’s build farm now has two
beefy POWER9 build machines.</p><p>While it is perhaps early days to call RISC-V an HPC platform, there are indicators that this may happen in the near future with investments from the <a href="https://www.tomshardware.com/news/risc-v-cluster-demonstrated">USA</a>, the <a href="https://www.european-processor-initiative.eu/epi-epac1-0-risc-v-test-chip-samples-delivered/">EU</a>, India, and China.</p><p>Together with Chris Batten of Cornell and Michael Taylor of the University of Washington, Erik Garrison and Pjotr Prins are UTHSC PIs responsible for creating a new NSF-funded <a href="https://news.cornell.edu/stories/2021/11/5m-grant-will-tackle-pangenomics-computing-challenge">RISC-V supercomputer for pangenomics</a>. It will incorporate GNU Guix and the <a href="https://guix.gnu.org/en/blog/2019/guix-reduces-bootstrap-seed-by-50/">GNU Mes bootstrap</a>, with input from Arun Isaac, Efraim Flashner and others. <a href="https://nlnet.nl/project/current.html">NLNet</a> is also funding the GNU Mes RISC-V bootstrap project with Ekaitz Zarraga and Jan Nieuwenhuizen. We aim to continue adding RISC-V support to GNU Guix at a rapid pace.</p><p>Why is the combination of GNU Mes and GNU Guix exciting for RISC-V? First of all, RISC-V is a very modern modular open hardware architecture that provides further guarantees of transparency and security. It extends reproducibility to the transistor level and for that reason generates interest from the Bitcoin community, for example. Because there are no licensing fees involved, RISC-V is already a major force in IoT and will increasingly penetrate hardware solutions, such as storage microcontrollers and network devices, going all the way to GPU-style parallel computing and many-core solutions with thousands of cores on a single die. GNU Mes and GNU Guix are particularly suitable for RISC-V because Guix can optimize generated code for different RISC-V targets and is able to parameterize deployed software packages for included/excluded RISC-V modules.</p><h2>On the way to a reproducible PhD thesis</h2><p>GNU Guix and <a href="https://www.orgmode.org">Org mode</a> form a powerful association
when it comes to setting up a PhD thesis workflow. On one hand, GNU Guix allows
us to ensure an experimental software environment is reproducible across various
high-performance testbeds. On the other hand, we can take advantage of the
literate programming paradigm using Org mode to describe the experimental
environment as well as the experiments themselves, then post-process and reuse
the results in final scientific publications.</p><p>The
<a href="https://mfelsoci.gitlabpages.inria.fr/thesis/">ongoing work of Marek Felšöci</a>
at Inria is an actual attempt for a <strong>reproducible PhD thesis</strong> relying on the
conjunction of GNU Guix and Org mode. The thesis project resides in a Git
repository where a dedicated Org file describes and explains all of the source
code and procedures involved in the construction of the experimental software
environment, the execution of experiments as well as the gathering and the
post-processing of the results. This includes a Guix channel file, scripts for
running the experiments, parsing the output logs, producing figures and so on.</p><p>Other Org documents of the repository may then build on these results and
produce the final publications, such as research reports, articles and
slideshows, in various formats. As an existing publication example we can cite
the <a href="https://hal.inria.fr/hal-03263603">research report #9412</a> and the
associated <a href="https://hal.inria.fr/hal-03263620">technical report #0513</a> providing
a literate description of the environment and the experiments the study
presented in the research report relies on.</p><p><img src="/static/images/blog/reproducible-thesis-structure.svg" alt="Partial structure of the repository containing the thesis" /></p><p>In the end, the entire process of setting up the software environment, running
experiments, post-processing results and publishing documents is automated
using continuous integration.</p><p><img src="/static/images/blog/reproducible-thesis-ci.svg" alt="Simplified continuous integration scheme" /></p><p>The result of the continuous integration is
<a href="https://mfelsoci.gitlabpages.inria.fr/thesis/">publicly available</a> as a
collection of web pages and PDF documents hosted using
<a href="https://docs.gitlab.com/ce/user/project/pages/">GitLab Pages</a>.</p><p>The initiative does not stop here. There is an effort to transform this monolithic
setup into independent modules with the aim to share and reuse portions of the
setup in other projects within the research team.</p><h2>Feedback from using Guix to ensure reproducible HPC experiments</h2><p>Philippe Swartvagher (Inria) took the opportunity of writing an article on
the impact of execution tracing on complex HPC
applications to discover how GNU Guix could help perform reproducible
experiments. The article studies the impact of tracing on application
performance, evaluates solutions to reduce this impact, and explores clock
synchronization issues when distributed applications are traced. The paper is
still under review.</p><p>The software stack considered in the article is made of several libraries
(<a href="https://starpu.gitlabpages.inria.fr/">StarPU</a>,
<a href="https://solverstack.gitlabpages.inria.fr/chameleon/">Chameleon</a>,
<a href="https://pm2.gitlabpages.inria.fr/">PM2</a> and
<a href="https://savannah.nongnu.org/projects/fkt">FxT</a>), all of them being already
packaged in GNU Guix, in the Guix-HPC channel. Manually installing this
software stack can be painful, the set of compilation options is wide and
desired options can change from an experience to another, to see their impact.
Correctly compiling the software stack before each experiment and
tracking its current state can be pretty tedious.</p><p>This source of headaches disappears with GNU Guix, especially with the help of
<a href="https://guix.gnu.org/en/manual/en/html_node/Package-Transformation-Options.html">package
transformations</a>.
For instance, <code>--with-input</code> allowed us to use of PM2 instead of the Open MPI
as the communication engine, <code>--with-commit</code> was handy to select a
specific commit of a library (for instance to compare performance before
and after a specific change), and <code>--with-patch</code> was
convenient to apply code modification for a specific experiments (for
instance modifications not suited to be included upstream, but required for the
experiment). These package tranformations, used with <code>guix environment</code> (the
predecessor of <a href="https://guix.gnu.org/en/manual/devel/en/html_node/Invoking-guix-shell.html"><code>guix shell</code></a>),
removes the burden of compiling the correct version of each software before
each experiment.</p><p>This intense use of package transformations lead to some corner cases of
GNU Guix features and <a href="https://issues.guix.gnu.org/49697">raises</a>
<a href="https://issues.guix.gnu.org/49696">several</a>
<a href="https://issues.guix.gnu.org/50335">issues</a>.</p><p>To ensure reproducibility of experiments made with GNU Guix, software versions
have to be pinned and saved along with scripts to launch the experiments.
<a href="https://guix.gnu.org/en/manual/en/html_node/Invoking-guix-describe.html"><code>guix describe</code></a>
and <a href="https://guix.gnu.org/en/manual/en/html_node/Invoking-guix-time_002dmachine.html"><code>guix time-machine</code></a>
are the two GNU Guix's commands to pin revisions and execute
applications built from these precise revisions. Making the experimental
scripts <a href="https://gitlab.inria.fr/pswartva/paper-starpu-traces-r13y">publicly
available</a> is
another step to achieve a reproducible article. It requires us to clearly
organize experiments, describe their goals and workings and ensure the maximum
independence from cluster specificities (or document which changes are
necessary to launch the experiments on another cluster). When the repository
describing the experiments is completed, archiving it on Software Heritage and
providing the obtained ID in the paper to easily retrieve the scripts is
effortless.</p><p>This first paper with GNU Guix was a great opportunity to discover the help
provided by GNU Guix, its ecosystem and support. It also showed areas where
documentation can be improved regarding the workflow to ensure reproducibility
of the experiments—from using <code>guix describe</code> to pin versions, to obtaining
an ID to easily cite the scripts in a paper. Moreover, there are still pending
questions about the best way to generalize experimentation scripts and make them
independent from the clusters being used—e.g., how to deal with different
job schedulers, file systems, and how to provide
instructions to replicate experiments even <em>without</em> Guix.</p><h1>Cluster Usage and Deployment</h1><p>At UTHSC, Memphis (USA), we are running an 11-node large-memory <a href="http://genenetwork.org/facilities/">HPC Octopus cluster</a> (264 cores) dedicated to pangenome and genetics research. In 2021 more SSDs and RAM were added. Notable about this HPC is that it is <em>administered by the users themselves</em>. Thanks to GNU Guix we install, run and manage the cluster as researchers (and roll back in case of a mistake). UTHSC IT manages the infrastructure, i.e., physical placement, routers and firewalls, but beyond that there are no demands on IT. Thanks to out-of-band access we can completely (re)install machines remotely. Octopus runs GNU Guix on top of a minimal Debian install and we are experimenting with pure GNU Guix nodes that can be run on demand. Lizardfs is used for distributed network storage. Almost all deployed software has been packaged in GNU Guix and can be installed by regular users on the cluster without root access.</p><p>At GLiCID (Nantes, France) we are in the process of merging two existing
HPC clusters (10,000+ cores). The first cluster (based on Slurm +
CentOS) already offers the <code>guix</code> command to our users as well as some
specific software on our own Guix channel, since a few years. This
merger involves a lot of change, including identity management. We
wanted to take advantage of this profound change to be more ambitious
and explore automated generation of part of the core infrastructure,
using virtual machines generated by <code>guix system</code>, deployed on
KVM+Ceph. We aim to eventually replace as many of these deployed machines
as possible, adjusting Guix system services and implementing new ones
as we go, benefiting the wider community.</p><h1>Outreach and User Support</h1><h2>Articles</h2><p>The following articles were published in the <a href="https://www.societe-informatique-de-france.fr/bulletin/1024-numero-18/">November 2021 edition of
<em>1024</em></a>,
the magazine of the <em>Société Informatique de France</em> (SIF), the French
computer science society:</p><ul><li>Konrad Hinsen, <a href="https://doi.org/10.48556/SIF.1024.18.11"><em>La Reproductibilité des calculs
coûteux</em></a></li><li>Ludovic Courtès, <a href="https://dx.doi.org/10.48556/SIF.1024.18.15"><em>Reproduire les environnements logiciels : un
maillon incontournable de la recherche
reproductible</em></a></li></ul><p>This article appeared in the <a href="https://connect.ed-diamond.com/GNU-Linux-Magazine/GLMFHS-113">March 2021 special edition of
French-speaking <em>GNU/Linux Magazine
France</em></a>:</p><ul><li>Ludovic Courtès, <a href="https://hal.inria.fr/hal-03418210"><em>Déploiements reproductibles dans le temps avec
GNU Guix</em></a></li></ul><p>The following article introducing the most recent addition to the PiGx
framework of reproducible workflows backed by Guix is awaiting
peer-review and has been submitted to the medRxiv preprint server:</p><ul><li>Vic-Fabienne Schumann et al., <a href="https://doi.org/10.1101/2021.11.30.21266952"><em>COVID-19 infection dynamics
revealed by SARS-CoV-2 wastewater sequencing analysis and
deconvolution</em></a></li></ul><h2>Talks</h2><p>Since last year, we gave the following talks at the following venues:</p><ul><li><a href="https://jcad2021.sciencesconf.org/resource/page/id/8">JCAD conference,
Dec. 2021</a>
(Ludovic Courtès)</li><li><a href="https://events.unesco.org/event?id=1423818652&lang=1033">Software Heritage Firth Anniversary, joint event with
UNESCO, Nov. 2021</a>
(Ludovic Courtès)</li><li><a href="https://trex-coe.eu/events/trex-build-system-hackathon-8-12-nov-2021">TREX Build System Hackathon,
Nov. 2021</a>
(Ludovic Courtès)</li><li><a href="https://packaging-con.org/">PackagingCon, Nov. 2021</a> (Ludovic Courtès)</li><li><a href="https://reproducibility.gricad-pages.univ-grenoble-alpes.fr/web/medias_251121.html#medias_251121">“<em>Pour une recherche reproductible</em>”, MAiMoSIne, SARI, GRICAD,
Nov. 2021</a>
(P.-A. Bouttier)</li><li><a href="https://www.rd-alliance.org/plenaries/rda-18th-plenary-meeting-virtual/software-source-code-and-reproducibility">RDA 18th plenary, Software Source Code, Nov. 2021</a> (P.-A. Bouttier)</li><li><a href="https://datascience.cancer.gov/news-events/events/reproducible-fair-workflows-and-ccwl">Reproducible FAIR+ Workflows and the CCWL, at the US NIH National
Cancer Institute in
Oct. 2021</a>
(Pjotr Prins, Arun Isaac)</li><li><a href="https://www.societe-informatique-de-france.fr/journee-reproductibilite/">special event on reproducibility of the <em>Société Informatique de
France</em> (French computer science society), May
2021</a>
(Konrad Hinsen, Ludovic Courtès)</li></ul><p>We also organised the following events:</p><ul><li>the first <a href="https://hpc.guix.info/events/2021/atelier-reproductibilité-environnements/">on-line workshop on the reproducibility of software
environments</a>
for French-speaking scientists, engineers, and system
administrators, on May 17–18th, 2021 with up to 80 participants.</li><li><a href="https://archive.fosdem.org/2021/schedule/track/declarative_and_minimalistic_computing/">“Declarative and minimalistic computing”
track</a>
at FOSDEM</li></ul><h2>Training Sessions</h2><p>A training session on computational reproducibility for high-energy
sessions took place at the Centre de Physique des Particules de
Marseille in April/May 2021. It included a hands-on session about Guix.</p><p>For the French HPC Guix community, we have set up a monthly on-line
event called <a href="https://hpc.guix.info/events/2021/café-guix/">“Café
Guix”</a>, started in
October 2021. Each month, a user or developer informally presents a
Guix feature or workflow and answers questions.</p><h1>Personnel</h1><p>GNU Guix is a collaborative effort, receiving contributions from more
than 90 people every month—a 50% increase compared to last year. As
part of Guix-HPC, participating institutions have dedicated work hours
to the project, which we summarize here.</p><ul><li>Inria: 2 person-years (Ludovic Courtès and the contributors to the
Guix-HPC channel: Emmanuel Agullo, Marek Felšöci, Nathalie Furmento,
Hugo Lecomte, Gilles Marait, Florent Pruvost, Matthieu Simonin,
Philippe Swartvagher)</li><li>Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC): 2 person-years
(Ricardo Wurmus and Mădălin Ionel Patrașcu)</li><li>University of Tennessee Health Science Center (UTHSC): 3+ person-years (Efraim Flashner, Bonface Munyoki, Fred Muriithi, Arun Isaac, Jorge Gomez, Erik Garrison and Pjotr Prins)</li><li>Utrecht Bioinformatics Center (UBC): 0.1 person-year (Roel Janssen)</li><li>University of Paris (Diderot): 0.5 person-year (Simon Tournier)</li></ul><h1>Perspectives</h1><p>Guix availability on scientific computing clusters remains a top priority.
More HPC practitioners—researchers, engineers, and system administrators—are
adopting Guix and showing interests, from reproducible research to flexible
deployment of virtual machines. We expect to continue to work on these two
complementary fronts: streamlining the use of reproducible packs, and reaching
out to system administrators and cluster users, notably through training
sessions.</p><p>Upstream, we will continue to work with Software Heritage with the goal of
achieving complete archive coverage of the source code Guix refers to. We have
identified challenges related to source code availability; this will probably
be one of the main efforts in this area for the coming year.</p><p>Downstream, a lot of work has happened in the area of reproducible research
tools. Our package collection has grown to include more and more
scientific tools. Tools like the Guix Workflow Language and Guix-Jupyter have
matured; along with the <a href="https://www.psychnotebook.org/">PsychNotebook
service</a>, they bridge the gap between
reproducible software deployment and reproducible scientific tools and
workflows. We also showed how to achieve high performance while
preserving provenance tracking, which we hope dispels the entrenched perception in HPC circles
that reproducibility and performance are antithetic.</p><p>Our work happens in a context of growing awareness of the importance of
software and software environments in research workflows. <a href="https://en.unesco.org/science-sustainable-future/open-science/recommendation">UNESCO’s
Recommendation on Open
Science</a>
and, for example, the <a href="https://www.ouvrirlascience.fr/second-national-plan-for-open-science/">Second French Plan for Open
Science</a>
are two illustrations of that.</p><p>We gave demonstrations of what Guix brings to scientific workflows
and we expect to continue to show that reproducible
scientific workflows are <em>indeed</em> a possibility.
Working on the tools and workflows directly in the hands of scientists will be
a major focus of the coming year. We want to contribute to raising the bar of
what scientists come to expect in terms of reproducible workflows.</p><p>There’s a lot we can do and we’d love to <a href="https://hpc.guix.info/about">hear your ideas</a>!</p>ccwl for concise and painless CWL workflowsArun Isaacguix-devel@gnu.org2022-01-10T14:30:00Z<p>In modern science, analysis is required to process data. When the data-flow
is linear, such a process is easily represented by tools such as the standard
<a href="https://en.wikipedia.org/wiki/Pipeline_(Unix)">Unix pipeline</a>. However, this
data-flow is often modeled by a <a href="https://en.wikipedia.org/wiki/Directed_graph">directed
graph</a>: each processing node may
have one or more inputs and the outputs may be directed to different
processing nodes. This directed graph, mainly used in the fields of
bioinformatics, medical imaging and astronomy, among many others, is called a
<a href="https://en.wikipedia.org/wiki/Bioinformatics_workflow_management_system"><em>workflow</em></a>.</p><p>The <a href="https://www.commonwl.org/">Common Workflow Language (CWL)</a> is a
specification to describe computational workflows that makes it easy to
reproduce and port to different hardware and software environments. But, why
do we need workflow languages such as CWL? Why will a simple shell script or
a Makefile not suffice?</p><h1>Why not shell scripts?</h1><h2>Housekeeping tasks</h2><p>With shell scripts, you need to not only code the actual command invocations
but also add a lot of boilerplate to perform housekeeping tasks such as
managing intermediate inputs/outputs. This makes the script hard to read and
the logic of the pipeline less obvious. Even with a Makefile, the programmer
needs to explicitly handle cleanup tasks, typically with a <code>clean</code> target.</p><p>Workflow languages allow the programmer to focus only on the actual command
invocations—the essence—of the workflow and let the workflow language deal
with the housekeeping tasks. For instance, CWL automatically deals with input
and output files produced by a command, and ensures that only the necessary
intermediate files are exposed to the next command.</p><p>When there is an error in a step, shell scripts usually leave the user with
arcane error messages, or worse, mindlessly march on as though nothing went
wrong. But workflow languages can clearly indicate which step failed.</p><h2>Portability to different software and hardware environments</h2><p>Workflows often need to be deployed to different software and hardware
environments—to a cluster, to containers in the cloud, etc. When a shell
script workflow needs to be deployed in a new environment, it will most likely
need to be tweaked a little. Even Makefiles invoke commands using a shell,
and thus suffer from the same portability issues. Workflow languages, on the
other hand, aim to handle this transparently. This leads to higher confidence
in the workflow, and allows a wide community to reproduce and deploy the
workflow easily.</p><h2>Data types, type conversion and static type checking</h2><p>For better or for worse, due to historical reasons, shells (and by extension,
Makefiles) revolve around only a single data type—the string. For instance,
all command line arguments passed into a shell script, or indeed any other
command, is a string. These strings may actually represent strings, but
often, they represent numbers, names of files, etc. It is up to the
programmer to convert these string arguments to suitable types, and deal with
any errors that may arise in that conversion.</p><p>Workflow languages can handle this type conversion automatically. For
example, they can ensure arguments representing numbers indeed contain only
digits, or that there indeed exist files whose names are mentioned in the
arguments. And some workflow languages such as CWL,
<a href="https://github.com/tweag/funflow">funflow</a> and
<a href="https://github.com/pveber/bistro">bistro</a> even have static typing so that
typing errors can be detected at compile-time, instead of at run-time.</p><h2>Human-readable and machine-readable</h2><p>And finally, workflow languages need to be easy not just for a human to read
and write, but also for machines to inspect. For instance, it should be
tractable for a computer to read a workflow and generate a graphical
visualization of the steps to be executed and the dependencies between those
steps. This is where CWL stands out. Another way to understand this is that
it is possible to automatically convert a CWL workflow into a shell script,
but not the other way around. In this regard, Makefiles are a little better
than shell scripts. But, with their many complex features to ease
human-writability, Makefiles sacrifice machine-readability.</p><h1>So, what's wrong with CWL?</h1><p>So, CWL has all these nice properties. Why do we need anything else?</p><h2>Limitations of YAML</h2><p>CWL is, in effect, a special purpose programming language built into YAML
syntax. CWL is fundamentally limited by this constraint, and often has
verbose constructs to express relatively simple ideas. For example, there are
at least three different fields that together build up the command to be
executed!</p><h2>Too many files</h2><p>Even simple workflows have to be spread out over multiple files. Each command
or step in the workflow needs its own CWL file. And all these individual
commands need to be wired up together in another CWL file that specifies the
overall workflow. Human short term memory is limited, and if one has to
juggle around several files and associated tabs/buffers, the overhead is often
too much.</p><h1>Why ccwl?</h1><p>What if instead of manually writing a CWL workflow, we could treat CWL as a
compilation target and auto-generate it? We would then be free to use a more
human-friendly frontend language without losing any of the machine-readability
of CWL. This is exactly what <a href="https://ccwl.systemreboot.net/">ccwl, the Concise Common Workflow
Language,</a> does.</p><p>ccwl is a domain-specific language embedded into <a href="https://www.gnu.org/software/guile/">GNU
Guile</a>, a Scheme implementation. Lisp
dialects such as Scheme are programmable programming languages and among the
few that allow you to directly hack the compiler. As such, it is extremely
well suited for embedding domain-specific languages into.</p><p>To the uninitiated, writing in a lisp may seem less <em>human-friendly</em> than
writing in YAML. But, if you try it, you might like it so much that you'll
never want to write anything else! And, if you're not convinced, there's
always <a href="https://www.draketo.de/software/wisp">wisp</a>, a Python-like
whitespace-significant syntax for GNU Guile. In fact, this is what the <a href="https://guixwl.org/">Guix
Workflow Language (GWL)</a>, another excellent workflow
language written in GNU Guile, favors.</p><h2>Human-readable and writable</h2><p>For the user, ccwl aims to be as easy to write as a shell script, or at least
a Makefile. But, by compiling to CWL, ccwl preserves all the benefits of CWL.</p><h2>Compile-time error checking</h2><p>Detecting errors as early as possible, preferably at compile time,
significantly improves the user experience. There is nothing more frustrating
than running a long workflow for several hours, only to have it error out in
between and being forced to restart all over again without knowing for sure if
it will succeed this time. ccwl, by virtue of the very hackable Scheme
compiler that it is built on, aims to provide excellent compile-time error
checking along with source references. ccwl isn't quite there yet, but
hopefully will be in the coming releases.</p><h2>Interface with external CWL workflows</h2><p>Not everybody might convert to ccwl. And often, it will be necessary to reuse
CWL workflows written by others. ccwl is pragmatic and allows calling
external CWL workflows as part of a larger ccwl workflow. If CWL grows to
become a common compilation target for many different workflow languages, this
feature could enable seamless collaboration between communities.</p><h2>Pre-packaged commands</h2><p>In the future, ccwl might also provide pre-packaged ccwl commands for
commonly used tools in bioinformatics, astronomy, etc. so that the
user is freed from having to write these wrappers and can instead
focus on writing only the workflow.</p><h2>Reproducibility with GNU Guix</h2><p>ccwl leaves all the hard work of reproducibility in Guix's capable hands. CWL
(and, by consequence, ccwl) are agnostic to deployment. As long as a tool can
be found in PATH, it does not care how that tool was deployed to PATH. This
means we can offload all reproducibility responsibilities to Guix. We could
simply fire up a Guix shell with the required packages in the environment, and
run our workflow from within that environment. If we fixate the Guix commit
we are running from, we can perfectly reproduce our workflow.</p><pre><code>$ guix shell ccwl cwltool package1 package2 ...
[env]$ ccwl compile workflow.scm > workflow.cwl
[env]$ cwltool workflow.cwl</code></pre><p>In contrast, the <a href="https://guixwl.org/">Guix Workflow Language (GWL)</a> uses Guix
internally to prepare a reproducible environment. It is thus deployment-aware
and tied to Guix.</p><h2>A taste of ccwl</h2><p>This article is not a ccwl tutorial. So, we will stop short of describing how
to write your own ccwl workflows. But, just to provide a taste for the
syntax, here is an example spell check workflow from the ccwl manual, followed
by a graphical visualization of it.</p><pre><code class="language-scheme">(define split-words
(command #:inputs text
#:run "tr" "--complement" "--squeeze-repeats" "A-Za-z" "\\n"
#:stdin text
#:outputs (words #:type stdout)))
(define downcase
(command #:inputs words
#:run "tr" "A-Z" "a-z"
#:stdin words
#:outputs (downcased-words #:type stdout)))
(define sort
(command #:inputs words
#:run "sort" "--unique"
#:stdin words
#:outputs (sorted #:type stdout)))
(define find-misspellings
(command #:inputs words dictionary
#:run "comm" "-23" words dictionary
#:outputs (misspellings #:type stdout)))
(workflow (text-file dictionary)
(pipe (tee (pipe (split-words #:text text-file)
(downcase #:words words)
(sort (sort-words) #:words downcased-words)
(rename #:sorted-words sorted))
(pipe (sort (sort-dictionary) #:words dictionary)
(rename #:sorted-dictionary sorted)))
(find-misspellings #:words sorted-words
#:dictionary sorted-dictionary)))</code></pre><p><img src="/static/images/blog/spell-check.svg" alt="Spell-check workflow visualized as a graph" /></p><h1>Contact</h1><p>ccwl development happens <a href="https://github.com/arunisaac/ccwl">on GitHub</a>.
Please do drop by to raise issues and offer suggestions. You may also peruse
the <a href="https://ccwl.systemreboot.net/manual/dev/en/">ccwl manual</a> for a detailed
introduction to ccwl. Thank you!</p>Tuning packages for a CPU micro-architectureLudovic Courtèsguix-devel@gnu.org2022-01-06T14:30:00Z<p>It should come as no surprise that the execution speed of programs is a
primary concern in high-performance computing (HPC). Many HPC
practitioners would tell you that, among their top concerns, is the
performance of high-speed networks used by the Message Passing Interface
(MPI) and use of the latest vectorization extensions of modern CPUs.</p><p>This post focuses on the latter: tuning code for specific CPU
micro-architectures, to reap the benefits of modern CPUs, with the
introduction of a new tuning option in Guix. But first, let us consider
this central question in the HPC and scientific community: can
“reproducibility” be achieved <em>without</em> sacrificing performance? Our
answer is a resounding “yes”, but that deserves clarifications.</p><h1>Reproducibility & high performance</h1><p>The author remembers advice heard at the beginning of their
career in HPC—advice still given today—:
that to get optimal MPI performance, you would have
to use the vendor-provided MPI library; that to get your code to perform
well on this new cluster, you would have to recompile the complete software
stack locally; that using generic, pre-built binaries from a GNU/Linux
distribution just won’t give you good performance.</p><p>From a software engineering viewpoint, this looks like a sad situation
and an inefficient approach, dismissing the benefits of automated
software deployment as pioneered by Debian, Red Hat, and others in the
90’s or, more recently, as popularized with container images. It also means doing away
with reproducibility, where “reproducibility” is to be understood in two
different ways: first as the ability to re-deploy the same software
stack on another machine or at a different point in time, and second as
the ability to <em>verify</em> that binaries being run match the source
code—the latter is what <a href="https://reproducible-builds.org/docs/definition/">reproducible
builds</a> are concerned
with.</p><p>But does it really have to be this way? Engineering efforts to support
<em>performance portability</em> suggest otherwise. We saw earlier that an MPI
implementation like Open MPI, today, <a href="https://hpc.guix.info/blog/2019/12/optimized-and-portable-open-mpi-packaging/">does achieve performance
portability</a>—that
it takes advantage of the high-speed networking hardware at run-time
without requiring recompilation.</p><p>Likewise, <a href="https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/">in a 2018
article</a>,
we looked at how generic, pre-built binaries can and indeed often do
take advantage of modern CPUs by selecting at run-time the most
efficient implementation of performance-sensitive routines for the host
CPU. The article also highlighted cases where this is <em>not</em> the case;
these are those we will focus on here.</p><h1>The jungle of SIMD extensions</h1><p>While major CPU architectures such as x86_64, AArch64, and POWER9 were
defined years ago, CPU vendors regularly extend them. Extensions that
matter most in HPC are vector extensions: <a href="https://en.wikipedia.org/wiki/SIMD">single instruction/multiple
data</a> instructions and registers.
In this area, a <em>lot</em> has happened on x86_64 CPUs since the baseline
instruction set architecture (ISA) was defined. As shown in the diagram
below, Intel and AMD have been tacking ever more powerful SIMD
extensions to their CPUs over the years, from
<a href="https://en.wikipedia.org/wiki/SSE3">SSE3</a> to
<a href="https://en.wikipedia.org/wiki/AVX-512">AVX-512</a>, leading to a wealth of
CPU “micro-architectures”.</p><p><img src="/static/images/blog/cpu-simd-extensions.png" alt="Overview of x86_64 SIMD extensions" /></p><p>This gives a high-level view, but just looking at generations of Intel
processors by their code name shows an already more complicated story:</p><p><img src="/static/images/blog/cpu-intel-families.png" alt="Overview of Intel CPU families." /></p><p>Linear algebra routines that scientific software relies on greatly
benefit from SIMD extensions. For example, on a modest Intel CORE i7
processor (of the Skylake generation, which supports AVX2), the
AVX2-optimized version of the dense matrix multiplication routines of
<a href="https://eigen.tuxfamily.org">Eigen</a>, built with GCC 10.3, peaks at
≅40 Gflops/s, compared to ≅11 Gflops/s for its baseline x86_64
version—four times faster!</p><h1>When function multi-versioning isn’t enough</h1><p>In our <a href="https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/">2018
post</a>,
we contemplated <em>function multi-versioning</em> (FMV) as the solution to
performance portability: the implementation provides multiple versions
of “hot” routines, one for each relevant CPU micro-architecture, and
picks the best one for the host CPU at run time. Many pieces of
performance-critical software already use this technique; software that
doesn’t do that yet can easily do so thanks to compiler toolchain
support.</p><p>To make the case for FMV, we wanted to see what it would take us to
actually add FMV support to code that would benefit from it. In the
spirit of the <a href="https://github.com/clearlinux/make-fmv-patch">Clear Linux automatic FMV patch
generator</a>, we wrote an
<a href="https://gitlab.inria.fr/guix-hpc/function-multi-versioning">automatic FMV tool for
Guix</a>: you
would give it a package name, and it would:</p><ol><li><p>Build the package with the <code>-fopt-info-vec</code> compiler flag to gather
information about vectorization opportunities and their source code
location.</p></li><li><p>Generate a patch that, for each C function with vectorization
opportunities, adds the <a href="https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Common-Function-Attributes.html#index-target_005fclones-function-attribute"><code>target_clone</code>
attribute</a>
to generate a couple of vectorized versions—generic, AVX2, and
AVX-512.</p></li><li><p>Build the package with this FMV patch.</p></li></ol><p>The tool can successfully FMV-patch a variety of packages written in C,
such as the <a href="https://www.gnu.org/software/gsl">GNU Scientific Library</a>,
which contains plain sequential implementations of a variety of math
routines. It was an exciting engineering experiment… but we found it to
be all too often inapplicable, for two reasons: performance-critical
software already does FMV, or it’s not written in C.</p><p>We realized there’s a common pattern where FMV isn’t applicable, or at
least isn’t applied: C++ header-only libraries. There’s no shortage of
C++ header-only math libraries providing hand-optimized SIMD versions of
their routines or otherwise supporting SIMD programming:
<a href="https://eigen.tuxfamily.org">Eigen</a>,
<a href="https://github.com/aff3ct/MIPP">MIPP</a>,
<a href="https://github.com/QuantStack/xsimd">xsimd</a> and
<a href="https://xtensor.readthedocs.io/en/latest/">xtensor</a>, <a href="https://github.com/simd-everywhere/simde">SIMD Everywhere
(SIMDe)</a>,
<a href="https://github.com/google/highway">Highway</a>, and many more (C++
meta-programming for SIMD appears to be an attractive engineering
effort). All these, except Highway, have in common that they do <em>not</em>
support FMV and run-time implementation selection. Since they “just”
provide headers, it is up to <em>each</em> package using them to figure out
what to do in terms of performance portability.</p><p>In practice though, software using these C++ header-only libraries
rarely makes provisions for performance portability. Thus, when compiling those
packages for the baseline ISA, one misses out on all the vectorized
implementations <a href="https://gitlab.com/libeigen/eigen/-/tree/master/Eigen/src/Core/arch">that libraries like Eigen
provide</a>.
This is a known issue <a href="https://gitlab.com/libeigen/eigen/-/issues/2344">in search of a
solution</a>. It is a bit
of a problem considering for instance the sheer number of packages
depending on Eigen:</p><p><img src="/static/images/blog/eigen-dependents.svg" alt="Graph showing packages that directly depend on Eigen." /></p><p>Fundamentally, run-time dispatch is at odds with the all-compile-time
approach that header-only C++ template libraries are about.
Furthermore, Eigen, for example, supports fine-grain vectorization; it
may be used to operate on small matrices, as is common in computer
graphics, and in that case inlining matrix operations is key to good
performance—run-time dispatch would have to be done at a higher level.</p><h1>Package multi-versioning</h1><p>With our packaging hammer, one could envision a solution to these
problems: if we cannot do function multi-versioning, what about
implementing <em>package</em> multi-versioning? Guix makes it easy to <a href="https://guix.gnu.org/manual/devel/en/html_node/Defining-Package-Variants.html">define
package
variants</a>,
so we can define package variants optimized for a specific CPU—compiled
with
<a href="https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/x86-Options.html"><code>-march=skylake</code></a>,
for instance. What we need is to define those variants “on the fly”.</p><p>The new <a href="https://guix.gnu.org/manual/devel/en/html_node/Package-Transformation-Options.html"><code>--tune</code> package transformation
option</a>,
which landed in Guix <code>master</code> a week ago, works along those lines.
Users can pass <code>--tune</code> to any of the command-line tools (<code>guix install</code>, <code>guix shell</code>, etc.) and that causes “tunable” packages to be
optimized for the host CPU. For example, here is how you would run
Eigen’s <a href="https://gitlab.com/libeigen/eigen/-/blob/4716040703be1ee906439385d20475dcddad5ce3/bench/benchBlasGemm.cpp">matrix multiplication
benchmark</a>
from the
<a href="https://hpc.guix.info/package/eigen-benchmarks"><code>eigen-benchmarks</code></a>
package, both with and without micro-architecture tuning:</p><pre><code>$ guix shell eigen-benchmarks -- \
benchBlasGemm 240 240 240
240 x 240 x 240
cblas: 0.239963 (13.826 GFlops/s)
eigen : 0.267135 (12.419 GFlops/s)
l1: 32768
l2: 262144
$ guix shell --tune eigen-benchmarks -- \
benchBlasGemm 240 240 240
guix shell: tuning eigen-benchmarks@3.3.8 for CPU skylake
240 x 240 x 240
cblas: 0.208547 (15.908 GFlops/s)
eigen : 0.0720303 (46.06 GFlops/s)
l1: 32768
l2: 262144</code></pre><p>There are several things happening behind the scenes. First, <code>--tune</code>
determines the name of the host CPU as recognized by GCC’s (and Clang’s)
<code>-march</code> option; it does that using
<a href="https://git.savannah.gnu.org/cgit/guix.git/tree/guix/cpu.scm?id=92faad0adb93b8349bfd7c67911d3d95f0505eb2#n83">code</a>
inspired by that <a href="https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/driver-i386.c;h=f844a168ddb6c064f51a559745bda39a56d2657e;hb=7ca388565af176bd4efd4f8db1e5e9e11e98ef45#l372">used by GCC’s
<code>-march=native</code></a>,
thought it’s currently limited to x86_64.</p><p>Users can also override auto-detection by passing a CPU name—e.g.,
<code>--tune=skylake-avx512</code>. However, the set of recognized CPU names varies
between GCC 11 and GCC 10, between GCC and Clang, and so on; passing the
wrong name to <code>-march</code> could result in obscure compilation errors. To
handle that gracefully, we instead <a href="https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/gcc.scm#n528?id=92faad0adb93b8349bfd7c67911d3d95f0505eb2">add metadata to the compiler
packages</a>
in Guix that lists the CPU names they know. This allows <code>--tune</code> to
emit a meaningful error when a CPU name unknown to the compiler is
given:</p><pre><code>$ guix install eigen-benchmarks --tune=x86-64-v4
guix install: tuning eigen-benchmarks@3.3.8 for CPU x86-64-v4
The following package will be installed:
eigen-benchmarks 3.3.8
guix install: error: compiler gcc@10.3.0 does not support micro-architecture x86-64-v4</code></pre><p>As mentioned earlier, we made the conscious choice of letting <code>--tune</code>
operate solely on packages explicitly marked as “tunable”, which
packagers can do along these lines:</p><pre><code class="language-scheme">(define-public eigen-benchmarks
(package
(name "eigen-benchmarks")
;; …
(properties '((tunable? . #true)))))</code></pre><p>This is to ensure Guix does not end up rebuilding packages that could
not possibly benefit from micro-architecture-specific optimizations,
which would be a waste of resources.
(For the same reason, we rejected the idea of defining separate system
types for the various x86_64 CPU micro-architectures <a href="https://discourse.nixos.org/t/nix-2-4-released/15822#other-features-2">the way Nix 2.4
did</a>.)</p><p>In the spirit of avoiding needless package rebuilds, <code>--tune</code> leverages
the <a href="https://guix.gnu.org/manual/en/html_node/Security-Updates.html">“graft”
mechanism</a>:
package variants are <em>grafted</em> to the dependency graph, such that
dependents of a tuned package do not need to be rebuilt. To illustrate
that, consider the figure below:</p><p><img src="/static/images/blog/cpu-tuning-graft.png" alt="Dependency graph of OpenCV, where the tuned variant of VTK is grafted." /></p><p>OpenCV depends on VTK, which depends on Eigen, as shown by the dotted
arrows. VTK is marked as tunable so it can benefit from SIMD
optimizations in Eigen. When <code>--tune</code> is passed, the optimized variant
of VTK built with <code>-march=skylake</code> is generated and grafted onto the
dependency graph, such that OpenCV itself does not need to be recompiled
and instead is relinked against the optimized VTK variant.</p><p>Importantly, this implementation of package multi-versioning does
not sacrifice reproducibility. When <code>--tune</code> is used, from Guix’s
viewpoint, it is just an alternate, but well-defined dependency graph
that gets built. Guix records package transformation options that were
used so it can “replay” them, for example by exporting a faithful
manifest:</p><pre><code>$ guix shell eigen-benchmarks --tune
guix shell: tuning eigen-benchmarks@3.3.8 for CPU skylake
[env]$ guix package --export-manifest -p $GUIX_ENVIRONMENT
;; This "manifest" file can be passed to 'guix package -m' to reproduce
;; the content of your profile. This is "symbolic": it only specifies
;; package names. To reproduce the exact same profile, you also need to
;; capture the channels being used, as returned by "guix describe".
;; See the "Replicating Guix" section in the manual.
(use-modules (guix transformations))
(define transform1
(options->transformation '((tune . "skylake"))))
(packages->manifest
(list (transform1
(specification->package "eigen-benchmarks"))))</code></pre><p>The dependency graph resulting from tuning is recorded and can be
replayed—much unlike stealthily passing <code>-march=native</code> during a build.
Like other transformation options, <code>--tune</code> is accepted by all the
commands, so you could just as well build a Singularity image tuned for
a particular CPU:</p><pre><code>guix pack -f squashfs -S /bin=bin \
eigen-benchmarks bash --tune</code></pre><p>This comes in handy if you want to prepare an image to run on another
cluster, where you know you can rely on a given CPU extension.</p><p>The Guix build farm is set up <a href="https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/ci.scm?id=92faad0adb93b8349bfd7c67911d3d95f0505eb2#n414">to build a few optimized package
variants</a>.
That way, users of <code>--tune</code> are likely to get substitutes (pre-built
binaries) even for the optimized variants, making deployment just as
fast as with non-tuned packages. To achieve this, <code>--tune</code> skips
running test suites when building packages: we cannot be sure that build
machines implement the CPU micro-architecture at hand.</p><h1>Conclusion and outlook</h1><p>We implemented what we call “package multi-versioning” for C/C++ software that
lacks function multi-versioning and run-time dispatch, a notable example
of which is optimized C++ header-only libraries. The new <code>--tune</code>
option is just one <code>guix pull</code> away; users and packagers can already
take advantage of it. It is another way to ensure that users do not
have to trade reproducibility for performance.</p><p>The scientific programming landscape has been evolving over the last few
years. It is encouraging to see that <a href="https://julialang.org">Julia</a>
<a href="https://docs.julialang.org/en/v1/devdocs/sysimg/">offers function multi-versioning for its “system
image”</a>, and that,
similarly, Rust supports it <a href="https://docs.rs/multiversion/0.6.1/multiversion/">with annotations similar to GCC’s
<code>target_clones</code></a>.
Hopefully these new development environments will support performance
portability well enough that users and packagers will not need to
worry about it.</p><blockquote><p><em>Illustrations were taken from a
<a href="https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/jcad-2021/talk.20211214.pdf">talk</a>
given at <a href="https://jcad2021.sciencesconf.org/resource/page/id/8">JCAD
2021</a>.</em></p></blockquote><h1>Acknowledgments</h1><p>Thanks to Ricardo Wurmus for insightful comments and suggestions on an
earlier draft of this article.</p>When Docker images become fixed-pointSimon Tournierguix-devel@gnu.org2021-10-22T16:00:00Z<p>We like to say that Docker images are like
<a href="https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/in2p3-2019/images/smoothie.pdf">smoothies</a>:
you can immediately tell whether it’s your liking, but you can hardly
guess what the ingredients are. Although containers are an efficient way to <em>ship</em>
things, the core question is how these things are produced.</p><p>The aim of this post is to demonstrate that the issue is not Docker
images by themselves. Instead the concrete question when talking about
reproducibility is: where do binaries come from, and using which tool?</p><p>The scenario below illustrates how one can ship reproducible <em>and
verifiable</em> Docker images built by <code>guix pack</code>. It had initially been
written as comment while reviewing
<a href="http://issues.guix.gnu.org/45919#10">patch #45919</a>.</p><h2>Alice generates a Docker image</h2><p>Alice is working on a standard scientific stack using Python.
She stores along her project the files <code>manifest.scm</code> containing the
package set and <code>channels.scm</code> containing the state of Guix (in other words,
its revision). With these two files, one can redeploy using
<a href="https://guix.gnu.org/manual/devel/en/guix.html#Invoking-guix-time_002dmachine"><code>guix time-machine</code></a>
the exact same computational environment.</p><p>Concretely, <code>manifest.scm</code> reads:</p><pre><code class="language-scheme">(specifications->manifest
(list
"python"
"python-numpy"))</code></pre><p>Alice produces the <code>channels.scm</code> file by running <a href="https://guix.gnu.org/manual/devel/en/guix.html#Invoking-guix-describe"><code>guix describe -f channels</code></a>,
which returns this:</p><pre><code class="language-scheme">(list (channel
(name 'guix)
(url "https://git.savannah.gnu.org/git/guix.git")
(commit
"fb32a38db1d3a6d9bc970e14df5be95e59a8ab02")
(introduction
(make-channel-introduction
"9edb3f66fd807b096b48283debdcddccfea34bad"
(openpgp-fingerprint
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))</code></pre><p>So far, so good. Because Alice needs to run this stack on some infrastructure
not running Guix but instead running Docker, she just
<a href="https://guix.gnu.org/manual/devel/en/guix.html#Invoking-guix-pack">packs</a> her
scientific stack with this command:</p><pre><code>guix pack -f docker --save-provenance -m manifest.scm</code></pre><p>For the next step, one option is to locally load the generated
tarball using Docker tools, like so:</p><pre><code>$ docker load < /gnu/store/6rga6pz60di21mn37y5v3lvrwxfvzcz9-python-python-numpy-docker-pack.tar.gz
Loaded image: python-python-numpy:latest
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
python-python-numpy latest ea2d5e62b2d2 51 years ago 431MB</code></pre><p>… then running <code>docker push</code> to upload the image to a registry.</p><p>The second option is to transfer the image to the target computer, and
to run over there the Docker commands shown above. Once the image has
been loaded on the target machine, running Python from that image <em>just
works</em>:</p><pre><code>$ docker run -ti python-python-numpy:latest python3
Python 3.8.2 (default, Jan 1 1970, 00:00:01)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
import numpy as np
>>> A = np.array([[1,0,1],[0,1,0],[0,0,1]])
A = np.array([[1,0,1],[0,1,0],[0,0,1]])
>>> _, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
_, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
array([1.61803399, 1. , 0.61803399])
0.0
>>> quit()</code></pre><p>Neat!</p><p>On a side note, the Docker image is produced directly by Guix. That is,
Guix manages everything, from the binary packages and all the requirements to
the Docker image itself — no <code>Dockerfile</code> involved. To <code>guix pack</code>,
Docker images are one container format among others; for instance <code>guix pack -f squashfs --save-provenance -m manifest.scm</code> generates a
<a href="https://singularity.hpcng.org/">Singularity</a> image (other container format)
with the exact same binaries inside.</p><h2>Bob retrieves and runs code from Alice’s image</h2><p>Bob works with Alice's Docker image. He needs to run this exact same
versions on another machine using plain relocatable tarballs, for
example. Or he needs to scrutinize how all the binaries in this stack are
produced, because maybe he found a bug and wants to know if all the results
obtained with this Docker image are correct or not. Or maybe he wants to study
a specific aspect to better understand a specific result. Bob is doing
science and thus Bob needs transparency.</p><p>The files <code>manifest.scm</code> and <code>channels.scm</code> sadly disappeared a long time ago,
probably at the end of Alice's postdoc. Had the Docker image been
produced with a <code>Dockerfile</code>, the game would most likely be over:
running <code>docker build</code> on that <code>Dockerfile</code> would probably give a
different result than back then (for instance because it starts by
running <code>apt-get update</code>), or it may simply fail because some of
the resources it refers to have vanished from the Internet. There are
ways to mitigate it, for instance by resorting to
<a href="https://snapshot.debian.org/">Debian’s snapshot service</a> and/or using
<a href="https://github.com/debuerreotype/debuerreotype">debuerreotype</a> to
recreate the image, assuming everything in the image was taken from
Debian. But overall, it’s safe to assume that a regular <code>Dockerfile</code>
does <em>not</em> describe a reproducible build process.</p><p>Fortunately, Bob remembers this Docker image had been produced with Guix
(<code>pack --save-provenance</code>). Let’s get back the recipe of this smoothie.</p><p>First, let’s start the container, which makes it easier to export as a
plain tarball. Second, let’s extract the embedded <a href="https://guix.gnu.org/manual/en/html_node/Getting-Started.html#index-profile">Guix
profile</a>:</p><pre><code>$ docker run -d python-python-numpy:latest python3
e1775ff836915dc55195eafd1710eec07106bd1677bde153e5842a0ded43395d
$ docker export -o /tmp/re-pack.tar $(docker ps -a --format "{{.ID}}"| head -n1)
$ tar -xf /tmp/re-pack.tar $(tar -tf /tmp/re-pack.tar | grep 'profile/manifest')
$ tree gnu
gnu
└── store
└── ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile
└── manifest
2 directories, 1 file</code></pre><p>Wow! Is it really a regular profile? Yes, it is! Because that profile
contains <em>provenance metadata</em> (thanks to <code>--save-provenance</code>), we can ask
Guix to export that metadata in the form of a list of channels and a
manifest:</p><pre><code>$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-channels
;; This channel file can be passed to 'guix pull -C' or to
;; 'guix time-machine -C' to obtain the Guix revision that was
;; used to populate this profile.
(list
(channel
(name 'guix)
(url "https://git.savannah.gnu.org/git/guix.git")
(commit
"fb32a38db1d3a6d9bc970e14df5be95e59a8ab02")
(introduction
(make-channel-introduction
"9edb3f66fd807b096b48283debdcddccfea34bad"
(openpgp-fingerprint
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA"))))
)
$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-manifest
;; This "manifest" file can be passed to 'guix package -m' to reproduce
;; the content of your profile. This is "symbolic": it only specifies
;; package names. To reproduce the exact same profile, you also need to
;; capture the channels being used, as returned by "guix describe".
;; See the "Replicating Guix" section in the manual.
(specifications->manifest
(list "python" "python-numpy"))</code></pre><p>Awesome, isn't it? These last two outputs are equivalent to Alice's
<code>manifest.scm</code> and <code>channels.scm</code> files. At this stage, Bob’s a happy
person: he can now take these two files anywhere and rebuild the exact
same image at any time:</p><pre><code>guix time-machine -C new-channels.scm \
-- pack -f docker --save-provenance -m new-manifest.scm</code></pre><p>The command should produce the exact same <code>docker-pack.tar</code> that Alice
provided,
<a href="https://reproducible-builds.org/docs/definition/">bit for bit</a>. If it
does not, then either the original image had been tampered with, or one
of the package build processes involved is non-deterministic — something
we would invite you to <a href="https://guix.gnu.org/en/contribute/">report as a
bug</a>!</p><p>Join the fun, join <a href="https://hpc.guix.info/about/">us</a>!</p>What’s in a packageLudovic Courtèsguix-devel@gnu.org2021-09-20T14:00:00Z<p>There is no shortage of package managers. Each tool makes its own set
of tradeoffs regarding speed, ease of use, customizability, and
reproducibility. <a href="https://guix.gnu.org">Guix</a> occupies a sweet spot,
providing reproducibility <em>by design</em> as pioneered by
<a href="https://nixos.org">Nix</a>, package customization à la
<a href="https://github.com/spack/spack">Spack</a> from the command line, the
ability to <a href="https://hpc.guix.info/blog/2017/10/using-guix-without-being-root/">create container
images</a>
without hassle, and more.</p><p>Beyond the “feature matrix” of the tools themselves, a topic that is
often overlooked is packages—or rather, what’s inside of them. Chances
are that a given package may be installed <a href="https://xkcd.com/1654/">using any of the many tools
at your disposal</a>. But are you really getting
the same thing regardless of the tool you are using? The answer is
“no”, contrary to what one might think. The author realized this very
acutely while fearlessly attempting to package the
<a href="https://github.com/pytorch/pytorch/">PyTorch</a> machine learning
framework for Guix.</p><p>This post is about the journey packaging PyTorch <em>the Guix way</em>, the
rationale, a glimpse at what other PyTorch packages out there look like,
and conclusions we can draw for high-performance computing and
scientific workflows.</p><h1>Getting PyTorch in Guix</h1><p>One can install PyTorch in literally seconds with <code>pip</code>:</p><pre><code>$ time pip install torch
Collecting torch
Downloading https://files.pythonhosted.org/packages/69/f2/2c0114a3ba44445de3e6a45c4a2bf33c7f6711774adece8627746380780c/torch-1.9.0-cp38-cp38-manylinux1_x86_64.whl (831.4MB)
|████████████████████████████████| 831.4MB 91kB/s
Collecting typing-extensions (from torch)
Downloading https://files.pythonhosted.org/packages/74/60/18783336cc7fcdd95dae91d73477830aa53f5d3181ae4fe20491d7fc3199/typing_extensions-3.10.0.2-py3-none-any.whl
Installing collected packages: typing-extensions, torch
real 0m24.502s
user 0m19.711s
sys 0m3.811s</code></pre><p>Since it’s on <a href="https://pypi.org/">PyPI</a>, the Python Package Index, one
might think it’s a simple Python package that can be imported in Guix
<a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-import.html">the easy
way</a>.
That’s unfortunately not the case:</p><pre><code>$ guix import pypi torch
guix import: error: no source release for pypi package torch 1.9.0</code></pre><p>The reason <code>guix import</code> bails out is that the only thing PyPI provides
is a binary-only <a href="https://www.python.org/dev/peps/pep-0427/">“wheels” package</a>: the
<code>.whl</code> file downloaded above contains pre-built binaries only, not
source.</p><p>In Guix we insist on building software from source: it’s a matter of
transparency, auditability, and provenance tracking. We want to make
sure our users can see the source code that corresponds to the code they
run; we want to make sure they can build it locally, should they choose
not to trust <a href="https://guix.gnu.org/manual/en/html_node/Official-Substitute-Server.html">the project’s pre-built
binaries</a>;
or, when they do use pre-built binaries, we want to make sure they can
<a href="https://guix.gnu.org/manual/en/html_node/Invoking-guix-challenge.html"><em>verify</em></a>
that those binaries correspond to the source code they claim to match.</p><p>Transparency, provenance tracking, verifiability: it’s about extending
the scientific method <em>to the whole computational experiment</em>, including
software that powers it.</p><h1>Bundling</h1><p>The first surprise when starting packaging PyTorch is that, despite
being on PyPI, PyTorch is <a href="https://github.com/pytorch/pytorch/">first and
foremost</a> a large C++ code base.
It does have a
<a href="https://github.com/pytorch/pytorch/blob/master/setup.py"><code>setup.py</code></a> as
commonly found in pure Python packages, but that file delegates the bulk
of the work to
<a href="https://github.com/pytorch/pytorch/blob/master/CMakeLists.txt">CMake</a>.</p><p>The second surprise is that PyTorch bundles (or “vendors”, as some would
say) source code for <a href="https://github.com/pytorch/pytorch/tree/master/third_party">no less than 41
dependencies</a>,
ranging from small Python and C++ helper libraries to large C++ neural
network tools. Like other distributions <a href="https://www.debian.org/doc/debian-policy/ch-source.html#embedded-code-copies">such as
Debian</a>,
Guix avoids bundling: we would rather have one Guix package for each of
these dependencies. The rationale is manifold, but it <a href="https://www.debian.org/doc/debian-policy/ch-source.html#id18">boils down
to</a>
keeping things auditable, reducing resource usage, and making security
updates practical.</p><p>Long story short: “unbundling” is often tedious, all the more so in
this case. We ended up packaging about ten dependencies that were not
already available or were otherwise outdated or incomplete, including
big C++ libraries like the
<a href="https://hpc.guix.info/package/xnnpack">XNNPACK</a> and
<a href="https://hpc.guix.info/package/onnx">onnx</a> neural network helper
libraries. Each of these typically bundles code for yet another bunch
of dependencies. Often, the CMake-based build system of these packages
would need patching so we could use our own copies of the dependencies.
Curious readers can take a look at the commits <a href="https://git.savannah.gnu.org/cgit/guix.git/log?id=b402a3ec86ebac4df4eed6a4030923bc62683d1d">leading to
XNNPACK</a>
and those <a href="https://git.savannah.gnu.org/cgit/guix.git/log?id=630c39d8df7557b6a0941c1d5ee879e487de0f5e">leading to
onnx</a>.
Another interesting thing is the use of derivatives: PyTorch depends on
both <a href="https://github.com/pytorch/qnnpack">QNNPACK</a> and
<a href="https://github.com/google/XNNPACK">XNNPACK</a>, even though the latter is
a derivative of the former, and of course, it bundles both.</p><p>Icing on the cake: most of these machine learning software packages do
not have proper releases—no Git tag, nothing—so we were left to pick the
commit <em>du jour</em> or the one explicitly referred to by Git submodules.</p><p>Most PyTorch dependencies were unbundled. The end result is a <a href="https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/machine-learning.scm?id=a537ef5e0ceca76de0073541c98999bb206052b3#n2591">PyTorch
package in its full
glory</a>,
actually built from source. Phew! Its dependency graph looks like this
(only showing dependencies at distance 2 or less):</p><p><img src="/static/images/blog/pytorch-dependency-graph.svg" alt="Excerpt from the PyTorch package dependency graph." /></p><p>With this many dependencies bundled,
these projects resemble the <a href="https://dustycloud.org/blog/javascript-packaging-dystopia/">JavaScript
dystopia</a>
Christine Lemmer-Webber described. Anyway, PyTorch is now <em>also</em>
<a href="https://hpc.guix.info/package/python-pytorch">installable with Guix</a> in
seconds when enabling pre-built binaries:</p><pre><code>$ time guix install python-pytorch
The following package will be installed:
python-pytorch 1.9.0
52.3 MB will be downloaded
python-pytorch-1.9.0 49.9MiB 6.2MiB/s 00:08 [##################] 100.0%
The following derivation will be built:
/gnu/store/yvygv6nlichbzyynvg4w04xa7xarx3rp-profile.drv
applying 16 grafts for /gnu/store/6qgcb3a7x1wg4havsryjh6zsy3za7h3b-python-pytorch-1.9.0.drv ...
building profile with 2 packages...
real 0m20.697s
user 0m3.604s
sys 0m0.118s</code></pre><p>This time though, one can view the self-contained package definition by
running <code>guix edit python-pytorch</code> and, say, rebuild it locally to
<em>verify</em> the source/binary correspondence:</p><pre><code>guix build python-pytorch --no-grafts --check</code></pre><p>… or at least it will be possible once NNPACK’s build system <a href="https://issues.guix.gnu.org/50672">generates
code in a deterministic order</a>.</p><h1>pip & CONDA</h1><p>Having done all this work, the author entered a soul-searching phase:
sure, the rationale is well documented, but <em>is it worth it</em>? It looks
as though <em>everyone</em> (everyone?) is installing PyTorch using <code>pip</code>
anyway and considering it good enough. Also, why was it so much work to
package PyTorch for Guix? Could it be that we’re missing packaging
tricks that make it so easy for others to provide PyTorch & co.?</p><p>To answer these questions, let’s first take a look at what <code>pip</code>
provides. The <code>pip install</code> command above completed after less than
thirty seconds, and most of that time went into downloading an 831 MiB
archive—no less. What’s in there? Those <code>.whl</code> files are actually zip
archives, which one can easily inspect:</p><pre><code>$ wget -qO /tmp/pytorch.zip https://files.pythonhosted.org/packages/69/f2/2c0114a3ba44445de3e6a45c4a2bf33c7f6711774adece8627746380780c/torch-1.9.0-cp38-cp38-manylinux1_x86_64.whl
$ unzip -l /tmp/pytorch.zip | grep '\.so'
29832 06-12-2021 00:37 torch/_dl.cpython-38-x86_64-linux-gnu.so
29296 06-12-2021 00:37 torch/_C.cpython-38-x86_64-linux-gnu.so
372539384 06-12-2021 00:37 torch/lib/libtorch_cpu.so
43520 06-12-2021 00:37 torch/lib/libnvToolsExt-3965bdd0.so.1
28964064 06-12-2021 00:37 torch/lib/libtorch_python.so
46351784 06-12-2021 00:37 torch/lib/libcaffe2_detectron_ops_gpu.so
1159370040 06-12-2021 00:37 torch/lib/libtorch_cuda.so
4862944 06-12-2021 00:37 torch/lib/libnvrtc-builtins.so
168720 06-12-2021 00:37 torch/lib/libgomp-a34b3233.so.1
116240 06-12-2021 00:37 torch/lib/libtorch.so
523816 06-12-2021 00:37 torch/lib/libcudart-80664282.so.10.2
222224 06-12-2021 00:37 torch/lib/libc10_cuda.so
36360 06-12-2021 00:37 torch/lib/libshm.so
47944 06-12-2021 00:37 torch/lib/libcaffe2_module_test_dynamic.so
22045456 06-12-2021 00:37 torch/lib/libnvrtc-08c4863f.so.10.2
12616 06-12-2021 00:37 torch/lib/libtorch_global_deps.so
21352 06-12-2021 00:37 torch/lib/libcaffe2_nvrtc.so
842376 06-12-2021 00:37 torch/lib/libc10.so
552808 06-12-2021 00:37 torch/lib/libcaffe2_observers.so
46651272 06-12-2021 00:37 caffe2/python/caffe2_pybind11_state.cpython-38-x86_64-linux-gnu.so
47391432 06-12-2021 00:37 caffe2/python/caffe2_pybind11_state_gpu.cpython-38-x86_64-linux-gnu.so
$ unzip -l /tmp/pytorch.zip | grep '\.so' | wc -l
21</code></pre><p>Twenty-one pre-compiled shared libraries in there! Most are part of
PyTorch, but some are external dependencies. First there’s libgomp,
GCC’s <a href="https://gcc.gnu.org/onlinedocs/libgomp/">OpenMP and OpenACC run-time support
library</a>; we can guess it’s
shipped to avoid incompatibilities with the user-installed libgomp, but
it could also be a fork of the official libgomp—hard to tell. Then
there’s <code>libcudart</code> and <code>libnvToolsExt</code>, both of which are proprietary
NVIDIA GPU support libraries—a bit of a surprise, and a bad one, as
nothing indicated that <code>pip</code> fetched proprietary software alongside
PyTorch. What’s also interesting is dependencies that are <em>not</em> there,
such as onnx and XNNPACK; we can only guess that they’re statically
linked within <code>libtorch.so</code>.</p><p>Will these binaries work? On my system, they won’t work without
tweaks, such as setting <code>LD_LIBRARY_PATH</code>, so these libraries find those
they depend on. Using <a href="https://linux.die.net/man/1/ldd"><code>ldd</code></a> shows
the “system libraries” that are assumed to be available; this includes
GNU libstdc++ and GCC’s run-time support library:</p><pre><code>$ ldd torch/lib/libtorch_cpu.so
linux-vdso.so.1 (0x00007ffca6d31000)
libgomp-a34b3233.so.1 => /tmp/pt/torch/lib/libgomp-a34b3233.so.1 (0x00007ff435723000)
…
libstdc++.so.6 => not found
libgcc_s.so.1 => not found</code></pre><p>Not providing those libraries, or providing a variant that is not
binary-compatible with what <code>libtorch_cpu.so</code> expects, is the end of the
game. Fortunately these two libraries rarely change, so the assumption
made here is that “most” users will have them. It’s interesting that
the authors deemed it necessary to ship <code>libgomp.so</code> and not
<code>libstdc++.so</code>—maybe a mixture of insider knowledge and dice roll.</p><p>How were these binaries built in the first place? Essentially, by
running <code>python setup.py bdist_wheel</code> “on some system” which, as we saw,
invokes <code>cmake</code> to build PyTorch and all its bundled dependencies. But
the PyTorch project does <a href="https://github.com/pytorch/pytorch/tree/7dc3858deb98f85a2353e4ea377b370b3d5c8e95/.circleci/README.md"><em>a little bit
more</em></a>
than this to build and publish binaries for pip and CONDA. The entry
point for both is
<a href="https://github.com/pytorch/pytorch/tree/7dc3858deb98f85a2353e4ea377b370b3d5c8e95/.circleci/scripts/binary_linux_build.sh"><code>binary_linux_build.sh</code></a>,
which in turn delegates to scripts living in another repo,
<a href="https://github.com/pytorch/builder/blob/d371104fb25cf57f3de9e8b168f9172c700962ee/conda/build_pytorch.sh"><code>build_pytorch.sh</code></a>
for CONDA or one of <a href="https://github.com/pytorch/builder/tree/d371104fb25cf57f3de9e8b168f9172c700962ee/manywheel">the wheels
scripts</a>;
it’s one of these scripts that’s <a href="https://github.com/pytorch/builder/blob/d371104fb25cf57f3de9e8b168f9172c700962ee/manywheel/build.sh#L110-L290">in charge of embedding <code>libgomp.so</code>,
<code>libcudart.so</code>, and other libraries present on the
system</a>.</p><p>And where do these libraries come from? They come from the GNU/Linux
distribution beneath it which, going back to the initial repository,
<a href="https://github.com/pytorch/pytorch/tree/7dc3858deb98f85a2353e4ea377b370b3d5c8e95/.circleci/docker">may typically be some version of Ubuntu or
CentOS</a>
running on the machines of CircleCI or Microsoft Azure.</p><p>At the end of the process is a bunch of wheel or CONDA archives ready to
be uploaded as-is <a href="https://github.com/pytorch/builder/blob/d371104fb25cf57f3de9e8b168f9172c700962ee/conda/publish_conda.sh">to
Anaconda</a>
or <a href="https://github.com/pytorch/builder/blob/d371104fb25cf57f3de9e8b168f9172c700962ee/wheel/upload_wheels_to_pypi.sh">to
PyPI</a>.</p><p>Looking at these scripts gives useful hints. But going back to the code
pip and CONDA users are actually running: is <code>libgomp-a34b3233.so.1</code>
<em>the</em> libgomp, or is it a modified version? Is <code>libtorch_cpu.so</code>
<em>really</em> obtained by building <a href="https://github.com/pytorch/pytorch/releases/tag/v1.9.0">source from the <code>1.9.0</code> Git
tag</a>?</p><p>Let’s make it clear: verifying the source/binary correspondence for all
the bits in the pip and CONDA packages is <em>practically infeasible</em>. Merely
rebuilding them locally is hard. Reasoning about the build process is
hard because of all the layers involved and because of the ball of
spaghetti that these scripts are. Such a setup rightfully raises red
flags for any security-minded person—we’ll get to that below—or
freedom-conscious user: it’s also about <a href="https://gnu.tools/en/documents/free-software/">user
freedom</a>. Is PyPI
conveying the Corresponding Source of libgomp, as per <a href="https://www.gnu.org/licenses/gpl-3.0.html#section6">Section 6 of its
license</a>? Probably
not. PyTorch’s own license doesn’t have this requirement, but there’s
certainly a tacit agreement that <code>pip install torch</code> provides <em>the</em>
PyTorch, and it’s unpleasant at best that this claim is unverifiable in
practice. <em>This</em>, should be a red flag for anyone doing reproducible
science—in other words, science.</p><h1>Source-based distros</h1><p>CONDA and pip (at least the “wheels” part of it) are essentially “binary
distros”: they focus on distributing pre-built binaries without concern
on how they were built, nor whether they can actually be built from
source. Without a conscious effort to require <a href="https://reproducible-builds.org/">reproducible
builds</a> so that anyone can
independently verify binaries, these tools are doomed to be not only
unsafe but also opaque—and there are to date no signs of CONDA and
PyPI/pip moving in that direction.</p><blockquote><p>Update (2021-09-21): Bovy on Twitter
<a href="https://nitter.net/benbovy/status/1440027976364552199#m">mentions</a>
<a href="https://conda-forge.org">conda-forge</a> as a possible answer. Public
build recipes (here’s <a href="https://github.com/conda-forge/pytorch-cpu-feedstock/tree/master/recipe">that of
PyTorch</a>)
and automated builds improve transparency compared to binaries
uploaded straight from developer machines, but build reproducibility
remains to be addressed.</p></blockquote><p>Like Guix, Spack and Nix are source-based: their primary job is to build
software from source and use of pre-built binaries is “an optimization”.
The <a href="https://github.com/spack/spack/blob/730720d50a8ef2afb3087d69fb44cd9ec93801e1/var/spack/repos/builtin/packages/py-torch/package.py">Spack
package</a>
and the <a href="https://github.com/NixOS/nixpkgs/blob/f8420fd6df9b70b10b88a66d4bfd085863e2f9d4/pkgs/development/python-modules/pytorch/default.nix">Nixpkgs
package</a>
are all about building it all <em>from source</em>. The Spack package avoids
using some of the bundled dependencies, though it does use large ones:
XNNPACK and onnx; the Nixpkgs package makes no such effort and builds it
all as-is.</p><p>Unlike Nix or Guix, Spack assumes core packages—for some definition of
“core”, but that includes at least a C/C++ compiler, a C library, and a
Python interpreter—are already available. Thus, by definition, the
Spack package is not self-contained and may fail to build, plain and
simple, if some of the implicit assumptions are not met. When fetching
pre-built binaries from a <a href="https://spack.readthedocs.io/en/latest/binary_caches.html">“binary
cache”</a>, the
problems are similar to those of CONDA and pip: binaries might not work
if assumptions about system libraries are not met (though Spack
mitigates this risk by tying binaries to the underlying GNU/Linux
distro), and it may be hard to verify them through rebuilding, again
because these implicit assumptions have an impact on the bits in the
resulting binaries.</p><h1>On convenience, security, and reproducible science</h1><p>The convenience and ease of use of pip and CONDA has undeniable appeal.
That one can, in a matter of minutes, install the tool <em>and</em> use it to
deploy a complex software stack like that of PyTorch has
certainly contributed to their success. Our view though, as Guix
packagers, is that we should take a step back and open the package—look
at what’s inside and the impact it has.</p><p>What we see when we look inside PyPI wheels and CONDA packages is
<em>opaque binaries</em> built on a developer’s machine and later uploaded to
the central repository. They are opaque because, lacking reproducible
build methodology and tooling, one cannot independently verify that they
correspond to the presumed source code. They may also be deceptive: you
get not just PyTorch but also the binary of a proprietary piece of
software.</p><p>In their <a href="https://dl.acm.org/doi/10.1145/3468264.3468592">ESEC/FSE 2021 paper on
LastPyMile</a>, Duc-Ly Vu
<em>et al.</em> empirically show that “<em>the last mile from source to package</em>”
on PyPI is indeed the weakest link in the software supply chain, and
that actual differences between packaged source code and upstream source
code <em>are</em> observed in the wild. And this is only source code—for
binaries as found in the <code>torch</code> wheel, there is just no practical way
to verify that they genuinely correspond to that source code.</p><p>Machine-learning software is fast-moving. The desire to be fast already
shows in upstream development practices: lack of releases for important
dependencies, careless dependency bundling. Coupled with the user’s
legitimate demand for “easy installation”, this turned PyPI, in the
footsteps of CONDA, into a huge software supply chain vulnerability
waiting to be exploited. It’s a step backwards several years in the past,
when Debian hadn’t yet put an end to its <a href="https://archive.fosdem.org/2015/schedule/event/distributions_boring_solved_problem/">“dirtiest
secret”</a>—that
Debian packages would be non-reproducible, built on developer machines,
and uploaded to the servers. <a href="https://reproducible-builds.org">Reproducible
builds</a> should be the norm; <a href="https://guix.gnu.org/en/blog/tags/bootstrapping/">building
from source</a>, too,
should be the norm.</p><p>It is surprising that such a blatant weakness goes unnoticed, especially
on high-performance computing clusters that are usually subject to
strict security policies. Even more so at a time where <a href="https://www.computer.org/csdl/magazine/sp/2021/02/09382367/1saZVPHhZew">awareness about
software supply chain security
grows</a>,
and when the US Government’s <a href="https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/">Executive Order on
cybersecurity</a>,
for example, explicitly calls for work on subjects as concrete as
“<em>using administratively separate build environments</em>” and “<em>employing
automated tools (…) to maintain trusted source code supply chains</em>”.</p><p>Beyond security, what are the implications for scientific workflows?
Can we build reproducible computational workflows using software that is
itself non-reproducible, non-verifiable? The answer is “yes”, one can
do that. However, just like one wouldn’t build a house on a quagmire,
building scientific workflows on shaky foundations is inadvisable. Far
from being an abstract principle, it has concrete implications:
scientists and their peers need to be able to reproduce the software
environment, <em>all of it</em>; they need the ability to customize it
and experiment with it, as opposed to merely running code from an “inert”
binary.</p><p>It is time to stop running opaque binaries and to value transparency and
verifiability for our foundational software, as much as we value
transparency and verifiability for scientific work.</p><h1>Acknowledgments</h1><p>The author thanks Ricardo Wurmus and Simon Tournier for insightful
feedback and suggestions on an earlier draft of this post.</p>