Asfaload mirror of public checksums available

Raphaël 2024-11-27

Asfaload develops a solution to help secure and authenticate downloads from the internet, aiming to check the integrity and authenticity of downloaded files. We want to make it easy to validate a downloaded file’s authenticity.

As another step in our journey, we announce our mirror of public checksums, usable by everyone, and made easy with our tool asfald. Remember to let us know via a Github issue or our contact form if you have any suggestions regarding our project!

With the first version of our download tool https://github.com/asfaload/asfald it was already possible to validate the integrity of a file by using a checksums file published alongside the file to be downloaded. To increase security it was possible to host the checksums file on separate server, but the general feedback we got was that this was cumbersome, and that downloading the checksums file from the same location as the file itself didn’t increase security, making it useless for most.

Building on this feedback, we have decided to publish a mirror of public checksums files. Github being the most popular code hosting solution, we decided to focus on Github projects publishing checksums files in their releases. The mirror is a public Github repository at https://github.com/asfaload/checksums, and is now used by our download tool https://github.com/asfaload/asfald.

In this block post we will look at how this mirror can help you increase your security, as well as its limitations which justifies our next step.

Why a mirror?

Downloading the checksums files from the same location as the main asset you want to download only helps you ensure that the asset was not corrupted during download, but does not bring any additional security benefit, because an attacker able to replace the main asset can also modify the checksums file. Hosting this checksums file on a third party location however makes it significantly harder for attackers to achieve their goal of making you download a corrupted file: not only must the publishing party be compromised, the checksums hoster must also be compromised. By compromised we mean that the attacker achieves to make a user download a file from that host that is different than the file originally published, being replaced by a file generated by the attacker (be it by a Man in The Middle attack, or by unauthorized server access).

An additional reason to use a mirror is that there is no generally accepted convention for checkums files. Some publish sha256 checksums, some sha512, some publish both and some md5 and sha1 files can also be encountered sporadically. Some publish one checksum file for all assets published in a release, while others publish one checksum file per release asset (where there is no general naming convetion of the checksums file), often with the same name as the release asset with an additional suffix identifying the checksum algorithm used (e.g. my_file.tgz.sha256, but again with numerous variations of naming conventions).

The first version of our download tool https://github.com/asfaload/asfald didn’t use a mirror, but had to guess the name of the checksums file for the downloaded asset. This was far from optimal:

  • resource spilling and/or time consuming: multiple request for inexisting resources were issued. Those could either be sent concurrently, spilling resources, or sequentially, possibly issuing fewer requests but probably consuming more time.
  • unreliable: the checksums file might use an unsupported naming convention and not be found by asfald.

Using a mirror allows us to augment it with a file asfaload.index.json aggregating the information of the checksums file present in the release. Using this file, asfald can retrieve the checksums of the downloaded asset with only one request, making a checksum validation faster while sparing resources.

How is the mirror structured?

Although we currently only work with Github releases, we take care to be able to expand our mirror to other publishing platforms, including self hosted files. That’s why the path to a file on the mirror includes the host and path it was downloaded from. For example for asfald, we have the checksums file for release v0.3.0 published at https://github.com/asfaload/asfald/releases/download/v0.3.0/checksums.txt and its path in the git repository is /github.com/asfaload/asfald/releases/download/v0.3.0/checksums.txt.

As for the asfaload.index.json file present inside each release directory, it is structured like this:

{
  "mirroredOn": "2024-11-08T15:50:17.5034034+00:00",
  "publishedOn": "2024-11-08T14:01:08+00:00",
  "version": 1,
  "publishedFiles": [
    {
      "fileName": "asfald-aarch64-apple-darwin",
      "algo": "Sha256",
      "source": "checksums.txt",
      "hash": "b2ad8f03807b15335dd2af367b55d6318ffe46d32462e514c272272c9aeba130"
    },
    {
      "fileName": "asfald-aarch64-apple-darwin.tar.gz",
      "algo": "Sha256",
      "source": "checksums.txt",
      "hash": "6c1cba9e7da41f9c047bd7ee58f2015fe7efc3b45c3b57c67f19ebf69629d5d1"
    },
    ...
  ]

publishedOn is the time at which the release was published, and mirroredOn is the time at which the mirror was taken. publishedFiles is the list of assets present in the release, giving their file name, their checksum (hash) and the hashing algorithm used (algo), as well as the checksums file from which it was extracted. Note that this asfaload.index.json file is present in addition to the mirrored copy of the checksums file published in the release.

Using the files from the git repository is suboptimal, and the repository is also used for publishing Github pages at https://gh.checksums.asfaload.com. The path to a file under that hostname is identical to the path of the file in the git repository. Continuing with our example file locate in the git repository at /github.com/asfaload/asfald/releases/download/v0.3.0/checksums.txt, it can be downloaded with the url https://gh.checksums.asfaload.com/github.com/asfaload/asfald/releases/download/v0.3.0/checksums.txt.

Using this information, it is easy for anyone to check the validity of the data in the mirror, as downloading the original checksums file from the release is trivial, its url being the path of its mirrored file in the git repository.

For example, here is a bash function to check the file on the mirror is the same as in the published release:

validate_from_mirror () {
local url="${1?Pass the url on gh.checksums.asfaload.com as argument}"
diff <(curl -L -s $url)  <(curl -L -s $(echo $url | sed -e 's+gh.checksums.asfaload.com/++'))
}

The similar function taking the release url as argument is equally easy to write:

validate_from_release () {
 local url="${1?Pass the url of the released checksum file as argument}"
 diff <(curl -L -s $url)  <(curl -L -s $(echo $url | sed -e 's+^https://+https://gh.checksums.asfaload.com/+'))
}

And all this is also done by asfald, as we’ll see in the next section.

How is the mirror used?

https://github.com/asfaload/asfald is our CLI tool for increasing the security of downloading files from the internet, and as stated above we start by covering files Github Releases. Let’s see what happens when a file is downloaded with asfald. We will use asfald to download asfald version v0.3.0:

$ asfald https://github.com/asfaload/asfald/releases/download/v0.3.0/asfald-aarch64-apple-darwin
INFO ℹ️ Using asfaload index on mirror
INFO ℹ️ Same checksum found in release
INFO 🗑️ Create temporary file...
INFO 🚚 Downloading file...
  [00:00:00] [#################################################################] 3.19 MiB/3.19 MiB (00:00:00)
INFO ✅ File's checksum is valid !

Several steps take place and are reported to the user:

First, asfald reports that it will look for and use the file asfaload.index.json on asfaload’s mirror. To locate the index file, it takes the URL of the file to be downloaded, then replaces the scheme https with the host publishing the mirror, in our case https://hg.checksums.asfaload.com, and it also replaces the filename by asfaload.index.json. In this case it will download the index from https://gh.checksums.asfaload.com/github.com/asfaload/asfald/releases/download/v0.3.0/asfaload.index.json

Secondly it reports that it found the same hash for the file in the release. Remember that the index file includes the checksums files in the release which holds the checksums for the file we want to download. In our case, this leads to the file at https://github.com/asfaload/asfald/releases/download/v0.3.0/checksums.txt. asfald reports that checksums found in the index file and in the release’s checksums file are the same. For files publishing both sha256 and sha512 hashes, the latter will be preferred.

Once these checks are done, the download of the file to a temporary file starts. During download it incrementally computes the file’s checksum. When the download is finished, it compares the checksum of the downloaded file to the expected value and if valid it saves the file under the expected name, in our case asfald-aarch64-apple-darwin.

The mirror increases security because once the mirror is taken an attacker has to compromise 2 separate hosts. If only the publisher’s release or if only the mirror is compromised, asfald will report an error. If this is not a perfect solution, it definitely is a step in the right direction and a big improvement compared to downloads without validity checks.

What can still be improved?

An attacker compromising the publisher’s github account still allows it to published malevolent artifacts that would be considered valid by asfald. Although this is something we will tackle and propose a solution for in the coming months, for the moment we can only urge github users to activate 2FA and act cautiously.

But as hinted at the end of the previous paragraph, there’s still a small window of opportunity for an attacker between the time a release is published and the time the mirror is taken. To close this window, we have published the Github Action named asfaload/notify-release-action. Calling this action on the release published event of your repository will notify Asfaload’s servers of your release, which will trigger the immediate mirroring of your release’s checksums files. Using it is very easy: put the following YAML content in the file .github/workflows/notify-asfaload.yml:

name: Notify Asfaload of release
on:
  release:
    types: [published]
jobs:
  notify-asfaload:
    runs-on: ubuntu-latest
    permissions:
      # Required to get an OIDC token. Doesn't grant any write access.
      id-token: write
    steps:
      - name: Notify new release to Asfaload
        uses: asfaload/notify-release-action@v0.1.0

You’ll notice there’s a write permission given for the id-token, but know that this is a wrongly-named permission as it does not give any write access, as confirmed by Github documentation.

Apart from introduing a signature scheme, which is our next planned feature, we think this mirror setup already improves significantly the situation. We are working on an easy way to sign checksums files to make it even better, but this will be the subject of a future post.