agile sysadmin

by Ferenc Erki

Host Gentoo dependency tarballs as GitHub releases

People who create Gentoo ebuilds for software written in Go probably noticed deprecation notices and discussion About EGO_SUM, and also about a Proposal to undeprecate EGO_SUM.

While the mailing lists and IRC channels provide plenty of opportunity to discuss how to supply dependencies for Go software, here I share a way to use GitHub releases to host dependency tarballs as an external Gentoo contributor, like proxied maintainer, GURU contributor, or overlay maintainer.

Update: see also Packaging Go dependencies for Gentoo as a follow-up post expanding on this topic.

The challenge

Official Gentoo developers have a personal space on dev.gentoo.org, and mirroring provided by the Gentoo Infrastructure project. They may decide to host distfiles there if necessary, including dependency tarballs for Go software in case upstream does not provide one.

In contrast, external contributors striving to provide warning-free ebuilds to the community need to find a hosting solution that fits various criteria, like:

  • has enough storage space
  • can handle enough traffic to serve downloads
  • stores and serves data securely
  • stays highly available even for a longer time
  • conforms to applicable privacy regulations, if any
  • fits community usage purposes
  • has no cost or at least an affordable price

Discarded alternatives

On my business domain it would cost extra money to host tarballs, and traffic limit might become a problem already around 350 downloads given the total tarball size of 180MB for 3 ebuilds. Mixing business and community usage there also requires more planning, partly because having to deal with the intricacies of providing a public service.

The same goes for my personal domain as well, with the difference of having even more limited storage and traffic limits.

Others recommended to commit tarballs directly into a GitHub repository. While storing blobs in git like that could work, it also has various drawbacks. Using Git Large File Storage (LFS) or git-annex would fit better in this direction. GitHub limits free storage to 1 GB, and traffic to 1 GB/month for the former, and the latter would require an external store to host blobs, though.

The idea

I’m always interested to find the simplest solution that could work for the given situation, and then iterate from there gradually.

All the Go software I package for Gentoo has its upstream repository on GitHub, where releases may host several assets, and I already have the relevant tools and workflows in place to efficiently interact with them.

Based on that, I decided to try the following idea:

  • fork and clone the upstream project
  • checkout the upstream release tag
  • generate a dependency tarball
  • create a GitHub release in my fork which includes the tarball

An example

I have GitHub CLI installed and configured, so I will use that as an example below along with the app-text/vale ebuild I worked on last time. Similar tools could work as well, like hub, or even a mix of browser interactions and plain git commands.

Prepare the repository

$ gh repo fork errata-ai/vale
$ cd vale
$ git checkout v2.24.4

Generate tarball

The go-module eclass documentation uses the following commands to download the dependencies into the go-mod directory, and create a tarball of it:

$ GOMODCACHE="${PWD}"/go-mod go mod download -modcacherw
$ XZ_OPT='-T0 -9' tar -acf vale-2.24.4-gentoo-deps.tar.xz go-mod

I chose the tarball name of vale-2.24.4-gentoo-deps.tar.xz because that makes the relation clear to Gentoo dependencies, and the prefix nicely matches with the value of ${P} when included as part of SRC_URI later in the ebuild.

Update: Since 2024-05-15 Gentoo ships a script in dev-go/go-dep-tarball to streamline creating such tarballs, as pointed out by Holger Hoffstätte via Mastodon.

Create GitHub release

First, mark the forked repository as default for this directory. Then create a GitHub release including our tarball, with clear references to Gentoo usage and the upstream version:

$ gh repo set-default ferki/vale
$ gh release create --notes 'dependency tarball for Gentoo v2.24.4' v2.24.4-gentoo-deps vale-2.24.4-gentoo-deps.tar.xz

The release name of v2.24.4-gentoo-deps refers to both the upstream tagged version and the purpose. The notes make the purpose of the release clear for humans to read.

Use it in an ebuild

With the GitHub release in place in the fork, add the tarball as an extra distfile to the SRC_URI in the ebuild:

SRC_URI+=" https://github.com/ferki/vale/releases/download/v${PV}-gentoo-deps/${P}-gentoo-deps.tar.xz"

Conclusion

GitHub releases can host dependency tarballs when packaging Go software for Gentoo as an external contributor.

For the 3 ebuilds I maintain in my overlay, the above approach allowed me to remove around 900 lines from ebuild and Manifest files, reducing their size by 173 kB. The process also feels more straightforward than populating EGO_SUM content by extracting go.sum files.

As a trade-off, I need to host about 180MB worth of data, and I wonder how well it would scale with the number of packages and version.