SourceHut: builds.sr.ht Jobs on Ephemeral DigitalOcean Droplets
Let’s optimize costs and energy consumption of SourceHut-based Git infrastructures by integrating its CI service, builds.sr.ht, with a cloud infrastructure provider, namely DigitalOcean.
UPDATE: I’m not supporting this effort anymore. SourceHut is awful to maintain and its founder, Drew DeVault, is unnecessarily complicated to deal with. If you don’t care about pseudo-elitism and instead seek a resilient and welcoming project, look elsewhere.
A main reason for using SaaS services over home-brewed setups is cost. While there are plenty of great open source services that can be self-hosted, the infrastructure required for them is usually more expensive than using a much cheaper or even free Software-as-a-Service offer. One area where this applies is Git infrastructure. With GitHub, GitLab and Atlassian offering free-of-charge solutions to safely store and publish code, it makes hosting GitLab CE, Gitea and others less attractive. But with growing concerns over SaaS offerings in general or strict non-disclosure agreements for code, it might be needed to have self-hosted infrastructure in place nevertheless. For my needs this infrastructure is built on Drew DeVault’s open source (AGPLv3) platform SourceHut.
SourceHut is well-designed from architecture as well as code perspectives and it is completely modular. Individual SourceHut services can be enabled/disabled as required, each services utilizes its own database and all services cross-communicate through Redis. Here is an architecture overview:
╔═══════════════════════════════╗ ╔═══════════════════════════════╗
║meta.sr.ht ║░ ║git.sr.ht ║░
║┌─────────────────────────────┐║░ ║┌─────────────────────────────┐║░
║│ service │║░ ║│ service │║░
║└─────────────────────────────┘║░ ║└─────────────────────────────┘║░
║┌─────────────┐ ┌─────────────┐║░ ║┌─────────────┐ ┌─────────────┐║░
║│ api │ │ webhooks │║◀─────┬─────▶║│ api │ │ webhooks │║░
║└─────────────┘ └─────────────┘║░ │ ║└─────────────┘ └─────────────┘║░
║┌────── ─────── ─────── ──────┐║░ │ ║┌────── ─────── ─────── ──────┐║░
║ database ║░ │ ║ database ║░
║└────── ─────── ─────── ──────┘║░ │ ║└────── ─────── ─────── ──────┘║░
╚═══════════════════════════════╝░ │ ╚═══════════════════════════════╝░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
╔═══════════════════════════════╗ │ ╔═══════════════════════════════╗
║lists.sr.ht ║░ │ ║todo.sr.ht ║░
║┌─────────────┐ ┌─────────────┐║░ │ ║┌─────────────┐ ┌─────────────┐║░
║│ service │ │ lmtp │║░ │ ║│ service │ │ lmtp │║░
║└─────────────┘ └─────────────┘║░ │ ║└─────────────┘ └─────────────┘║░
║┌─────────────┐ ┌─────────────┐║░ │ ║┌─────────────┐ ┌─────────────┐║░
║│ process │ │ webhooks │║◀─────┼─────▶║│ api │ │ webhooks │║░
║└─────────────┘ └─────────────┘║░ │ ║└─────────────┘ └─────────────┘║░
║┌────── ─────── ─────── ──────┐║░ │ ║┌────── ─────── ─────── ──────┐║░
║ database ║░ │ ║ database ║░
║└────── ─────── ─────── ──────┘║░ │ ║└────── ─────── ─────── ──────┘║░
╚═══════════════════════════════╝░ │ ╚═══════════════════════════════╝░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
╔═══════════════════════════════╗ │ ╔═══════════════════════════════╗
║hub.sr.ht ║░ │ ║builds.sr.ht ║░
║┌─────────────────────────────┐║░ │ ║┌─────────────────────────────┐║░
║│ service │║░ │ ║│ service │║░
║└─────────────────────────────┘║◀─────┼─────▶║└─────────────────────────────┘║░
║┌────── ─────── ─────── ──────┐║░ │ ║┌────── ─────── ─────── ──────┐║░
║ database ║░ │ ║ database ║░
║└────── ─────── ─────── ──────┘║░ │ ║└────── ─────── ─────── ──────┘║░
╚═══════════════════════════════╝░ │ ╚═══════════════════════════════╝░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
╔═══════════════════════════════╗ │ ╔═══════════════════════════════╗
║dispatch.sr.ht ║░ │ ║paste.sr.ht ║░
║┌─────────────────────────────┐║░ │ ║┌─────────────────────────────┐║░
║│ service │║░ │ ║│ service │║░
║└─────────────────────────────┘║◀─────┤ ║└─────────────────────────────┘║░
║┌────── ─────── ─────── ──────┐║░ │ ║┌────── ─────── ─────── ──────┐║░
║ database ║░ │ ║ database ║░
║└────── ─────── ─────── ──────┘║░ │ ║└────── ─────── ─────── ──────┘║░
╚═══════════════════════════════╝░ │ ╚═══════════════════════════════╝░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
╔═══════════════════════════════╗ │
║builds.sr.ht-worker ║░ │
║┌─────────────────────────────┐║░ │
║│ service │║◀─────┤
║└─────────────────────────────┘║░ │
╚═══════════════════════════════╝░ │
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ │░
│ Redis │░
│ │░
└─────────────────────────────────────────────────────────────────────────────┘░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Running a full-fledged SourceHut environment including
builds.sr.ht requires at least two
servers/VPCs:
One for hosting everything but the build.sr.ht-worker
and one
for solely the worker. When a new
manifest
is being submitted to the service, it adds a job to its queue, which in turn
gets picked up by a worker. The worker then spins up a virtual environment
for each individual build job.
While this setup can make sense for installations that handle a lot of traffic, it doesn’t do for smaller setups with as few as several hundred build jobs per month.
In order to optimize the infrastructural requirements, I began digging into the
builds.sr.ht
code and found out a
couple of things:
- It’s possible to consolidate the master and the worker into one single unit,
by replacing
shell=/usr/bin/master-shell
withshell=/usr/bin/runner-shell
in theconfig.ini
- Provisioning of virtual environments is done through a simple
script,
which is configurable through the
controlcmd=./images/control
option in theconfig.ini
- Each image has its own
functions
script that contains platform-dependent commands for adding repositories or installing packages - SSH connections are not only initiated through the
controlcmd
but also triggered from within the worker code, with little possibility to configure their parameters
I came to the conclusion that by altering the code here and there, it should be
possible to remove the need for an individual worker instance, as well as
KVM/Docker altogether. By connecting a cloud service provider, build.sr.ht
would be able to provision virtual environments on individual VPCs on a per
build job basis. Ultimately this would allow me to spare one instance
(the worker) that would usually be running 24/7 even if it wouldn’t have
any work to do. It would also save me from cramming all sr.ht services
onto a single, likely more powerful and higher-priced instance, that would be
able to run KVM/Docker and have enough resources left for running all other
components.
To cut a long story short, I changed Drew’s worker code, altered the
controlcmd
script and implemented my own image for a DigitalOcean
Debian 10 droplet. With these modifications in place, I was able to get
build.sr.ht
to spin up a new droplet for every build job, run the manifest
and shut it back down afterwards. My setup is currently using the
weakest (and cheapest) instance type on DigitalOcean. I will make this and
a couple of other settings configurable in the future, though.
It’s possible to add other cloud providers using the same approach. The code
for this can be found on my SourceHut instance
or on sr.ht. Right now it’s a proof
of concept and hence a little hacky – mainly caused by the lack of clean
interfaces in SourceHut – but I’m working on integrating it more nicely.
Drew has already stated, that this approach is not supported (by him),
therefore I will try to build it more like an extension for SourceHut which
ideally wouldn’t require
monkey-patching any of
Drew’s code.
Write me in case you’re interested in testing this extension for yourself or if you’d like to contribute.
Happy building!
Enjoyed this? Support me via Monero, Bitcoin, Lightning, or Ethereum! More info.