• A whirlwind tour of systemd-nspawn containers

    In the last yearly update, I talked about isolating my self-hosted LLMs running in Ollama as well as Open WebUI in systemd-nspawn containers and promised a blog post about it. However, while writing that blog post, a footnote on why I am using it instead of Docker accidentally turned into a full blog post on its own. Here’s the actual post on systemd-nspawn.

    Fundamentally, systemd-nspawn is a lightweight Linux namespaces-based container technology, not dissimilar to Docker. The difference is mostly in image management—instead of describing how to build images with Dockerfiles and distributing prebuilt, read-only images containing ready-to-run software, systemd-nspawn is typically used with a writable root filesystem, functioning more similarly to a virtual machine. For those of you who remember using chroot to run software on a different Linux distro, it can also be described as chroot on steroids.

    I find systemd-nspawn especially useful in the following scenarios:

    1. When you want to run some software with some degree of isolation on a VPS, where you can’t create a full virtual machine due to nested virtualization not being available1;
    2. When you need to share access to hardware, such as a GPU (which is why I run LLMs in systemd-nspawn);
    3. When you don’t want the overhead of virtualization;
    4. When you want to directly access some files on the host system without resorting to virtiofs; and
    5. When you would normally use Docker but can’t or don’t want to. For reasons, please see the footnote-turned-blog post.

    In this post, I’ll describe the process of setting up systemd-nspawn containers and how to use them in some common scenarios.

    (Read more...)
  • Docker considered harmful

    In the last yearly update, I talked about isolating my self-hosted LLMs running in Ollama, as well as Open WebUI, in systemd-nspawn containers. However, as I contemplated writing such a blog post, I realized the inevitable question would be: why not run it in Docker?

    After all, Docker is super popular in self-hosting circles for its “convenience” and “security.” There’s a vast repository of images that exist for almost any software you might want. You could run almost anything you want with a simple docker run, and it’ll run securely in a container. What isn’t there to like?

    This is probably going to be one of my most controversial blog posts, but the truth is that over the past decade, I’ve run into so many issues with Docker that I’ve simply had enough of it. I now avoid Docker like the plague. In fact, if some software is only available as a Docker container—or worse, requires Docker compose—I sigh and create a full VM to lock away the madness.

    This may seem extreme, but fundamentally, this boils down to several things:

    1. The Docker daemon’s complete overreach;
    2. Docker’s lack of UID isolation by default;
    3. Docker’s lack of init by default; and
    4. The quality of Docker images.

    Let’s dive into this.

    (Read more...)
  • On ECC RAM on AMD Ryzen

    Last time, I talked about how a bad stick of RAM drove me into buying ECC RAM for my Ryzen 9 3900X home server build—mostly that ECC would have been able to detect that something was wrong with the RAM and also correct for single-bit errors, which would have saved me a ton of headache.

    Now that I’ve received the RAM and ran it for a while, I’ll write about the entire experience of getting the RAM working and my attempts to cause errors to verify the ECC functionality.

    Spoilers: Injecting faults was way harder than it appeared from online research.

    (Read more...)
  • 2024: Year in Review

    For the past two years, I’ve been writing year-end reviews to look back upon the year that had gone by and reflect on what had happened. I thought I might as well continue the tradition this year.

    However, I’ll try a new format—instead of grouping by month, I’d group it by area. I’ll focus on the following areas:

    1. BGP and operating my own autonomous system;
    2. My homebrew CDN for this blog;
    3. My home server;
    4. My new mechanical keyboard;
    5. My travel router project; and
    6. My music hobby.

    Without further ado, let’s begin.

    (Read more...)
  • On btrfs and memory corruption

    As you may have heard, I have a home server, which hosts mirror.quantum5.ca and doubles as my home NAS. To ensure my data is protected, I am running btrfs, which has data checksums to ensure that bit rot can be detected, and I am using the raid1 mode to enable btrfs to recover from such events and restore the correct data. In this mode, btrfs ensures that there are two copies of every file, each on a distinct drive. In theory, if one of the copies is damaged due to bit rot, or even if an entire drive is lost, all of your data can still be recovered.

    For years, this setup has worked perfectly. I originally created this btrfs array on my old Atomic Pi, with drives inside a USB HDD dock, and the same array is still running on my current Ryzen home server—five years later—even after a bunch of drive changes and capacity upgrades.

    In the past week, however, my NAS has experienced some terrible data corruption issues. Most annoyingly, it damaged a backup that I needed to restore, forcing me to perform some horrific sorcery to recover most of the data. After a whole day of troubleshooting, I was eventually able to track the problem down to a bad stick of RAM. Removing it enabled my NAS to function again, albeit with less available RAM than before.

    I will now explain my setup and detail the entire event for posterity, including my thoughts on how btrfs fared against such memory corruption, how I managed to mostly recover the broken backup, and what might be done to prevent this in the future.

    (Read more...)
  • Implementing ASPA validation in the bird2 filter language

    When we looked at route authorization, we discussed how Resource Public Key Infrastructure (RPKI)—or more specifically, route origin authorizations—could prevent some types of BGP hijacking, but not all of it. We also mentioned that Autonomous System Provider Authorization (ASPA), a draft standard that extends RPKI to also authenticate the AS path, could prevent unauthorized networks from acting as upstreams. (For more information about upstreams, see my post on autonomous systems).

    Essentially, an ASPA is a type of resource certificate in RPKI, just like Route Origin Authorizations (ROAs), which describes which ASNs are allowed to announce a certain IP prefix. However, ASPAs describe which networks are allowed to act as upstreams for any given AS.

    There are two parts to deploying ASPA:

    1. Creating an ASPA resource certificate for your network and publishing it, so that everyone knows who your upstreams are; and
    2. Checking routes you receive from other networks, rejecting the ones that are invalid according to ASPA.

    The first part is fairly straightforward, with RPKI software like Krill offering support out of the box. One simply has to set up delegated RPKI with the RIR that issued the ASN. I’ll give a quick overview of the process, but it’s not the main focus today.

    Unfortunately, the second part is less than trivial, since ASPA is just a draft standard, not widely supported by router software. Only OpenBGPd, which I don’t use, has implemented experimental support. However, that doesn’t mean we can’t use ASPA today—we simply need to implement it ourselves. Thus, I embarked on this journey to implement ASPA filtering in the bird 2 filter language.

    (Read more...)
  • Installing Debian (and Proxmox) by hand from a rescue environment

    Normally, installing Debian is a simple process: you boot from the installer CD image and follow the menu options in debian-installer. Simple, right? Or even easier, just use the Debian image provided by your server vendor, since Debian is quite popular and an image is bound to be available. Given the simplicity of this, you might have idly wondered: what’s actually going on behind debian-installer’s pretty menus? Well, you are about to find out.

    You see, recently, I got this cheap headless dedicated server without IPMI1—really, just an Intel N100 mini PC. To cut costs, there was no video feed, as that would require separate hardware to receive and stream the screen. Instead, there’s only the ability to power cycle and boot from PXE, which is used to perform a variety of tasks, such as booting rescue CDs or performing automated installation of operating systems. This shouldn’t be a problem for my use case, since there is a Proxmox 8 image right there, and I just set it to install automatically.

    Of course that didn’t work, because I wouldn’t be writing about it if it did! As it turns out, the Proxmox 8 image (and also the Debian 12 image) didn’t have the firmware for the Realtek NICs on the mini PC, which prevented them from working. I thought that I just needed to install the firmware package, but when I booted into the included Finnix rescue system, it appeared that Debian wasn’t installed at all! Clearly, the PXE installer failed to start due to the missing firmware.

    What now? Well, I’ve already done some pretty sketchy Debian installs in the past, so I thought I might as well just go all out and install a full Debian system through the rescue system. Unlike last time though, I’ll do a complete clean install, instead of keeping the partition scheme.

    (Read more...)
  • Custom mechanical keyboard: OS-specific custom RGB lighting with QMK

    My old Corsair keyboard has been struggling recently. It has some weird issues, either in hardware or firmware, that cause it to sometimes go crazy and randomly “press” the wrong keys, forcing me to pull out my backup keyboard until the lunacy1 passes. On top of that, managing it requires Corsair’s bloated, Windows-only iCUE software or a reverse-engineered alternative like ckb-next, which isn’t fun for a Linux user like me, and even with ckb-next, the customization is limited.

    So I figured I’d get a new keyboard. I have a few simple requirements:

    1. It should be a 100% keyboard because I use the numpad quite a bit for number entry, e.g. to manage my personal finances;
    2. It should have a backlight since I often use my computer at night in relative darkness, and while I can touch type just fine, being able to see the keyboard is nice;
    3. It should have tactile mechanical switches, but not the obnoxious clicky ones. For reference, my old keyboard has Cherry MX browns, which I liked; and
    4. It should have properly programmable and customizable firmware. QMK is the popular option, so I searched for keyboards supporting that, and failing that, at least keyboards with proper first-party Linux support.

    As it turned out, I couldn’t find any prebuilt mechanical keyboards that ticked all the options and were in stock, so I figured I might just get into the custom mechanical keyboard scene and build my own. Thus began a journey that saw immense frustration and nerd-sniping…

    (Read more...)
  • On the Inter-RIR transfer of AS200351 from RIPE NCC to ARIN

    As you might know already, on May 24, 2024, at the RIPE NCC General Meeting, model C for the 2025 charging scheme was adopted. I will not go into the details here, such as the lack of an option to preserve the status quo1, but model C involved adding an annual fee of 50 EUR per ASN, billed to the sponsoring LIR. This meant that the sponsoring LIR for AS200351 would be forced to bill me annually for at least 50 EUR for the ASN, plus some administrative overhead and fees for payment processing2.

    To protest against this fee and save myself some money, I decided to transfer AS200351 to ARIN, which charges no extra for me to hold an additional ASN, given that my current service category at ARIN allows up to 3 ASNs, and I only had one ASN already with ARIN: AS54148.

    And so, on June 2nd, I decided to initiate the process to transfer AS200351, which was in active use, to ARIN. As it turned out, this became an ordeal, especially on the RIPE NCC end. Since I’ve been asked many times about the process, I am writing this post to share my experience, so that you know what to expect.

    (Read more...)
  • Cloning Proxmox with LVM Thin Pools

    During Black Friday last year, I got tempted by a super good offer of a dedicated server in Kansas City with the option of connecting it to the Kansas City Internet Exchange (KCIX). Here are the specs:

    • Intel Xeon E5-2620 v4 (8 cores, 16 threads)
    • 64 GB DDR4 RAM
    • 500 GB SSD
    • 1 Gbps unmetered bandwidth

    It was such the perfect thing for AS200351 (if a bit overkill), so I just had to take it. I set it up during the winter holidays, having decided to install Proxmox to run a bunch of virtual machines, and all was well. Except for one thing—the disk.

    You see, the server came with a fancy SAN, with exactly 500 GiB of storage mounted over iSCSI via 10 Gbps Ethernet, backed by a highly reliable ZFS volume (zvol). While this all sounds good on paper, in practice I am barely able to hit over 200 MB/s when doing I/O, even at large block sizes. Nothing I did seemed to help, so I asked the provider to switch it to physical drives.

    Having configured Proxmox just the way I wanted it, I opted against reinstalling it from scratch, instead choosing to clone the disk. The provider suggested using Clonezilla, which should be able to do this sort of disk cloning very quickly. So we found an agreeable time, took the server down, and booted Clonezilla over PXE. All should be good, right?

    As it turns out, this ended up being a super painful experience.

    Editorial note: This story is based on my memory and incomplete console output. While the broad story is correct, the commands provided may not be correct.

    (Read more...)