Posts from 2024

  • 2024: Year in Review

    For the past two years, I’ve been writing year-end reviews to look back upon the year that had gone by and reflect on what had happened. I thought I might as well continue the tradition this year.

    However, I’ll try a new format—instead of grouping by month, I’d group it by area. I’ll focus on the following areas:

    1. BGP and operating my own autonomous system;
    2. My homebrew CDN for this blog;
    3. My home server;
    4. My new mechanical keyboard;
    5. My travel router project; and
    6. My music hobby.

    Without further ado, let’s begin.

    (Read more...)
  • On btrfs and memory corruption

    As you may have heard, I have a home server, which hosts mirror.quantum5.ca and doubles as my home NAS. To ensure my data is protected, I am running btrfs, which has data checksums to ensure that bit rot can be detected, and I am using the raid1 mode to enable btrfs to recover from such events and restore the correct data. In this mode, btrfs ensures that there are two copies of every file, each on a distinct drive. In theory, if one of the copies is damaged due to bit rot, or even if an entire drive is lost, all of your data can still be recovered.

    For years, this setup has worked perfectly. I originally created this btrfs array on my old Atomic Pi, with drives inside a USB HDD dock, and the same array is still running on my current Ryzen home server—five years later—even after a bunch of drive changes and capacity upgrades.

    In the past week, however, my NAS has experienced some terrible data corruption issues. Most annoyingly, it damaged a backup that I needed to restore, forcing me to perform some horrific sorcery to recover most of the data. After a whole day of troubleshooting, I was eventually able to track the problem down to a bad stick of RAM. Removing it enabled my NAS to function again, albeit with less available RAM than before.

    I will now explain my setup and detail the entire event for posterity, including my thoughts on how btrfs fared against such memory corruption, how I managed to mostly recover the broken backup, and what might be done to prevent this in the future.

    (Read more...)
  • Implementing ASPA validation in the bird2 filter language

    When we looked at route authorization, we discussed how Resource Public Key Infrastructure (RPKI)—or more specifically, route origin authorizations—could prevent some types of BGP hijacking, but not all of it. We also mentioned that Autonomous System Provider Authorization (ASPA), a draft standard that extends RPKI to also authenticate the AS path, could prevent unauthorized networks from acting as upstreams. (For more information about upstreams, see my post on autonomous systems).

    Essentially, an ASPA is a type of resource certificate in RPKI, just like Route Origin Authorizations (ROAs), which describes which ASNs are allowed to announce a certain IP prefix. However, ASPAs describe which networks are allowed to act as upstreams for any given AS.

    There are two parts to deploying ASPA:

    1. Creating an ASPA resource certificate for your network and publishing it, so that everyone knows who your upstreams are; and
    2. Checking routes you receive from other networks, rejecting the ones that are invalid according to ASPA.

    The first part is fairly straightforward, with RPKI software like Krill offering support out of the box. One simply has to set up delegated RPKI with the RIR that issued the ASN. I’ll give a quick overview of the process, but it’s not the main focus today.

    Unfortunately, the second part is less than trivial, since ASPA is just a draft standard, not widely supported by router software. Only OpenBGPd, which I don’t use, has implemented experimental support. However, that doesn’t mean we can’t use ASPA today—we simply need to implement it ourselves. Thus, I embarked on this journey to implement ASPA filtering in the bird 2 filter language.

    (Read more...)
  • Installing Debian (and Proxmox) by hand from a rescue environment

    Normally, installing Debian is a simple process: you boot from the installer CD image and follow the menu options in debian-installer. Simple, right? Or even easier, just use the Debian image provided by your server vendor, since Debian is quite popular and an image is bound to be available. Given the simplicity of this, you might have idly wondered: what’s actually going on behind debian-installer’s pretty menus? Well, you are about to find out.

    You see, recently, I got this cheap headless dedicated server without IPMI1—really, just an Intel N100 mini PC. To cut costs, there was no video feed, as that would require separate hardware to receive and stream the screen. Instead, there’s only the ability to power cycle and boot from PXE, which is used to perform a variety of tasks, such as booting rescue CDs or performing automated installation of operating systems. This shouldn’t be a problem for my use case, since there is a Proxmox 8 image right there, and I just set it to install automatically.

    Of course that didn’t work, because I wouldn’t be writing about it if it did! As it turns out, the Proxmox 8 image (and also the Debian 12 image) didn’t have the firmware for the Realtek NICs on the mini PC, which prevented them from working. I thought that I just needed to install the firmware package, but when I booted into the included Finnix rescue system, it appeared that Debian wasn’t installed at all! Clearly, the PXE installer failed to start due to the missing firmware.

    What now? Well, I’ve already done some pretty sketchy Debian installs in the past, so I thought I might as well just go all out and install a full Debian system through the rescue system. Unlike last time though, I’ll do a complete clean install, instead of keeping the partition scheme.

    (Read more...)
  • Custom mechanical keyboard: OS-specific custom RGB lighting with QMK

    My old Corsair keyboard has been struggling recently. It has some weird issues, either in hardware or firmware, that cause it to sometimes go crazy and randomly “press” the wrong keys, forcing me to pull out my backup keyboard until the lunacy1 passes. On top of that, managing it requires Corsair’s bloated, Windows-only iCUE software or a reverse-engineered alternative like ckb-next, which isn’t fun for a Linux user like me, and even with ckb-next, the customization is limited.

    So I figured I’d get a new keyboard. I have a few simple requirements:

    1. It should be a 100% keyboard because I use the numpad quite a bit for number entry, e.g. to manage my personal finances;
    2. It should have a backlight since I often use my computer at night in relative darkness, and while I can touch type just fine, being able to see the keyboard is nice;
    3. It should have tactile mechanical switches, but not the obnoxious clicky ones. For reference, my old keyboard has Cherry MX browns, which I liked; and
    4. It should have properly programmable and customizable firmware. QMK is the popular option, so I searched for keyboards supporting that, and failing that, at least keyboards with proper first-party Linux support.

    As it turned out, I couldn’t find any prebuilt mechanical keyboards that ticked all the options and were in stock, so I figured I might just get into the custom mechanical keyboard scene and build my own. Thus began a journey that saw immense frustration and nerd-sniping…

    (Read more...)
  • On the Inter-RIR transfer of AS200351 from RIPE NCC to ARIN

    As you might know already, on May 24, 2024, at the RIPE NCC General Meeting, model C for the 2025 charging scheme was adopted. I will not go into the details here, such as the lack of an option to preserve the status quo1, but model C involved adding an annual fee of 50 EUR per ASN, billed to the sponsoring LIR. This meant that the sponsoring LIR for AS200351 would be forced to bill me annually for at least 50 EUR for the ASN, plus some administrative overhead and fees for payment processing2.

    To protest against this fee and save myself some money, I decided to transfer AS200351 to ARIN, which charges no extra for me to hold an additional ASN, given that my current service category at ARIN allows up to 3 ASNs, and I only had one ASN already with ARIN: AS54148.

    And so, on June 2nd, I decided to initiate the process to transfer AS200351, which was in active use, to ARIN. As it turned out, this became an ordeal, especially on the RIPE NCC end. Since I’ve been asked many times about the process, I am writing this post to share my experience, so that you know what to expect.

    (Read more...)
  • Cloning Proxmox with LVM Thin Pools

    During Black Friday last year, I got tempted by a super good offer of a dedicated server in Kansas City with the option of connecting it to the Kansas City Internet Exchange (KCIX). Here are the specs:

    • Intel Xeon E5-2620 v4 (8 cores, 16 threads)
    • 64 GB DDR4 RAM
    • 500 GB SSD
    • 1 Gbps unmetered bandwidth

    It was such the perfect thing for AS200351 (if a bit overkill), so I just had to take it. I set it up during the winter holidays, having decided to install Proxmox to run a bunch of virtual machines, and all was well. Except for one thing—the disk.

    You see, the server came with a fancy SAN, with exactly 500 GiB of storage mounted over iSCSI via 10 Gbps Ethernet, backed by a highly reliable ZFS volume (zvol). While this all sounds good on paper, in practice I am barely able to hit over 200 MB/s when doing I/O, even at large block sizes. Nothing I did seemed to help, so I asked the provider to switch it to physical drives.

    Having configured Proxmox just the way I wanted it, I opted against reinstalling it from scratch, instead choosing to clone the disk. The provider suggested using Clonezilla, which should be able to do this sort of disk cloning very quickly. So we found an agreeable time, took the server down, and booted Clonezilla over PXE. All should be good, right?

    As it turns out, this ended up being a super painful experience.

    Editorial note: This story is based on my memory and incomplete console output. While the broad story is correct, the commands provided may not be correct.

    (Read more...)