• 2024: Year in Review

    For the past two years, I’ve been writing year-end reviews to look back upon the year that had gone by and reflect on what had happened. I thought I might as well continue the tradition this year.

    However, I’ll try a new format—instead of grouping by month, I’d group it by area. I’ll focus on the following areas:

    1. BGP and operating my own autonomous system;
    2. My homebrew CDN for this blog;
    3. My home server;
    4. My new mechanical keyboard;
    5. My travel router project; and
    6. My music hobby.

    Without further ado, let’s begin.

    (Read more...)
  • On btrfs and memory corruption

    As you may have heard, I have a home server, which hosts mirror.quantum5.ca and doubles as my home NAS. To ensure my data is protected, I am running btrfs, which has data checksums to ensure that bit rot can be detected, and I am using the raid1 mode to enable btrfs to recover from such events and restore the correct data. In this mode, btrfs ensures that there are two copies of every file, each on a distinct drive. In theory, if one of the copies is damaged due to bit rot, or even if an entire drive is lost, all of your data can still be recovered.

    For years, this setup has worked perfectly. I originally created this btrfs array on my old Atomic Pi, with drives inside a USB HDD dock, and the same array is still running on my current Ryzen home server—five years later—even after a bunch of drive changes and capacity upgrades.

    In the past week, however, my NAS has experienced some terrible data corruption issues. Most annoyingly, it damaged a backup that I needed to restore, forcing me to perform some horrific sorcery to recover most of the data. After a whole day of troubleshooting, I was eventually able to track the problem down to a bad stick of RAM. Removing it enabled my NAS to function again, albeit with less available RAM than before.

    I will now explain my setup and detail the entire event for posterity, including my thoughts on how btrfs fared against such memory corruption, how I managed to mostly recover the broken backup, and what might be done to prevent this in the future.

    (Read more...)
  • Implementing ASPA validation in the bird2 filter language

    When we looked at route authorization, we discussed how Resource Public Key Infrastructure (RPKI)—or more specifically, route origin authorizations—could prevent some types of BGP hijacking, but not all of it. We also mentioned that Autonomous System Provider Authorization (ASPA), a draft standard that extends RPKI to also authenticate the AS path, could prevent unauthorized networks from acting as upstreams. (For more information about upstreams, see my post on autonomous systems).

    Essentially, an ASPA is a type of resource certificate in RPKI, just like Route Origin Authorizations (ROAs), which describes which ASNs are allowed to announce a certain IP prefix. However, ASPAs describe which networks are allowed to act as upstreams for any given AS.

    There are two parts to deploying ASPA:

    1. Creating an ASPA resource certificate for your network and publishing it, so that everyone knows who your upstreams are; and
    2. Checking routes you receive from other networks, rejecting the ones that are invalid according to ASPA.

    The first part is fairly straightforward, with RPKI software like Krill offering support out of the box. One simply has to set up delegated RPKI with the RIR that issued the ASN. I’ll give a quick overview of the process, but it’s not the main focus today.

    Unfortunately, the second part is less than trivial, since ASPA is just a draft standard, not widely supported by router software. Only OpenBGPd, which I don’t use, has implemented experimental support. However, that doesn’t mean we can’t use ASPA today—we simply need to implement it ourselves. Thus, I embarked on this journey to implement ASPA filtering in the bird 2 filter language.

    (Read more...)
  • Installing Debian (and Proxmox) by hand from a rescue environment

    Normally, installing Debian is a simple process: you boot from the installer CD image and follow the menu options in debian-installer. Simple, right? Or even easier, just use the Debian image provided by your server vendor, since Debian is quite popular and an image is bound to be available. Given the simplicity of this, you might have idly wondered: what’s actually going on behind debian-installer’s pretty menus? Well, you are about to find out.

    You see, recently, I got this cheap headless dedicated server without IPMI1—really, just an Intel N100 mini PC. To cut costs, there was no video feed, as that would require separate hardware to receive and stream the screen. Instead, there’s only the ability to power cycle and boot from PXE, which is used to perform a variety of tasks, such as booting rescue CDs or performing automated installation of operating systems. This shouldn’t be a problem for my use case, since there is a Proxmox 8 image right there, and I just set it to install automatically.

    Of course that didn’t work, because I wouldn’t be writing about it if it did! As it turns out, the Proxmox 8 image (and also the Debian 12 image) didn’t have the firmware for the Realtek NICs on the mini PC, which prevented them from working. I thought that I just needed to install the firmware package, but when I booted into the included Finnix rescue system, it appeared that Debian wasn’t installed at all! Clearly, the PXE installer failed to start due to the missing firmware.

    What now? Well, I’ve already done some pretty sketchy Debian installs in the past, so I thought I might as well just go all out and install a full Debian system through the rescue system. Unlike last time though, I’ll do a complete clean install, instead of keeping the partition scheme.

    (Read more...)
  • Custom mechanical keyboard: OS-specific custom RGB lighting with QMK

    My old Corsair keyboard has been struggling recently. It has some weird issues, either in hardware or firmware, that cause it to sometimes go crazy and randomly “press” the wrong keys, forcing me to pull out my backup keyboard until the lunacy1 passes. On top of that, managing it requires Corsair’s bloated, Windows-only iCUE software or a reverse-engineered alternative like ckb-next, which isn’t fun for a Linux user like me, and even with ckb-next, the customization is limited.

    So I figured I’d get a new keyboard. I have a few simple requirements:

    1. It should be a 100% keyboard because I use the numpad quite a bit for number entry, e.g. to manage my personal finances;
    2. It should have a backlight since I often use my computer at night in relative darkness, and while I can touch type just fine, being able to see the keyboard is nice;
    3. It should have tactile mechanical switches, but not the obnoxious clicky ones. For reference, my old keyboard has Cherry MX browns, which I liked; and
    4. It should have properly programmable and customizable firmware. QMK is the popular option, so I searched for keyboards supporting that, and failing that, at least keyboards with proper first-party Linux support.

    As it turned out, I couldn’t find any prebuilt mechanical keyboards that ticked all the options and were in stock, so I figured I might just get into the custom mechanical keyboard scene and build my own. Thus began a journey that saw immense frustration and nerd-sniping…

    (Read more...)
  • On the Inter-RIR transfer of AS200351 from RIPE NCC to ARIN

    As you might know already, on May 24, 2024, at the RIPE NCC General Meeting, model C for the 2025 charging scheme was adopted. I will not go into the details here, such as the lack of an option to preserve the status quo1, but model C involved adding an annual fee of 50 EUR per ASN, billed to the sponsoring LIR. This meant that the sponsoring LIR for AS200351 would be forced to bill me annually for at least 50 EUR for the ASN, plus some administrative overhead and fees for payment processing2.

    To protest against this fee and save myself some money, I decided to transfer AS200351 to ARIN, which charges no extra for me to hold an additional ASN, given that my current service category at ARIN allows up to 3 ASNs, and I only had one ASN already with ARIN: AS54148.

    And so, on June 2nd, I decided to initiate the process to transfer AS200351, which was in active use, to ARIN. As it turned out, this became an ordeal, especially on the RIPE NCC end. Since I’ve been asked many times about the process, I am writing this post to share my experience, so that you know what to expect.

    (Read more...)
  • Cloning Proxmox with LVM Thin Pools

    During Black Friday last year, I got tempted by a super good offer of a dedicated server in Kansas City with the option of connecting it to the Kansas City Internet Exchange (KCIX). Here are the specs:

    • Intel Xeon E5-2620 v4 (8 cores, 16 threads)
    • 64 GB DDR4 RAM
    • 500 GB SSD
    • 1 Gbps unmetered bandwidth

    It was such the perfect thing for AS200351 (if a bit overkill), so I just had to take it. I set it up during the winter holidays, having decided to install Proxmox to run a bunch of virtual machines, and all was well. Except for one thing—the disk.

    You see, the server came with a fancy SAN, with exactly 500 GiB of storage mounted over iSCSI via 10 Gbps Ethernet, backed by a highly reliable ZFS volume (zvol). While this all sounds good on paper, in practice I am barely able to hit over 200 MB/s when doing I/O, even at large block sizes. Nothing I did seemed to help, so I asked the provider to switch it to physical drives.

    Having configured Proxmox just the way I wanted it, I opted against reinstalling it from scratch, instead choosing to clone the disk. The provider suggested using Clonezilla, which should be able to do this sort of disk cloning very quickly. So we found an agreeable time, took the server down, and booted Clonezilla over PXE. All should be good, right?

    As it turns out, this ended up being a super painful experience.

    Editorial note: This story is based on my memory and incomplete console output. While the broad story is correct, the commands provided may not be correct.

    (Read more...)
  • 2023: Year in Review

    Last year, I started writing year-end reviews to look back upon the past year and reflect on what has happened. I thought I might as well continue to do the same this year.

    In January, I decided to finally create that stratum 1 NTP server that I had wanted ever since I heard about people doing it with Raspberry Pis. Instead of using Pis though, I ended up doing the ancient (but superior) approach of using a serial port. Along the way, I ran into various issues, but that tale is told in its own blog post.

    (Read more...)
  • On BGP Route Selection and High Availability via Anycast

    Earlier, we discussed how IP addresses and route authorizations work, before we took a break to talk about how the RIRs issue ASNs. As promised, I’ll now cover BGP route selection, how it enables anycasting, and how we can use it to achieve low latency and high availability. We’ll also cover some of the pitfalls of this approach and how it led to an infamous outage.

    For those not familiar with the concept, anycasting means the same IP address is shared by devices in multiple locations, with routers sending packets to the “nearest” location. This can result in latency lower than that is possible with the speed of light limitation from a single location1. Although, as you will see later, the routers’ concept of “nearest” may not necessarily be what we expect.

    Now, if one location stops announcing the IP address via BGP, then routers will select the next best location, enabling high availability as long as there is one location still available. Somewhat morbidly, I’ve claimed that this website will stay up even if Yellowstone erupts, which is theoretically true since my servers in Europe would still be able to serve traffic to the rest of the world even if every server in North America is down, but I’ve not tested this (and hope it will never be tested).

    Side note: AS200351 turns one year old today! 🎂

    (Read more...)
  • What I wish I knew when I got my ASN

    As you may know, I am currently writing a series on BGP and how the Internet works, from my perspective as the operator of a small autonomous system, AS200351. While we haven’t really exhausted the theoretical material, I think I’ve covered enough to enable readers to set up their own basic autonomous system. Rather than forcing you to do your own research based on outdated and potentially incorrect information on the Internet, or allowing you to fall victim to scams, I think it would be wise to talk about the process of getting your own ASN.

    For readers who haven’t read the previous parts of the series and are unfamiliar with why one might want an ASN, here’s a brief explanation:

    An autonomous system (AS) is a constituent part of the Internet that can define its own routing to the remainder of the Internet, and ASes exchange routes with each other over Border Gateway Protocol (BGP) to form the Internet itself. By receiving a globally unique identifier, an AS number (ASN), which in my case is 200351, I can exchange routes over BGP with other ASes, announce my own IP addresses to the Internet, and control how traffic flows in and out of my network, as opposed to simply exchanging traffic from a default gateway to reach the Internet with an IP address assigned by my ISP. This comes with several advantages, such as being able to switch upstream ISPs at will (or when such an ISP fails) without changing my IP addresses or breaking a single connection; or to advertise the same IP addresses from multiple locations (anycasting) to allow users to reach my services with lower latency than otherwise permissible by the speed of light with automatic failover.

    I will now share what I wish I knew when I impulsively decided to apply for an ASN at 3 a.m. on a cold December night last year, now that I’ve been doing this for a while. I’ll walk through the process as objectively and thoroughly as possible, demystifying the role of any player in this space. I would like you to go into this with full knowledge of the risks and a full understanding of where your money is going. In the end, I will offer some subjective suggestions on providers, but those can be ignored if you would rather do your own research.

    (Read more...)