Posts in Devops (20 found)

Self-hosting and surviving the front page of Hacker News

IntroLast October, my post “Why I Ditched Disqus for My Blog” unexpectedly reached the #1 spot on Hacker News - and stayed on the front page for more than 11 hours. Overnight, a blog I usually run quietly from an old laptop-turned-server at home was under the kind of scrutiny I never prepared for. For context, a couple of years ago, I migrated from Ghost to Hugo, and started self-hosting in a truly DIY way - a setup that could reliably handle my modest regular traffic.

0 views

BlogLog April 10 2026

Subscribe via email or RSS I added a new page to my blog in the header showing all the specifications of my homelab and self-hosted services. It will be updated as I continue to update my services or infrastructure. Fixed misspellings in Overview of My Homelab post.

0 views
Evan Hahn 6 days ago

In defense of GitHub's poor uptime

In short: GitHub’s downtime is bad, but uptime numbers can be misleading. It’s not as bad as it looks; more like a D than an F. 99.99% uptime, or “four nines”, is a common industry standard. Four nines of uptime is equivalent to 1.008 minutes of downtime per week. GitHub is not meeting that, and it’s frustrating. Even though they’re owned by Microsoft’s, one of the richest companies on earth, they aren’t clearing this bar. Here are some things people are saying: According to “The Missing GitHub Status Page” , which reports historical uptime better than GitHub’s official source, they’ve had 89.43% uptime over the last 90 days. That’s zero nines of uptime. That implies more than 2.5 hours of downtime every day ! I dislike GitHub and Microsoft, so I shouldn’t be coming to their defense, but I think this characterization is unfair. I’m no mathematician, but let’s do a little math. Let’s say your enterprise has two services: Service A and Service B. Over the last 10 days: 3 of the last 10 days had outages. That’s 70% uptime total. (That’s how the Missing GitHub Status Page calculates it.) GitHub’s status page lists ten services: core Git operations, webhooks, Issues, and more. Sometimes they’re down simultaneously, but usually not. If all ten of those services have 99% uptime and outages don’t overlap, it’d look like GitHub had 90% uptime because some part of GitHub is out 10% of the time. That’s much worse! The numbers look better if outages happen at the same time. For example, if Service A and Service B go down on Saturday and Sunday, you’d have 80% uptime overall instead of 70%. Compared to the previous scenario, Service A is down twice as long, but the uptime number looks better. A downstream effect of this calculation is that your uptime numbers look worse if your services are well-isolated . I think it’s good that Service A doesn’t take down Service B! I think it’s good that a GitHub Packages outage doesn’t take down GitHub Issues! But if all you see is one aggregate uptime number, you might miss that. Things look rosier when you look at features individually. Over the last 90 days, core Git operations have had 98.98% uptime, or about 22 hours where things were broken. That’s still bad, but not as bad as some people are saying. D tier, not F tier. Also, an incident doesn’t mean everything is broken. For example, GitHub recently had an issue where things were slow for users on the west coast of the United States. Not good , but not “everything is broken for all users”. Again, the number doesn’t tell the whole story. I still think GitHub’s uptime is unacceptably low, especially because they’re owned by Microsoft, but I don’t think we’re being honest when we say that GitHub has “zero nines” of availability. To me, it’s more like: they have a bunch of unstable services which cumulatively have horrible uptime, but individually have not-very-good uptime. There are better reasons to dislike these companies. “GitHub appears to be struggling with measly three nines availability” “World’s First Enterprise Solution With Zero Nines Uptime” “Sure, they may have made the uptime worse, but remember what we got in exchange – when it’s up, the UI is slower and buggier.” Service A had one day of downtime. That means it has 90% uptime. Service B had two days of downtime on different days. That means it has 80% uptime.

0 views
Jim Nielsen 6 days ago

Fewer Computers, Fewer Problems: Going Local With Builds & Deployments

Me, in 2025, on Mastodon : I love tools like Netlify and deploying my small personal sites with But i'm not gonna lie, 2025 might be the year I go back to just doing builds locally and pushing the deploys from my computer. I'm sick of devops'ing stupid stuff because builds work on my machine and I have to spend that extra bit of time to ensure they also work on remote linux computers. Not sure I need the infrastructure of giant teams working together for making a small personal website. It’s 2026 now, but I finally took my first steps towards this. One of the ideas I really love around the “local-first” movement is this notion that everything canonical is done locally, then remote “sync” is an enhancement. For my personal website, I want builds and deployments to work that way. All data, build tooling, deployment, etc., happens first and foremost on my machine. From there, having another server somewhere else do it is purely a “progressive enhancement”. If it were to fail, fine. I can resort back to doing it locally very easily because all the tooling is optimized for local build and deployment first (rather than being dependent on fixing some remote server to get builds and deployments working). It’s amazing how many of my problems come from the struggle to get one thing to work identically across multiple computers . I want to explore a solution that removes the cause of my problem, rather than trying to stabilize it with more time and code. “The first rule of distributed computing is don’t distribute your computing unless you absolutely have to” — especially if you’re just building personal websites. So I un-did stuff I previously did (that’r right, my current predicament is self-inflicted — imagine that). My notes site used to work like this : It worked, but sporadically. Sometimes it would fail, then start working again, all without me changing anything. And when it did work, it often would take a long time — like five, six minutes to run a build/deployment. I never could figure out the issue. Some combination of Netlify’s servers (which I don’t control and don’t have full visibility into) talking to Dropbox’s servers (which I also don’t control and don’t have full visibility into). I got sick of trying to make a simple (but distributed) build process work across multiple computers when 99% of the time, I really only need it to work on one computer. So I turned off builds in Netlify, and made it so my primary, local computer does all the work. Here are the trade-offs: The change was pretty simple. First, I turned off builds in Netlify. Now when I Netlify does nothing. Next, I changed my build process to stop pulling markdown notes from the Dropbox API and instead pull them from a local folder on my computer. Simple, fast. And lastly, as a measure to protect myself from myself, I cloned the codebase for my notes to a second location on my computer. This way I have a “working copy” version of my site where I do local development, and I have a clean “production copy” of my site which is where I build/deploy from. This helps ensure I don’t accidentally build and deploy my “working copy” which I often leave in a weird, half-finished state. In my I have a command that looks like this: That’s what I run from my “clean” copy. It pulls down any new changes, makes sure I have the latest deps, builds the site, then lets Netlify’s CLI deploy it. As extra credit, I created a macOS shortcut So I can do , type “Deploy notes.jim-nielsen.com” to trigger a build, then watch the little shortcut run to completion in my Mac’s menubar. I’ve been living with this setup for a few weeks now and it has worked beautifully. Best part is: I’ve never had to open up Netlify’s website to check the status of a build or troubleshoot a deployment. That’s an enhancement I can have later — if I want to. Reply via: Email · Mastodon · Bluesky Content lives in Dropbox Code is on GitHub Netlify’s servers pull both, then run a build and deploy the site What I lose : I can no longer make edits to notes, then build/deploy the site from my phone or tablet. What I gain : I don’t have to troubleshoot build issues on machines I don’t own or control. Now, if it “works on my machine”, it works period.

0 views
Abhinav Sarkar 1 weeks ago

Running NixOS Micro VMs on MacOS

microvm.nix is a framework to run NixOS based micro VMs on various platforms. In particular, it can use vfkit to run micro VMs on macOS that use the macOS virtualization framework to provide a more performant VM than QEMU . microvm.nix works well but the documentation is a bit lacking. I had to figure out some gotchas while setting this up on my MacBook Pro M4, so I decided to write this note. This tutorial requires Nix and Nix Darwin to be installed on the macOS machine. To build a micro VM, we need a NixOS builder machine running AArch64 Linux. Thankfully, it is really easy to set up one with Nix Darwin. Assuming we have Nix Darwin set up with a Nix flake like: First, we add Nix Linux builder config: Now, we switch the system config to build and start the Linux builder: We should verify that the builder is working: It may take up to a minute for the builder to start. Once SSH works, we can proceed. We create a file with the micro VM configuration: This configures a micro VM with 4 VCPU, 8GB RAM and 40 GB disk. The disk image is used to store the Nix packages downloaded within the VM. It is mounted at . The host’s Nix store is mounted read-only at . The option combines these two with overlays to create the VM’s Nix store at . We can share additional directories from the host and mount them in the VM, as we do here for the directory from the macOS host. Next couple of lines set up networking in the VM. The vfkit hypervisor supports only NAT networking. This means: There are ways to work around this using gvisor-tap-vsock and vmnet-helper , but we are not going into it here. We can uncomment the line if we want a graphical NixOS VM. Finally, the workaround for the big gotcha! By default Nix does builds in a sandbox and the sandbox is created (and deleted) on the root filesystem. However, microvm.nix uses a temporary filesystem residing in RAM for the root filesystem. This means that the Nix builds may cause the root FS and RAM to fill up, causing out-of-memory or out-of-disk-space errors. To prevent that, we disable the sandbox and set the build directory to be at on the disk image we mounted. Next, we integrate the VM config with the Nix Darwin flake: Let’s go over the tricky bits. The wrapper script rebinds Ctrl + ] to send the interrupt, suspend and quit signals instead of the usual Ctrl + C so that we can use Ctrl + C inside the VM without it causing the VM to shut down. We add the script to our system packages. Lastly, the defines the actual micro VM using the file. Finally, we build and install the micro VM: And, now we can run it from any directory: Note that the disk image file will be created in the directory in which we run the above command. After this, we can remove the Linux builder config and switch again to stop and delete it. Now we have a performant micro VM running NixOS to play around with in our macOS machine. That’s all I had for this note. I hope this helps. If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! Thanks for reading this post via feed. Feeds are great, and you're great for using them. ♥ This post was originally published on abhinavsarkar.net . Read more of my posts and notes . The VM can make outgoing connections to the host/internet. The host cannot initiate connections to the VM.

0 views
Nelson Figueroa 1 weeks ago

Proxying GoatCounter Requests for a Hugo Blog on CloudFront to bypass Ad Blockers

I’ve been running GoatCounter on my site using the script . The problem is that adblockers like uBlock Origin block it (understandably). To get around this, I set up proxying so that the GoatCounter requests go to an endpoint under my domain , and then from there CloudFront handles it and sends it to GoatCounter. Most ad blockers work based on domain and GoatCounter is on the blocklists. Since the browser is now sending requests to the same domain as my site, it shouldn’t trigger any ad blockers. This post explains how I did it in case it’s useful for anyone else. It’s possible to self-host GoatCounter, but my approach was easier to do and less infrastructure to maintain. Perhaps in the future. I know there are concerns around analytics being privacy-invasive. GoatCounter is privacy-respecting. I care about privacy. I am of the belief that GoatCounter is harmless. I just like to keep track of the visitors on my site. Read the GoatCounter developer’s take if you want another opinion: Analytics on personal websites . Clicking through the AWS console to configure CloudFront distributions is a pain in the ass. I took the time to finally get the infrastructure for my blog managed as infrastructure-as-code with Pulumi and Python . So while you can click around the console and do all of this, I will be showing how to configure everything with Pulumi. If you don’t want to use IaC, you can still find all of these options/settings in AWS itself. To set up GoatCounter proxying via CloudFront, we’ll need to CloudFront functions are JavaScript scripts that run before a request reaches a CloudFront distribution’s origin. In this case, the function strips the from . We need to strip for two reasons: Here is the code for the function: And here is the CloudFront function resource defined in Pulumi (using Python) that includes the JavaScript from above. This is a new resource defined in the same Python file where my existing distribution already exists: Here is my existing CloudFront distribution being updated with a new origin and cache behavior in Pulumi code. At the time of writing CloudFront only allows to be a list of HTTP methods in specific combinations. The value must be one of these: Since the GoatCounter JavaScript sends a request, and the third option is the only one that includes , we’re forced to use all HTTP verbs. It should be harmless though. Now that my Pulumi code has both the CloudFront function defined and the CloudFront distribution has been updated, I ran to apply changes. Finally, I updated goatcounter.js to use the new endpoint. So instead of I changed it to my own domain at the very top of the snippet: After this, I built my site with Hugo and deployed it on S3/CloudFront by updating the freshly built HTML/CSS/JS in my S3 Bucket and then invalidating the existing CloudFront cache . Now, GoatCounter should no longer be blocked by uBlock Origin. I tested by loading my site on an incognito browser window and checked that uBlock Origin was no longer blocking anything on my domain. Everything looks good! If you’re using GoatCounter you should consider sponsoring the developer . It’s a great project. Create a new CloudFront function resource Add a second origin to the distribution Add an ordered cache behavior to the distribution (which references the CloudFront function using its ARN) Update the GoatCounter script to point to this new endpoint I chose to proxy requests that hit the endpoint on my site to make sure there’s no collision with post titles/slugs. I’ll never use the path for posts. GoatCounter accepts requests under , not https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cloudfront-functions.html https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/DownloadDistS3AndCustomOrigins.html https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/DownloadDistValuesCacheBehavior.html https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html https://www.goatcounter.com/help/js https://www.goatcounter.com/help/backend https://www.goatcounter.com/help/countjs-host

0 views
./techtipsy 1 weeks ago

You can fake SSD-like disk speeds in any Linux VM, but it's unsafe (literally)

Do you have a need for speed really fast disk performance? Are you unwilling or unable to buy/rent a machine with fast, NVMe-based storage? Are you OK with potential data corruption? Then is the solution to all your problems! We had an interesting conundrum at work recently. Our platform does not use a lot of resources, but there are bursts of activity that require a lot of CPU and performant disk IO from our infrastructure. This was previously handled by manually starting some expensive AWS EC2 instances to cope with the load, but this manual process was error-prone due to the human factor (which did end up causing an actual production outage once), and AWS is stupidly expensive for what you get in return. Around this time I also learned about a Proxmox server that we were underutilizing. My goal was to investigate the resources that we had available and to ensure that we didn’t have to think about taking any manual actions while at the same time not relying on AWS and its expensive resources. I set up a few VM-s on the Proxmox machine, and did some testing. CPU, memory, that was all fine, but the IO-bound workloads that we had to run during those bursty periods would still be relatively slow. Not much slower than the main infrastructure provider that we were using, but slow enough for a beefy machine to not be able to handle more than a few parallel IO-heavy workloads running at the same time. We exhausted a few other wild-ass ideas during the investigation: Then one day I was browsing around Proxmox and noticed an interesting option on the virtual storage drives: setting the cache mode to . With this one trick, your VM will see really fast disk speeds up to a certain point, and it’s invisible from the perspective of your workloads, no customization needed. In a way, this is like one of the RAM-backed storage options, but for the whole VM. The major trade-off is that an unexpected shutdown of the VM or the VM host will likely result in data corruption. This is because you’re writing everything to memory first, and eventually the writes will end up on persistent storage, whenever the disks catch up with you. Something happens while changes are in memory, and they are lost. In our case, the data corruption risk is completely OK, as the workloads are ephemeral, the results of the work are sent to another machine immediately after completion, and the configuration of the machine is largely automated with Ansible. One instance of our workload would usually result in writing 50 MB to disk, and we observed about 300-500 IOPS of performance from HDD-backed storage. The disks were not able to handle more than one at a time if we cared about execution time. With the trick, and on some relatively old hardware (assume DDR3 memory), we saw numbers as high as 15K IOPS and disk throughput of 500+ MB/s. This was more than enough to handle peak loads, and the resources were always on and available on a rented server with a stable price that compared extremely well to AWS. Cloud service providers have their benefits, sure, but when all you need is raw speed and configurability to make it happen, then owning a physical Linux server (or a few of them for redundancy) is a no-brainer, slam-dunk decision, as long as you have someone in your team that knows how to manage one. Since you’re working with Linux VM-s already in the cloud, then you already have that person in your team, don’t you? :) Docker on a RAM-backed storage drive online resources did not inspire confidence in this working well, so we didn’t try this optimizing the workload to not be IO-heavy unsuccessful after spending a few hours on it, the high IO was a consequence of making an intentional trade-off to reduce CPU load, and the IO requirement was much more manageable putting certain folders in the container itself on RAM-backed storage highly container specific, and did not yield the desired results

0 views

Stamp It! All Programs Must Report Their Version

Recently, during a production incident response, I guessed the root cause of an outage correctly within less than an hour (cool!) and submitted a fix just to rule it out, only to then spend many hours fumbling in the dark because we lacked visibility into version numbers and rollouts… 😞 This experience made me think about software versioning again, or more specifically about build info (build versioning, version stamping, however you want to call it) and version reporting. I realized that for the i3 window manager, I had solved this problem well over a decade ago, so it was really unexpected that the problem was decidedly not solved at work. In this article, I’ll explain how 3 simple steps (Stamp it! Plumb it! Report it!) are sufficient to save you hours of delays and stress during incident response. Every household appliance has incredibly detailed versioning! Consider this dishwasher: (Thank you Feuermurmel for sending me this lovely example!) I observed a couple household appliance repairs and am under the impression that if a repair person cannot identify the appliance, they would most likely refuse to even touch it. So why are our standards so low in computers, in comparison? Sure, consumer products are typically versioned somehow and that’s typically good enough (except for, say, USB 3.2 Gen 1×2!). But recently, I have encountered too many developer builds that were not adequately versioned! Unlike a physical household appliance with a stamped metal plate, software is constantly updated and runs in places and structures we often cannot even see. Let’s dig into what we need to increase our versioning standard! Usually, software has a name and some version number of varying granularity: All of these identify the Chrome browser on my computer, but each at different granularity. All are correct and useful, depending on the context. Here’s an example for each: After creating the i3 window manager , I quickly learned that for user support, it is very valuable for programs to clearly identify themselves. Let me illustrate with the following case study. When running , you will see output like this: Each word was carefully deliberated and placed. Let me dissect: When doing user support, there are a couple of questions that are conceptually easy to ask the affected user and produce very valuable answers for the developer: Based on my experiences with asking these questions many times, I noticed a few patterns in how these debugging sessions went. In response, I introduced another way for i3 to report its version in i3 v4.3 (released in September 2012): a flag! Now I could ask users a small variation of the first question: What is the output of ? Note how this also transfers well over spoken word, for example at a computer meetup: Michael: Which version are you using? User: How can I check? Michael: Run this command: User: It says 4.24. Michael: Good, that is recent enough to include the bug fix. Now, we need more version info! Run please and tell me what you see. When you run , it does not just report the version of the i3 program you called, it also connects to the running i3 window manager process in your X11 session using its IPC (interprocess communication) interface and reports the running i3 process’s version, alongside other key details that are helpful to show the user, like which configuration file is loaded and when it was last changed: This might look like a lot of detail on first glance, but let me spell out why this output is such a valuable debugging tool: Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. I use NixOS, BTW, so I automatically get a stable identifier ( ) for the specific build of i3. To see the build recipe (“derivation” in Nix terminology) which produced this Nix store output ( ), I can run : Unfortunately, I am not aware of a way to go from the derivation to the source, but at least one can check that a certain source results in an identical derivation. The versioning I have described so far is sufficient for most users, who will not be interested in tracking intermediate versions of software, but only the released versions. But what about developers, or any kind of user who needs more precision? When building i3 from git, it reports the git revision it was built from, using : A modified working copy gets represented by a after the revision: Reporting the git revision (or VCS revision, generally speaking) is the most useful choice. This way, we catch the following common mistakes: As we have seen above, the single most useful piece of version information is the VCS revision. We can fetch all other details (version numbers, dates, authors, …) from the VCS repository. Now, let’s demonstrate the best case scenario by looking at how Go does it! Go has become my favorite programming language over the years, in big part because of the good taste and style of the Go developers, and of course also because of the high-quality tooling: I strive to respect everybody’s personal preferences, so I usually steer clear of debates about which is the best programming language, text editor or operating system. However, recently I was asked a couple of times why I like and use a lot of Go, so here is a coherent article to fill in the blanks of my ad-hoc in-person ramblings :-). Read more → Therefore, I am pleased to say that Go implements the gold standard with regard to software versioning: it stamps VCS buildinfo by default! 🥳 This was introduced in Go 1.18 (March 2022) : Additionally, the go command embeds information about the build, including build and tool tags (set with -tags), compiler, assembler, and linker flags (like -gcflags), whether cgo was enabled, and if it was, the values of the cgo environment variables (like CGO_CFLAGS). Both VCS and build information may be read together with module information using or runtime/debug.ReadBuildInfo (for the currently running binary) or the new debug/buildinfo package. Note: Before Go 1.18, the standard approach was to use or similar explicit injection. This setup works (and can still be seen in many places) but requires making changes to the application code, whereas the Go 1.18+ stamping requires no extra steps. What does this mean in practice? Here is a diagram for the common case: building from git: This covers most of my hobby projects! Many tools I just , or if I want to easily copy them around to other computers. Although, I am managing more and more of my software in NixOS. When I find a program that is not yet fully managed, I can use and the tool to identify it: It’s very cool that Go does the right thing by default! Systems that consist of 100% Go software (like my gokrazy Go appliance platform ) are fully stamped! For example, the gokrazy web interface shows me exactly which version and dependencies went into the build on my scan2drive appliance . Despite being fully stamped, note that gokrazy only shows the module versions, and no VCS buildinfo, because it currently suffers from the same gap as Nix: For the gokrazy packer, which follows a rolling release model (no version numbers), I ended up with a few lines of Go code (see below) to display a git revision, no matter if you installed the packer from a Go module or from a git working copy. The code either displays (the easy case; built from git) or extracts the revision from the Go module version of the main module ( ): What are the other cases? These examples illustrate the scenarios I usually deal with: This is what it looks like in practice: But a version built from git has the full revision available (→ you can tell them apart): When packaging Go software with Nix, it’s easy to lose Go VCS revision stamping: So the fundamental tension here is between reproducibility and VCS stamping. Luckily, there is a solution that works for both: I created the Nix overlay module that you can import to get working Go VCS revision stamping by default for your Nix expressions! Tip: If you are not a Nix user, feel free to skip over this section. I included it in this article so that you have a full example of making VCS stamping work in the most complicated environments. Packaging Go software in Nix is pleasantly straightforward. For example, the Go Protobuf generator plugin is packaged in Nix with <30 lines: official nixpkgs package.nix . You call , supply as the result from and add a few lines of metadata. But getting developer builds fully stamped is not straightforward at all! When packaging my own software, I want to package individual revisions (developer builds), not just released versions. I use the same , or if I need the latest Go version. Instead of using , I provide my sources using Flakes, usually also from GitHub or from another Git repository. For example, I package like so: The comes from my : Go stamps all builds, but it does not have much to stamp here: Here’s a full example of gokrazy/bull: To fix VCS stamping, add my overlay to your : (If you are using , like I am, you need to apply the overlay in both places.) After rebuilding, your Go binaries should newly be stamped with buildinfo: Nice! 🥳 But… how does it work? When does it apply? How do you know how to fix your config? I’ll show you the full diagram first, and then explain how to read it: There are 3 relevant parts of the Nix stack that you can end up in, depending on what you write into your files: For the purpose of VCS revision stamping, you should: Hence, we will stick to the left-most column: fetchers. Unfortunately, by default, with fetchers, the VCS revision information, which is stored in a Nix attrset (in-memory, during the build process), does not make it into the Nix store, hence, when the Nix derivation is evaluated and Go compiles the source code, Go does not see any VCS revision. My Nix overlay module fixes this, and enabling the overlay is how you end up in the left-most lane of the above diagram: the happy path, where your Go binaries are now stamped! How does the overlay work? It functions as an adapter between Nix and Go: So the overlay implements 3 steps to get Go to stamp the correct info: For the full source, see . See Go issue #77020 and Go issue #64162 for a cleaner approach to fixing this gap: allowing package managers to invoke the Go tool with the correct VCS information injected. This would allow Nix (or also gokrazy) to pass along buildinfo cleanly, without the need for workarounds like my adapter . At the time of writing, issue #77020 does not seem to have much traction and is still open. My argument is simple: Stamping the VCS revision is conceptually easy, but very important! For example, if the production system from the incident I mentioned had reported its version, we would have saved multiple hours of mitigation time! Unfortunately, many environments only identify the build output (useful, but orthogonal), but do not plumb the VCS revision (much more useful!), or at least not by default. Your action plan to fix it is just 3 simple steps: Implementing “version observability” throughout your system is a one-day high-ROI project. With my Nix example, you saw how the VCS revision is available throughout the stack, but can get lost in the middle. Hopefully my resources help you quickly fix your stack(s), too: Now go stamp your programs and data transfers! 🚀 Chrome 146.0.7680.80 Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} “This works in Chrome for me, did you test in Firefox?” “Chrome 146 contains broken middle-click-to-paste-and-navigate” “I run Chrome 146.0.7680.80 and cannot reproduce your issue” “Apply this patch on top of Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} and follow these steps to reproduce: […]” : I could have shortened this to or maybe , but I figured it would be helpful to be explicit because is such a short name. Users might mumble aloud “What’s an i-3-4-2-4?”, but when putting “version” in there, the implication is that i3 is some computer thing (→ a computer program) that exists in version 4.24. is the release date so that you can immediately tell if “ ” is recent. signals when the project was started and who is the main person behind it. gives credit to the many people who helped. i3 was never a one-person project; it was always a group effort. Question: “Which version of i3 are you using?” Since i3 is not a typical program that runs in a window (but a window manager / desktop environment), there is no Help → About menu option. Instead, we started asking: What is the output of ? Question: “ Are you reporting a new issue or a preexisting issue? To confirm, can you try going back to the version of i3 you used previously? ”. The technical terms for “going back” are downgrade, rollback or revert. Depending on the Linux distribution, this is either trivial or a nightmare. With NixOS, it’s trivial: you just boot into an older system “generation” by selecting that version in the bootloader. Or you revert in git, if your configs are version-controlled. With imperative Linux distributions like Debian Linux or Arch Linux, if you did not take a file system-level snapshot, there is no easy and reliable way to go back after upgrading your system. If you are lucky, you can just the older version of i3. But you might run into dependency conflicts (“version hell”). I know that it is possible to run older versions of Debian using snapshot.debian.org , but it is just not very practical, at least when I last tried. Can you check if the issue is still present in the latest i3 development version? Of course, I could also try reproducing the user issue with the latest release version, and then one additional time on the latest development version. But this way, the verification step moves to the affected user, which is good because it filters for highly-motivated bug reporters (higher chance the bug report actually results in a fix!) and it makes the user reproduce the bug twice , figuring out if it’s a flaky issue, hard-to-reproduce, if the reproduction instructions are correct, etc. A natural follow-up question: “ Does this code change make the issue go away? ” This is easy to test for the affected user who now has a development environment. Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Note that this is the same check that is relevant during production incidents: verifying that effectively running matches supposed to be running versions. Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. People build from the wrong revision. People build, but forget to install. People install, but their session does not pick it up (wrong location?). Nix fetchers like are implemented by fetching an archive ( ) file from GitHub — the full repository is not transferred, which is more efficient. Even if a repository is present, Nix usually intentionally removes it for reproducibility: directories contain packed objects that change across runs (for example), which would break reproducible builds (different hash for the same source). We build from a directory, not a Go module, so the module version is . The stamped buildinfo does not contain any information. Fetchers. These are what Flakes use, but also non-Flake use-cases. Fixed-output derivations (FOD). This is how is implemented, but the constant hash churn (updating the line) inherent to FODs is annoying. Copiers. These just copy files into the Nix store and are not git-aware. Avoid the Copiers! If you use Flakes: ❌ do not use as a Flake input ✅ use instead for git awareness I avoid the fixed-output derivation (FOD) as well. Fetching the git repository at build time is slow and inefficient. Enabling , which is needed for VCS revision stamping with this approach, is even more inefficient because a new Git repository must be constructed deterministically to keep the FOD reproducible. Nix tracks the VCS revision in the in-memory attrset. Go expects to find the VCS revision in a repository, accessed via file access and commands. It synthesizes a file so that Go’s detects a git repository. It injects a command into the that implements exactly the two commands used by Go and fails loudly on anything else (in case Go updates its implementation). It sets in the environment variable. Stamp it! Include the source VCS revision in your programs. This is not a new idea: i3 builds include their revision since 2012! Plumb it! When building / packaging, ensure the VCS revision does not get lost. My “VCS rev with NixOS” case study section above illustrates several reasons why the VCS rev could get lost, which paths can work and how to fix the missing plumbing. Report it! Make your software print its VCS revision on every relevant surface, for example: Executable programs: Report the VCS revision when run with For Go programs, you can always use Services and batch jobs: Include the VCS revision in the startup logs. Outgoing HTTP requests: Include the VCS revision in the HTTP responses: Include the VCS revision in a header (internally) Remote Procedure Calls (RPCs): Include the revision in RPC metadata User Interfaces: Expose the revision somewhere visible for debugging. My overlay for Nix / NixOS My repository is a community resource to collect examples (as markdown content) and includes a Go module with a few helpers to make version reporting trivial.

0 views
./techtipsy 1 weeks ago

The most unstable computer in my fleet is now the most critical one

Remember that failed experiment where I ran Jellyfin off of a LattePanda V1? Do you recall all the parts where I said what this single board computer cannot do? Yeah, I remember. Then I took it and put the two of the most critical services running on it: the blog you’re reading right now, and my Wireguard setup. Trust me, it makes more sense with some context. The board is incapable of doing anything else other than serving content from the eMMC module, and it has a functioning network port. It doesn’t seem to crash in these scenarios. When I try anything else with this board, especially things that include USB connectivity, things break. This makes the board ideal for a light workload that needs to be up 24/7. The biggest threat to my uptime is not internet connectivity or loss of power (although that did happen for the first time in a year recently), it’s me getting new ideas to try out on my setup, which results in downtime. This board is so unreliable for trying those ideas out that it removes any and all temptation to do that, resulting in a computer that has the highest chance of actually being up and running for a very long time. To play things safe, I used an IKEA SJÖSS 20W USB-C power adapter that I got for 3 EUR, with a cheap USB-C to USB-A adapter thrown into the mix. It looks janky, but the adapter outputs 5V 3A, which makes it the beefiest power adapter that I have in my fleet for plain USB-A powered devices. I then hit the board with some commands, including hitting the 2 GB of memory. It ran really well for days, no issues at all. I also improved the cooling situation. I am now a proud owner of an assortment of M2, M2.5 and M3 screws and bits, and equipped with a Makita cordless drill, I made some mounting holes into an old aluminium server heat sink. The drilling was a complete hack job, everything was misaligned, but it was good enough. Certainly better than holding the board and heat sink together with thin velcro strips. The cooling performance is completely adequate, the board hits a maximum of 65°C with the heat sink facing down. This is well below the point at which the board starts to throttle its CPU. The theoretical maximum Wireguard throughput on this board is about 340 Mbps, measured using the fantastic wg-bench solution. Remember the part about the USB ports being flaky? Yeah. That didn’t stop me from getting a USB Gigabit Ethernet adapter to remove one of the main limitations of the LattePanda V1. Based off of vibe-recommendations by Claude, I got a TP-Link UE300 for its alleged low power usage and its availability at a local computer store in Estonia. It seems to work well enough, you can push gigabit speeds through it measured by , and the actual Wireguard performance that I could push through it with an actual workload was at about 420 Mbit/s, higher than indicated by the benchmark, and plenty fast for most workloads, especially in external networks that are usually slower than that. A few hours after making that change, a HN post put some mild load on the LattePanda V1, what good timing. As of publishing this post, the blog has been running mostly off of the LattePanda V1 for over a month now, with that gap in it being caused by contemplating getting that USB Ethernet adapter and temporarily running the blog and Wireguard off of another mini PC during that time. Did you notice?

0 views
iDiallo 1 weeks ago

Zipbombs are not as effective as they used to be

Last year, I wrote about my server setup and how I use zipbombs to mitigate attacks from rogue bots. It was an effective method that help my blog survive for 10 years. I usually hesitate to write these types of articles, especially since it means revealing the inner workings of my own servers. This blog runs on a basic DigitalOcean droplet, a modest setup that can handle the usual traffic spike without breaking a sweat. But lately, things have started to change. My zipbomb strategy doesn't seem to be as effective as it used to be. TLDR; What I learned... and won't tell you Here is the code I shared last year : I deliberately didn't reveal what a function like does in the background. But that wasn't really the secret sauce bots needed to know to avoid my trap. In fact, I mentioned it casually: One more thing, a zip bomb is not foolproof. It can be easily detected and circumvented. You could partially read the content after all. But for unsophisticated bots that are blindly crawling the web disrupting servers, this is a good enough tool for protecting your server. One way to test whether my zipbomb was working was to place an abusive IP address in my blacklist and serve it a bomb. Those bots would typically access hundreds of URLs per second. But the moment they hit my trap, all requests from that IP would cease immediately. They don't wave a white flag or signal that they'll stop the abuse. They simply disappear on my end, and I imagine they crash on theirs. For a lean server like mine, serving 10 MB per request at a rate of a couple per second is manageable. But serving 10 MB per request at a rate of hundreds per second takes a serious toll. Serving large static files had already been a pain through Apache2, which is why I moved static files to a separate nginx server to reduce the load . Now, bots that ingest my bombs, detect them, and continue requesting without ever crashing, have turned my defense into a double-edged sword. Whenever there's an attack, my server becomes unresponsive, requests are dropped, and my monthly bandwidth gets eaten up. Worst of all, I'm left with a database full of spam. Thousands of fake emails in my newsletter and an overwhelmed comment section. After combing through the logs, I found a pattern and fixed the issue. AI-driven bots, or simply bots that do more than scrape or spam, are far more sophisticated than their dumber counterparts. When a request fails, they keep trying. And in doing so, I serve multiple zipbombs, and end up effectively DDoS-ing my own server. Looking at my web server settings: I run 2 instances of Apache, each with a minimum of 25 workers and a maximum of 75. Each worker consumes around 2 MB for a regular request, so I can technically handle 150 concurrent requests before the next one is queued. That's 300 MB of memory on my 1 GB RAM server, which should be plenty. The problem is that Apache is not efficient at serving large files, especially when they pass through a PHP instance. Instead of consuming just 2 MB per worker, serving a 10 MB zipbomb pushes usage to around 1.5 GB of RAM to handle those requests. In the worst case, this sends the server into a panic and triggers an automatic restart. Meaning that during a bot swarm, my server becomes completely unresponsive. And yet, here I am complaining, while you're reading this without experiencing any hiccups. So what did I do? For one, I turned off the zipbomb defense entirely. As for spam, I've found another way to deal with it. I still get the occasional hit when individuals try to game my system manually, but for my broader defense mechanism, I'm keeping my mouth shut. I've learned my lesson. I've spent countless evenings reading through spam and bot patterns to arrive at a solution. I wish I could share it, but I don't want to go back to the drawing board. Until the world collectively arrives at a reliable way to handle LLM-driven bots, my secret stays with me.

0 views
Kaushik Gopal 1 weeks ago

We are becoming Harness Engineers

The role of a software engineer is shifting. Not toward writing more code but toward building the environment that makes agents reliable. Think about what you actually do with Claude Code or Codex today: you configure AGENTS.md files, set up MCP servers, write skills and hooks, build feedback loops and tune sub-agents. You’re not writing as much of the software anymore. You’re engineering the harness around the thing that writes the software. Mitchell Hashimoto first coined the term harness engineering — the work of shaping the environment around an agent so it can act reliably. What the model sees, what tools it has, how it gets feedback, when humans step in. We keep hearing that agents will replace engineers. That shouldn’t be the focus of the change we’re seeing. What’s actually happening is product people shipping features directly. A well-harnessed agent lets someone with product instinct but little engineering background make meaningful changes — safely . The harness engineer makes that possible. Guardrails, design choices, blast radius controls, feedback loops. The scaffolding that turns “just prompt it” into something a team can trust. I say this from first-hand experience. If you want to go deeper, listen to the episode where my cohost and I dug into it. We landed on five pillars: Honestly one of the most important episodes we’ve recorded. agent legibility closed feedback loops persistent memory entropy control blast radius controls

0 views
Giles's blog 1 weeks ago

Automating starting Lambda Labs instances

I've been trying to get an 8x A100 instance on Lambda Labs to do a training run for my LLM from scratch series , but they're really busy at the moment, and it's rare to see anything. Thanks to the wonders of agentic coding, I spent an hour today getting something up and running to help, which I've called lambda-manager . It has three commands: Let's see if that helps -- though it's been running for six hours now, with no luck... , which prints which kinds of instances are available. , which prints out all of the possible instance types (available or not) with both their "friendly" names -- what you'd see on the website -- and the instance type names that the API uses. , which polls the API until it sees a specified type of instance, at which point it starts one and sends a Telegram message.

0 views
iDiallo 2 weeks ago

13th Year of Blogging

Of all the days to start a blog, I chose April Fools' Day. It wasn't intentional, maybe more of a reflection of my mindset. When I decide to do something, I shut off my brain and just do it. This was a commitment I made without thinking about the long-term effects. I knew writing was hard, but I didn't know how hard. I knew that maintaining a server was hard, but I didn't know the stress it would cause. Especially that first time I went viral. Seeing traffic pour in, reading back the article, and realizing it was littered with errors. I was scrambling to fix those errors while users hammered my server. I tried restarting it to relieve the load and update the content, but to no avail. It was a stressful experience. One I wouldn't trade for anything in the world. 13 years later, it feels like the longest debugging session I've ever run. Random people message me pointing out bugs. Some of it is complete nonsense. But others... well, I actually sent payment to a user who sent me a proof of concept showing how to compromise the entire server. I thought he'd done some serious hacking, but when I responded, he pointed me to one of my own articles where I had accidentally revealed a vulnerability in my framework. The amount you learn from running your own blog can't be replicated by any other means. Unlike other side projects that come and go, the blog has to remain. Part of its value is its longevity. No matter what, I need to make sure it stays online. In the age of AI, it feels like anyone can spin up a blog and fill it with LLM-generated content to rival any established one. But there's something no LLM can replicate: longevity. No matter what technology we come up with, no tool can create a 50-year-old oak tree. The only way to have one is to plant a seed and give it the time it needs to grow. Your very first blog post may not be entirely relevant years later, but it's that seed. Over time, you develop a voice, a process, a personality. Even when your blog has an audience of one, it becomes a reflection of every hurdle you cleared. For me, it's the friction in my career, the lessons I learned, the friends I made along the way. And luckily, it's also the audience that keeps me honest and stops me from spewing nonsense. Nothing brings a barrage of emails faster than being wrong. Maybe that's why I subconsciously published it on April Fools' Day. Maybe that's the joke. I'm going to keep adding rings to my tree, audience or no audience, I'm building longevity. Thank you for being part of this journey. Extra : Some articles I wrote on April Fools day. So you've been blogging for 2 years Quietly waiting for Overnight Success Happy 5th Anniversary Count the number of words with MySQL How to self-publish a book in 7 years The Art of Absurd Commitment Happy 12th Birthday Blog What is Copilot exactly?

0 views
Taranis 2 weeks ago

Go has some tricks up its logging sleeve

Since it's more or less TDOV (IYKYK...), I'm going to talk about logging instead. Logging isn't exactly the most shiny or in-your-face thing that coders tend to think about, but it really can make or break large systems. Throwing in a few print statements (or fmt.Printf, or whatever) only scratches the surface. I'm mostly talking about my own logging library here. If there's interest, I'd consider releasing it as open source, but it's currently a bit of a moving target. Feel free to comment if you think you'd find it useful, and I'll try to find the time to split it out from the Euravox codebase and put it on GitHub. The Go programming language ships with logging capabilities in the standard library, found in the log package. If you don't have any better alternatives, using that package rather than raw fmt.Printf is far preferable. My own logging package is a bit nicer. It's not my first – one of my first jobs working in financial markets data systems back in the 90s was the logging subsystem for the Reuter Workstation, and there is some influence from that 30-odd years later in my library. One of the first things I always recommend is breaking out log messages by log level. I currently define the following: It's possible to set a configuration parameter that limits logging at a particular level. This makes it possible to crank logging all the way up for tests, but dial it down for production without changing the code or having to introduce if/then guards around the logging. It was a finding back in the 90s that systems would sometimes break when you took the logging out – this isn't something that's normally a problem with Go, because idiomatic code doesn't tend to have too many side-effects, but it was quite noticeable with C++. Of course, the library doesn't do the string formatting if the level is disabled, but any parameters are still evaluated, which tends to be a less risky approach. It's common to send log messages to stdout or stderr. There's nothing fundamentally wrong with this, but I find it useful to have deeper capabilities than this. My own library has three options, which can be used together (and with different log levels): Any good logging solution should be able to include file name and line number information in log output. Using an IDE like vscode, this allows control/command-clicking a log entry and immediately seeing the code that generated it. C and C++ support this via some fancy #define stunts. Go lacks this kind of preprocessor, but actually has something far better: the runtime.Caller() library function. This makes it possible to pull back the file name and line number (and program counter if you care) anywhere up the call stack. This code fragment comes from my logging function. The argument to Caller is typically 2, because this code is called from one of many convenience functions for syntactic sugar. Typical log commands look something like this: The logging library will automatically pick up the file paths and line numbers where the log commands are located. However, this isn't always useful, and sometimes can be a complete nightmare. Here's a small example: In this case, the file name and line number that will be logged will be where the command is located. This can be absolutely maddening if has many call sites, because they will look exactly the same in the log. My logging library has a small tweak that I've not seen elsewhere – I'm not claiming invention or ownership, because it's so obviously useful that I'd be shocked if nobody else has ever done it. It's just I've not personally seen it. Anyway, here goes: In this case, works similarly to , but it takes an extra parameter at the start, which represents how many extra stack frames to look through to find the filename and line number. The parameter returns the filename and line number of the immediate caller, so the thing that makes its way into the log is the location of the calls, not the logging calls themselves. This might seem to be a subtle difference, but the practical consequences are huge – get this right, and logs become useful traces of activity that make it possible to look backwards in time to see when particular data items have been acted upon, and exactly by what code. Almost as good as single-stepping with a debugger, but can be done after the fact. Anyway, in conclusion, trans women are women, trans men are men, nonbinary and all other variant identities are valid. And fuck fascism. SPM -- Spam messages. Very verbose logging, not something you'd normally use, but the kind of thing that makes all the difference doing detailed debugging. INF -- Information messages. These are intended to be low volume, used to help trace what systems are doing, but not actually representing a an error (i.e., they explicitly are used to log normal behaviour) WRN -- Warning messages. What it says on the tin. Something is possibly wonky, but not bad enough to be an actual error. Real production systems should have as close to zero of these things as possible -- samething should either be normal (INF) or an actual error (ERR). ERR -- Error messages. This represents recoverable errors. Something bad happened, but the code can keep running without risk. FTL -- Fatal errors. These errors show that something very bad has happened, and that the code must abort immediately. There are two cases where this is appropriate. One is when something catastrophic has happened -- system has run out of handles, process is OOMing, etc. The second is where a serious logic bug has been detected. Though in some cases ERR can be OK for this, aborting makes it easier to spot that processes in production are badly broken (e.g., after a bad push), and need to be rolled back. stdout. Nothing special here, but I do have the option to send colour control codes for terminals that support it, which makes logs much more readable. Files. This is similar to piping the process through the tee command, but has the advantage that things like log rotation can be built in. I need to get around to supporting log rotation, but file output works now. Circular buffer. This is the one you don't see often. The idea here is you maintain an in-RAM circular buffer of N lines (say about 5000), which can be exposed via code. I use this to provide an HTTP/HTML interface that makes it possible to watch log output on a process via a web browser. This is a godsend when you have a large number of processes running across multiple VMs and/or physical machines.

0 views
Martin Fowler 2 weeks ago

Encoding Team Standards

AI coding assistants respond to whoever is prompting, and the quality of what they produce depends on how well the prompter articulates team standards. Rahul Garg proposes treating the instructions that govern AI interactions (generation, refactoring, security, review) as infrastructure: versioned, reviewed, and shared artifacts that encode tacit team knowledge into executable instructions, making quality consistent regardless of who is at the keyboard.

0 views
Jim Nielsen 2 weeks ago

Continuous, Continuous, Continuous

Jason Gorman writes about the word “continuous” and its place in making software. We think of making software in stages (and we often assign roles to ourselves and other people based on these stages): the design phase, the coding phase, the testing phase, the integration phase, the release phase, and so on. However this approach to building and distributing software isn’t necessarily well-suited to an age where everything moves at breakneck speed and changes constantly. The moment we start writing code, we see how the design needs to change. The moment we start testing, we see how the code needs to change. The moment we integrate our changes, we see how ours or other people’s code needs to change. The moment we release working software into the world, we learn how the software needs to change. Making software is a continuous cycle of these interconnected stages: designing, coding, testing, integrating, releasing. But the lines between these stages are very blurry, and therefore the responsibilities of people on our teams will be too. The question is: are our cycles for these stages — and the collaborative work of the people involved in them — measured in hours or weeks? Do we complete each of these stages multiple times a day, or once every few weeks? if we work backwards from the goal of having working software that can be shipped at any time, we inevitably arrive at the need for continuous integration, and that doesn’t work without continuous testing, and that doesn’t work if we try to design and write all the code before we do any testing. Instead, we work in micro feedback loops, progressing one small step at a time, gathering feedback throughout so we can iterate towards a good result. Feedback on the process through the process must be evolutionary. You can’t save it all up for a post-mortem or a 1-on-1. It has to happen at the moment, evolving our understanding one piece of feedback at a time (see: Gall’s law , a complex system evolves from a simpler one). if code craft could be crystallised in one word, that word would be “continuous”. Your advantage in software will be your ability to evolve and change as your customers expectations evolve and change (because the world evolves and changes), which means you must be prepared to respond to, address, and deliver on changes in expectations at any given moment in time. Reply via: Email · Mastodon · Bluesky

0 views
W. Jason Gilmore 2 weeks ago

Resolving Dependabot Issues with Claude Code

I created a Claude skill creatively called dependabot which once installed you can invoke like this: It will use the GitHub CLI to retrieve open Dependabot alerts and upgrade the relevant dependencies. If you have multiple GitHub accounts logged in via the CLI it will ask which one it should use if it can't figure it out based on how the skill was invoked or based on the repository settings. You can find the skill here: https://github.com/wjgilmore/dependabot-skill To install it globally, open a terminal and go to your home directory, then into and clone there. Then restart Claude Code and you should be able to invoke it like any other skill. Here is some example output of it running on one of my projects:

0 views
Carlos Becker 2 weeks ago

Announcing GoReleaser v2.15

This version a big one for Linux packaging - Flatpak bundles and Source RPMs land in the same release, alongside a rebuilt documentation website and better Go build defaults.

0 views
blog.philz.dev 2 weeks ago

computing 2+2: so many sandboxes

Sandboxes are so in right now. If you're doing agentic stuff, you've now doubt thought about what Simon Willison calls the lethal trifecta : private data, untrusted content, and external communication. If you work in a VM, for example, you can avoid putting a secret on that VM, and then that secret--that's not there!--can't be exfiltrated. If you want to deal with untrusted data, you can also cut off external communication. You can still use an agent, but you need to either limit its network access or limit its tools. So, today's task is to run five ways. Cloud Hypervisor is a Virtual Machine Monitor which runs on top of the Linux Kernel KVM (Kernel-based Virtual Machine) which runs on top of CPUs that support virtualization. A cloud-hypervisor VM sorta looks like a process on the host (and can be managed with cgroups, for example), but it's running a full Linux kernel. With the appropriate kernel options, you can run Docker containers, do tricky networking things, nested virtualization, and so on. Lineage-wise, it's in the same family as Firecracker and crosvm . It avoids implementing floppy devices and tries to be pretty small. Traditionally, people tell you to unpack a file system and maybe make a vinyl out of it using an iso image or some such. A trick is to instead start with a container image for your userspace, and then you get all the niceties (and all the warts) of Docker. Takes about 2 seconds. gVisor implements a large chunk of the Linux syscall interface in a Go process. Think of it as a userland kernel. It came out of Google's AppEngine work. It can use systrap/seccomp, ptrace, and KVM tricks to do the interception. The downside of gVisor is that you can't do some things inside of it. For example, you can't run vanilla Docker inside of gVisor because it doesn't support Docker's networking tricks. Again, let's use Docker to get ourselves a userland. No need for a kernel image. stands for "run secure container." Monty is a Python interpreter written in Rust. It doesn't expose the host, but can call functions that are explicitly exposed. This one's super fast. Pyodide is CPython compiled to WebAssembly. Deno is a JS runtime with permission-based security. Deno happens to run wasm code fine, so we're using it as a wasm runtime. There are other choices. Chromium is probably the world's most popular sandbox. This is pretty much the same as Deno: it's the V8 interpreter under the hood. Lots of ways to drive Chromium. Puppeteer, headless , etc. Let's try rodney : Run pyodide inside Deno inside gVisor inside cloud-hypervisor. Setting up the networking and the file system/disk sharing for these things is usually not trivial, especially if you don't want to accidentally expose the VMs to each other, and so forth. I want to compare two possible agents: a coding agent and a logs agent. A coding agent needs a full Linux, because, at the end of the day, it needs to edit files and run tests and operate git. Your sandboxing options are going to end up being a VM or a container of some sort. A logs agent needs access to your logs (say, the ability to run readonly queries on Clickhouse) and it needs to be able to send you its output. In the minimal case, it doesn't need any sandboxing at all, since it doesn't have access to anything. If you want it to be able to produce a graph, however, it will need to write out a file. At the minimum, it will need to take the results of its queries and pair them with an HTML file that has some JS that renders them with Vegalite. You might also want to mix and match the results of multiple queries, and do some data munging outside of SQL. This is all where a setup like Monty or Pyodide come in handy. Giving the agent access to some Python expands considerably how much the agent can do, and you can do it cheaply and safely with these sandboxes. In this vein, if you use DSPy for RLM, its implementation gives the LLM the Deno/pyodide solution to let the LLM have "infinite" context. Browser-based agents are a thing too. Itsy-Bitsy is a bookmarklet-based agent. It runs in the context of the web page it's operating on. Let me know what other systems I missed!

0 views
blog.philz.dev 2 weeks ago

What is Buildkite?

If you're starting a new project, just skip the misery of GitHub actions and move on. Buildkite mostly gets it. The core Buildkite noun is a Pipeline, and, as traditional for an enterprise software company, their docs don't really tell you what's what. The point is that your pipeline should be: Pipelines can add steps to themselves . So, you can write a script to generate your pipeline (or just store it in your repo), and cat it into the command, and that's how the rest of your steps are discovered. Pipeline steps are each executed in their own clean checkout of what you're building. So, if you want to run the playwright tests in parallel with the backend tests (or whatever), you just declare that as two different steps, but they're part of the same thing. Pipelines have a dependency graph between steps that's conceptually similar to . (Perhaps was the Make replacement that I first heard of that did "generate the ninja graph"?) The agents seem pretty good at manipulating Buildkite once you give them an API key. They also seem to not in-line shell scripts into the yaml, which is Obviously Good. The ways to speed up the build is always the same: cache and parallelize. A 16-core machine for 1 minute costs the same as an 2-core machine for 8 minutes, and I know which one I'd rather wait for! Buildkite makes parallelism pretty easy. Anyway, it's pretty good. Thanks, Buildkite.

0 views