Posts in Devops (20 found)
Robin Moffatt Yesterday

Alternatives to MinIO for single-node local S3

In late 2025 the company behind MinIO decided to abandon it to pursue other commercial interests. As well as upsetting a bunch of folk, it also put the cat amongst the pigeons of many software demos that relied on MinIO to emulate S3 storage locally, not to mention build pipelines that used it for validating S3 compatibility. In this blog post I’m going to look at some alternatives to MinIO. Whilst MinIO is a lot more than 'just' a glorified tool for emulating S3 when building demos, my focus here is going to be on what is the simplest replacement. In practice that means the following: Must have a Docker image. So many demos are shipped as Docker Compose, and no-one likes brewing their own Docker images unless really necessary. Must provide S3 compatibility. The whole point of MinIO in these demos is to stand-in for writing to actual S3. Must be free to use, with a strong preference for Open Source (per OSI definition ) licence e.g. Apache 2.0. Should be simple to use for a single-node deployment Should have a clear and active community and/or commercial backer. Any fule can vibe-code some abandon-ware slop, or fork a project in a fit of enthusiasm—but MinIO stood the test of time until now and we don’t want to be repeating this exercise in six months' time. Bonus points for excellent developer experience (DX), smooth configuration, good docs, etc. What I’m not looking at is, for example, multi-node deployments, distributed storage, production support costs, GUI capabilities, and so on. That is, this blog post is not aimed at folk who were using MinIO as self-managed S3 in production. Feel free to leave a comment below though if you have useful things to add in this respect :) My starting point for this is a very simple Docker Compose stack: DuckDB to read and write Iceberg data that’s stored on S3, provided by MinIO to start with. You can find the code here . The Docker Compose is pretty straightforward: DuckDB, obviously, along with Iceberg REST Catalog MinIO (S3 local storage) , which is a MinIO CLI and used to automagically create a bucket for the data. When I insert data into DuckDB: it ends up in Iceberg format on S3, here in MinIO: In each of the samples I’ve built you can run the to verify it. Let’s now explore the different alternatives to MinIO, and how easy they are to switch MinIO out for. I’ve taken the above project and tried to implement it with as few changes to use the replacement for MinIO. I’ve left the MinIO S3 client, in place since that’s no big deal to replace if you want to rip out MinIO completely (s3cmd, CLI, etc etc). 💾 Example Docker Compose Version tested: ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍👍 Very easy to implement, and seems like a nice lightweight option. 💾 Example Docker Compose Version tested: Ease of config: ✅✅ ✅ Docker image (100k+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility RustFS also includes a GUI: 💾 Example Docker Compose Version tested: ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍 This quickstart is useful for getting bare-minimum S3 functionality working. (That said, I still just got Claude to do the implementation…). Overall there’s not too much to change here; a fairly straightforward switchout of Docker images, but the auth does need its own config file (which as with Garage, I inlined in the Docker Compose). SeaweedFS comes with its own basic UI which is handy: The SeaweedFS website is surprisingly sparse and at a glance you’d be forgiven for missing that it’s an OSS project, since there’s a "pricing" option and the title of the front page is "SeaweedFS Enterprise" (and no GitHub link that I could find!). But an OSS project it is, and a long-established one: SeaweedFS has been around with S3 support since its 0.91 release in 2018 . You can also learn more about SeaweedFS from these slides , including a comparison chart with MinIO . 💾 Example Docker Compose Version tested: ✅ Docker image (also outdated ones on Docker Hub with 5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍 Formerly known as S3 Server, CloudServer is part of a toolset called Zenko, published by Scality. It drops in to replace MinIO pretty easily, but I did find it slightly tricky at first to disentangle the set of names (cloudserver/zenko/scality) and what the actual software I needed to run was. There’s also a slightly odd feel that the docs link to an outdated Docker image. 💾 Example Docker Compose Ease of config: 😵 Version tested: ✅ Docker image (1M+ pulls) ✅ Licence: AGPL ✅ S3 compatibility I had to get a friend to help me with this one. As well as the container, I needed another to do the initial configuration, as well as a TOML config file which I’ve inlined in the Docker Compose to keep things concise. Could I have sat down and RTFM’d to figure it out myself? Yes. Do I have better things to do with my time? Also, yes. So, Garage does work, but gosh…it is not just a drop-in replacement in terms of code changes. It requires different plumbing for initialisation, and it’s not simple at that either. A simple example: . Excellent for production hygiene…overkill for local demos, and in fact somewhat of a hindrance TBH. 💾 Example Docker Compose Version tested: ✅ Docker images (1M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ozone was spun out of Apache Hadoop (remember that?) in 2020 , having been initially created as part of the HDFS project back in 2015. Ease of config: 😵 It does work as a replacement for MinIO, but it is not a lightweight alternative; neither I nor Claude could figure out how to deploy it with any fewer than four nodes. It gives heavy Hadoop vibes, and I wouldn’t be rushing to adopt it for my use case here. I took one look at the installation instructions and noped right out of this one! Ozone (above) is heavyweight enough; I’m sure both are great at what they do, but they are not a lightweight container to slot into my Docker Compose stack for local demos. Everyone loves a bake-off chart, right? gaul/s3proxy ( Git repo ) Single contributor ( Andrew Gaul ) ( Git repo ) Fancy website but not much detail about the company ( Git repo ) Single contributor ( Chris Lu ), Enterprise option available Zenko CloudServer ( Git repo ) Scality (commercial company) 5M+ (outdated version) ( Git repo ) NGI/NLnet grants Apache Ozone ( Git repo ) Apache Software Foundation 1 Docker pulls is a useful signal but not an absolute one given that a small number of downstream projects using the image in a frequently-run CI/CD pipeline could easily distort this figure. I got side-tracked into writing this blog because I wanted to update a demo in which currently MinIO was used. So, having tried them out, which of the options will I actually use? SeaweedFS - yes. S3Proxy - yes. RustFS - maybe, but very new project & alpha release. CloudServer - yes, maybe? Honestly, put off by it being part of a suite and worrying I’d need to understand other bits of it to use it—probably unfounded though. Garage - no, config too complex for what I need. Apache Ozone - lol no. I mean to cast no shade on those options against which I’ve not recorded a ; they’re probably excellent projects, but just not focussed on my primary use case (simple & easy to configure single-node local S3). A few parting considerations to bear in mind when choosing a replacement for MinIO: Governance . Whilst all the projects are OSS, only Ozone is owned by a foundation (ASF). All the others could, in theory , change their licence at the drop of a hat (just like MinIO did). Community health . What’s the "bus factor"? A couple of the projects above have a very long and healthy history—but from a single contributor. If they were to abandon the project, would someone in the community fork and continue to actively develop it? Must have a Docker image. So many demos are shipped as Docker Compose, and no-one likes brewing their own Docker images unless really necessary. Must provide S3 compatibility. The whole point of MinIO in these demos is to stand-in for writing to actual S3. Must be free to use, with a strong preference for Open Source (per OSI definition ) licence e.g. Apache 2.0. Should be simple to use for a single-node deployment Should have a clear and active community and/or commercial backer. Any fule can vibe-code some abandon-ware slop, or fork a project in a fit of enthusiasm—but MinIO stood the test of time until now and we don’t want to be repeating this exercise in six months' time. Bonus points for excellent developer experience (DX), smooth configuration, good docs, etc. DuckDB, obviously, along with Iceberg REST Catalog MinIO (S3 local storage) , which is a MinIO CLI and used to automagically create a bucket for the data. ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (100k+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (also outdated ones on Docker Hub with 5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (1M+ pulls) ✅ Licence: AGPL ✅ S3 compatibility ✅ Docker images (1M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility SeaweedFS - yes. S3Proxy - yes. RustFS - maybe, but very new project & alpha release. CloudServer - yes, maybe? Honestly, put off by it being part of a suite and worrying I’d need to understand other bits of it to use it—probably unfounded though. Garage - no, config too complex for what I need. Apache Ozone - lol no. Governance . Whilst all the projects are OSS, only Ozone is owned by a foundation (ASF). All the others could, in theory , change their licence at the drop of a hat (just like MinIO did). Community health . What’s the "bus factor"? A couple of the projects above have a very long and healthy history—but from a single contributor. If they were to abandon the project, would someone in the community fork and continue to actively develop it?

0 views
neilzone 2 days ago

Enabling a user's processes to continue after the user disconnects their ssh session, using loginctl enable-linger

I set up Immich over the weekend , using rootless podman. An annoyance was that, when I disconnected from ssh, podman stopped running too. On an interim basis, I fudged it by opening a new session, and running podman within that. The “correct” solution, as far as I can tell, is to use for that user : Having done this, I can now disconnect from ssh, and the podman containers continue to run.

0 views
neilzone 3 days ago

My initial thoughts about Immich, a self-hosted photo gallery

I am looking to move away from Nextcloud (another blogpost at another time), and one of the things that I wanted to replace was a tool for automatically uploading photos from my phone (Graphene OS) and Sandra’s phone (an iPhone). At the moment, they are uploaded automatically to Nextcloud, although I’m not sure how well that works on Sandra’s iPhone. I also take a backup, using rsync to copy photos from the phones to an external drive , when I remember, which frankly is not often enough. But I did one of these backups ahead of installing Immich, which was useful. Lastly, I have a load of photos from the last 20 years, including our wedding photos, saved onto two different drives, and I wanted to add those to the hosted backup. I had heard a lot about Immich , with happy users, so I decided to give it a go. I did not really look around for alternatives. I already had a machine suitable to run Immich, so that was straightforward. Immich is, to my slight annoyance, only available via docker. I’m not a fan of docker, as I don’t understand it well enough. I can just about use it, but if there is a problem, I don’t have many skills beyond “bring it down then bring it up again”. I certainly haven’t mastered docker’s networking. But anyway, it is available via docker. So, naturally, I used and . I followed the official instructions , substituting “podman” for “docker”, and it worked fine. Since I am using an external drive to store the photo library, I configured that in the file. And there we go. Immich was running. For TLS termination, I use nginx, which I already had running on that machine anyway. The official suggested nginx configuration worked for http, but for whatever reason, I could not get it to work with to sort out a TLS certificate. I didn’t spend too much time debugging this, as I was not convinced I’d be able to get it to work. Instead, I started from the default nginx config, then used certbot, then put bits back in from the Immich-suggested config. I tested it with , to make sure I hadn’t somehow messed it up, and it was working, so fingers crossed. I set up an admin user, and then two normal users, for Sandra and me. I didn’t do any other specific configuration. Immich does not offer multi-factor authentication, which I find surprising, but ho hum, since it is an internal-only service, it is not a big deal for me. Perhaps I need to make Authelia or similar a future project. I set up the Android app, via F-Droid , and the iPhone app from the App Store. I decided to upload my Camera library without getting into albums, so there was no particular configuration other than choosing the Camera directory. Uploading ~3000 photos from my phone took a while, but it worked fine. For Sandra’s iPhone, I set it up, and set up the directory to sync, but I didn’t wait for it to upload. Instead - and for the other photos which I wanted to upload to Immich - I used the backup copy I had on an external drive. I used the command line tool immich-go for this. After creating both a user API key and an admin API key, I just left it to do its thing. I had to restart it a few times (and switch user API key for Sandra’s photos, to upload them to her account), but after a few runs, I had uploaded everything. I managed to upload all of my photos before I went to bed last night and, this morning, the server had created all the thumbnails, and done the facial recognition (not sure how I feel about this). I expect Sandra’s to be ready tomorrow. I have quite a few photos without a location, because they were taken many years ago on an old-fashioned digital camera. Batch replacement of location is reasonably straightforward, and I have not gone for the exact specific location, but at least the right town / city. I have fixed the date on some photos, but I need to go through more of them. 1 January is a very popular date. As far as I know, there is no way to fix incorrect orientation right now , but I think that that is coming pretty soon. So far, I am impressed. Despite docker, it was pretty straightforward to get running, and I am willing to chalk my problems with nginx down to me. I will probably use the web interface rather than the app, and that seems pretty good. Sandra will probably use the iOS app and, again, that seems fine. As long as it uploads photos automatically - which is something to test / keep an eye on - I suspect that we will be happy with it.

2 views
Abhinav Sarkar 5 days ago

How I use Jujutsu

About three months ago I started using Jujutsu (JJ), a new Version Control System , for my personal projects. It took me a while to get used to it after more than a decade of using Git , but now I’m quite comfortable with it. Working with Jujutsu requires a shift from the mental model of Git. However, it is not as daunting as it may seem on the first day. This post was originally published on abhinavsarkar.net . Looking back, I don’t actually use all the fancy things JJ provides, and you may not need to either. In this post I list my most used JJ commands and how I use them. It is not meant to be a tutorial, or even a comprehensive list, but it should be enough to get you started. This post assumes that the reader knows how to use Git. JJ uses Git as a backend. This means that you can still use Git commands in your repo, and push them to Git remotes. Your coworkers can keep using Git with shared repos without ever being aware that you use JJ . initializes a new Jujutsu repository. You do this once, and you’re ready to start working. I usually run it with the option, which allows me to use Git commands as well in the same repo. If you want to work in an existing Git repo, you should run it with in the repo directory, to make JJ aware of it. Afterward, you don’t need to use Git commands. clones a Git repo and initializes it as a JJ repo. You can supply the option if you want. configures user settings. You can edit the user-level JJ config file by running . You can also override settings at repo level. For example, to set a different user email for a repo, run . You can also run to list the current config in effect. This is an area where JJ differs a lot from Git. JJ has no staging area, which means that every change you make is automatically and continuously staged. This came as a big surprise to me when I was getting started. If you are planning to use JJ with an existing Git repo, get rid of the untracked files either by committing them, or deleting them, or adding them to . There is literally no concept of untracked files in JJ ; a file is either committed or tracked or ignored. JJ has the concept of commits, same as Git. However, the workflow is different. Since there is no staging area, you start with creating a commit. That’s right! The first thing you do is create a commit, and then fill it by changing your files. Once you are done, you finalize the commit, and move on to a new fresh commit. JJ prefers to call them “changes” instead of commits to distinguish them from Git commits. creates a new change. If you know what your change is about, you can start with a commit message: , but JJ does not mandate it. You can start making changes without worrying about the message. One useful variation that I use a lot is . This creates a new change after the given change but before all the change’s descendants, effectively inserting a new change in the commit tree while simultaneously rebasing all descendant change. Once you are done, you can add a commit message to the current change by running . You can also provide the message inline: . As I mentioned, you don’t need to add a message to start working on a change, but you do need it before you push the change to a Git remote. You can run it any number of times to change the current change’s message. Alternatively, you can run to describe the current commit and start a new one. It is equivalent to running followed by . I use a mix of , and , depending on the situation. Like the command, tells you the state your current change is in. It lists the changed files and their individual statuses (added, modified, etc). This is where JJ really shines compared to Git. Moving commits around or editing them is a massive pain in Git. However, JJ makes it so easy, I do it many times a day. switches you over to the given change so you can modify it further. You use this when you’ve already committed a change but you need to tweak it. By default, you can edit only the changes that haven’t been pushed to the main branch of your repo’s remote. After you edit files, all the descendant changes are automatically rebased if there are no conflicts. simply combines the current change with its parent. It is useful when you commit something, and realize that you forgot to make some small changes. Another use for it is to resolve conflicts: create a new change after the conflicted change, fix the conflict, and squash it to resolve. is the opposite of : you use it to interactively split the current change into two or more changes. Often when I’m working on a feature and I find some unrelated things to fix, such as linter warnings, I go ahead and fix them in the same change. After I’m done with all the work for the feature, I use to split the change into unrelated changes so that the project history stays clean. restores files to how they were before the change, pretty much same as . You can run it in interactive mode by adding the option. You can also restore the files to how they were in a different change by specifying a change ID with the option. moves changes from anywhere to anywhere. You can use it to move individual changes between branches, or rearrange them in the same branch like so: When you move single changes like this, their descendant changes become invalid, but you can move them also in the same way. Or you can move entire branch of changes: It mostly works without any issues, but if there are conflicts, you’ll need to resolve them. I actually use rebase all the time. When I’m working on multiple features, and I find something that is more suited to be done on a different feature branch than I’m currently on, I finish working on the change, and then just move it to the different branch. Another use is to rebase feature branches on the main branch every day, like so: Here, , , , and are shorthand change IDs of the roots of various feature branches. You can also use rebase to splice changes/branches in the middle of other branches using the (after) and (before) options, but I rarely do this. is like except the changes are not moved but copied to the destination. It’s somewhat like . discards a change and rebases all its descendants onto the discarded change’s parent. I use it to get rid of failed experiments or prototypes. is supposed to automatically break the current change and integrate parts of it into ancestor changes at the right places, but I haven’t managed to make it work yet. I need to look more deeply into this. shows the change graph. JJ has a concept of revsets (sets of changes) that has an entire language to specify change IDs. takes an argument that uses the revsets language to choose which changes to show. For example: The revset language is rich and revsets can be used with many JJ commands. You can also create your own aliases for it, as we’ll see in a later section. shows differences between two changes, or in general between two revsets: shows the details of the current change. You can also use to inspect another change without switching to it. I’ve been mentioning branches, but actually JJ does not have branches like Git does. Instead, it has bookmarks, which are named pointers to changes. You can bookmark any change by giving it a name, and you can manipulate the bookmarks. Then to have branches, all you need to do is to point a bookmark to the required tip of the change graph. creates a new bookmark pointing to the current change with the given name. You can use bookmarks to mark the root or the tip of a feature branch, or to mark a milestone you want to return to later. When you rebase a change that a bookmark points to, the bookmark moves with it automatically. To list all existing bookmarks, run . To delete a bookmark you no longer need, run . If the deleted bookmark is tracked as a remote Git branch, the deletion is propagated to the remote as well. Alternatively, you can delete a bookmark only locally by running . You can also move, rename, and set bookmarks, as well as associate/disassociate them with Git remote branches. If you push a change with a bookmark to a Git remote, JJ creates a Git branch with the same name on the remote, but locally it remains a JJ bookmark. JJ tracks each operation in the repository in an immutable log, and provides commands to work with this log. shows a history of all operations performed on the repository. Each operation is assigned a unique ID, and you can see what changed with each operation. You can use the op IDs to restore the whole repo to an earlier state by running . undoes the last operation performed on the repository. Unlike , which modifies history, works on the Jujutsu operations themselves. This means it doesn’t lose any information; it just moves you back one step in the operation history. You can run this repeatedly to move backward in the operation history one step at a time. is the opposite of , that is, it moves you forward in the operation history by one step. It can also be run repeatedly. The operation log along with the undo and redo commands provide a safety net that makes it much easier to experiment with JJ without the fear of losing work. JJ uses Git as its backend, and provides commands to interact with remote Git repos. We already learned about and . We can also push and fetch. pushes your JJ changes to a Git remote. By default, it pushes all tracked bookmarks that have new changes. If you want to push a specific bookmark, you can specify it with . You can also push to all or tracked branches with the and options respectively. When you push, JJ converts the changes into Git commits and creates or updates remote Git branches accordingly. One thing to note is that JJ refuses to push changes that have conflicts or are missing commit messages. fetches changes from a Git remote and updates your local repository. It’s equivalent to . After fetching, you can see the remote changes in your change graph, and you can rebase your local changes on top of them if needed. You can fetch from a specific remote by running , and fetch a particular branch by running . manages your Git remotes. You can add a new remote with , or list existing remotes with . This is similar to but integrated with JJ ; it does not update the remotes of your underlying Git repo. creates a new change that undoes the effects of the specified change, pretty much like . The reverted change remains in the history of the repo. marks conflicts as resolved during a merge. When JJ can’t automatically merge changes (for example, when two changes modified the same lines), it creates a conflicted state in your working directory. After you manually fix the conflicts in your files, you run to tell JJ that the conflicts are resolved and the merge can proceed. JJ then automatically rebases any descendant changes. JJ is highly customizable through its configuration files. You can define custom aliases for commonly used commands and revsets, which can significantly ease up your workflow. These are stored in your JJ config file at the user and/or repo level. Here’s my configuration: You can compose revsets to create new revsets. These are the ones I use: I use the above defined revsets to create some custom commands: I have the default command set to so running only shows me the recent log. My usual workflow is to create a new commit, work on it, describe it, split/squash/rebase as needed, then run . Three months in, JJ has become my primary version control tool. The learning curve was steep, but it was worth it. The ability to freely rearrange changes and experiment without fear has fundamentally changed how I work. I spend less time wrestling with Git and more time actually coding. JJ has plenty of other useful features such as workspaces and the ability to manipulate multiple changes at once that I haven’t explored deeply. There’s a lot more to discover as I continue using it. If you use Git for personal projects and find yourself frustrated with rebasing or commit management, JJ might be worth a try. For further learning, I recommend the Jujutsu for Everyone tutorial , Steve Klabnik’s tutorial and Justin Pombrio’s cheat sheet , and of course, the official documentation . If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! If you liked this post, please leave a comment . Starting Up Creating Changes Modifying Changes Viewing Changes Managing Branches Managing State Working with Git Other Useful Commands Custom Configuration Revset Aliases Command Aliases : finds nonempty leaf changes that are mutable, have descriptions, and can be pushed to a remote. : finds changes from the default branch or the repository root to the current change, plus ancestors of visible leaf changes from the last 5 days. This gives me a good overview of the state of my repo. : finds all changes from the last month. : shows the recent changes from the default branch to the present, combining the and revsets. : moves the bookmark in the current branch to the closest pushable commit.

0 views
Simon Willison 5 days ago

Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

New from Fly.io today: Sprites.dev . Here's their blog post and YouTube demo . It's an interesting new product that's quite difficult to explain - Fly call it "Stateful sandbox environments with checkpoint & restore" but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox. Disclosure: Fly sponsor some of my work. They did not ask me to write about Sprites and I didn't get preview access prior to the launch. My enthusiasm here is genuine. I predicted earlier this week that "we’re due a Challenger disaster with respect to coding agent security" due to the terrifying way most of us are using coding agents like Claude Code and Codex CLI. Running them in mode (aka YOLO mode, where the agent acts without constantly seeking approval first) unlocks so much more power, but also means that a mistake or a malicious prompt injection can cause all sorts of damage to your system and data. The safe way to run YOLO mode is in a robust sandbox, where the worst thing that can happen is the sandbox gets messed up and you have to throw it away and get another one. That's the first problem Sprites solves: That's all it takes to get SSH connected to a fresh environment, running in an ~8GB RAM, 8 CPU server. And... Claude Code and Codex and Gemini CLI and Python 3.13 and Node.js 22.20 and a bunch of other tools are already installed. The first time you run it neatly signs you in to your existing account with Anthropic. The Sprites VM is persistent so future runs of will get you back to where you were before. ... and it automatically sets up port forwarding, so you can run a localhost server on your Sprite and access it from on your machine. There's also a command you can run to assign a public URL to your Sprite, so anyone else can access it if they know the secret URL. In the blog post Kurt Mackey argues that ephemeral, disposable sandboxes are not the best fit for coding agents: The state of the art in agent isolation is a read-only sandbox. At Fly.io, we’ve been selling that story for years, and we’re calling it: ephemeral sandboxes are obsolete. Stop killing your sandboxes every time you use them. [...] If you force an agent to, it’ll work around containerization and do work . But you’re not helping the agent in any way by doing that. They don’t want containers. They don’t want “sandboxes”. They want computers. [...] with an actual computer, Claude doesn’t have to rebuild my entire development environment every time I pick up a PR. Each Sprite gets a proper filesystem which persists in between sessions, even while the Sprite itself shuts down after inactivity. It sounds like they're doing some clever filesystem tricks here, I'm looking forward to learning more about those in the future. There are some clues on the homepage : You read and write to fast, directly attached NVMe storage. Your data then gets written to durable, external object storage. [...] You don't pay for allocated filesystem space, just the blocks you write. And it's all TRIM friendly, so your bill goes down when you delete things. The really clever feature is checkpoints. You (or your coding agent) can trigger a checkpoint which takes around 300ms. This captures the entire disk state and can then be rolled back to later. For more on how that works, run this in a Sprite: Here's the relevant section: Or run this to see the for the command used to manage them: Which looks like this: I'm a big fan of Skills , the mechanism whereby Claude Code (and increasingly other agents too) can be given additional capabilities by describing them in Markdown files in a specific directory structure. In a smart piece of design, Sprites uses pre-installed skills to teach Claude how Sprites itself works. This means you can ask Claude on the machine how to do things like open up ports and it will talk you through the process. There's all sorts of interesting stuff in the folder on that machine - digging in there is a great way to learn more about how Sprites works. Also from my predictions post earlier this week: "We’re finally going to solve sandboxing" . I am obsessed with this problem: I want to be able to run untrusted code safely, both on my personal devices and in the context of web services I'm building for other people to use. I have so many things I want to build that depend on being able to take untrusted code - from users or from LLMs or from LLMs-driven-by-users - and run that code in a sandbox where I can be confident that the blast radius if something goes wrong is tightly contained. Sprites offers a clean JSON API for doing exactly that, plus client libraries in Go and TypeScript and coming-soon Python and Elixir . From their quick start: You can also checkpoint and rollback via the API, so you can get your environment exactly how you like it, checkpoint it, run a bunch of untrusted code, then roll back to the clean checkpoint when you're done. Managing network access is an important part of maintaining a good sandbox. The Sprites API lets you configure network access policies using a DNS-based allow/deny list like this: Sprites have scale-to-zero baked into the architecture. They go to sleep after 30 seconds of inactivity, wake up quickly when needed and bill you for just the CPU hours, RAM hours and GB-hours of storage you use while the Sprite is awake. Fly estimate a 4 hour intensive coding session as costing around 46 cents, and a low traffic web app with 30 hours of wake time per month at ~$4. (I calculate that a web app that consumes all 8 CPUs and all 8GBs of RAM 24/7 for a month would cost ((7 cents * 8 * 24 * 30) + (4.375 cents * 8 * 24 * 30)) / 100 = $655.2 per month, so don't necessarily use these as your primary web hosting solution for an app that soaks up all available CPU and RAM!) I was hopeful that Fly would enter the developer-friendly sandbox API market, especially given other entrants from companies like Cloudflare and Modal and E2B . I did not expect that they'd tackle the developer sandbox problem at the same time, and with the same product! My one concern here is that it makes the product itself a little harder to explain. I'm already spinning up some prototypes of sandbox-adjacent things I've always wanted to build, and early signs are very promising. I'll write more about these as they turn into useful projects. Update : Here's some additional colour from Thomas Ptacek on Hacker News: This has been in the works for quite awhile here. We put a long bet on "slow create fast start/stop" --- which is a really interesting and useful shape for execution environments --- but it didn't make sense to sandboxers, so "fast create" has been the White Whale at Fly.io for over a year. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Developer sandboxes Storage and checkpoints Really clever use of Claude Skills A sandbox API Scale-to-zero billing Two of my favorite problems at once

1 views
Jeff Geerling 5 days ago

Local Email Debugging with Mailpit

For the past decade, I've used Mailhog for local email debugging. Besides working on web applications that deal with email, I've long used email as the primary notification system for comments on the blog. I built an Ansible role for Mailhog , and it was one of the main features of Drupal VM , a popular local development environment for Drupal I sunset 3 years ago. Unfortunately, barring any future updates from the maintainers, it seems like Mailhog has not been maintained for four years now. It still works , but something as complex as an email debugging environment needs ongoing maintenance to stay relevant.

0 views
Brain Baking 1 weeks ago

Thinking about email workflows

This Emacs thing is getting out of hand and eating away all my free time. Now I know what they mean with the saying “diving into a rabbit hole” (and never seeing the bottom of it). We’re at 1k lines of Elisp code and I still add items to the list that don’t work well enough on a daily basis. For some weird reason, I decided to try my hand at using Emacs as an email client as well. Anyway, we can save those boring technical details for another post you can safely skip then, but for now, let’s stick with the philosophical implications of messing with my email schedule and/or habits. I’ve had some dirty habits that I thought kicked the bucket way back in 2021 when I threw out everything Google-related . Except that I didn’t throw out much—I just started doing something else. My Google & GMail account still lives but now primarily serves as yet another spam address. But I forgot to clean up and process the archive! I had another account lying around ( ) that I stopped using in 2013-ish but I forgot to clean up and process those archives as well! The Google Takeout as backup I saved, but the original ones I didn’t delete, meaning my data was still out there. Whoops. The question is: what to do with a bunch of very old emails? Do you save them all? Locally or centrally? Which ones? I had to think about this because the Emacs package I use—excellent Dutch software called mu4e —works with IMAP. I still rocked POP3 so I moved to IMAP. But in IMAP, you synchronize between client and server, meaning most stuff stays on the server which I don’t like. Why keep an IMAP folder in there just accumulating junk to wire up and down? And should I dump my GMail archive in there as well? Since moving from GMail ( and then Protonmail ), my preferred mail client has been Apple Mail. I want a proper application for working with email, not a webapp, and I don’t want any email near my smartphone (so I don’t really care about syncing that much, which is why I stayed with POP). Nothing is stopping you from creating a folder “On My Mac” and moving stuff in there instead of pressing the Archive button—in that way, the email disappears from the server. But then it ends up in a proprietary database format. Now, it’s all just flat text files syncing with and auto-backed up with various stuff. But perhaps you still want a semi-permanent archive folder to sync just in case? I’m a zero inbox kind of guy: once the mail has been dealt with, it needs to go: That means my folders look like this: Why isn’t inside the folder? Because that’s outside IMAP sync zone. is there in case I need something synced, but it’s rarely used and I plan to delete it in the coming months. will serve as the semi-saved “ongoing thing but don’t need to deal with right now but can’t get rid of just yet” folder. But what about ? That’s simple: I set up rules that automatically move emails to that folder to only occasionally glance at. For example, our daughter’s preschool loves to send at least four days a week titled “NEW MESSAGE IN PARENT PLATFORM!!!!!ONE!!11”. Ah, and yes, that Limited Run Games mailing list? *Cough*. Yeah, that one that I shouldn’t be looking at. In it goes: at least it’s not staring at me in . Now about that (local) archive. Why keep emails around? Several reasons: That being said, I am an opponent of blindfully preserving everything “just in case”. You don’t need that email invoice if you have the invoice stored. You don’t need that project mail if the project was done and buried five years ago. You don’t need those superficial “sure I’ll be there” appointment emails once the event is over. I hate it when people say Just Save Everything, Dude, It’s GMail! . To me, that sounds like I’m Too Lazy To Filter, Dude! Where’s My Stuff? —although that’s also a perfectly valid strategy. But then again, that might just be me. How do you deal with your emails? What’s your grand archival plan? Send me a mail and let me know! If it’s interesting enough I’ll promise to keep it indefinitely. Related topics: / email / emacs / By Wouter Groeneveld on 7 January 2026.  Reply via email . Is it spam? Move to junk & have your filter learn from it. Is it a short thing that you can answer (if needed) and forget about? Delete. Is it informational/an invite/whatever that you can move to a calendar? Do it & delete. Is it an invoice/whatever where you can save the attach into the DEVONThink inbox? Do it & delete. Is it a receipt without attach? Print as PDF and treat as above. Is it an email from family full of photos of last Saturday’s party? Save them all to your NAS where Photoprism can find them & delete. Is it from an ongoing project that you still need to keep as evidence just in case? Move to the “projects” folder. Is it an exciting email from friends, co-bloggers, et al.? Answer & archive to save. I can’t say goodbye to them. Several conversations with my late father-in-law and other deceased where I honestly don’t have the courage to trash them permanently. They were meaningful to me. Same as above, I guess, except for these people are still alive? I like keeping emails from lovely folks around. They might still have a practical use. Since all mails are indexed by , I can quickly whip up a search and find stuff not stored elsewhere. It should, though.

0 views
André Arko 1 weeks ago

Announcing <code>rv clean-install</code>

Originally posted on the Spinel blog . As part of our quest to build a fast Ruby project tool , we’ve been hard at work on the next step of project management: installing gems. As we’ve learned over the last 15 years of working on Bundler and RubyGems, package managers are really complicated! It’s too much to try to copy all of rbenv, and ruby-build, and RubyGems, and Bundler, all at the same time. Since we can’t ship everything at once, we spent some time discussing the first project management feature we should add after Ruby versions. Inspired by and , we decided to build . Today, we’re releasing the command as part of version 0.4. So, what is a clean install? In this case, clean means “from a clean slate”. You can use to install the packages your project needs after a fresh checkout, or before running your tests in CI. It’s useful by itself, and it’s also concrete step towards managing a project and its dependencies. Even better, it lays a lot of the groundwork for future gem management functionality, including downloading, caching, and unpacking gems, compiling native gem extensions, and providing libraries that can be loaded by Bundler at runtime. While we don’t (yet!) handle adding, removing, or updating gem versions, we’re extremely proud of the progress that we’ve made, and we’re looking forward to improving based on your feedback. Try running today, and see how it goes. Is it fast? Slow? Are there errors? What do you want to see next? Let us know what you think .

0 views
neilzone 1 weeks ago

Dealing with apt's warning 'Policy will reject signature within a year, see --audit for details'

I’ve noticed an increasing number of s result in a warning that: Running (as suggested), results in something like: My understanding is that - as the last line suggests - there has been a change in key-handling policy by apt, and that keys which were previously acceptable are (or, rather, will be) no longer acceptable by default. The “correct” way of solving this is for the repository provider to update their signing key to something which is compliant. However, I have no control over what a repository provider does, or when they will do it. For instance, the warning message above suggests to me that I will have a problem on 1 February 2026, so under a month away. I can suppress this warning - and tell apt to accept the key - by adding to : Or, to avoid having to add that each time, I can added a slightly-tweaked version of it to an apt config file, in . For instance, I can put this into : Hopefully though, repository providers will update their keys (which will then need re-importing).

0 views
neilzone 1 weeks ago

Removing .m4v files from my media server when an equivalent .mp4 file exists

For some reason, some of the directories on my media server have both .mp4 and .m4v versions of the same thing. I blame Past Neil for this. To save space, I wanted to delete the duplicate .m4v files. “Duplicate” here just means that the files have the same name but different extensions. This does not do anything clever/safe in terms of checking the .mp4 file is valid, is the right length etc. I was willing to take this risk. It might be possible to do all of this with a one-liner, of course. But this worked for me, and gave me a chance to eyeball the list of files at each point. I saved about 100GB of space :) One with .mp4 files, one with .m4v files, in each of the directories in : Because, to compare the lists, these need to be removed, to make the strings identical i.e. the files which exist as both .mp4 and .m4v. So that you have a list of the .m4v files, which are to be deleted.

0 views
Danny McClelland 1 weeks ago

Using Proton Pass CLI to Keep Linux Scripts Secure

If you manage dotfiles in a public Git repository, you’ve probably faced the dilemma of how to handle secrets. API keys, passwords, and tokens need to live somewhere, but committing them to version control is a security risk. Proton has recently released a CLI tool for Proton Pass that solves this elegantly. Instead of storing secrets in files, you fetch them at runtime from your encrypted Proton Pass vault. The CLI is currently in beta. Install it with: This installs to . Then authenticate: This opens a browser for Proton authentication. Once complete, you’re ready to use the CLI. List your vaults: View an item: Fetch a specific field: Get JSON output (useful for parsing multiple fields): I have several tools that need API credentials. Rather than storing these in config files, I created wrapper scripts that fetch credentials from Proton Pass at runtime. Here’s a wrapper for a TUI application that needs API credentials: The key insight: fetching JSON once and parsing with is faster than making separate API calls for each field. The Proton Pass API call takes a few seconds. For frequently-used tools, this adds noticeable latency. The solution is to cache credentials in the Linux kernel keyring: With caching: The cache expires after one hour, or when you log out. Clear it manually with: The CLI also has built-in commands for secret injection. The command passes secrets as environment variables: The command processes template files: These use a URI syntax: to reference secrets. For applications that read credentials from config files (like WeeChat’s ), the wrapper can update the file before launching: The CLI can also act as an SSH agent, loading keys stored in Proton Pass: This is useful if you store SSH private keys in your vault. This approach keeps secrets out of your dotfiles repository entirely. The wrapper scripts reference Proton Pass item names, not actual credentials. Your secrets remain encrypted in Proton’s infrastructure and are only decrypted locally when needed. The kernel keyring cache is per-user and lives only in memory. It’s cleared on logout or reboot, and the TTL ensures credentials don’t persist indefinitely. For public dotfiles repositories, this is a clean solution: commit your wrapper scripts freely, keep your secrets in Proton Pass. First run: ~5-6 seconds (fetches from Proton Pass) Subsequent runs: ~0.01 seconds (from kernel keyring)

0 views
Jason Scheirer 1 weeks ago

Learn To Live With The Defaults

Every deviation from default is slowing you down from getting started and making it harder to help others. I use about 5 distinct laptops/desktops on an average day, not to mention the VMs within them and other machines I shell into. Having a consistent experience is useful, but equally important is my ability to roll with the punches and get productive on a new computer without much ceremony. One thing I do to cope with this is a dotfiles repo with a dead-simple installation method , but also note how conservative it is. No huge vim plugin setup. Very minimal tmux config (which is still bad, and I’ll explain why later). Not a lot going on. Moving from the defaults to a custom setup might make you more effective in the immediate term, but it makes it harder long-term. You have additional complexity in terms of packages installed, keymaps, etc. that you need to reproduce regularly on every system you use. As I complained about in Framework Syndrome , flexible software just moves the problem along, it does not solve the problem. Having a tool that’s flexible enough to get out of the way so that you can solve the problem yourself is double-edged: it does not provide the solution you want, it provides an environment to implement your solution. This seems to mean that everyone new to the software will not see it as useful as it seems to you, right? To them it’s a blank slate, and is only useful with significant customization. This also affects teachability! With your hyper-customized setup you can’t be as effective a mentor or guide. One thing that makes it harder for me to advocate tmux to new devs is that I use one thing sightly idiomatically: coming from the older tool screen means I remap Ctl-B to Ctl-A for consistency. This has bitten me many a time! One example: Once I had set up a shared VM at work and had long-running tasks in tmux that my teammates could check in on. The entire setup was stymied by the fact that nobody but me could use tmux due to that one customization I had set up. Learn to lean in and be as functional as possible with the default setup. A kitted-out vim is great but learn the basics as muscle memory. Prefer tools with good defaults over good enough tools with the flexibility to make them as good as the ones with good defaults.

0 views
Danny McClelland 1 weeks ago

Scheduled Deploys for Future Posts

One of the small joys of running a static blog is scheduling posts in advance. Write a few pieces when inspiration strikes, set future dates, and let them publish themselves while you’re busy with other things. There’s just one problem: static sites don’t work that way out of the box. With a dynamic CMS like WordPress, scheduling is built in. The server checks the current time, compares it to your post’s publish date, and serves it up when the moment arrives. Simple. Static site generators like Hugo work differently. When you build the site, Hugo looks at all your content, checks which posts have dates in the past, and generates HTML for those. Future-dated posts get skipped entirely. They don’t exist in the built output. This means if you write a post today with tomorrow’s date, it won’t appear until you rebuild the site tomorrow. And if you’re using Netlify’s automatic deploys from Git, that rebuild only happens when you push a commit. No commit, no deploy, no post. I could set a reminder to push an empty commit every morning. But that defeats the purpose of scheduling posts in the first place. The fix is straightforward: trigger a Netlify build automatically every day, whether or not there’s new code to deploy. Netlify provides build hooks for exactly this purpose. A build hook is a unique URL that triggers a new deploy when you send a POST request to it. All you need is something to call that URL on a schedule. GitHub Actions handles the scheduling side. A simple workflow with a cron trigger runs every day at midnight UK time and pings the build hook. Netlify does the rest. First, create a build hook in Netlify: Next, add that URL as a secret in your GitHub repository: Finally, create a workflow file at : The dual cron schedule handles UK daylight saving time. During winter (GMT), the first schedule fires at midnight. During summer (BST), the second one does. There’s a brief overlap during the DST transitions where both might run, but an extra deploy is harmless. The trigger is optional but handy. It adds a “Run workflow” button in the GitHub Actions UI, letting you trigger a deploy manually without pushing a commit. Now every morning at 00:01, GitHub Actions wakes up, pokes the Netlify build hook, and a fresh deploy rolls out. Any posts with today’s date appear automatically. No manual intervention required. It’s a small piece of automation, but it removes just enough friction to make scheduling posts actually practical. Write when you want, publish when you planned. Go to your site’s dashboard Navigate to Site settings → Build & deploy → Build hooks Click Add build hook , give it a name, and select your production branch Copy the generated URL Go to Settings → Secrets and variables → Actions Create a new repository secret called Paste the build hook URL as the value

0 views
neilzone 1 weeks ago

From proxmox to Raspberry Pi 4s again

I’ve been experimenting with proxmox for a few months now. And it was going pretty well, until Something Happened (as yet undiagnosed), which means that I cannot access the proxmox interface or ssh in. The containers on it are still running though. While I’d like to fix it - if only to understand why it crashed - I decided to move the things I cared about back onto Raspberry Pis (because that’s the hardware that I had to hand) for the time being. Restoring the services was easy, thanks to restic backups. This blog is now back onto a Pi, so if you can see this blogpost, it is working. I have one more service to move, and then I can start exploring the proxmox issue. Annoyingly, I have most of the hardware to run up a second instance of proxmox, with a plan for some basic level of failover, but I had not got around to setting it up.

0 views
Nelson Figueroa 2 weeks ago

Setting up WireGuard on Synology DSM 7 using Docker and Gluetun

At the time of writing Synology DiskStation Manager (DSM) v7.2.2-72806 is running on Linux v4.4.4 which doesn’t support WireGuard. It doesn’t look like Synology is interested in adding WireGuard support the way OpenVPN is supported. So if you want certain services on your Synology NAS to connect through WireGuard, you’ll need a workaround. One workaround is to establish a WireGuard connection using Gluetun in Docker. Then have containerized services do their networking through this Gluetun container. The caveat is that whatever services you want to go through a WireGuard tunnel will need to be containerized. This guide is intended for those comfortable with the command line , SSH, and Docker. You’ll need Container Manager installed, which is basically just Synology’s wrapper around Docker. Install it via the Web UI, then you’ll be able to use commands via SSH. Installing Container Manager is straightforward. Log into the Synology DSM Web UI -> open Package Center -> search for “Container Manager” -> click “Install”. You’ll also need a WireGuard configuration file. For this guide I’ll be using a configuration file from Mullvad VPN . A bit of background as to why I’m using Gluetun. There’s a linuxserver/wireguard docker image we can use, but that image expects the underlying kernel to have WireGuard support. Since Synology DSM runs on 4.4.4 at this time that means it doesn’t support WireGuard, which means the linuxserver/wireguard image won’t work. I tried to get it working myself but kept running into errors. Unlike linuxserver/wireguard, Gluetun works on any kernel by using something called userspace WireGuard implementation. Basically it runs at the user level rather than at the kernel level. This is beyond my knowledge though, so I encourage you to do some of your own research if you want to learn more. First, let’s create a directory where the Gluetun container will store a configuration file once it’s running. SSH into your Synology device with an admin user: Once you’re in, get root access to make this process easier: If you can run and get as the output then you’re good to go. Now we can create the directory that Gluetun will need. In my case, I only have one volume and it’s called , so your path may be a little different: That should be it! Stay as going forward to keep things simple. Next we can create a file where we’ll tell Docker to run a Gluetun container. This file can also be easily extended with additional containers that should connect to Gluetun to have WireGuard access. More on that later though. First, make sure Docker is actually installed as it’s a prerequisite I mentioned at the beginning of this post: Then create a file. I chose to create it in because it seemed logical but you can place this just about anywhere you’d like. Now we can fill in . Here’s the starting point you’ll need for Gluetun: Note that if it’s easier you can create locally on your device and then drag it over to a directory of your choosing through the Web UI. We’ll need to fill in and in . These can be retrieved from a WireGuard configuration file. It depends on your provider but for Mullvad VPN you go to https://mullvad.net/en/account/wireguard-config and download the Linux version of the WireGuard configuration file. The file itself should look something like this regardless of your VPN provider: Copy the field and paste it as the value for in . Then copy and paste it as the value for . Note: at the time of writing Gluetun only supports IPv4 addresses. So if your value contains an IPv6 range it will not work and you’ll get an error like . The value should look like this: . The value should NOT look like this: . Here’s what the updated file will look like in this case: Now we can start up the Gluetun container and verify that it works. In the same directory as , spin up a Gluetun container with : You’ll see some output similar to the following: For Mullvad VPN specifically there’s a way to verify that a connection is going through their servers. We can run a command against the Gluetun container to confirm. Regardless of VPN provider, you can check that the command returns a different IP address from the IP address your internet provider has assigned to you. Get your normal IP address first by running outside of Docker. Then run the same command against Gluetun to verify that you get a different IP address: If the IP addresses are different, you should be good to go. Now we can start creating containers that use the WireGuard connection through Gluetun. I’ll be using a qBittorrent container as an example as that is a common use case with WireGuard. People love their Linux ISOs. Adding containers is easy as we just need to append to the existing file. First, create some directories that qBittorrent will need for configuration and downloads: Then update like so: You’ll see output similar to: qBittorrent has a web interface that can be accessed on port . Open up a web browser and go to and see if the web UI shows up. If it does, qBittorrent is running successfully and all of its network traffic will run through Gluetun and WireGuard! We can do one final check with the qBittorrent container to make sure it has the same IP address as the Gluetun container: Both IP addresses are the same, which means qBittorrent is running through Gluetun and through a WireGuard connection. Everything works! https://github.com/qdm12/gluetun https://github.com/qdm12/gluetun-wiki/blob/main/setup/providers/mullvad.md https://docs.linuxserver.io/images/docker-qbittorrent/ a lot of trial and error

0 views
alikhil 2 weeks ago

Kubernetes In-Place Pod Resize

About six years ago, while operating a large Java-based platform in Kubernetes, I noticed a recurring problem: our services required significantly higher CPU and memory during application startup. Heavy use of Spring Beans and AutoConfiguration forced us to set inflated resource requests and limits just to survive bootstrap, even though those resources were mostly unused afterwards. This workaround never felt right. As an engineer, I wanted a solution that reflected the actual lifecycle of an application rather than its worst moment. I opened an issue in the Kubernetes repository describing the problem and proposing an approach to adjust pod resources dynamically without restarts. The issue received little discussion but quietly accumulated interest over time (13 👍 emoji reaction). Every few months, an automation bot attempted to mark it as stale, and every time, I removed the label. This went on for nearly six years… Until the release of Kubernetes 1.35 where In-Place Pod Resize feature was marked as stable . In-Place Pod Resize allows Kubernetes to update CPU and memory requests and limits without restarting pods, whenever it is safe to do so. This significantly reduces unnecessary restarts caused by resource changes, leading to fewer disruptions and more reliable workloads. For applications whose resource needs evolve over time, especially after startup, this feature provides a long-missing building block. The new field is configured at the pod spec level. While it is technically possible to change pod resources manually, doing so does not scale. In practice, this feature should be driven by a workload controller. At the moment, the only controller that supports in-place pod resize is the Vertical Pod Autoscaler (VPA). There are two enhancement proposals enable this behavior: AEP-4016: Support for in place updates in VPA which introduces update mode AEP-7862: CPU Startup Boost which is about temporarily boosting pod by giving more cpu during pod startup. This is conceptually similar to the approach proposed in my original issue. Here is an example of Deployment and VPA using both AEP features: With such configuration pod will have doubled cpu requests and limits during startup. During the boost period no resizing will happen. Once the pod reaches the state, the VPA controller scales CPU down to the currently recommended value. After that, VPA continues operating normally, with the key difference that resource updates are applied in place whenever possible. Does this feature fully solve the problem described above? Only partially. First, most application runtimes still impose fundamental constraints. Java and Python runtimes do not currently support resizing memory limits without a restart. This limitation exists outside of Kubernetes itself and is tracked in the OpenJDK project via an open ticket . Second, Kubernetes does not yet support decreasing memory limits, even with in-place Pod Resize enabled. This is a known limitation documented in the enhancement proposal for memory limit decreases . As a result, while in-place Pod Resize effectively addresses CPU-related startup spikes, memory resizing remains an open problem. In place Pod Resize gives a foundation for cool new features like StartupBoost and makes use of VPA more reliable. While important gaps remain, such as memory decrease support and scheduling race condition , this change represents a meaningful step forward. For workloads with distinct startup and steady-state phases, Kubernetes is finally beginning to model reality more closely. AEP-4016: Support for in place updates in VPA which introduces update mode AEP-7862: CPU Startup Boost which is about temporarily boosting pod by giving more cpu during pod startup. This is conceptually similar to the approach proposed in my original issue.

0 views
Justin Duke 2 weeks ago

mise hooks

I've written before about having a standardized set of commands across all my projects. One of the things that helps with that is mise hooks. I've actually been migrating from justfiles to mise tasks over the past few months. mise already manages my tool versions, so consolidating task running into the same tool means one less thing to install and one less config file per project. The syntax is nearly identical—you're just writing shell commands with some metadata—and mise tasks have a few nice extras like file watching and automatic parallelization. mise has a handful of lifecycle hooks that you can use to run arbitrary scripts at various points: | Hook | When it runs | | --- | --- | | | Every time you change directories | | | When you first enter a project directory | | | When you leave a project directory | | | Before a tool is installed | | | After a tool is installed | I am not a fan of injecting scripts into things that you want to be as fast as humanly possible, but here's one exception: I use the hook to print a little welcome message when I into a project: This is a small thing, but it's a nice reminder of what commands are available when I'm jumping between projects. I'm always context-switching and I never remember what the right incantation is for any given project. This way, I get a little cheat sheet every time I enter a directory.

0 views
Blog System/5 2 weeks ago

ssh-agent broken in tmux? I've got you!

A little over two years ago, I wrote an article titled SSH agent forwarding and tmux done right . In it, I described how SSH agent forwarding works—a feature that lets a remote machine use the credentials stored in your local ssh-agent instance—and how using a console multiplexer like tmux or screen often breaks it. In that article, I presented the ssh-agent-switcher : a program I put together in a few hours to fix this problem. In short, ssh-agent-switcher exposes an agent socket at a stable location ( by default) and proxies all incoming credential requests to the transient socket that the sshd server creates on a connection basis. In this article, I want to formalize this project by presenting its first actual release, 1.0.0, and explain what has changed to warrant this release number. I put effort into creating this formal release because ssh-agent-switcher has organically gained more interest than I imagined as it is solving a real problem that various people have. When I first wrote ssh-agent-switcher, I did so to fix a problem I was having at work: we were moving from local developer workstations to remote VMs, we required SSH to work on the remote VMs for GitHub access, and I kept hitting problems with the ssh-agent forwarding feature breaking because I’m an avid user of tmux . To explain the problem to my peers, I wrote the aforementioned article and prototyped ssh-agent-switcher after-hours to demonstrate a solution. At the end of the day, the team took a different route for our remote machines but I kept using this little program on my personal machines. Because of work constraints, I had originally written ssh-agent-switcher in Go and I had used Bazel as its build system. I also used my own shtk library to quickly write a bunch of integration tests and, because of the Bazel requirement, I even wrote my first ruleset, rules_shtk , to make it possible. The program worked, but due to the apparent lack of interest, I considered it “done” and what you found in GitHub was a code dump of a little project I wrote in a couple of free evenings. Recently, however, ssh-agent-switcher stopped working on a Debian testing machine I run and I had to fix it. Luckily, someone had sent a bug report describing what the problem was: OpenSSH 10.1 had changed the location where sshd creates the forwarding sockets and even changed their naming scheme, so ssh-agent-switcher had to adapt. Fixing this issue was straightforward, but doing so made me have to “touch” the ssh-agent-switcher codebase again and I got some interest to tweak it further. My energy to work on side-projects like this one and to write about them comes from your support. Subscribe now to motivate future content! As I wanted to modernize this program, one thing kept rubbing me the wrong way: I had originally forced myself to use Go because of potential work constraints. As these requirements never became relevant and I “needed to write some code” to quench some stress, I decided to rewrite the program in Rust. Why, you ask? Just because I wanted to. It’s my code and I wanted to have fun with it, so I did the rewrite. Which took me into a detour. You see: while command line parsing in Rust CLI apps is a solved problem , I had been using the ancient getopts crate in other projects of mine out of inertia. Using either library requires replicating some boilerplate across apps that I don’t like, so… I ended up cleaning up that “common code” as well and putting it into a new crate aptly-but-oddly-named getoptsargs . Take a look and see if you find it interesting… I might write a separate article on it. Doing this rewrite also made me question the decision to use Bazel (again imposed by constraints that never materialized) for this simple tool: as much as I like the concepts behind this build system and think it’s the right choice for large codebases, it was just too heavy for a trivial program like ssh-agent-switcher. So… I just dropped Bazel and wrote a Makefile—which you’d think isn’t necessary for a pure Rust project but remember that this codebase includes shell tests too. With the Rust rewrite done, I was now on a path to making ssh-agent-switcher a “real project” so the first thing I wanted to fix were the ugly setup instructions from the original code dump. Here is what the project README used to tell you to write into your shell startup scripts: Yikes. You needed shell-specific logic to detach the program from the controlling session so that it didn’t stop running when you logged out, as that would have made ssh-agent-switcher suffer from the exact same problems as regular sshd socket handling. The solution to this was to make ssh-agent-switcher become a daemon on its own with proper logging and “singleton” checking via PID file locking. So now you can reliably start it like this from any shell: I suppose you could make systemd start and manage ssh-agent-switcher automatically with a per-user socket trigger without needing the daemonization support in the binary per se… but I do care about more than just Linux and so assuming the presence of systemd is not an option. With that done, I felt compelled to fix a zero-day TODO that kept causing trouble for people: a fixed-size buffer used to proxy requests between the SSH client and the forwarded agent. This limitation caused connections to stall if the response from the ssh-agent contained more keys than fit in the buffer. The workaround had been to make the fixed-size buffer “big enough”, but that was still insufficient for some outlier cases and came with the assumption that the messages sent over the socket would fit in the OS internal buffers in one go as well. No bueno. Fixing this properly required one of the following: adding threads to handle reads and writes over two sockets in any order, dealing with the annoying / family of system calls, or using an async runtime and library (tokio) to deal with the event-like nature of proxying data between two network connections. People dislike async Rust for some good reasons, but async is the way to get to the real fearless concurrency promise. I did not fancy managing threads by hand, and I did not want to deal with manual event handling… so async it was. And you know what? Switching to async had two immediate benefits: Handling termination signals with proper cleanup became straightforward. The previous code had to install a signal handler and deal with potential races in the face of blocking system calls by doing manual polling of incoming requests, which isn’t good if you like power efficiency. Using tokio made this trivial and in a way that I more easily trust is correct. I could easily implement the connection proxying by using an event-driven loop and not having to reason about threads and their terminating conditions. Funnily enough, after a couple of hours of hacking, I felt proud of the proxying algorithm and the comprehensive unit tests I had written so I asked Gemini for feedback, and… while it told me my code was correct, it also said I could replace it all with a single call to a primitive! Fun times. I still don’t trust AI to write much code for me, but I do like it a lot to perform code reviews. Even with tokio in the picture and all of the recent new features and fixes, the Rust binary of ssh-agent-switcher is still smaller (by 100KB or so) than the equivalent Go one and I trust its implementation more. Knowing that various people had found this project useful over the last two years, I decided to conclude this sprint by creating an actual “formal release” of ssh-agent-switcher. Formal releases require: Documentation, which made me write a manual page . A proper installation process, which made me write a traditional -like script because doesn’t support installing supporting documents. A tag and release number, which many people forget about doing these days but are critical if you want the code to be packaged in upstream OSes. And with that, ssh-agent-switcher 1.0.0 went live on Christmas day of 2025. pkgsrc already has a package for it ; what is your OS waiting for? 😉 In that article, I presented the ssh-agent-switcher : a program I put together in a few hours to fix this problem. In short, ssh-agent-switcher exposes an agent socket at a stable location ( by default) and proxies all incoming credential requests to the transient socket that the sshd server creates on a connection basis. In this article, I want to formalize this project by presenting its first actual release, 1.0.0, and explain what has changed to warrant this release number. I put effort into creating this formal release because ssh-agent-switcher has organically gained more interest than I imagined as it is solving a real problem that various people have. Some background When I first wrote ssh-agent-switcher, I did so to fix a problem I was having at work: we were moving from local developer workstations to remote VMs, we required SSH to work on the remote VMs for GitHub access, and I kept hitting problems with the ssh-agent forwarding feature breaking because I’m an avid user of tmux . To explain the problem to my peers, I wrote the aforementioned article and prototyped ssh-agent-switcher after-hours to demonstrate a solution. At the end of the day, the team took a different route for our remote machines but I kept using this little program on my personal machines. Because of work constraints, I had originally written ssh-agent-switcher in Go and I had used Bazel as its build system. I also used my own shtk library to quickly write a bunch of integration tests and, because of the Bazel requirement, I even wrote my first ruleset, rules_shtk , to make it possible. The program worked, but due to the apparent lack of interest, I considered it “done” and what you found in GitHub was a code dump of a little project I wrote in a couple of free evenings. New OpenSSH naming scheme Recently, however, ssh-agent-switcher stopped working on a Debian testing machine I run and I had to fix it. Luckily, someone had sent a bug report describing what the problem was: OpenSSH 10.1 had changed the location where sshd creates the forwarding sockets and even changed their naming scheme, so ssh-agent-switcher had to adapt. Fixing this issue was straightforward, but doing so made me have to “touch” the ssh-agent-switcher codebase again and I got some interest to tweak it further. My energy to work on side-projects like this one and to write about them comes from your support. Subscribe now to motivate future content! The Rust rewrite As I wanted to modernize this program, one thing kept rubbing me the wrong way: I had originally forced myself to use Go because of potential work constraints. As these requirements never became relevant and I “needed to write some code” to quench some stress, I decided to rewrite the program in Rust. Why, you ask? Just because I wanted to. It’s my code and I wanted to have fun with it, so I did the rewrite. Which took me into a detour. You see: while command line parsing in Rust CLI apps is a solved problem , I had been using the ancient getopts crate in other projects of mine out of inertia. Using either library requires replicating some boilerplate across apps that I don’t like, so… I ended up cleaning up that “common code” as well and putting it into a new crate aptly-but-oddly-named getoptsargs . Take a look and see if you find it interesting… I might write a separate article on it. Doing this rewrite also made me question the decision to use Bazel (again imposed by constraints that never materialized) for this simple tool: as much as I like the concepts behind this build system and think it’s the right choice for large codebases, it was just too heavy for a trivial program like ssh-agent-switcher. So… I just dropped Bazel and wrote a Makefile—which you’d think isn’t necessary for a pure Rust project but remember that this codebase includes shell tests too. Daemonization support With the Rust rewrite done, I was now on a path to making ssh-agent-switcher a “real project” so the first thing I wanted to fix were the ugly setup instructions from the original code dump. Here is what the project README used to tell you to write into your shell startup scripts: Yikes. You needed shell-specific logic to detach the program from the controlling session so that it didn’t stop running when you logged out, as that would have made ssh-agent-switcher suffer from the exact same problems as regular sshd socket handling. The solution to this was to make ssh-agent-switcher become a daemon on its own with proper logging and “singleton” checking via PID file locking. So now you can reliably start it like this from any shell: I suppose you could make systemd start and manage ssh-agent-switcher automatically with a per-user socket trigger without needing the daemonization support in the binary per se… but I do care about more than just Linux and so assuming the presence of systemd is not an option. Going async With that done, I felt compelled to fix a zero-day TODO that kept causing trouble for people: a fixed-size buffer used to proxy requests between the SSH client and the forwarded agent. This limitation caused connections to stall if the response from the ssh-agent contained more keys than fit in the buffer. The workaround had been to make the fixed-size buffer “big enough”, but that was still insufficient for some outlier cases and came with the assumption that the messages sent over the socket would fit in the OS internal buffers in one go as well. No bueno. Fixing this properly required one of the following: adding threads to handle reads and writes over two sockets in any order, dealing with the annoying / family of system calls, or using an async runtime and library (tokio) to deal with the event-like nature of proxying data between two network connections. Handling termination signals with proper cleanup became straightforward. The previous code had to install a signal handler and deal with potential races in the face of blocking system calls by doing manual polling of incoming requests, which isn’t good if you like power efficiency. Using tokio made this trivial and in a way that I more easily trust is correct. I could easily implement the connection proxying by using an event-driven loop and not having to reason about threads and their terminating conditions. Funnily enough, after a couple of hours of hacking, I felt proud of the proxying algorithm and the comprehensive unit tests I had written so I asked Gemini for feedback, and… while it told me my code was correct, it also said I could replace it all with a single call to a primitive! Fun times. I still don’t trust AI to write much code for me, but I do like it a lot to perform code reviews. Documentation, which made me write a manual page . A proper installation process, which made me write a traditional -like script because doesn’t support installing supporting documents. A tag and release number, which many people forget about doing these days but are critical if you want the code to be packaged in upstream OSes.

0 views
The Tymscar Blog 2 weeks ago

Automating What Backblaze Lifecycle Rules Don't Do Instantly

I recently moved from Synology to TrueNAS and set up cloud backups to Backblaze B2. I have two buckets: one for important files like documents, and one for homelab services. The services bucket backs up things like qcow2 disk images for my VMs, some of which are hundreds of gigabytes. When I created the buckets, I set the lifecycle rule to “Keep only the last version of the file.” I assumed this meant Backblaze would automatically replace old versions when new ones arrived. It doesn’t work that way.

0 views
Sean Goedecke 3 weeks ago

Nobody knows how large software products work

Large, rapidly-moving tech companies are constantly operating in the “fog of war” about their own systems. Simple questions like “can users of type Y access feature X?”, “what happens when you perform action Z in this situation?”, or even “how many different plans do we offer” often can only be answered by a handful of people in the organization. Sometimes there are zero people at the organization who can answer them, and somebody has to be tasked with digging in like a researcher to figure it out. How can this be? Shouldn’t the engineers who built the software know what it does? Aren’t these answers documented internally? Better yet, aren’t these questions trivially answerable by looking at the public-facing documentation for end users? Tech companies are full of well-paid people who know what they’re doing 1 . Why aren’t those people able to get clear on what their own product does? Large software products are prohibitively complicated . I wrote a lot more about this in Wicked Features , but the short version is you can capture a lot of value by adding complicated features. The classic examples are features that make the core product available to more users. For instance, the ability to self-host the software, or to trial it for free, or to use it as a large organization with centralized policy controls, or to use it localized in different languages, or to use it in countries with strict laws around how software can operate, or for highly-regulated customers like governments to use the software, and so on. These features are (hopefully) transparent to most users, but they cannot be transparent to the tech company itself . Why are these features complicated? Because they affect every single other feature you build . If you add organizations and policy controls, you must build a policy control for every new feature you add. If you localize your product, you must include translations for every new feature. And so on. Eventually you’re in a position where you’re trying to figure out whether a self-hosted enterprise customer in the EU is entitled to access a particular feature, and nobody knows - you have to go and read through the code or do some experimenting to figure it out. Couldn’t you just not build these features in the first place? Sure, but it leaves a lot of money on the table 2 . In fact, maybe the biggest difference between a small tech company and a big one is that the big tech company is set up to capture a lot more value by pursuing all of these fiddly, awkward features. Why can’t you just document the interactions once when you’re building each new feature? I think this could work in theory, with a lot of effort and top-down support, but in practice it’s just really hard. The core problem is that the system is rapidly changing as you try to document it . Even a single person can document a complex static system, given enough time, because they can just slowly work their way through it. But once the system starts changing, the people trying to document it now need to work faster than the rate of change in the system. It may be literally impossible to document it without implausible amounts of manpower. Worse, many behaviors of the system don’t necessarily have a lot of conscious intent behind them (or any). They just emerge from the way the system is set up, as interactions of a series of “default” choices. So the people working on the documentation are not just writing down choices made by engineers, they’re discovering how the system works for the first time . The only reliable way to answer many of these questions is to look at the codebase. I think that’s actually the structural cause of why engineers have institutional power at large tech companies. Of course, engineers are the ones who write software, but it’s almost more important that they’re the ones who can answer questions about software . In fact, the ability to answer questions about software is one of the core functions of an engineering team . The best understanding of a piece of software usually lives in the heads of the engineers who are working with it every day. If a codebase is owned by a healthy engineering team, you often don’t need anybody to go and investigate - you can simply ask the team as a whole, and at least one engineer will know the answer off the top of their head, because they’re already familiar with that part of the code. When tech companies reorg teams, they often destroy this tacit knowledge. If there’s no team with experience in a piece of software, questions have to be answered by investigation : some engineer has to go and find out. Typically this happens by some combination of interacting with the product (maybe in a dev environment where it’s easy to set up particular scenarios), reading through the codebase, or even performing “exploratory surgery” to see what happens when you change bits of code or force certain checks to always return true. This is a separate technical skill from writing code (though of course the two skills are related.) In my experience, most engineers can write software, but few can reliably answer questions about it. I don’t know why this should be so. Don’t you need to answer questions about software in order to write new software? Nevertheless, it’s true. My best theory is that it’s a confidence thing . Many engineers would rather be on the hook for their code (which at least works on their machine) than their answers (which could be completely wrong). I wrote about this in How I provide technical clarity to non-technical leaders . The core difficulty is that you’re always going out on a limb . You have to be comfortable with the possibility that you’re dead wrong, which is a different mindset to writing code (where you can often prove that your work is correct). You’re also able to be as verbose as you like when writing code - certainly when writing tests - but when you’re answering questions you have to boil things down to a summary. Many software engineers hate leaving out details. Non-technical people - at least, ones without a lot of experience working with software products - often believe that software systems are well-understood by the engineers who build them. The idea here is that the system should be understandable because it’s built line-by-line from (largely) deterministic components. However, while this may be true of small pieces of software, this is almost never true of large software systems . Large software systems are very poorly understood, even by the people most in a position to understand them. Even really basic questions about what the software does often require research to answer. And once you do have a solid answer, it may not be solid for long - each change to a codebase can introduce nuances and exceptions, so you’ve often got to go research the same question multiple times. Because of all this, the ability to accurately answer questions about large software systems is extremely valuable . I’m not being sarcastic here, I think this is literally true and if you disagree you’re being misled by your own cynicism. I first read this point from Dan Luu . I’m not being sarcastic here, I think this is literally true and if you disagree you’re being misled by your own cynicism. ↩ I first read this point from Dan Luu . ↩

0 views