Latest Posts (20 found)

A Brief History of App Icons From Apple’s Creator Studio

I recently updated my collection of macOS icons to include Apple’s new “Creator Studio” family of icons. Doing this — in tandem with seeing funny things like this post on Mastodon — got me thinking about the history of these icons. I built a feature on my icon gallery sites that’s useful for comparing icons over time. For example, here’s Keynote : (Unfortunately, the newest Keynote isn’t part of that collection because I have them linked in my data by their App Store ID and it’s not the same ID anymore for the Creator Studio app — I’m going to have to look at addressing that somehow so they all show up together in my collection.) That’s one useful way of looking at these icons. But I wanted to see them side-by-side, so I dug them all up. Now, my collection of macOS icons isn’t complete. It doesn’t show every variant since the beginning of time, but it’s still interesting to see what’s changed within my own collection. So, without further ado, I present the variants in my collection. The years labeled in the screenshots represent the year in which I added the to my collection (not necessarily the year that Apple changed them). For convenience, I’ve included a link to the screenshot of icons as they exist in my collection ( how I made that page , if you’re interested). Final Cut Pro: Compressor: Pixelmator Pro: (Granted, Pixelmator wasn’t one of Apple’s own apps until recently but its changes follow the same pattern showing how Apple sets the tone for itself as well as the ecosystem.) One last non-visual thing I noticed while looking through these icons in my archive. Apple used to call their own apps in the App Store by their name, e.g. “Keynote”. But now Apple seems to have latched on to what the ecosystem does by attaching a description to the name of the app, e.g. “Keynote: Design Presentations”. Reply via: Email · Mastodon · Bluesky Keynote -> Keynote: Design Presentations Pages -> Pages: Create Documents Numbers -> Numbers: Make Spreadsheets Final Cut Pro -> Final Cut Pro: Create Video Compressor -> Compressor: Encode Media Logic Pro -> Logic Pro: Make Music MainStage -> MainStage: Perform Live Pixelmator Pro -> Pixelmator Pro: Edit Images

0 views

Big Design, Bold Ideas

I’ve only gone and done it again! I redesigned my website. This is the eleventh major version. I dare say it’s my best attempt yet. There are similarities to what came before and plenty of fresh CSS paint to modernise the style. You can visit my time machine to see the ten previous designs that have graced my homepage. Almost two decades of work. What a journey! I’ve been comfortable and coasting for years. This year feels different. I’ve made a career building for the open web. That is now under attack. Both my career, and the web. A rising sea of slop is drowning out all common sense. I’m seeing peers struggle to find work, others succumb to the chatbot psychosis. There is no good reason for such drastic change. Yet change is being forced by the AI industrial complex on its relentless path of destruction. I’m not shy about my stance on AI . No thanks! My new homepage doubles down. I won’t be forced to use AI but I can’t ignore it. Can’t ignore the harm. Also I just felt like a new look was due. Last time I mocked up a concept in Adobe XD . Adobe in now unfashionable and Figma, although swank, has that Silicon Valley stench . Penpot is where the cool kids paint pretty pictures of websites. I’m somewhat of an artist myself so I gave Penpot a go. My current brand began in 2016 and evolved in 2018 . I loved the old design but the rigid layout didn’t afford much room to play with content. I spent a day pushing pixels and was quite chuffed with the results. I designed my bandit game in Pentpot too (below). That gave me the confidence to move into real code. I’m continuing with Atkinson Hyperlegible Next for body copy. I now license Ahkio for headings. I used Komika Title before but the all-caps was unwieldy. I’m too lazy to dig through backups to find my logotype source. If you know what font “David” is please tell me! I worked with Axia Create on brand strategy. On that front, we’ll have more exciting news to share later in the year! For now what I realised is that my audience here is technical. The days of small business owners seeking me are long gone. That market is served by Squarespace or Wix. It’s senior tech leads who are entrusted to find and recruit me, and peers within the industry who recommend me. This understanding gave me focus. To illustrate why AI is lame I made an interactive mini-game! The slot machine metaphor should be self-explanatory. I figured a bit of comedy would drive home my AI policy . In the current economy if you don’t have a sparkle emoji is it even a website? The game is built with HTML canvas, web components, and synchronised events I over-complicated to ensure a unique set of prizes. The secret to high performance motion blur is to cheat with pre-rendered PNGs. In hindsight I could have cheated more with a video. I commissioned Declan Chidlow to create a bespoke icon set. Declan delivered! The icons look so much better than the random assortment of placeholders I found. I’m glad I got a proper job done. I have neither the time nor skill for icons. Declan read my mind because I received a 88×31 web badge bonus gift. I had mocked up a few badges myself in Penpot. Scroll down to see them in the footer. Declan’s badge is first and my attempts follow. I haven’t quite nailed the pixel look yet. My new menu is built using with invoker commands and view transitions for a JavaScript-free experience. Modern web standards are so cool when the work together! I do have a tiny JS event listener to polyfill old browsers. The pixellated footer gradient is done with a WebGL shader. I had big plans but after several hours and too many Stack Overflow tabs, I moved on to more important things. This may turn into something later but I doubt I’ll progress trying to learn WebGL. Past features like my Wasm static search and speech synthesis remain on the relevant blog pages. I suspect I’ll be finding random one-off features I forgot to restyle. My homepage ends with another strong message. The internet is dominated by US-based big tech. Before backing powers across the Atlantic, consider UK and EU alternatives. The web begins at home. I remain open to working with clients and collaborators worldwide. I use some ‘big tech’ but I’m making an effort to push for European alternatives. US-based tech does not automatically mean “bad” but the absolute worst is certainly thriving there! Yeah I’m English, far from the smartest kind of European, but I try my best. I’ve been fortunate to find work despite the AI threat. I’m optimistic and I refuse to back down from calling out slop for what it is! I strongly believe others still care about a job well done. I very much doubt the touted “10x productivity” is resulting in 10x profits. The way I see it, I’m cheaper, better, and more ethical than subsidised slop. Let me know on the socials if you love or hate my new design :) P.S. I published this Sunday because Heisenbugs only appear in production. Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views

are you out of touch?

In Mina Le's latest video, she quotes Adam Aleksic about quitting or severely reducing social media and phone use: " For one, it's the equivalent of sticking your head in the sand and pretending like the algorithm doesn't exist. Whether you like it or not, our culture is still being shaped by these platforms, and they won't go away by themselves. All of our music and fashion aesthetics are either defined by or against the algorithm, which means that even the "countercultural" tastes of the No Phone People are necessarily influenced by it. Engaging with algorithmic media - in a limited, deliberate manner - is thus important to understanding your experience in society as a whole. Not engaging, meanwhile, makes you vulnerable to being blindsided by sudden social or political shifts. Each Reddit argument and YouTube comment war is an epistemic basis for understanding the current state of cultural discourse. If you ignore those, you lose touch with reality as most people experience it. " I can see why he'd think that, and maybe to a small part I can understand. We feel out of control about our screen behavior at times, and we expect drastic changes from drastic measures, when a bit more nuance could be more helpful. But in my view, the importance of social media in staying culturally in touch is completely overstated. People still go outside! People go to work, to university, to school, to their clubs and other responsibilities or hobby spaces. They talk to their friends, family, superiors and acquaintances and they see what people vote for locally. They see the banners, flags, posters and stickers in their area. They witness what the strangers on the sidewalk, in cafes, restaurants, public transport and other spaces talk about. The quote, on the other hand, acts as if people's only connection to others or the outside world in general is through their phone, which is nuts. No one is blinded by a cultural shift for not having social media unless they also do not interact with anyone outside of their home. Not everyone in your real life is part of "your bubble". Plenty of us have family members, peers or coworkers with wildly different views that we still interact with. Yes, these are mass platforms where tons of content gets created, and music snippets, memes and viral moments have shaped our time and memories of specific years, don't get me wrong - but this ignores that a lot of the accounts are simply lurkers who do not contribute at all. Many have a very weak output that has no impact at all (or no lasting one), or they create on a private, locked down profile for people they approved. For every area, country, and even globally, there are a few hundred creators who truly shape culture, but they do so in a way that either transcends the online, or stays only making a local impact no one else outside is missing out on. The view also doesn't take into account how sturdy algorithmic bubbles now seem to be. What some see as a huge trend online is actually something small in the grand scheme of things, and it's something their friend hasn't even seen, despite otherwise living in the same area and having tastes. You can be on social media and still "miss out" on whatever Adam means; you can also be off of social media and your friends will send you (or screen record for you) funny posts and short-form videos from Tumblr, Tiktok, X and more anyway. News outlets and publications like 404media pick up internet drama and memes as well, and commentary/video essay YouTubers like Hannah Alonzo, Kiki Chanel, Brooke Sharks, Becauseimmissy and more show and break down viral videos and creators and give more insight what's going on socially and culturally in 40-90 minute long videos. This is far more valuable to me (and the attention span, I guess!) than just seeing the original video on a feed. It contextualizes a lot of videos under a shared topic, identifies a pattern, and tends to be published a few weeks later, only giving time to things that truly lasted a while or were blowing up. It's an amazing filter, and you do not need to have any accounts or spend hours of time on a feed that makes you sad and harvests your data if you don't want to. You don't even need a phone to consume all that - you can do it on a cheap laptop, if you want to. I disagree with the notion that it is culturally important to be very aware of what goes on in comment sections. They are notoriously filled with inflammatory trash because it is easier to fire off a comment than to write an email or write a long-form blog post about it. People comment on things without opening the link or fully reading the post, and just read the title, rushing to be the first ones to comment and get more engagement. Comment sections also suffer from the usual review bias, where people usually only feel the need to comment if they feel strongly about something (usually negatively). That means the impression you'll get from these will be very skewed towards the loud, often abrasive minority and their upvoters. As things that make you feel strongly get more engagement, feeds get distorted and comments asking for the most extreme consequences or showing the most extreme view get catapulted to the top visually. While the websites and many of the commenters skew towards focusing on US culture and issues, it also skews towards the American lens on things. If you really want to be in touch with culture (especially if you do not live in the US), you cannot base your cultural understanding on these! In a way, this quote reads to me like an addict justifying why they should stay; like a smoker who says they need the breaks to rest and socialize, or the alcoholic who says they need the bar to socialize and the drinks to loosen up, as "social lubricant". Lots of culture and tradition in my country involves alcohol, yet I don't drink, and the disadvantages of that have yet to show. It's important to note that social media is Adam Aleksic's job . He gets his success from his short-form content on TikTok. It will never be in the interest of people in that industry for others to log off or stop consuming. His job necessitates that he posts frequently, stays up to date, consumes the feed and jumps on any trend he can, even if it's just the latest slang word explained through an etymologist's lens. Content creators also have to, at times, overstate their importance and impact to justify it all - the sums of money, the dark patterns, money off of unethical platforms, or spending so much time in front of a screen, some even essentially living a lie for content. It's all supposed to be worth something, to be for the common good, be done for the people, and immortalize... something , I guess. In my view, not everyone needs to experience everything firsthand or be directly knowledgeable about everything. It's better that way, even. You can always rely on articles, long-form video essays accessible without accounts, and podcasts from different sources, or simple conversations with others to keep you updated on stuff that's not on your radar. If it's important enough it will make your way to you, filtered and curated in a way that makes sense to you and focuses on what is truly important to you. If you want to know more, you are free to research and dive deeper. But it will always be impossible for you to be aware of everything. I do not need to know about the latest looksmaxxing trend that will vanish in a month, but I do care about how influencers consistently normalize overconsumption and how it is done. Others seeing it for me and sparking a conversation about it is how I was still able to write this without having an account on any of the big platforms. I know it can be scary to suddenly feel like you do not understand internet culture or memes anymore, but being less in touch about youth culture is a normal part of getting older, and the speed at which we go through trends and viral content has increased massively. Most things you do not understand right now that make you question whether it was the right choice to leave some socials behind is something you will never hear about again. You'll see what stands the test of time and what doesn't. The full piece is here , if you are interested in the quote's context. Reply via email Published 09 Feb, 2026

0 views

SteamOS on a ThinkPad P14s gen 4 (AMD) is quite nice

In April 2024, I wrote on the Lenovo ThinkPad P14s gen 4 and how it does not suck under Linux. That is still true. It’s been fantastic, and a very reliable laptop during all that time. The P14s gen 4 comes with a CPU that is still solid today, the AMD Ryzen 7 PRO 7840U, and that comes with impressive integrated graphics in the form of an AMD Radeon 780M. I’ve had a Steam Deck. I’ve also accidentally built a Steam Machine. I had to put SteamOS on this laptop to see how well it does. I did a quick Bazzite test the last time around, but after being impressed with how well the stock SteamOS image runs on a random machine with an AMD GPU, I had to test that, too. The normal way to install SteamOS on a machine is to take the Steam Deck recovery image and to install it on your own machine that has one NVMe SSD. I didn’t want to do exactly that, I wanted to run it off of an USB SATA SSD, which the recovery image does not support, as it hard-codes the target SSD for the SteamOS installation to . There’s a handy project out there that customizes the recovery script to allow you to install SteamOS to any target device, but I learned about that after the fact. I went a slightly different route: I imaged the SteamOS installation from my DIY Steam Machine build, wrote it to the 4TB USB SSD that I had available for testing, and after that I resized the partition to take up the full disk. Bam, clean SteamOS on a USB SSD! Oh, and before I did that, I did the same process but to a 128 GB Samsung FIT USB 3.0 thumb drive. The game library images did load a bit slowly, but it was a great demonstration of how low you can go with the hardware requirements. I wouldn’t recommend actually installing games on such a setup as that would likely kill the USB thumb drive very quickly. I ran the SteamOS setup on this laptop over a USB-C dock that only supports running at up to 4K at 30Hz, so I did testing at 1080p 60Hz setup. You’re unlikely to want to run this setup at 4K anyway, unless you’re a fan of light, easy to run games like Katamari or Donut County. In most games, the experience was enjoyable. 1080p resolution, maybe change the settings to medium or low in some cases, and you’ll likely have a solid gaming experience. Forza Horizon 4? No problem, 1080p high settings and a solid, consistent experience. Need for Speed Hot Pursuit Remastered was an equally enjoyable experience, and I did not have to turn the settings down from high/ultra. God of War Ragnarök was pushing the setup to the limits. With 1080p, low/medium settings you can expect 30+ FPS. If you include AMD FSR settings in the mix and also enable FSR frame generation, you can have a perfectly enjoyable 50-60 FPS experience. Some UI hints were a bit “laggy” with frame generation, but I’m genuinely surprised how well that rendering trick worked. I’ll admit it, my eyesight is not the best, but given the choice of a crisp but laggy picture, and a slightly blurrier but smoother experience, I’d pick the latter. After a pint of Winter Stout, you won’t even notice the difference. 1 Wreckfest was also heaps fun. It did push the limits of the GPU at times, but running it at 1080p and medium/high settings is perfectly enjoyable. The observed power usage throughout the heaviest games measured via SteamOS performance metrics ( ) were around 30-40 W, with the GPU using up the most of that budget. In most games, the CPU was less heavily loaded, and in the games that required good single thread performance, it could provide it. I like SteamOS. It’s intentionally locked down in some aspects (but you can unlock it with one command), and the Flatpak-only approach to software installation will make some people mad, but I like this balance. It almost feels like a proper console-type experience, almost . Valve does not officially support running SteamOS on random devices, but they haven’t explicitly prevented it either. I love that. Take any computer from AMD that has been manufactured from the last 5 years, slap SteamOS on it, and there is a very high chance that you’ll have a lovely gaming experience, with the level of detail and resolution varying depending on what hardware you pick. A top of the line APU from AMD seems to do the job well enough for most casual gamers like myself, and if the AMD Strix Halo based systems were more affordable, I would definitely recommend getting one if you want a small but efficient SteamOS machine. Last year, we saw the proliferation of gaming-oriented Linux distros. The Steam Machine is shipping this year. DankPods is covering gaming on Linux. 2026 has to be the year of the Linux (gaming) desktop. that’s the tipsy part in techtipsy   ↩︎ that’s the tipsy part in techtipsy   ↩︎

0 views

A Language For Agents

Last year I first started thinking about what the future of programming languages might look like now that agentic engineering is a growing thing. Initially I felt that the enormous corpus of pre-existing code would cement existing languages in place but now I’m starting to think the opposite is true. Here I want to outline my thinking on why we are going to see more new programming languages and why there is quite a bit of space for interesting innovation. And just in case someone wants to start building one, here are some of my thoughts on what we should aim for! Does an agent perform dramatically better on a language that it has in its weights? Obviously yes. But there are less obvious factors that affect how good an agent is at programming in a language: how good the tooling around it is and how much churn there is. Zig seems underrepresented in the weights (at least in the models I’ve used) and also changing quickly. That combination is not optimal, but it’s still passable: you can program even in the upcoming Zig version if you point the agent at the right documentation. But it’s not great. On the other hand, some languages are well represented in the weights but agents still don’t succeed as much because of tooling choices. Swift is a good example: in my experience the tooling around building a Mac or iOS application can be so painful that agents struggle to navigate it. Also not great. So, just because it exists doesn’t mean the agent succeeds and just because it’s new also doesn’t mean that the agent is going to struggle. I’m convinced that you can build yourself up to a new language if you don’t want to depart everywhere all at once. The biggest reason new languages might work is that the cost of coding is going down dramatically. The result is the breadth of an ecosystem matters less. I’m now routinely reaching for JavaScript in places where I would have used Python. Not because I love it or the ecosystem is better, but because the agent does much better with TypeScript. The way to think about this: if important functionality is missing in my language of choice, I just point the agent at a library from a different language and have it build a port. As a concrete example, I recently built an Ethernet driver in JavaScript to implement the host controller for our sandbox. Implementations exist in Rust, C, and Go, but I wanted something pluggable and customizable in JavaScript. It was easier to have the agent reimplement it than to make the build system and distribution work against a native binding. New languages will work if their value proposition is strong enough and they evolve with knowledge of how LLMs train. People will adopt them despite being underrepresented in the weights. And if they are designed to work well with agents, then they might be designed around familiar syntax that is already known to work well. So why would we want a new language at all? The reason this is interesting to think about is that many of today’s languages were designed with the assumption that punching keys is laborious, so we traded certain things for brevity. As an example, many languages — particular modern ones — lean heavily on type inference so that you don’t have to write out types. The downside is that you now need an LSP or the resulting compiler error messages to figure out what the type of an expression is. Agents struggle with this too, and it’s also frustrating in pull request review where complex operations can make it very hard to figure out what the types actually are. Fully dynamic languages are even worse in that regard. The cost of writing code is going down, but because we are also producing more of it, understanding what the code does is becoming more important. We might actually want more code to be written if it means there is less ambiguity when we perform a review. I also want to point out that we are heading towards a world where some code is never seen by a human and is only consumed by machines. Even in that case, we still want to give an indication to a user, who is potentially a non-programmer, about what is going on. We want to be able to explain to a user what the code will do without going into the details of how. So the case for a new language comes down to: given the fundamental changes in who is programming and what the cost of code is, we should at least consider one. It’s tricky to say what an agent wants because agents will lie to you and they are influenced by all the code they’ve seen. But one way to estimate how they are doing is to look at how many changes they have to perform on files and how many iterations they need for common tasks. There are some things I’ve found that I think will be true for a while. The language server protocol lets an IDE infer information about what’s under the cursor or what should be autocompleted based on semantic knowledge of the codebase. It’s a great system, but it comes at one specific cost that is tricky for agents: the LSP has to be running. There are situations when an agent just won’t run the LSP — not because of technical limitations, but because it’s also lazy and will skip that step if it doesn’t have to. If you give it an example from documentation, there is no easy way to run the LSP because it’s a snippet that might not even be complete. If you point it at a GitHub repository and it pulls down individual files, it will just look at the code. It won’t set up an LSP for type information. A language that doesn’t split into two separate experiences (with-LSP and without-LSP) will be beneficial to agents because it gives them one unified way of working across many more situations. It pains me as a Python developer to say this, but whitespace-based indentation is a problem. The underlying token efficiency of getting whitespace right is tricky, and a language with significant whitespace is harder for an LLM to work with. This is particularly noticeable if you try to make an LLM do surgical changes without an assisted tool. Quite often they will intentionally disregard whitespace, add markers to enable or disable code and then rely on a code formatter to clean up indentation later. On the other hand, braces that are not separated by whitespace can cause issues too. Depending on the tokenizer, runs of closing parentheses can end up split into tokens in surprising ways (a bit like the “strawberry” counting problem), and it’s easy for an LLM to get Lisp or Scheme wrong because it loses track of how many closing parentheses it has already emitted or is looking at. Fixable with future LLMs? Sure, but also something that was hard for humans to get right too without tooling. Readers of this blog might know that I’m a huge believer in async locals and flow execution context — basically the ability to carry data through every invocation that might only be needed many layers down the call chain. Working at an observability company has really driven home the importance of this for me. The challenge is that anything that flows implicitly might not be configured. Take for instance the current time. You might want to implicitly pass a timer to all functions. But what if a timer is not configured and all of a sudden a new dependency appears? Passing all of it explicitly is tedious for both humans and agents and bad shortcuts will be made. One thing I’ve experimented with is having effect markers on functions that are added through a code formatting step. A function can declare that it needs the current time or the database, but if it doesn’t mark this explicitly, it’s essentially a linting warning that auto-formatting fixes. The LLM can start using something like the current time in a function and any existing caller gets the warning; formatting propagates the annotation. This is nice because when the LLM builds a test, it can precisely mock out these side effects — it understands from the error messages what it has to supply. For instance: Agents struggle with exceptions, they are afraid of them. I’m not sure to what degree this is solvable with RL (Reinforcement Learning), but right now agents will try to catch everything they can, log it, and do a pretty poor recovery. Given how little information is actually available about error paths, that makes sense. Checked exceptions are one approach, but they propagate all the way up the call chain and don’t dramatically improve things. Even if they end up as hints where a linter tracks which errors can fly by, there are still many call sites that need adjusting. And like the auto-propagation proposed for context data, it might not be the right solution. Maybe the right approach is to go more in on typed results, but that’s still tricky for composability without a type and object system that supports it. The general approach agents use today to read files into memory is line-based, which means they often pick chunks that span multi-line strings. One easy way to see this fall apart: have an agent work on a 2000-line file that also contains long embedded code strings — basically a code generator. The agent will sometimes edit within a multi-line string assuming it’s the real code when it’s actually just embedded code in a multi-line string. For multi-line strings, the only language I’m aware of with a good solution is Zig, but its prefix-based syntax is pretty foreign to most people. Reformatting also often causes constructs to move to different lines. In many languages, trailing commas in lists are either not supported (JSON) or not customary. If you want diff stability, you’d aim for a syntax that requires less reformatting and mostly avoids multi-line constructs. What’s really nice about Go is that you mostly cannot import symbols from another package into scope without every use being prefixed with the package name. Eg: instead of . There are escape hatches (import aliases and dot-imports), but they’re relatively rare and usually frowned upon. That dramatically helps an agent understand what it’s looking at. In general, making code findable through the most basic tools is great — it works with external files that aren’t indexed, and it means fewer false positives for large-scale automation driven by code generated on the fly (eg: , invocations). Much of what I’ve said boils down to: agents really like local reasoning. They want it to work in parts because they often work with just a few loaded files in context and don’t have much spatial awareness of the codebase. They rely on external tooling like grep to find things, and anything that’s hard to grep or that hides information elsewhere is tricky. What makes agents fail or succeed in many languages is just how good the build tools are. Many languages make it very hard to determine what actually needs to rebuild or be retested because there are too many cross-references. Go is really good here: it forbids circular dependencies between packages (import cycles), packages have a clear layout, and test results are cached. Agents often struggle with macros. It was already pretty clear that humans struggle with macros too, but the argument for them was mostly that code generation was a good way to have less code to write. Since that is less of a concern now, we should aim for languages with less dependence on macros. There’s a separate question about generics and comptime . I think they fare somewhat better because they mostly generate the same structure with different placeholders and it’s much easier for an agent to understand that. Related to greppability: agents often struggle to understand barrel files and they don’t like them. Not being able to quickly figure out where a class or function comes from leads to imports from the wrong place, or missing things entirely and wasting context by reading too many files. A one-to-one mapping from where something is declared to where it’s imported from is great. And it does not have to be overly strict either. Go kind of goes this way, but not too extreme. Any file within a directory can define a function, which isn’t optimal, but it’s quick enough to find and you don’t need to search too far. It works because packages are forced to be small enough to find everything with grep. The worst case is free re-exports all over the place that completely decouple the implementation from any trivially reconstructable location on disk. Or worse: aliasing. Agents often hate it when aliases are involved. In fact, you can get them to even complain about it in thinking blocks if you let them refactor something that uses lots of aliases. Ideally a language encourages good naming and discourages aliasing at import time as a result. Nobody likes flaky tests, but agents even less so. Ironic given how particularly good agents are at creating flaky tests in the first place. That’s because agents currently love to mock and most languages do not support mocking well. So many tests end up accidentally not being concurrency safe or depend on development environment state that then diverges in CI or production. Most programming languages and frameworks make it much easier to write flaky tests than non-flaky ones. That’s because they encourage indeterminism everywhere. In an ideal world the agent has one command, that lints and compiles and it tells the agent if all worked out fine. Maybe another command to run all tests that need running. In practice most environments don’t work like this. For instance in TypeScript you can often run the code even though it fails type checks . That can gaslight the agent. Likewise different bundler setups can cause one thing to succeed just for a slightly different setup in CI to fail later. The more uniform the tooling the better. Ideally it either runs or doesn’t and there is mechanical fixing for as many linting failures as possible so that the agent does not have to do it by hand. I think we will. We are writing more software now than we ever have — more websites, more open source projects, more of everything. Even if the ratio of new languages stays the same, the absolute number will go up. But I also truly believe that many more people will be willing to rethink the foundations of software engineering and the languages we work with. That’s because while for some years it has felt you need to build a lot of infrastructure for a language to take off, now you can target a rather narrow use case: make sure the agent is happy and extend from there to the human. I just hope we see two things. First, some outsider art: people who haven’t built languages before trying their hand at it and showing us new things. Second, a much more deliberate effort to document what works and what doesn’t from first principles. We have actually learned a lot about what makes good languages and how to scale software engineering to large teams. Yet, finding it written down, as a consumable overview of good and bad language design, is very hard to come by. Too much of it has been shaped by opinion on rather pointless things instead of hard facts. Now though, we are slowly getting to the point where facts matter more, because you can actually measure what works by seeing how well agents perform with it. No human wants to be subject to surveys, but agents don’t care . We can see how successful they are and where they are struggling.

0 views

Fast by Default

After 25 years building sites for global brands, I kept seeing the same pattern appear. A team ships new features, users quietly begin to struggle, and only later do the bug reports start trickling in. Someone finally checks the metrics, panic spreads, and feature development is put on hold so the team can patch problems already affecting thousands of people. The fixes help for a while, but a month later another slowdown appears and the cycle begins again. The team spends much of its time firefighting instead of building. I call this repeating sequence of ship, complain, panic, patch the Performance Decay Cycle . Sadly, it’s the default state for many teams and it drains morale fast. There has to be a better way. When I stepped into tech-lead roles, I started experimenting. What if performance was something we protected from the start rather than something we cleaned up afterward? What if the entire team shared responsibility instead of relying on a single performance-minded engineer to swoop in and fix things? And what if the system itself made performance visible early, long before issues hit production? Across several teams and many iterations, a different pattern began to emerge. I now call it Fast by Default . Fast by Default is the practice of embedding performance into every stage of development so speed becomes the natural outcome, not a late rescue mission. It involves everyone in the team, not just engineers. Most organizations treat performance as something to address when it hurts, or they schedule a bug-fix sprint every few months. Both approaches are expensive, unreliable, and almost always too late. By the time a slowdown is noticeable, the causes are already baked into the rendering strategy, the data-fetching sequence, and the component boundaries. These decisions define a ceiling on how fast your system can ever be. You can tune within that ceiling, but without a rebuild, you can’t break through it. Meanwhile, the baseline slowly drifts. Slow builds and sluggish interactions become expected. What felt unacceptable in week 1 feels normal by month 6. And once a feature ships, the attention shifts. Performance work competes with new ideas and roadmap pressure. Most teams never return to clean things up. Performance regressions rarely announce themselves through one dramatic failure. They accumulate quietly, through dozens of reasonable decisions. A feature adds a little more JavaScript, a new dependency brings a hidden transitive load, and a design tweak introduces layout movement. A single page load still feels fine, but interactions begin to feel heavier. More features are added, more code ships, and slowly the slow path becomes the normal path. It shows up most clearly at the dependency level: Each import made sense in isolation and passed through code review. No single decision broke the experience; the combination did. This is why prevention always beats the cure. If you want to avoid returning to a culture of whack-a-mole fixes, you need to change the incentives so fast outcomes happen naturally. The core idea is simple: make the fast path easier than the slow path. Once you do that, performance stops depending on vigilance or heroics. You create systems and workflows that quietly pull the team toward fast decisions without friction. Here’s what this looks like day-to-day: If your starting point is a client-rendered SPA, you’re already fighting uphill. Server-first rendering with selective hydration (often called the Islands Architecture ) gives you a performance margin that doesn’t require constant micro-optimization to maintain. It also helps clarify how much of your SPA truly needs to be a SPA. When dependency size appears directly in your IDE, bundle size and budget checks run automatically in CI, and hydration warnings surface in local development, developers see the cost of their changes immediately and fix issues while the context is still fresh. Reaching for utility-first libraries, choosing smaller dependencies, and cultivating a culture where the first question is "do we need this?" rather than "why not?" keeps complexity from compounding. When reviewers consistently ask how a change affects render time or memory pressure, the entire team levels up. The question becomes part of the craft rather than an afterthought, and eventually it appears in every pull request. Teams that stay fast don’t succeed because they have more performance experts; they succeed because they distribute ownership. Designers think about layout stability, product managers scope work with speed in mind, and engineers treat performance budgets as part of correctness rather than a separate concern. Everyone understands that shipping fast code is as important as shipping correct code. For this to work, regressions need to surface early. That requires continuous measurement, clear ownership, and tooling that highlights problems before users do. Once the system pulls in the right direction with minimal resistance, performance becomes self-sustaining. A team with fast defaults ships fast software in month 1, and they’re still shipping fast software in month 12 and month 36 because small advantages accumulate in their favor. A team living in the Performance Decay Cycle may start with acceptable performance, but by month 12 they find themselves planning a dedicated performance sprint, and by month 36 they’re discussing a rewrite. The difference isn’t expertise or effort; it’s the approach they started from. Speed is leverage because it builds trust, sharpens design, and accelerates development. Once you lose it, you lose more than seconds: you lose users, revenue, and confidence in your own system. Fast by Default is how teams break this cycle and build systems that stay fast as they grow. For more on this model, see https://fastbydefault.com. <small>This article was first published on 4 December 2025 at https://calendar.perfplanet.com/2025/fast-by-default/</small>

0 views

Step Aside, Phone!

I read this post on Manu's blog and it immediately resonated. I've been spending more time than I'd like to admit staring at my phone recently, and most of that consists of a stupid game, or YouTube shorts. If you also want to cut down on some of your phone usage, feel free to join in; I’ll be happy to include links to your posts. As a benchmark, my screen time this week averaged around 2.5 hours per day on my phone and 1.5 hours per day on my tablet. That's bloody embarrassing - 28 hours in one week sat staring at (mostly) pointless shite on a fucking screen. I think my phone usage is more harmful as it's stupid stuff, whereas my tablet is more reading posts in my RSS reader, and "proper" YouTube (whatever that is). I think reducing both and picking up my Kindle more - or just being bored - will be far more healthy though. So count me in, Manu. Thanks for reading this post via RSS. RSS is great, and you're great for using it. ❤️ You can reply to this post by email , or leave a comment .

1 views

How To Quiet A Ugreen 4800 Plus Without Sacrificing Drive Temps

I recently got a Ugreen 4800 Plus NAS, and it is basically perfect for what I wanted. Four bays, enough CPU, enough RAM, nice build quality, and it does not look like a sci-fi router from 2012. The first thing I did was wipe the OS it shipped with and install TrueNAS. That part was also great. The not so great part was the noise. I expected it to be louder than my old Synology, mostly because I moved from “HDDs in a plastic box” to “a more PC-like NAS with more airflow”. Still, it was louder than I thought it would be, and it had this annoying behavior where the fan would randomly ramp up. Which is exactly the kind of thing you notice at night.

0 views
ava's blog Yesterday

privacy professionals: working at a messaging/social media platform

Welcome to a little series I'm starting, where I ask people working in the privacy field 7 questions about their work! This includes Data Protection Officers, Managers and Consultants, and other members of Privacy & Compliance teams. I find career advice and more specific information about the field to be lacking online, so I want to change that and host it myself :) First up is an employee from the privacy team at a social media/messaging platform! I messaged them via their support platform asking the questions and asking for consent to publish the answers, and received this response from one of the employees. Note: An earlier version published mentioned the company name; they have since requested me to anonymize it. 1. Can you describe your career path and what led you to become a Data Protection Officer (or similar role)? I started as a lawyer and then transitioned into the corporate world leveraging my law degree in a major corporation in their emerging privacy program. Another one of our teammates actually spent 25 years in teaching and took her CIPP US and transitioned careers. In privacy specifically you will see many backgrounds and stories of people "falling into" this career. Our DPO has experience across multiple companies and years of experience to make it to where he is now as a leader in the company. 2. What drew you specifically to data protection law and privacy as a profession? I loved the legal aspect of it and the ability to leverage my law degree. Fascinating intersection where humanity meets privacy. 3. What does a typical day in your role look like? Our team works with customer facing requests, internal team meetings discussing ways we can continue to serve our customers and also lead with excellence in compliance and communication. Compliance, legal regulations, new laws etc are all things we spend time working on, studying, and implementing within our platform. 4. What aspects of your work do you find most rewarding or challenging? Everyday comes with a new opportunity. With the ever changing privacy landscape the team is always learning, growing, and adapting. Its a very dynamic atmosphere. Love the challenge! 5. Which skills, qualities, or experiences do you consider essential for someone in such a role? Being a good listener as number one! Background in privacy law and certifications such as CIPP/ US - AI etc. A well rounded approach to both the legal aspects and the human impact which can come through experience, reading and working in the industry. 6. How do you keep up with the rapidly changing landscape of data protection regulations? Reading, conferences, webinar, IAPP, and association. Once you immerse yourself in understanding privacy you will find it touches virtually every part of our human existence in the marketplace, health, education, housing, finance etc. It is truly a fascinating industry. 7. If you could give advice to someone aspiring to enter this role, what would it be? It's a great career with growing impact across all industries. I would say consume content that makes you better. Books, podcasts, articles. Check out the IAPP website that has lots of resources. Stay up to date on different laws and regulations being passed. Finally, keep reaching out to industry leaders, think about how you want to show up either through certification, law school etc. It is always a bonus to get internships or equivalent. In the end though, I would say, no matter what you do work on your character through the decisions that you make in your day to day life now. Integrity, honesty, work ethic, humility, and curiosity will take you far in whatever you do! Thank you to this employee for the reply! I'm still reaching out to other companies, but if you know some who would be interested or know of people working in the privacy field that would like to answer these, please shoot me a message! :) Reply via email Published 08 Feb, 2026

0 views
Kev Quirk Yesterday

I've Moved to Pure Blog!

In my last post I introduced Pure Blog and ended the post by saying: I'm going to take a little break from coding for a few days, then come back and start migrating this site over to Pure Blog. Dogfoodin' yo! Yeah, I didn't take a break. Instead I've pretty much spent my entire weekend at the computer migrating this site from Jekyll to Pure Blog, and trying to make sure everything works ok. Along the way there were features that I wanted to add into Pure Blog to make my live easier, which I've now done, these include: As well as all this, I've also changed the way Pure Blog is formatted so that it's easier for people to update their Pure Blog version. While I was there, I also added a simple little update page in settings so people can see if they're running the latest version or not: Finally, I decided to give the site a new lick of paint. Which was by far the easiest part of this whole thing - just some custom CSS in the CMS and I ended up with this nice (albeit brutal) new design. The way I've architected Pure Blog should allow me to very easily change the design going forward, which is just fantastic for a perpetual fiddler, such as myself. OK, that's enough for one weekend. I hope publishing this post doesn't bring any other issues to the surface, but we shall see. Now I really am going to take a break from coding. This has been so much fun, and I continue to learn a lot. For now though, my brain needs a rest. Oh, if you're using Pure Blog, please do let me know - I'd love to hear your feedback. The reply button below should be working fine. 🙃 Thanks for reading this post via RSS. RSS is great, and you're great for using it. ❤️ You can reply to this post by email , or leave a comment . Hooks so I can automatically purge Bunny CDN cache when posts are published/updated. Implementing data files so I can generate things like my Blogroll and Projects pages from YML lists. Adding shortcodes so I can have a site wide email setting and things like my Reply by email button works at the bottom of every post. Post layout partial so I can add custom content below my posts without moving away from Pure Blog's upstream code.

0 views

SmartPoi Accelerometer Controller

Connects to your Poi Gets a list of images available Every time it stops spinning sends a “Change Image” signal to the poi* *only works for the newer SmartPoi firmware with Single Image selection. Code is on GitHub: https://github.com/tomjuggler/SmartPoi_Accelerometer_Controller – includes all install instructions needed (ESP32 C3 only – PlatformIO firmware). Extra: Battery, charger and switch, for one you can clip onto poi.. The post SmartPoi Accelerometer Controller appeared first on Circus Scientist . ESP32 with C3 chip: recommended: https://www.aliexpress.com/item/1005008593933324.html (just choose the correct one with antenna). I used C3 SuperMini which also works (WiFi not the best though), my better ones are still in the post. MPU-6050 Accelerometer: https://s.click.aliexpress.com/e/_c40exNFh

0 views
Ankur Sethi Yesterday

The only correct recipe for making chai

All my friends have their own personal recipes for making chai. I love my friends, so it hurts me to say that they’re wrong. My friends are, unfortunately, wrong about chai. I’m still coming to terms with this upsetting fact, but I’ll live. What follows is the only correct recipe for making chai. The only correct choice of tea leaves is Tata Tea Gold. Keep it in an airtight jar. Shake it up a bit so there’s an even mix of smaller grains and whole tea leaves. The smaller grains make for a stronger chai and they tend to settle at the bottom, so take that into account when measuring. You need full-cream milk for this recipe. Amul Gold is a good choice. I buy the tetrapacks because they survive in the fridge for longer, but the plastic bags work as well. According to the pack, Amul Gold has 6% fat. If you can’t find Amul Gold, try to find an equivalent milk. For a basic chai, you only need tea leaves, water, sugar, and milk. But we don’t want to make a basic chai, do we? No. So we’re going to add some elaichi (green cardamom) and saunf (fennel). Try to find fresh spices, if you can. I don’t have recommendations for specific brands here because most of them are fine. I learned the hard way that you get two kinds of saunf in the supermarket: green and brown. Green saunf tastes sweet and fresh, almost like a dessert. The brown saunf has a stronger flavor but is also bitter. We want the green saunf. Sometimes you find old elaichi at the store that’s gone a bit brown. Don’t buy that. Your elaichi should be green in color, just like the saunf. This recipe makes three cups of chai. Why three? Because that’s how much chai I drink every day. You can adjust this recipe to make more or fewer cups, as long as you keep all the ratios the same. Dig out your mortar and pestle from the drawer it has been languishing in. Add six pods of elaichi—two for each cup. Add half a tablespoon of saunf. You can use a bit more of both these spices if you want a more flavorful chai. Grind the spices into a semi-powdery mix. You don’t have to turn it into a fine powder, just grind them enough so that the flavors come through. Put two cups of water in a saucepan and add the spice mix. Put it on a high flame until boiling. When the water is boiling, reduce the flame to medium. Add three dessert spoons full of tea leaves to the boiling water. A dessert spoon is slightly smaller than a tablespoon. If all you have is a tablespoon, try about 3/4 tablespoons of tea leaves for each cup. Then add the same amount of sugar. You can adjust the amount of sugar based on how sweet you want your chai, but if you don’t add enough sugar the flavors won’t come through. Allow the mixture to boil on the stove for about 3-4 minutes. Then add a cup of milk. At this stage you should add a tiny bit of extra milk to account for the water evaporating, otherwise you won’t have three full cups of chai. About 1/5 of a cup should be enough, but I’ve been known to add a bit more to make the chai richer. Stir the mixture a bit to ensure everything is properly mixed together, then allow it to sit on the stove until the milk boils over. This next step is crucial. It will make or break your chai. I swear it’s not superstition. When the milk boils over, turn the stove to simmer. Allow it to settle back down into the pan. Then turn it up to medium heat again until it boils over once more. Repeat one more time. The milk should boil over and settle down three times total. Your chai is ready! Use a strainer to strain it into cups and enjoy. Should you eat a Parle-G with your chai? Maybe a Rusk? I have strong opinions on this matter but I’m running out of time, so I’ll leave that decision up to you.

0 views
Phil Eaton Yesterday

Paths of MySQL, vector search edition

This is an external post of mine. Click here if you are not redirected.

0 views
Sean Goedecke Yesterday

Large tech companies don't need heroes

Large tech companies operate via systems . What that means is that the main outcomes - up to and including the overall success or failure of the company - are driven by a complex network of processes and incentives. These systems are outside the control of any particular person. Like the parts of a large codebase, they have accumulated and co-evolved over time, instead of being designed from scratch. Some of these processes and incentives are “legible”, like OKRs or promotion criteria. Others are “illegible”, like the backchannel conversations that usually precede a formal consensus on decisions 1 . But either way, it is these processes and incentives that determine what happens, not any individual heroics . This state of affairs is not efficient at producing good software. In large tech companies, good software often seems like it is produced by accident , as a by-product of individual people responding to their incentives. However, that’s just the way it has to be. A shared belief in the mission can cause a small group of people to prioritize good software over their individual benefit, for a little while. But thousands of engineers can’t do that for decades. Past a certain point of scale 2 , companies must depend on the strength of their systems. Individual engineers often react to this fact with horror. After all, they want to produce high-quality software. Why is everyone around them just cynically 3 focused on their own careers? On top of that, many software engineers got into the industry because they are internally compelled 4 to make systems more efficient. For these people, it is viscerally uncomfortable being employed in an inefficient company. They are thus prepared to do whatever it takes to patch up their system’s local inefficiencies. Of course, making your team more effective does not always require heroics. Some amount of fixing inefficiencies - improving process, writing tests, cleaning up old code - is just part of the job, and will get engineers rewarded and promoted just like any other kind of engineering work. But there’s a line. Past a certain point, working on efficiency-related stuff instead of your actual projects will get you punished, not rewarded. To go over that line requires someone willing to sacrifice their own career progression in the name of good engineering. In other words, it requires a hero . You can sacrifice your promotions and bonuses to make one tiny corner of the company hum along nicely for a while. However, like I said above, the overall trajectory of the company is almost never determined by one person. It doesn’t really matter how efficient you made some corner of the Google Wave team if the whole product was doomed. And even poorly-run software teams can often win, so long as they’re targeting some niche that the company is set up to support (think about the quality of most profitable enterprise software). On top of that, heroism makes it difficult for real change to happen . If a company is set up to reward bad work and punish good work, having some hero step up to do good work anyway and be punished will only insulate the company from the consequences of its own systems . Far better to let the company be punished for its failings, so it can (slowly, slowly) adjust, or be replaced by companies that operate better. Large tech companies don’t benefit long-term from heroes, but there’s still a role for heroes. That role is to be exploited . There are no shortage of predators who will happily recruit a hero for some short-term advantage. Some product managers keep a mental list of engineers in other teams who are “easy targets”: who can be convinced to do extra work on projects that benefit the product manager (but not that engineer). During high-intensity periods, such as the lead-up to a major launch, there is sometimes a kind of cold war between different product organizations, as they try to extract behind-the-scenes help from the engineers in each other’s camps while jealously guarding their own engineering resources. Likewise, some managers have no problem letting one of their engineers spend all their time on glue work . Much of that work would otherwise be the manager’s responsibility, so it makes the manager’s job easier. Of course, when it comes time for promotions, the engineer will be punished for not doing their real work. This is why it’s important for engineers to pay attention to their actual rewards. Promotions, bonuses and raises are the hard currency of software companies. Giving those out shows what the company really values. Predators don’t control those things (if they did, they wouldn’t be predators). As a substitute, they attempt to appeal to a hero’s internal compulsion to be useful or to clean up inefficiencies. Large tech companies are structurally set up to encourage software engineers to engage in heroics A background level of inefficiency is just part of the landscape of large tech companies I write about this point at length in Seeing like a software company . Why do companies need to scale, if it means they become less efficient? The best piece on this is Dan Luu’s I could build that in a weekend! : in short, because the value of marginal features in a successful software product is surprisingly high, and you need a lot of developers to capture all the marginal features. For a post on why this is not actually that cynical, see my Software engineers should be a little bit cynical . I write about these internal compulsions in I’m addicted to being useful . Large tech companies are structurally set up to encourage software engineers to engage in heroics This is largely accidental, and doesn’t really benefit those tech companies in the long term, since large tech companies are just too large to be meaningfully moved by individual heroics However, individual managers and product managers inside these tech companies have learned to exploit this surplus heroism for their individual ends As a software engineer, you should resist the urge to heroically patch some obvious inefficiency you see in the organization Unless that work is explicitly rewarded by the company, all your efforts will do is delay the point at which the company has to change its processes A background level of inefficiency is just part of the landscape of large tech companies It’s the price they pay to be so large (and in return reap the benefits of scale and legibility ) The more you can learn to live with it, the more you’ll be able to use your energy tactically for your own benefit I write about this point at length in Seeing like a software company . ↩ Why do companies need to scale, if it means they become less efficient? The best piece on this is Dan Luu’s I could build that in a weekend! : in short, because the value of marginal features in a successful software product is surprisingly high, and you need a lot of developers to capture all the marginal features. ↩ For a post on why this is not actually that cynical, see my Software engineers should be a little bit cynical . ↩ I write about these internal compulsions in I’m addicted to being useful . ↩

0 views
Ruslan Osipov 2 days ago

Starting daycare is rough

Picture this: it’s 2 am. My kiddo is mouth breathing, loudly as she’s whining trying to fall asleep. Poor kid is running a fever. She’s drooling and scratching her face because she’s teething. No one in this household have slept well for weeks. Everyone warned me that starting daycare will be rough. Everyone said oh hey, you’ll be sick all the time, your kid will be sick all the time, you’ll be miserable. How bad could it be, right? Well, it’s bad. I don’t have a thesis for this post, I just need to vent. And yeah, sick kiddo is why I’m almost a week behind my (self-imposed) writing schedule. Because over the past month this child was supposed to be in daycare (which isn’t cheap, mind you), she’s been home at least 50% of the time. And oh how I wish I could just blame daycare and say they don’t want to deal with yet-another-whiny-and-snotty-kid, I also empathize with both the overworked daycare employees who want to send her home. Being a daycare worker isn’t easy, and I’m sure constant crying doesn’t help. When we were touring daycares, we’ve noticed something interesting: every place posts pictures, names, and mini-resumes for their teachers - and what stands out to me is that many have 1-2 years of experience. Not just at the daycare we picked, but among the majority of places we’ve toured. Turns out daycare workers have a significantly above average turnover - like a press release from Federal Reserve Bank of Cleveland indicating that the “turnover among childcare workers was 65% higher than turnover of median occupation”. The wages are low, the hyper-vigilance needed to keep infants and toddlers alive takes a toll on a nervous system, and the job is mostly sedentary - with lots of sitting on the floor and baby chairs watching the little demons crawl around. Where was I? Oh, yeah, I don’t know what daycare workers are going through, but I empathize. But I also empathize with myself (d’oh), working half-days and taking unexpected time off as my clingy, cranky, annoyed toddler wants demands some kind of attention. The kiddo’s sick and wants to be held 24/7. But you know what else? She gets bored, so she wants to play. But it’s hard to play when you’re being held. So crying tends to be a good solution. And all of that is on top of the fact that this disease-ridden potato has gotten me sick, 4 times and counting in the past 3 months. Her and mom get pretty sick, but - probably because mom’s body is working for two - they do mostly fine. Sick, but manageable. I on the other hand just feel like I’m barely able to survive some days. Everything hurts, and nothing helps. I used to like being sick, in the same ways I love rainy days. You get an excuse to veg out - yeah, it’s unpleasant, but you get to binge your favorite shows or play some sick-friendly games. You order in or your partner cooks for you. You drink tea and such. It’s cozy. And most importantly for someone who struggles to sit still, I don’t feel any guilt for doing nothing. It’s nice. But being sick with a kid - hell no. Gone is the guilt-free experience. Kid’s sick, wife’s sick, I’m sick. We’re all rotating through our chores, we all have our roles to play. One of us soothes the baby, one of us cooks and cleans, one of us cries and leaves a trail of snot on the floor. So yeah, here I am, on my 4th sickness, taking a breather to write up this note while mom took the kiddo to get some fresh air. Send help. No, really - shoot me an email to tell me I’m not alone and you’ve survived this. Or maybe tell me why you also enjoy how being sick gives you a permission to be lazy. Someone please normalize my experience!

0 views
Manuel Moreale 2 days ago

Step aside, phone

I was chatting with Kevin earlier today, and since he’s unhappy with his mindless phone usage , I proposed a challenge to him: for the next 4 weeks, each Sunday, we’re gonna publish screenshots of our screen time usage as well as some reflections and notes on how the week went. If you also want to cut down on some of your phone usage, feel free to join in; I’ll be happy to include links to your posts. I experimented with phone usage in the past and I know that I can push screen time usage very low , but it’s always nice to do these types of challenges, especially when done to help someone else. Like Kevin, I’m also trying to read more. I read 35 books last year , the goal for 2026 is to read 36 (currently more than halfway through book number 5), and so I’m gonna attempt to spend more time reading on paper and less on screen. It’s gonna be fun, curious to see how low I can push my daily averages this time around. Thank you for keeping RSS alive. You're awesome. Email me :: Sign my guestbook :: Support for 1$/month :: See my generous supporters :: Subscribe to People and Blogs

0 views
Simon Willison 2 days ago

How StrongDM's AI team build serious software without even looking at the code

Last week I hinted at a demo I had seen from a team implementing what Dan Shapiro called the Dark Factory level of AI adoption, where no human even looks at the code the coding agents are producing. That team was part of StrongDM, and they've just shared the first public description of how they are working in Software Factories and the Agentic Moment : We built a Software Factory : non-interactive development where specs + scenarios drive agents that write code, run harnesses, and converge without human review. [...] In kōan or mantra form: In rule form: Finally, in practical form: I think the most interesting of these, without a doubt, is "Code must not be reviewed by humans". How could that possibly be a sensible strategy when we all know how prone LLMs are to making inhuman mistakes ? I've seen many developers recently acknowledge the November 2025 inflection point , where Claude Opus 4.5 and GPT 5.2 appeared to turn the corner on how reliably a coding agent could follow instructions and take on complex coding tasks. StrongDM's AI team was founded in July 2025 based on an earlier inflection point relating to Claude Sonnet 3.5: The catalyst was a transition observed in late 2024: with the second revision of Claude 3.5 (October 2024), long-horizon agentic coding workflows began to compound correctness rather than error. By December of 2024, the model's long-horizon coding performance was unmistakable via Cursor's YOLO mode . Their new team started with the rule "no hand-coded software" - radical for July 2025, but something I'm seeing significant numbers of experienced developers start to adopt as of January 2026. They quickly ran into the obvious problem: if you're not writing anything by hand, how do you ensure that the code actually works? Having the agents write tests only helps if they don't cheat and . This feels like the most consequential question in software development right now: how can you prove that software you are producing works if both the implementation and the tests are being written for you by coding agents? StrongDM's answer was inspired by Scenario testing (Cem Kaner, 2003). As StrongDM describe it: We repurposed the word scenario to represent an end-to-end "user story", often stored outside the codebase (similar to a "holdout" set in model training), which could be intuitively understood and flexibly validated by an LLM. Because much of the software we grow itself has an agentic component, we transitioned from boolean definitions of success ("the test suite is green") to a probabilistic and empirical one. We use the term satisfaction to quantify this validation: of all the observed trajectories through all the scenarios, what fraction of them likely satisfy the user? That idea of treating scenarios as holdout sets - used to evaluate the software but not stored where the coding agents can see them - is fascinating . It imitates aggressive testing by an external QA team - an expensive but highly effective way of ensuring quality in traditional software. Which leads us to StrongDM's concept of a Digital Twin Universe - the part of the demo I saw that made the strongest impression on me. The software they were building helped manage user permissions across a suite of connected services. This in itself was notable - security software is the last thing you would expect to be built using unreviewed LLM code! [The Digital Twin Universe is] behavioral clones of the third-party services our software depends on. We built twins of Okta, Jira, Slack, Google Docs, Google Drive, and Google Sheets, replicating their APIs, edge cases, and observable behaviors. With the DTU, we can validate at volumes and rates far exceeding production limits. We can test failure modes that would be dangerous or impossible against live services. We can run thousands of scenarios per hour without hitting rate limits, triggering abuse detection, or accumulating API costs. How do you clone the important parts of Okta, Jira, Slack and more? With coding agents! As I understood it the trick was effectively to dump the full public API documentation of one of those services into their agent harness and have it build an imitation of that API, as a self-contained Go binary. They could then have it build a simplified UI over the top to help complete the simulation. With their own, independent clones of those services - free from rate-limits or usage quotas - their army of simulated testers could go wild . Their scenario tests became scripts for agents to constantly execute against the new systems as they were being built. This screenshot of their Slack twin also helps illustrate how the testing process works, showing a stream of simulated Okta users who are about to need access to different simulated systems. This ability to quickly spin up a useful clone of a subset of Slack helps demonstrate how disruptive this new generation of coding agent tools can be: Creating a high fidelity clone of a significant SaaS application was always possible, but never economically feasible. Generations of engineers may have wanted a full in-memory replica of their CRM to test against, but self-censored the proposal to build it. The techniques page is worth a look too. In addition to the Digital Twin Universe they introduce terms like Gene Transfusion for having agents extract patterns from existing systems and reuse them elsewhere, Semports for directly porting code from one language to another and Pyramid Summaries for providing multiple levels of summary such that an agent can enumerate the short ones quickly and zoom in on more detailed information as it is needed. StrongDM AI also released some software - in an appropriately unconventional manner. github.com/strongdm/attractor is Attractor , the non-interactive coding agent at the heart of their software factory. Except the repo itself contains no code at all - just three markdown files describing the spec for the software in meticulous detail, and a note in the README that you should feed those specs into your coding agent of choice! github.com/strongdm/cxdb is a more traditional release, with 16,000 lines of Rust, 9,500 of Go and 6,700 of TypeScript. This is their "AI Context Store" - a system for storing conversation histories and tool outputs in an immutable DAG. It's similar to my LLM tool's SQLite logging mechanism but a whole lot more sophisticated. I may have to gene transfuse some ideas out of this one! I visited the StrongDM AI team back in October as part of a small group of invited guests. The three person team of Justin McCarthy, Jay Taylor and Navan Chauhan had formed just three months earlier, and they already had working demos of their coding agent harness, their Digital Twin Universe clones of half a dozen services and a swarm of simulated test agents running through scenarios. And this was prior to the Opus 4.5/GPT 5.2 releases that made agentic coding significantly more reliable a month after those demos. It felt like a glimpse of one potential future of software development, where software engineers move from building the code to building and then semi-monitoring the systems that build the code. The Dark Factory. I glossed over this detail in my first published version of this post, but it deserves some serious attention. If these patterns really do add $20,000/month per engineer to your budget they're far less interesting to me. At that point this becomes more of a business model exercise: can you create a profitable enough line of products that you can afford the enormous overhead of developing software in this way? Building sustainable software businesses also looks very different when any competitor can potentially clone your newest features with a few hours of coding agent work. I hope these patterns can be put into play with a much lower spend. I've personally found the $200/month Claude Max plan gives me plenty of space to experiment with different agent patterns, but I'm also not running a swarm of QA testers 24/7! I think there's a lot to learn from StrongDM even for teams and individuals who aren't going to burn thousands of dollars on token costs. I'm particularly invested in the question of what it takes to have agents prove that their code works without needing to review every line of code they produce. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Why am I doing this? (implied: the model should be doing this instead) Code must not be written by humans Code must not be reviewed by humans If you haven't spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement

0 views
Jeff Geerling 2 days ago

Exploring a Modern SMTPE 2110 Broadcast Truck With My Dad

In October, my Dad and I got to go behind the scenes at two St. Louis Blues (NHL hockey) games, and observe the massive team effort involved in putting together a modern digital sports broadcast. I wanted to explore the timing and digital side of a modern SMPTE 2110 mobile unit, and my Dad has been involved in studio and live broadcast for decades, so he enjoyed the experience as the engineer not on duty!

0 views
Brain Baking 2 days ago

Creating Buttons To Remember Things

My wife recently bought a device to scratch her creative crafting itch: a button press . At first, I dismissed it as yet another thing requiring space in her increasingly messy atelier. I don’t know how we manage to do it but we seem to be experts in gathering things that gather things themselves: dust. But now that she finally started doing something with it, I was secretly becoming interested in what it could mean for our scrapbook making. The button press in question is a “We R Makers Button Press Bundle All-In-One Kit” that comes with press, a few add-on peripherals that allow you to modify how it cuts and presses, and of course the buttons themselves. The button press in action, about to create a 'little monster'. Since handling the lever requires a bit of pressure to correctly cut and a second time fit the cut circle inside the button, I yelled TSJAKKA every time she would press it, to great joy of our daughter. She now calls it the Tsjakka . “Daddy, can we make another little monster with Tjsakka?” Because my first instinct after thinking about what kind of buttons I wanted was to print a variant of the Alien Lisp Mascot —a green monster with five eyes. Fellow nerds reading this might have covered their entire laptop back with cool looking stickers: a Docker container sticker, an IDEA logo one, the GitHub Octocat, and god knows what else you managed to nab from a conference table. While I always found those laptops to be just cute, I never wanted to soil mine with a sticker of some technology stack that I would grow to hate a few years later. Thanks to a random takeover by Microsoft sharks, for instance. *cough* Give Up Github *cough*. So why not a programming language mascot? Java’s The Duke? No way, I’m not that big of a Java fan. The Gopher perhaps? Better, but no. If I was to wear a badge, smack on a sticker somewhere prominent, it would have to be of something that makes me happy. Go is cool but boring. Java brings in a lot of money but smells like enterprise mud. So far, I haven’t encountered a single programming language that truly makes me happy. But Lisp is coming very close. The Lisp Alien it is, then: The result: three buttons pinned to the inside of my bike bag. One of the other two buttons is self-explanatory: the Brain Baking logo. The first one on the upper left is a part of my late father-in-law’s master’s thesis; an electronic schematic with resistors. The embossed logo on the button press, below the We R name, reads: Memory Keepers. Which is exactly what that button is for. They market it as a way to permanently record precious memories—and wear them on your sleeve . I think it’s brilliant. We don’t have an endless supply of metal clips and plastic caps to press that memory in so we have to be mindful: which one’s do we really want to create? Sure you can buy more and it’s not expensive, but that’s not the point. The point is that there won’t be a Duke on my bag, but there will be a Brain Baking logo. And, apparently, a warning. Most folks pin these buttons onto the obvious visible part of their bag. But I don’t want to come across as a button lunatic (at least not at first sight). A more convincing argument then: the bag I pinned it on is a simple detachable laptop cycle bag . The exterior gets wet now and then. I highly doubt that the button is water resistant. The third but slightly less convincing argument is that the buttons rattle quite a bit as the needle on the back used to pin it onto something sits quite loose in its metal socket. Perhaps that depends from product type to type. As you might have guessed, our daughter now is dead set on pinning a little monster on her bag she uses carry her lunch go to school. We’ll first have to ask Tjsakka to get back to work. Related topics: / crafting / By Wouter Groeneveld on 7 February 2026.  Reply via email .

0 views
Karboosx 2 days ago

Tech documentation is pointless (mostly)

Do you really trust documentation for your evolving codebase? Probably not fully! So why do we even write documentation or constantly complain about lack of it? Let's talk about that :D

0 views