Latest Posts (20 found)

Career Update - Life After Stepping Down

Last October I made the decision to step down from my role as a global executive in the cybersecurity industry. The last 8 months have gone by in a flash, so I wanted to write a little update on how it's going. Do I regret the decision? Was it the right thing to do? Or have I screwed up the career I've spent 25 years building? The TL;DR is that I'm far happier now than I was back in October. I've battled with whether this was a failure, and I decided that I didn't fail , it was all just my ego. I've gone from working a ~60-70 hour week, to working a ~40-45 hour week on average (I'm contracted to 35 hours). When I sign off, I know my phone isn't going to ring. I'm doing work that I enjoy doing again, with a team I enjoy doing it with. More importantly though, I'm less stressed and I get to spend quality time with my family. That right there, that , is the most important thing. For example, since all my team are in the States, I tend to have late meetings. So on a Wednesday I shift my hours and sign on at around lunchtime. My wife is also off on a Wednesday, so every week we now take the boys to school, then go out for breakfast together. After that we take the dogs for a walk. It's just lovely. Working weekends just isn't a thing for me now either. So I can spend every weekend with the kids, or out on my motorbikes with friends, or in the garage working on them, or on various projects around the smallholding . Sure, I don't have a corner office, or an assistant, or the fancy executive job title. But I'm happy I traded those things in for all the above. If you're thinking about a step down to focus more on family instead of your career, my advice is to go for it . I have zero regrets. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

An Interview with Ben Bajarin About Apple, AI, and Compute

An interview with Ben Bajarin about WWDC and the status of the AI compute industry.

0 views

So you want to write a GUI framework

Link: https://www.cmyr.net/blog/gui-framework-ingredients.html There are a handful of technical blog posts in my bookmarks that made me go oh, I never thought of it that way when I first read them. I'm talking about posts like Parse, don't validate , Text Editing Hates You Too , Choose Boring Technology , or Making illegal states unrepresentable . I consider these required reading for any working programmer. To me, Colin Rofls' So you want to write a GUI framework falls into the same category. Before reading this post, I'd never considered how much work goes into building a GUI framework. There's a reason even trillion-dollar megacorporations use web technologies to build their apps , ship buggy frameworks year after year , or drop support for platforms with no concern for their users. Building a brand-new GUI framework in 2026 is a long slog, and you don't get to reap the fruits of your labor until you've solved every single problem on Colin's list. Colin writes : Regardless of the specifics, there is one major dividing line to recognize, and this is whether or not a framework is expected to integrate closely into an existing platform or environment . On one side of this line, then, are tools for building games, embedded applications, and (to a lesser degree) web apps. In this world, you are responsible for providing almost everything your applications will need, and you will be interacting closely with the underlying hardware: accepting raw input events, and outputting your UI to some sort of buffer or surface. (The web is different; here the browser vendors have done that integration work for you.) On the other side of this line are tools for building traditional desktop applications. In this world, you must integrate tightly into a large number of existing platform APIs, design patterns, and conventions, and it is this integration that is the source of most of your design complexity. In general, a game or an embedded application is a self-contained world; there is a single ‘window’, and the application is responsible for drawing everything in it. The application doesn’t need to worry about menus or sub-windows; it doesn’t need to worry about the compositor , or integrating with the platform’s IME system. Although they maybe should , they often don’t support complex scripts . They can ignore rich text editing. They likely don’t need to support font enumeration or fallback . They often ignore accessibility. He goes on to enumerate all the integration points a GUI framework has with its host platform, including windowing, menus, 2D graphics, text rendering, accessibility, user input, and a bunch more. Each of these problems is hard on its own, but to build a GUI framework that people will want to use, you must solve all of these problems simultaneously . A few surprising things that stood out to me from the post: We don't have too many viable cross-platform GUI frameworks today, especially if you want to target desktop computers. It takes too much time, money, and specialized expertise to build one. If I was starting a desktop app business today, there are only two frameworks I'd feel comfortable relying on: Electron and Qt. Nothing else is mature enough. Dropdowns and select menus are actually tiny windows. If they weren't, they would be constrained to live inside your app's main window. You can see this in action when a web application cobbles together a custom select box using a bunch of s. Those custom selects can never overflow the boundaries of your browser. Building an abstraction that supports all the different 2D drawing APIs across platforms (CoreGraphics on Mac, Direct2D on Windows, Cairo on Linux, etc.) is difficult. To get around this, many cross-platform apps bundle Skia, which adds ~17MB to the application's binary. The article is from 2021, so that footprint is probably larger now. GPUs are built to render 3D scenes, which makes them worse at rendering 2D scenes. Rendering 2D scenes on GPUs is an area of active research. If you only ever write English, you've probably never thought about IME s. I write Hindustani and Punjabi, and broken support for the macOS IMEs for those languages immediately tells me that an app is built using a non-native GUI framework. Replicating the native behavior and conventions of a platform is difficult but possible. Replicating the native appearance of a platform—down to the animation curves, gradients, border radii—is a fool's errand. In my opinion, if you're building a cross-platform app, it's better to have it look completely alien than trying to mimic the platform's native widgets. But not respecting the platform's conventions for things like drag and drop, scroll acceleration, etc. is nonnegotiable.

0 views

Note #736

Went to Cali for a fun party in the woods. Was able to hang with friends and talk about the future. Highly recommended. Thank you for using RSS. I appreciate you. Email me

0 views
Unsung Today

“This is my favorite news from all of WWDC this week.”

John Gruber on Daring Fireball : Perhaps the worst UI crime in MacOS 26 Tahoe was the inexplicable decision to add inscrutable, distracting icons next to every item in the menu bar. You will recall Jim Nielsen writing about it , rightly describing it as exactly the sort of thing that Mac users look down upon in platforms like Google Docs and Windows. You will also recall Nikita “Tonsky” Prokopov writing about it , illustrating that the bad idea wasn’t even implemented well, with different Apple apps using entirely different icons for the same menu items. […] Top third-party developers rightly rejected the design , adopting open source code from Brent Simmons to disable the default “icons in all standard menu items” behavior. […] Wonderful news in MacOS 27 Golden Gate: the icons are gone. It’s like Tahoe’s menu item icons never happened. Kudos to Nielsen and Prokopov for pushing on this and explaining the problem so well. This wasn’t about ugly icons. This was about improper use and misunderstanding of iconography. (Also may I try to manifest something:) Looking forward to reading the oral history of macOS 26 Tahoe and Liquid Glass some time in the 2030s! #apple #iconography #mac os

0 views

Home Brew Presents: Last Week

What’s going on, Internet? Last night I headed out by myself to catch the Home Brew Crew perform their first project Last Week at the Auckland Town Hall. I was supposed to go with my wife, but last minute plans saw her and the kids head to Waiheke for the long weekend. Solo gig? No problem! I started my night solo by hitting up Low Brow on K Road for a burger and beer before making my way down Queen Street to the venue. Sitting at the bar with my beer waiting for my burger, it wasn’t long until a group of people out for the night approached the bar and hovered waiting for a table. I started talking with one of the group and quickly discovered that they were out for a work social club event and also heading to see Home Brew. I ended up chatting with them and a couple beers later I headed off with them for the walk down Queen Street to the show. We arrived just before 7pm as the doors opened, and there was a line all the way down the street. We headed across the road to an Irish Pub for another drink and wait for the line to clear out. In the pub I was introduced to more people and had some more chats with other concert goers. It was great to have a chat with others about Tom Scott, Home Brew and the other music associated with him and Young Gifted & Broke (YGB). As we finished up our drinks and headed across to the venue, it was time to part ways as I was heading up stairs to my seats while the group was heading inside to the floor. I said my goodbyes, exchanged phone numbers with a quick txt message and made my way to find my seats. Once orientated I headed back downstairs to check out the merch stand. I was hoping for a copy of the vinyl which has only been available during its first pressing in 2010 and a hoody. I made it to the front of the line and was able to get a copy of a recently repressed vinyl in pink , a black “LISTEN TO HOME BREW” t-shirt, and a Run It Back lyric book. No hoody though. I grabbed a couple drinks, a bottle of water balanced in my hands with my merch and made my way back to my seats. I had seats up in the circle which proved to be popular as many people hovered nearby, some asking if they could sit in the vacant seat that I had also purchased for the night. I kept the seat occupied with my jacket and merch haul. The place was heaving with people. The stage design was fantastic. Set up to resemble the Sandringham flat where Tom lived at the time when they created and recorded the EP complete with fridge full of beer. They also had the egg cartons on the wall as mentioned in the closing track on Run It Back, “Run It Back Again”: Remember when we first started this shit? (Yeah run it back) Studio with the egg cartons and shit (ha ha yeah) The stage soon filled up with what Tom describes as his favourite people in the world. I saw Team Dynamite, Brandan Shiraz, Mellowdownz (I think?) up there on stage. It wasn’t too long before Tom burst onto the stage and started spitting the words to the hit, Monday. The crowd went wild, the floor was heaving under a cloud of smoke (not ciggis). He continued through the EP setlist including all the hits, Tuesday, Wednesday, Thursday, Friday, Saturday, and Sunday. Once Last Week was finished he dived into the rest of the Home Brew hits, Drinking In The Morning, Alcoholic, Datura/White Flowers and many more! I had a super fun time, even though I was solo, there were enough people around who were happy to talk so I never felt alone. I’m super happy I got to see Home Brew perform live finally as with them all off in their own musical directions we might not get to see them perform together again, especially not Last Week, front-to-back. Listen to Home Brew, laterz. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views
iDiallo Today

Please, use a link!

This is a rant. It didn't start today, but I think I've reached the end of the line. The straw that broke the camel's back, so to say. I used an internal tool for the first time. I logged in and navigated through the web app, making some updates here and there. All was well. But then I made the mistake of wanting to go back to the initial dashboard. I clicked the back button, and instead of returning to the previous page, I saw Chrome's default tab page staring right back at me. How is it possible? I had navigated through at least a dozen pages, yet one back button click and the web app was completely gone. If you've ever experienced something similar, it's probably because you were using a single-page app. Nothing wrong with single-page apps, of course, but over the years I've concluded that people who only know how to build single-page apps don't know what a link is. So let's start with examples of what a link isn't. Not a link. It's a div with an event handler. You can style it all you want, but it's not a link. This may be a button, but it is not a link. With the advent of React, this has become so common. Because it's called a button, learners naturally gravitate toward it to link different pages. But there is worse. This almost feels intentional. As if the developer is teasing me. Why would you use an anchor tag but then omit its most important attribute? Here is what a link is supposed to look like: That's it. Simple. You don't have to add any configuration for the browser to support it. You don't even have to style it. All user agents have sensible default styling for the different states of a link: unvisited, visited, and active. It works well with browser history. On desktop, when you hover over it, you get a preview of the destination URL in the bottom-left corner of your screen. On mobile, you can press and hold to get several options on how to open it. You don't even have to worry about accessibility. It just works. But when a developer is deep in their React app thinking about functionality, they might say, "When you click this button, go to the home page." They will naturally think of as an event. And since it's a single-page app, they're thinking about state, not a page. They might write something like this: This is already bad enough. But depending on how the function is implemented, it can make or break the entire browser history. In the internal tool I was using, was essentially replacing the current URL with the new one using . You can avoid all of these issues by just using an anchor tag. If you need it to play nicely with your React app, React Router has a component. Please, just use a native link and you won't have to worry about anything else.

0 views

📝 2026-06-10 21:05

I'm quite liking #Vivaldi, especially now I have dark/light mode switching working in #Ubuntu. If it sticks, I'll play with their email and RSS integration next. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Being “Good” at Things

Golf content on social media is my online junk food and the other day I came across a video interviewing professional golfers that asks: “What does an amateur golfer have to shoot to be considered good?” It’s a leading question because the phrasing implicitly frames a number as the answer for a qualitative measurement , but I digress. All the pros give their answers. Some say you gotta shoot a number in 90’s. Others say the 80’s. Some even say the 70’s. Then along comes Collin Morikawa: I don’t think there’s a number, but I think you have to be able to finish out every hole without, like, picking up a two-footer. Love it! I don’t want to go too deep on a social media golf interview clip, but… I love how he breaks out of the question’s implicit framing and really strikes at the heart of the qualitative question: “What does it mean to be good at golf?” Being “good”, in his eyes, is not shooting a specific number. Numbers are standardized proxies for measurement across a wide variety of players, skill levels, and — to be quite frank — degrees of honesty. Anyone who has played golf knows that scores can be easily manipulated. On a casual outing amongst friends, my “82” may be very different than the “82” of the players in front of me — or even the players in my own group. It all depends on how you play the game. So saying “if you can shoot number ___” is a very lossy picture of what it means to be “good” at golf — at least for amateurs. That’s why I love Morikawa’s answer: if you finish every hole and don’t get a double bogey, you’re “good” at golf. Because guess what? Finishing is the hard part. The consistency. Showing up to every hole, finishing out based on the actual rules of the game, not taking mulligans, not picking up a two-footer and saying “That’s good.” (Or even missing a two-footer and re-putting and giving yourself the make.) Relieving yourself of the exacting burden of the reality of the game is the easy way to play, but it doesn’t make you a better golfer. I think that’s true of so many things we do as humans: programming, design, writing, etc. If you want to be “good” at what you do, do the hard, little things that others gloss over. Do them consistently and well, with discipline and perseverance. If you do, then I’d say you’re “good” at what you do because “good” isn’t a number. It’s quality. A disposition. A way of being. Reply via: Email · Mastodon · Bluesky

0 views
Unsung Today

The trouble with font previews

A reader sent me this screenshot from PowerPoint, with one of the menus looking the best it’s ever looked, and the other one showing to work with what we could charitably call “a UI hangover”: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/1.1600w.avif" type="image/avif"> It’s obviously bad craft and crossing over to the “embarrassing” territory, but I thought it’s an interesting question: what happened? The main piece of the puzzle is that the first menu shows the name of the font in San Francisco, but the second asks to render the font name in itself, serving as a font preview. = 3x)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/2.1600w.avif" type="image/avif"> Font previews are fascinating because they are the perfect showcase of how tricky fonts can be at scale. Some time ago, I wrote an essay called Typography is impossible . TL; DR: It’s actually impossible to left align or center text. Ever. Not just because each font does whatever it wants – font size is a number that doesn’t really give you anything to hang a hat on, and the font can place itself in its box however it desires, too – and not just because fonts often lie (via bad metrics) about what they store inside, but also because aligning and centering are really in the eye of the license holder, and have more than one definition. So, every time you align text to anything, in whatever way, it’s only an approximation . Most of the time that’s good enough. Here it is not. I worked on font previews at Figma, and wanted to show you three screenshots of what we did. This first one shows the default attempt: we ask the fonts to render themselves in the same size (16px), vertically centered in a box that’s always 28px tall… and they oblige on paper, but it really doesn’t feel like they are: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/3.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/3.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/4.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/4.1600w.avif" type="image/avif"> The second take shows what happens if you nudge the fonts up and down so they’re aligned to their baselines. This at least creates vertical rhythm; in effect, we need to make the fonts uneven to make them feel even. = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/5.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/5.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/6.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/6.1600w.avif" type="image/avif"> And this is the final result, with extra adjustments: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/7.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/7.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/8.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/8.1600w.avif" type="image/avif"> What do we do in the final version? Too many small things to mention, but in essence: These adjustments are all in the same category: getting off math balance to get to optical balance. Here, you can compare before (the naïve version) with after (the final version): = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/9.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/9.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/10.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/10.1600w.avif" type="image/avif"> If it feels subtle, imagine it applied to a much wilder menagerie of very thin, very huge, or very strange fonts. (The go-to example? Open a Mac and try typing in Zapfino .) I’m not showing this to brag about my work – okay, fine, to some extent I am, we’re all human – and I truly believe this could be so much better, still. There are icon fonts, color fonts, and non-Western fonts so rich in variety and tradition that this category itself is basically a fractal. Mostly, I wanted to share this lesson: dealing with fonts is hard, and dealing with fonts as a system even more so. Whether it’s the printing press, paper, or Illustrator, it takes people years or even decades to fully learn the craft of setting type, and to believe their eyes instead of only relying on math. But here, what’s needed is manufactured craft : we have to teach the machine to trust its eyes (which it doesn’t have) over math (which it can’t escape). Now if you’re wondering why font previews look bad in so many apps, I believe it’s because people working on those did not allocate enough time to deal with all that. But I’ve used the word “embarrassing” as there’s one more thing that the original did poorly, and something the reader identified immediately. The makers of PowerPoint allowed the font to escape its containment: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/11.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-trouble-with-font-previews/11.1600w.avif" type="image/avif"> This is another big lesson: fonts will ignore their bounds at every single opportunity. That infamous CSS IS AWESOME graphic? That’s CSS underestimating text. That naked URL or code snippet pushing the mobile site past the viewport and making it scroll? That’s the creators of the site not building up enough imagination of what fonts can do when they’re not watching. Zalgo text ? A joke, but based in reality. Fonts are so much more feral than you think. Are you ready for it? Thank you to Giovanni Lanzani for sending in the original PowerPoint screenshots. #details #typography We literally measure the fonts (programmatically) by rendering them and looking at them, and make adjustments. We blow them up (but not too much) if they’re optically too small, or reduce them (but not too much) if they’re too big. We have a multiplier for scripty fonts and monospace fonts, where the traditional measurements are likely to be off. We even special-case specific fonts by name. That feels like bad practice, but fonts are so varied and all over the place, that I think it’s perfectly fine to make exceptions for particular individual fonts that are popular or otherwise very important to your users.

0 views
Unsung Today

Got your back, pt. 6

A nice little old-school moment from Flickr: When you accidentally type the name of the new photo album in the search field instead of the “new album name” field, Flickr just passes on that value to save you from having to retype it: (I bet you witness a version of this all the time when dealing with “I forgot my password” flows which should pass on your login from one field to another, but they don’t.) I’m not saying this dialog is beyond reproach; one way to reduce this problem would be to make those two treatments different enough visually to reduce the chance of confusion. But it doesn’t matter, because the truth is that often there is no dividing line between the problems and the symptoms, and both overlap to a scary degree. As a designer, I think it’s sometimes good to simply face this truth: no matter what I do, the user might type something into a wrong field. Anything I can do to help then? #got your back #interface design

0 views

Kafka Share Groups and Parallelizing Consumption - Part 3: Client-local parallelism

All tests were executed against Kafka 4.3.0 using Dimster.  In the last post Broker-Visible vs Client-Local Parallelism we looked at two ways of scaling Kafka consumption. The final unit of parallelism can be visible to the broker, as consumers, or it can be local to the client, as threads, virtual threads, async tasks, or some other execution mechanism hidden behind a smaller number of consumers.  Broker-visible parallelism is simple to reason about: if each consumer processes records serially, we add more consumers to increase parallelism. But each consumer adds overhead to the brokers: broker-side protocol state, TCP connections, group membership, fetch state, and participation in the consumer or share group protocol. With long processing times and/or high throughput, the required number of parallel workers can easily exceed what is practical to model as broker-visible consumers. That is where client-local parallelism becomes important. Instead of scaling by adding more consumers, each consumer application can poll records and process them concurrently inside the client. This allows a smaller number of Kafka consumers to drive a much larger amount of parallel work. In this post, we’ll compare client-local parallelism with consumer groups and share groups using the Apache Kafka clients, by way of Dimster, the benchmarking tool used throughout this series. Dimster uses the official Apache Kafka clients under the hood. The main comparison is between two styles of client-local parallelism: blocking and continuous styles. At the API level, applications obtain records by calling , then later record their progress by committing offsets or acknowledging records. Under the covers, the client sends fetch requests and commit/acknowledgment requests to the brokers. There is some indirection between API calls and network requests, but every parallel processing style has to fit into this general poll/commit cycle. Consumer-group consumers commit offsets (one offset per partition) whereas share-group consumers commit a set of per-record acknowledgments. Any parallel processing style must fit into this fetch/commit style. We can classify parallel processing within this fetch/commit style into two main methods: Blocking and Continuous Poll -> kick off parallel processing -> block on completion of all -> commit -> (repeat) There are many implementation options for decoupling polling, processing and commits, but the general pattern can be classified into two main mechanisms: Poll -> Dispatch loop : each record submitted for background processing. Keep polling independently of processing (though implement backpressure by limiting the number of inflight records, i.e. stop polling when your processing buffer is full). Accumulate -> Commit loop : Accumulate completed records to commit opportunistically. If you are rolling your own logic using the Apache Kafka clients (rather than choosing a parallel processing library), blocking is by far the simplest to code, but the most inferior in terms of performance profile. The Kafka clients support both blocking and continuous styles with consumer groups, but only blocking with share groups . Why don’t share groups support continuous mode with the Apache Kafka clients? It’s simple. You can only acknowledge a record from the current poll batch. If you try and acknowledge a record from the previous poll, it throws an exception. This may or may not change in the future, but it’s worth knowing this if you were planning on implementing a continuous parallel processing style with the AK clients and share groups. Dimster supports parallel processing simulation with both blocking and continuous mode, but share groups only support blocking mode. Dimster doesn’t actually process records (except for recording metrics), instead it simulates processing time by calculating how long each record would take, based on randomized processing times between the min/max. In Blocking mode , it figures out how long the processing would take to process a poll batch (based on the level of parallelism requested in the workload file) and performs one sleep per poll-loop-iteration for the aggregate processing time. In Continuous mode , it feeds each record, along with its randomized processing time, into an in-memory delay queue (accounting for how much parallel processing is requested). Separately it polls, drains completed records from the delay queue and commits continuously. Let’s run some benchmarks with Dimster in blocking vs continuous modes. We’ll use an example workload with: A long processing time of 1-5 seconds (3 second average) A moderate rate of 1,000 records a second.  Each consumer application is capable of processing around 300 records concurrently. The aggregate parallelism is 3,000 ( ) which puts us in the territory where serial consumers are not a great choice. Firstly, share groups only allow groups of up to 1000 members, and regardless, 3000 consumers would create more than 9000 TCP connections (in a three broker cluster), which is excessive for one use case of this size. We need to parallel process inside the clients. We’ll run 3 tests: Consumer group blocking style Consumer group continuous style Share group blocking style The workload file (single scenario with three test points): In the test analysis we’ll cover the configurations in this workload. All three tests resulted in the same 1000 records/s throughput. But end-to-end latency differed a lot, with consumer group continuous style easily winning. The latency distributions: The latency distribution of only cg-continous: Continuous is the clear winner here. You can download the Dimster result tarball here . Let’s dig into the results and why the workload was configured the way it was. With blocking mode, consumption is a factor of the poll rate and the number of records per poll: The poll rate is determined by how long the application blocks waiting for all records to complete. The number of records per poll is bounded by (though it is a soft cap). When estimating the poll rate in blocking mode, the average processing time is the wrong choice as we’ll really block for the longest processing time of a poll batch (not the average). With 100s of records per poll, we’ll likely hit or get close to the upper bound 5 seconds (assuming uniform distribution). More likely in the real world, we’d see a non-uniform distribution where, for example, p95 might be 500ms, but p99.9 be significantly higher at say, 5 seconds.  The workload we’re using has a rate of 1,000 records/s. Each consumer is capable of processing around 300 records concurrently so we set . With a poll rate of 0.2 (one every five seconds), the consumption throughput per consumer is 60 records/s. To reach 1000 records per second we need at least 17 consumers (and partitions), so I configured 18. The effective workload of test point 1 : The consumers managed the 1000 record/s but for some reason, the max end-to-end latency (processing-start-timestamp - publish-timestamp) was double the worst processing time. It turns out that this is a natural effect of Blocking mode with consumer groups. The highest e2e latency will be at least the blocking time of the previous poll iteration (as records kept arriving in the partition throughout the blocking time). However, you may note that the above e2e latency numbers are p50 is 7.5s and max is 10s. This can occur in blocking mode due to the way polls return buffered records and trigger an asynchronous fetch (pre-fetch) to fill the buffer before the next poll. Think of the consumer as having a two-step delivery path: first Kafka records are fetched asynchronously into the consumer’s internal buffer, and poll() returns buffered records to the application. In the above diagram, we see the application spending 5 seconds processing a batch it just received, but that batch had already spent 5s in the buffer as it was filled by a fetch triggered by the previous poll (5s before this one). This kind of e2e latency might not be a big deal, considering the long processing times. If we want to lower the e2e latency significantly, then we need the continuous style. With continuous style, we have decoupled polling from processing and we can use the average processing time of 3 seconds to calculate the consumption rate per consumer (we are not constrained by the max processing time). Parallelism is not defined by the number of records per poll but the total inflight capacity of parallel work (threads, virtual threads, async tasks). We can feed that capacity with a constant stream of small polls and stop polling once that capacity has been reached (polling again once there is free capacity again). Because the application can poll at a high frequency, buffered records remain in the buffer for only a few milliseconds before being submitted for parallel processing.  In continuous mode, worker throughput is approximately: In Dimster the inflight capacity is set by the workload field . With a capacity of 300 and average processing time of 3s, each consumer can process 100 records/s. To reach 1000 records per second, we need 10 consumers and partitions. I set the capacity to 400 to add some wiggle room. The effective workload of test point 2 : This time we see that e2e latency remains very low, as we don’t block on the longest processing time. Again with the Blocking style, so: The per-consumer poll rate is determined by the highest processing time per batch. The aggregate parallelism is 5000 ( ) Share groups introduce a new constraint, the per-partition inflight budget (aka ). The aggregate inflight budget must exceed the aggregate parallelism of 5000. The default is 2000 per partition and so just three partitions gives us an aggregate inflight budget of 6000. Another difference is that we can have multiple share consumers per partition. If we use then each partition needs parallelism of 1666 ( ). With , we need 6 consumers per partition to cover it. The effective workload of test point 3 : You might expect end-to-end latency to be lower than the blocking consumer group test and you’d be right! Each partition has six consumers so the time period between fetches is lower (records spend less time in the partition before being fetched). We are also using so there is no pre-fetching which inflated the e2e latency in the consumer group test. But it’s still higher than you might expect. Per-partition, we have 6 consumers with a fetch/poll rate of 1.2 per second and 333 records arrive per second. We might expect the worst e2e latency to be 277 ms (333 / 1.2). So what’s going on? The above calculation assumes each fetch arrives evenly spread over time. But fetches cluster to a greater or lesser degree, there is no coordination between the consumers. If a long period passes with no fetches, then the first fetch that arrives can drain the accumulated lag, and subsequent fetches just return the handful of records that arrived since the prior fetch a few milliseconds before. The only way for the consumers in this workload to reach the 1000 records/s is if each fetch returns around 277 records per fetch on average. With fetch request clustering, the only way fetches can be filled to this extent is if lag has built up. If 6 consumers attempt to fetch 300 records at exactly the same time, only if lag has reached 1800 on that partition will all those fetches return full. So the consumers settle into a stable amount of lag that is high enough such that fetches return with enough records to keep up. If consumers catch up and lag goes to 0, consumption throughput will naturally drop down until lag builds up to allow for the full consumption rate. Client-local parallelism is often the only practical way to handle long record processing times. But how that client-local parallelism is implemented in Kafka fetch/commit cycle has a big impact on latency. A blocking poll → process → commit loop is simple, but it couples consumption progress to the slowest record in each batch, which lowers poll frequency and can inflate e2e latency even when there is plenty of processing capacity. Continuous polling decouples polling, processing, and committing, allowing the client to keep records flowing into a processing pool while applying backpressure through an in-flight limit. For consumer groups, this provides much better latency and usually requires many fewer consumers and partitions for the same workload. Share groups improve the broker-visible side of the problem by allowing multiple consumers per partition, but the current Apache Kafka clients still constrain client-local parallelism to the blocking style. If your goal is highly parallel, low-latency processing, consumer groups remain the better fit. Removing the same-batch acknowledgment constraint from the Kafka share consumer would make that style possible with share groups as well. In the next post, I’ll look at some pathological share-group workloads with some gotchas to watch out for. Poll -> Dispatch loop : each record submitted for background processing. Keep polling independently of processing (though implement backpressure by limiting the number of inflight records, i.e. stop polling when your processing buffer is full). Accumulate -> Commit loop : Accumulate completed records to commit opportunistically. In Blocking mode , it figures out how long the processing would take to process a poll batch (based on the level of parallelism requested in the workload file) and performs one sleep per poll-loop-iteration for the aggregate processing time. In Continuous mode , it feeds each record, along with its randomized processing time, into an in-memory delay queue (accounting for how much parallel processing is requested). Separately it polls, drains completed records from the delay queue and commits continuously. A long processing time of 1-5 seconds (3 second average) A moderate rate of 1,000 records a second.  Each consumer application is capable of processing around 300 records concurrently. Consumer group blocking style Consumer group continuous style Share group blocking style The per-consumer poll rate is determined by the highest processing time per batch. The aggregate parallelism is 5000 ( )

0 views

Who Runs the Ransomware Group ‘The Gentlemen?’

A cybercrime group known as The Gentlemen has emerged as the second most active ransomware gang by victim count, rapidly attracting a talented pool of hackers through an aggressive recruitment strategy that promises affiliates 90 percent of any ransom paid by victims. This post examines clues pointing to a real life identity for the administrator of The Gentlemen ransomware group. A graphic created and shared by The Gentlemen ransomware group administrator Hastalamuerte on Breachforums in May 2026. Credit: ke-la.com. Experts at the security firm Check Point Software have been closely covering exploits of The Gentlemen, a so-called “ransomware-as-a-service” (RaaS) offering that pays affiliates handsomely to help spread the group’s malware. “A 90/10 affiliate revenue split — compared to the industry standard 80/20 — is accelerating the group’s growth by attracting experienced operators from competing programs,” the researchers wrote in April. Check Point found The Gentlemen are the second most active ransomware group by victim count so far this year, claiming at least 332 published victims since the group’s inception in mid-2025 and more than 240 in 2026 alone. According to Check Point, the group targets Internet-facing devices (VPNs, firewalls) as their entry point, and once inside moves quickly to encrypt entire networks within hours. Check Point says the administrator and primary operator of the ransomware group uses the nickname Zeta88 on the Russian-language cybercrime forums, and that this individual was previously known under the moniker Hastalamuerte . Check Point noted that a breach of the group’s backend infrastructure made it clear that Hastalamuerte/Zeta88 is the person who assembles the locker and RaaS panel, manages payments, and is essentially the administrator of the entire program who receives 10 percent of all ransoms. The cyber intelligence firm Intel 471 shows that the user Hastalamuerte is a Russian and English speaking person who registered on almost a dozen cybercrime forums between 2019 and the present day, including Exploit, Breachforums, Ramp_V2, BHF, Raidforums , and Nulled . Intel 471 reveals that Hastalamuerte registered on Breachforums in January 2025 from an Internet address in Izhevsk , the capital city of Russia’s Udmurt Republic. Likewise, the user Zeta88 signed up at the English-language cybercrime forum Breached in August 2022 from a different Internet address in Izhevsk. Intel 471 finds Hastalamuerte registered on Raidforums in 2020 using the email address [email protected] (1488 is a common combination of two numeric symbols associated with white supremacy ). A lookup on this address at the open source intelligence service Epieos shows it is connected to an account at Apple and to a phone number ending in 04 . Epieos says that Protonmail address is also linked to a GitHub account under the username SantaMuerte . That account is marked private, but a history of this user’s activity shows they are watching and developing a number of malware tools and exploits. In April 2020, Hastalamuerte said on the crime forum Nulled that they could be contacted at the Telegram instant messenger name @hastalamuerte18 , and the threat intelligence company Flashpoint finds this username is assigned the unique Telegram ID number 30907522 [full disclosure: Flashpoint is an advertiser on this blog]. The breach tracking service Constella Intelligence reports that Hastalamuerte’s Telegram ID is connected to another username — “ bu4vs ” — and to the Russian phone number 79127650004 . Pivoting on this phone number in Constella fetches multiple records from hacked Russian government databases showing it is assigned to one Alexander Andreevich Yapaev , a 36-year-old from Izhevsk. Constella reveals that phone number was used to create an account at the Russian social media platform Pikabu under the name “ 4apai18 ,” and shows Mr. Yapaev has signed up at a number of websites using the common surname Ivanov , or else “Chapaev” (the numeral 4 is often used as shorthand for a “ch” sound in Russian). A search in Intel 471 for cybercrime forum members with the nickname SantaMeurte unearths an account by the same name created in 2020 on the Russian hacking forum Codeby. Intel 471 shows this user originally registered on Codeby with the not-so-subtle nickname Alexandr 4apaev . Constella finds Mr. Yapaev regularly used the email address [email protected] . Meanwhile, Epieos shows this address is connected to a LinkedIn account for Alexander Yapaev, who lists himself as the head of B2B marketing at the company Uralenergo Udmurtia , one of Russia’s largest suppliers of electrotechnical and lighting products. Mr. Yapaev did not respond to multiple requests for comment. Nearly every time we publish one of these Breadcrumbs stories , readers are curious to know why it seems like so many cybercriminals from Russia apparently do little to hide their real life identities. The truth is that — Russian or not — most didn’t exactly set out to be arch criminals, but instead got drawn into the scene gradually over several years as their skills broadened and sharpened. Another important dynamic is that the Russian government generally either co-opts or ignores cybercriminal activity within its border so long as the hackers do not steal from or attack Russian businesses and citizens. As a result, successful cybercriminals in Russia are usually insulated from prosecution and arrest by foreign law enforcement agencies provided they occasionally pay off the right people and do not travel abroad. And cybercriminals who intend to strictly adhere to those unwritten rules may (at least initially) be less concerned about covering their tracks online. But the simplest explanation is that cybercriminals of all nationalities tend to make a number of basic operational security mistakes early in their careers, when they are less savvy and have far less to lose by their carelessness. A review of Hastalamuerte’s early posts on the crime forums (circa 2019-2020) shows a relatively unsophisticated and low-skilled hacker still trying to learn the ropes and earn a positive reputation on these communities. For example, in June 2020 Hastalamuerte’s Telegram account joined a multi-month training program (@pntst) to learn how to use popular penetration testing tools, and their candid posts to this hacker training camp show Hastalamuerte struggling to use these tools effectively. A Google-translated record of Hastalmuerte’s posts to @pntst is here .

0 views
ava's blog Yesterday

our workplace LLM mass delusion

I can't help but wonder whether we will look back on this AI hype in the workplace with confusion and embarrassment. If we indeed progress into a future where the bubble will burst, models will further close up, become too expensive for the average user, enshittified, or really specialized for specific fields and most promises end up not fulfilled, how will employers everywhere play this off? How will employees recover from witnessing this cultish environment suddenly dropping off as if nothing happened? My employer, for example, struggles with funding. Open positions are not to be filled and will just fall away; employee bonuses for great work have been cancelled 2 years ago due to the tense financial situation; necessities fell away with a message to just " find a way to deal with it ". Several departments are completely overworked with no help in sight, and are just asked to cut corners. Important licenses and databases are just dropped to save money. This is the backdrop to our AI adoption in the workplace. Still, somehow, there is enough money to hire consultants that advise to go all in on AI for a possible future where money can be saved, and enough money to pay external companies for LLM workshops and seminars for employees for years, and enough money to pay for licenses of both ChatGPT and Copilot. That means: The employee bonuses that should go to all the hardworking employees, and the money to further support our work, is going to grifters, security risks, bad workshops that are not teaching anything remotely usable for our work, and technofascists. Not only that! We have recurring house-wide meetings where groups are asked to show off their LLM projects. They register them, try them out for a couple months, and then come back presenting their results. I have attended all of these meetings so far, and there was not a single one that actually worked out . All projects ended with the conclusion that this isn't workable, that this isn't saving time, or that it over-complicates things. Hundreds of people, different teams, people enthusiastic about AI, all kinds of projects, and there wasn't a single success . All kinds of workshops, "prompt-engineering", custom GPTs and skills, pre-prepared documents and templates could not make something truly effective and reproducible in our field of work (not anything coding related!). It was a messy gamble every time . It took a significant amount of time to fine-tune everything, to repeat the task, to verify the output, and correct mistakes before continuing with the rest of the workflow. Not considering this or that document, hallucinations, inability to fill in documents correctly or edit them were the biggest complaints. Even on an Enterprise license, the restrictions were too great. But wait, there's more! We also have house-wide meetings where employees show off how ChatGPT can be used regardless of specific projects; just general use cases for the workday. Let me tell you what great things were shown off. For one , it was shown that you can ask the bot how it feels today. That wasn't presented as a joke, or being sarcastic; no, it was shown very seriously, I guess under the guise of how cool and futuristic and human it is. I'm getting really upset here at the point of writing this, because I have to fight hard to get the funding for the database my team needs for my work and have to justify it every year, and I know that in any other contexts, or just 5 years ago, they would have laughed in your face if you suggested to get a subscription in the thousands to enable employees to have a pointless conversation with a bot. Hello, we have shit to do over here, departments are drowning in work, and you wanna have software that talks to you? That would have been the response, and it is the correct response still!! People like that need to be treated like the fools they are, and we need to challenge them more! Next up was the great use case of downloading the cafeteria menu (which is a 1 page nicely designed Excel sheet, like a timetable, showing the different options for each day) from the intranet, giving it to ChatGPT, and asking it what's for lunch on Wednesday. I wish I was joking . I WISH! The bot spat out a longer answer than reading the entire sheet would be. Downloading and uploading and writing the prompt took longer than just reading the sheet. You can see what's for lunch on Wednesday with one glance already. No bot needed! The other general use case presented to us (by our head of IT, no less) was that if we are not sure whether something is a spam mail, phishing attempt, a mail with a suspicious attachment, whatever... we should save it to our Desktop, upload it to ChatGPT and ask it. Good god. I am still in disbelief. I'm sorry, but I don't want the less technically inclined employees among us to save anything shady onto their work laptop. Come on now. What is happening? Have we lost our minds? Intentional or not, AI is seemingly great at amplifying the Dunning-Kruger-Effect in people, making everything they attempt with it seem smarter and justified to them, packaging every fart in a nice bow that makes it seem deep initially. People can pretend they’re now doing something really important and groundbreaking while using the tool for completely mundane and worthless tasks that are better handled differently. Defenders of the tech can feel like they’re part of something big and revolutionary and fantasize about the day they will be proven right and all the critics will shut up or apologize (like my conspiracy theorist dad, who still clings to the same prophecies after over a decade, hoping to be ahead of the curve and right for once in his disappointing life). It’s sad, because it feels like a completely out-of-control delusion; you see smart and capable people with lots of responsibility at work suddenly turn into a shill for these AI companies without any rhyme or reason. A highly qualified person, suddenly reduced to the same presence as a door-to-door salesman lying about how well the cleaning product really works, making up use cases that are neither useful nor working right. How is a person like you suddenly reduced to praising the option to ask a bot to summarize a damn 1 page lunch table and present it as a good use case in a company-wide meeting? What have you done to arrive at this point? It’s pure hype, eerily much so, and these people cannot possibly admit that. We have no specific problems it can solve in this workplace, at least 90% of the employees do not have work that would profit off of what Copilot etc can do; yet we attempt it anyway, each attempt worse than the prior one, inventing possible uses and creating problems where there are none, just to be able to burn tokens and justify a subscription, to cosplay the people in Sci-Fi media and have something to show upper management (" At least we tried "). We fall behind on our daily work to train an LLM, beg and plead with it and dance for it like court jesters, and poke around in the shit it spits out. If you ask around at my workplace, any use is good because it is a use and we are exploring and playing . It completely minimizes the time waste, the money sink, the effect of each use, and the powerful institutions behind these tools. And I just wonder: How did this befall us so quickly? There is never money for anything, but this unreliable tech with a huge upfront cost got through immediately? New tech usually passes the public sector by, but this one got all the attention? It takes years or even a decade to implement any sort of change or new ideas into this beast, yet we had the infrastructure and organizational bandwidth to deal with AI up within a blink of an eye? It is creepy to realize how capable a place really is, and how easily things can be implemented - if the leadership just wants to. It's a complete mask-off moment, underlining that it is never impossible or slow-going by default; it is intentional, by design, and could be improved any time. This is a completely trust-shattering moment for any employee. This is why I asked at the start of this post how we are supposed to move forward from this at some point. How are we all collectively supposed to forget and move past experiencing a point when the respectable elders in an institution have completely and totally embarrassed themselves in the name of "progress"? When all the gates and wallets have been opened for this utter disappointment, showing that the obstacles for implementing anything thought to be inherent and unavoidable in the organization are just a fluke, a lie, an arbitrary thing? How it all created a culture of feeling repeatedly gaslit over months about this whole assessment, as if you must be the one that is insane? I cannot forget this at all. This is my second Covid. As a final note: If none of that is happening in your workplace or life in general - genuinely good for you. Should be like that everywhere, hope that happens to me too eventually. I applaud you for the skilled and competent people in your life that choose AI use wisely, make the most of it, and offer good solutions. Happy for you if you work in an industry where its use makes sense and produces good output. But unfortunately, places and situations like the above exist, so let people commiserate about the insanity in them without attempting to deny what we’re experiencing. Reply via email Published 10 Jun, 2026

0 views
Stratechery Yesterday

Fable 5, Anthropic Alignment, AI Tiers

Fable 5 is the public version of Mythos, and while it is very capable it sets some troubling new precedents.

0 views

Scan any iOS or Android App for SDKs and API Calls for Free with AppGoblin, no login

AppGoblin lets any user, from anywhere, request to scan any mobile app for the SDKs, trackers and API calls the app makes. AppGoblin has a 100% free android app that lets you select apps from your device. This lets you select groups or multiple apps at a time and see which trackers each app has. AppGoblin uses the data to show overviews of what companies each app has integrations with as well as detailed views for looking at which exact SDK parts are imported in the app. Feel free to reach out to me or AppGoblin and new companies/sdks/trackers can be added with ease. Search the app you’re interested in on AppGoblin Go to the main App page and press the “Scan SDKs & APIs” button AppGoblin goes and fetches the latest version of the app. AppGoblin analyzes the Android APK or iOS IPA file for known trackers, ad networks and other business tools that scrape user data. For Android apps, AppGoblin runs the app in an emulator to track what data leaves the app in the first 60 seconds The results are prepared and put on the public app page within 24 hrs

0 views

A human in control

There seems to be a fair amount of people in either extremes in the current AI landscape. At one side we see the “vibe coders” who use agents and allow them to merge code without any person even looking at the source, while on the other side of the field there are people who are against everything and anything even remotely associated with AI. My personal stance is somewhere in between, as I suppose shouldn’t be too surprising to readers of this blog. The core team behind curl, and that is more people than just me, consists of individuals to whom code quality and source code excellence is important. We do software development because it is a craft we love and we are proud of what we have accomplished this far. We do not hand over our responsibilities to any machines. We stand for every bit of code we merge – as humans. Blindly accepting code written by AI means that you merge a certain amount of errors, but this is certainly true for human written code as well, so this is not in itself special. Some data suggests that AI generated code might even contain more mistakes than the human versions. We invented test cases and code review a long time ago as a means to help us combat and reduce mistakes to get merged. The particular way code was written does not take away the benefits from code review and getting additional checks and eyes on pending changes. A good code review helps spotting mistakes, omissions or slip-ups. It also helps reinforce the architecture and established design choices. This is true however the code was created. This far, code reviews done by automatic AI bots and the likes have not yet managed to replace the humans. They are simply not good enough. Human reviews are much better. They catch other things and they help make sure proposed changes stay on track. Not to mention how I want to know how curl works, even if I don’t keep 100% intimate knowledge of every single angle and corner, I know most of it. I think it helps me make better decisions, debug better, help users better and keep the architecture sound. Getting the initial code written is not the big deal. For curl, maintaining and polishing the landed code through decades is the real task. Everything we merge in curl is determined fine and fitting by humans. In all living software projects we get bugs reported and we fix them. We do new releases and continue to iterate. We have done this since software was invented and we still do, as humans are quite fallible and easily make mistakes. We try to reduce the error density and frequency by adding tests and by adding more human eyes on the code before we green-light it. It helps, but is not perfect. To help us do better code we invent, introduce and enforce a wide variety of different tools. With tools that look at code and identify problems in the early stages, they help avoid landing bad code in the first place. They make us do better code. They reduce the bug frequency. Some of the best tools for detecting coding mistakes today use AI. These tools might work on existing source code in a git repository or they might look at proposed changes in pull-requests. Above I mentioned that human code reviews are better; but the opposite is also true. In a somewhat complicated change request, it is now common that after the humans can’t spot any more problems, the AI PR review bots can still find an issue or two to remark on. Sure, sometimes they are wrong and then the comment is easily dismissed, but more often than not the findings they point out are actually something worth addressing before merge. curl is developed and driven by humans, assisted by tools. Open Source is about sharing code and is a development model where we do things in the open. The communication part of this model is key. Share your ideas, your visions, your problems or maybe just your ideas for what to do this afternoon. Express what you want or what the problem is, and the team can respond and we can work together on fixing and improving whatever needs to be done. Effective communication, a condition for good Open Source, implies human-to-human interaction. Inserting a large AI generated tone-deaf large wall-of-text into such a flow can still work, but only in the same way humans can learn to work with difficult individuals as well. It is not ideal and it is not a smooth way of working. It introduces sand in the machine. Don’t do that. It is rude. Effective Open Source work means we communicate as humans, even if parts of the work and the code is made with the help of AI. Humans and machines excel at different things. We can complement each other in software development. Everyone is free to act to their own will, but in the curl project we don’t hand over responsibility to machines. We stand for our product. We make it as good as we possibly can; using all the tools that are available to us. I claim that in order to do this, humans need to remain in control.

0 views
Brain Baking Yesterday

Mechanical Buttons, Not Touchscreens (a Design Mistake)

Or rather, a lack thereof . Why is is that mechanical buttons are being replaced by touchscreens? With every car model refresh, washing machine re-iteration or even new pressure washer model, a button disappears and a small touchscreen-enabled panel appears or grows in size. That’s what I’d call a big design mistake . Nicolas Magand beat me to it with his critique on on car dashboard and interior design . Especially in cars, the last two iterations of the new designs aggressively pushed the implementation of touchscreens. Many car brands such as Tesla or Cupra from SEAT seem to go all-in and simply jam an iPad-like into the cockpit and call it a day. The ugly screen that protrudes in a weird way from the dashboard not only looks out of place (that’s a polite way of saying bad) but also poses a treat to the safety of the driver and others as you take your eyes off the road to try and “press” something instead of controlling the AC with the help of haptic feedback. I leased a second generation Peugeot 308 in 2016 that came with a “brand new touchscreen system interface” controlling most of the car’s features such as the airconditioning. I hated it: it was slow (a “press” took a second to react) and devoid of feedback (there wasn’t anything to “feel”). A button has a certain springiness—that’s why people who work with computers all day love mechanical keyboards. Feedback. Pressing a virtual button on a touchscreen? No feedback. Besides the terrible software that usually controls the screen, introducing latency when speed is important, the replacement of a true button by a virtual one introduces another problem. While the financial department will hail the adoption rate of cheap touchscreens as a major achievement, I hope the UI/UX department is a bit more reserved. Yet in practice, since “anything is possible” on a blank screen, I’m sure their enthusiasm to use the latest and greatest combined with a pressing deadline yields a lesser product. The terrible interfaces are easy to spot and everywhere. I don’t know why we as a consumer stomach them. Hopefully, our ten year old Volvo manages to keep it together for another ten years. We bought the second hand end-of-lease car in 2020. To avoid analysis paralysis when faced with the many car manufacturing choices, I just went with the one I once drove as a company car that (1) felt very comfortably to sit in and (2) radiated peace in the cockpit. Subjectiveness of both facts aside, the centre console—installed slightly diagonally towards the driver—features… mechanical buttons. The cockpit of our 2016 Volvo. The centre console features... buttons? There are four big knobs that are very satisfying to turn with clear feedback: the volume and menu controls on top and AC controls on the bottom. Between the knobs, there are a few more buttons to navigate the interface and a classic numeric keyboard reminiscent of nineties landline phones. The buttons are sturdy and I can press them whilst keeping my eyes on the road if I need to. Not that I often need to: the most-used buttons are the big knobs to turn up/down the volume and AC. The volume is also adjustable with buttons installed on the steering wheel itself. A friend recently remarked “oh wow that’s a small screen!”. Yes it is, it’s a car from 2016, remember? But the screen is non-invasive: it doesn’t get in my way. It’s fully integrated, not touchscreen, and displays the map just fine when I need a GPS. Unfortunately, the second generation V60 got rid of most of the buttons in favour of… a touchscreen. Nicolas is right: we need something better than touchscreens in cars . The problem isn’t limited to cars. Our washing machine broke down last month, refusing to do anything at all. After two repair jobs, it was apparently time to be replaced. Again, not wanting to sink in hours and hours of time researching potential replacement candidates, we just went with the same AEG brand. Guess what: the boring white front panel with mechanical buttons has been replaced by a sleek black design—touchscreen, of course. The lack of feedback is not the biggest problem. As a short-sighted person, I usually “feel” the buttons and know by heart where to push to just get on with it. Turning the washing machine on was one of those things. I can no longer do that as the front panel feels the same regardless of where you put your finger. Sure, remembering the exact position works, but it doesn’t fix an accidental wrong press. Luckily, the giant knob that selects the washing program is still there. That’s of course part of the core branding/design of any washing machine. We bloggers love to shout Build Websites That Last but perhaps we should generalise instead: Build Stuff That Lasts . But why should the automotive industry do something like that when they’re motivated by company leases to renew their models roughly every four years? Especially in Belgium, more than half of the cars on the road are company leases. I also expected the washing machine to last 20 years. It stranded at 14 including two repairs. I’ve read reports claiming the average front-loader lasts between 7 and 8 years (others say 10 to 12) so I guess we were lucky. At least they’re not shoving AI crap in our cars and washers. Related topics: / cars / design mistake / By Wouter Groeneveld on 10 June 2026.  Reply via email .

0 views
Den Odell Yesterday

The Great Agent Skills Land Grab

Like many developers, I now lean on AI as a big part of my day-to-day tooling. With appropriate steering, I can have any number of coding agents working with me to accelerate my engineering work. For tasks where more specialized knowledge is needed, I can load in an agent skill to do something the AI model can’t reliably do on its own. That’s the theory, anyway. I installed a popular collection recently. A handful of skills about good engineering practice, well-organized, published by a developer I respect. I loaded them into Claude Code and ran through some tasks I’d normally do by hand. The output was decent. Then, as an experiment, I removed all the skills and ran the same tasks again. The output was identical. That’s when I started looking more carefully at what was actually inside some of these repos. What I found there made me rethink why some of them even exist. My conclusion: creating agent skills collections has become a land grab . A land grab is when new territory opens up, everyone rushes to claim as much of it as possible, and nobody stops to check whether what they’re claiming is actually worth anything to anybody. An agent skill, as Anthropic designed the format , is a folder containing a file with instructions, optional bundled scripts the agent can execute, and optional reference material loaded only when needed. It’s a sound idea, and it gives an agent enough information to understand when to apply a skill without having to load the entire skill into context. What’s appeared on GitHub in the past few months since the format was defined is the land grab in action. Thousands of skills, most of them instructions only, with no bundled code and no platform-specific reference material. Well-known developers and organizations are publishing collections of 50, 100, 200+ skills. Platform companies are shipping marketplaces and leaderboards. An "awesome-agent-skills" list is tracking the whole thing, covering everything from Solana development to AI fitness coaching. The developers with the biggest followings are staking the biggest claims, and their followers are installing whatever they publish. But look at some of the code in these repos and you notice something: many of them look to have been written by the same models they’re supposed to be teaching. And, of course, that’s what makes the land grab possible. Nobody can realistically publish 200 skills in a weekend by writing them by hand. Which leads to a simple test for any instructions-only agent skill: could an LLM have written it? If yes, the skill is almost certainly useless, because the knowledge it contains is already in the model’s training data. Loading it into context just spends tokens telling the model something it already knows. That test disqualifies a lot of what’s being published right now. Open any popular general-purpose skills repo and try the test. You’ll find advice like "write tests before code," "use semantic HTML," "measure before optimizing," and "code-split your bundles." This is good engineering guidance. It’s also information that LLMs have already absorbed from thousands of blog posts and books. A skill repeating this knowledge adds nothing, and it takes up space that could hold information the model actually needs. If you can generate a skill with a prompt, you don’t need the skill. The tells are right there in the first few commits: the same structure across dozens of files and a kind of polished completeness with no rough edges. When an LLM writes a skill that gets loaded into another LLM’s context, nothing new enters the system. It’s a closed loop, and handing the output back as context is a no-op. One developer made the generation loop visible by accident. The first commit of a large and widely starred skills repository contains a file: inside, a set of instructions telling Claude Code to generate eight categories of skills across a twelve-week roadmap, complete with projected ROI numbers. The file’s own instructions say to exclude itself from version control, but the commit went through before the file was in place; that was added in the second commit. The generation prompt was supposed to be hidden. It wasn’t. I’ll leave the repo unnamed here, since the point isn’t to name-and-shame anyone. The pattern is common enough now that, in the rush to claim ground, the cracks occasionally show. Another collection of 50+ skills is simply the content from a popular software architecture website, reformatted as files. That website is the kind of public, heavily linked resource that major models have near-certainly ingested. The model read the book; these skills hand it the same book back and call it a new capability. Other collections chase size. One developer ships over 200 skills spanning engineering, marketing, product management, and "C-level advisory," which when you look inside means skills telling your coding agent how to prep a board meeting or run a competitive teardown. Another packages what it advertises as over 1,000 skills into an installable library with an one-liner, though the actual skills directory holds closer to a tenth of that number, padded out with aggregated collections from other repos. There is research that backs up the pattern too. A recent analysis published on the Hugging Face blog by Shanshan Zhong , summarizing research by Bosch and Carnegie Mellon, looked at 40,000+ publicly listed skills from the skills.sh marketplace. They found an ecosystem shaped by what’s easy to produce: skill publication happens in short bursts that track shifts in community attention, content is heavily concentrated in software engineering workflows, and there’s a pronounced supply-demand imbalance across categories. The overall picture is of a marketplace racing to generate content rather than solve problems. The skills that genuinely improve agent output share one thing in common: they contain information the model couldn’t possibly have, or they pair instructions with code that does work the model can’t do on its own. Jonathan Fulton , a Staff Engineer at Datadog, has written about using skills built around , a CLI for querying Datadog metrics, tailing logs, and managing dashboards. The specific flags, query syntax, and output format for that tool aren’t something a model can reliably guess. With the skill loaded, Fulton can say "show me error rates for the experiments service over the last hour" and the agent gets it right. Without the skill, it has to search online for the docs and example code every time, or just guess. That’s a useful skill, because it fills a gap in what the model knows. Matteo Collina publishes skills encoding his conventions for Node.js and Fastify . He co-created Fastify and has maintained it for years. His opinions on how it should be used, the patterns he prefers, the edge cases he’s run into: that’s information a model can’t work out from just inspecting the public docs. Benchmarks on his skill show measurable improvement in agent output across different models. The low baselines confirm what you’d expect: Fastify-specific patterns ( for config, for backpressure, and for graceful shutdown) aren’t things models reliably know without guidance. None of that lives in the documentation, so the skill earns its place. Anthropic’s own pre-built skills work because of what they bundle with the instructions. The skill pairs its instructions with openpyxl code that Claude runs. The markdown file exposes the functionality to the agent, and the value is in the executable code that sits alongside. They succeed because the information or capability came from somewhere the model couldn’t reach on its own. So why do so many generic skills collections keep getting published? Because the whole ecosystem is set up to reward volume over usefulness. For individual publishers, a repo with 50 skills gets more GitHub stars than a repo with 3, regardless of whether those 50 skills tell the model anything it doesn’t already know. Stars, awesome-list placements, and social media reach all favor big collections over smaller, sharper ones. It’s how a collection ends up advertising 1,000+ skills it doesn’t actually contain. Now, I don’t think the people publishing these collections are acting in bad faith. The instinct to share engineering knowledge is a good one, and many of the developers involved have genuinely shaped how the industry builds software. But that’s what the format incentivizes, and it makes a land grab all but inevitable. For platform companies, a larger ecosystem justifies the investment in those marketplaces and leaderboards. Install counts and "works with 18+ agents" headlines need big numbers. There’s no reason for anyone to turn away a submitted skill just because it repeats what the model already knows. And developers keep installing because it feels like the right thing to do. Loading a curated collection of engineering skills gives you the same sense of preparation as installing a recommended set of linter rules or editor extensions. It feels like due diligence. The difference is that a linter rule actually enforces something the editor wouldn’t do on its own, while a skill that says "write clear commit messages" tells the model something it could already tell you. But the instinct is the same: if a respected engineer published it, it must be worth having. The comfort of a well-stocked toolbox is hard to argue with, even if many of the tools duplicate capabilities the machine already has. There’s one honest counter-argument to make, however. A skill can sometimes change behavior even when its content is already known to the model, by reshaping attention at the right moment. The model knows TDD exists, for example, but it doesn’t always default to writing tests first. A skill that triggers on coding tasks and explicitly says "follow red-green-refactor" might make the model more likely to actually do it. We could call this activation and recall : the skill surfaces knowledge the model already has, at the exact moment it needs it. The better-designed instructions-only skills include anti-rationalization sections that counter the specific ways models skip important steps, and in principle that’s a pattern the model might not produce on its own. The problem with this defense is that almost nobody publishing these skills is measuring whether they work. The handful of people who have run proper benchmarks, like the teams behind and Tessl’s evaluation tools , have mostly found strong improvements for skills that encode specialist knowledge and weak or absent improvements for skills that repeat general engineering advice. The largest lift in Tessl’s 880-eval benchmark came from a skill for , a niche CLI of Collina’s, which took the agent from around 52% to 88%. A generic TDD reminder might change behavior a little. It also costs tokens and competes for attention with other instructions. Worse, it can push the model toward workflows that don’t fit the task. "It might help a bit, sometimes" is not a strong case for loading dozens of them. At minimum, a publisher claiming activation-and-recall benefits should be running evals and showing the numbers. Almost none do. If you’re picking skills to install, use the test. Does this skill contain information your model couldn’t already know, or does it ship code that does work the model can’t do on its own? If the answer to both is no , skip it. You’re paying tokens for nothing. A lot of what’s out there fails the test. The ones that do pass work because their authors wrote down what only they knew. For your codebase, that author is you. Don’t bother with the general best practices; the model already has those. Write about your world. Describe your project’s file structure, the module boundaries, the patterns your team has settled on and the ones you’ve ruled out. Your internal CLIs, custom build plugins, private APIs, and deployment steps are all places where an agent burns tokens when it guesses wrong, so write those down too. Capture the reasoning behind your decisions: why you picked this state management approach over the alternatives, or what pushed you toward the current architecture. An agent that understands the thinking behind your codebase can extend it well. One that only knows generic best practices will keep suggesting things that cut across choices you’ve already made. Where the model keeps getting deterministic work wrong, write scripts and let a skill teach the agent when to run them. Each of these skills will be short, specific, and often useless to anyone outside your team. That’s how you know they’re working. Don’t expect your skills to be appearing on a leaderboard or getting thousands of GitHub stars. Don’t get sucked into the land grab. The rush to publish and install generic skills will slow down as developers notice that loading them doesn’t make their agent noticeably better. What remains after the rush is the slower, quieter practice of teams writing down their own knowledge so their agents can use it. That’s the type of skill that actually changes what your agent can do, and it’s the one that will have the most impact in your day-to-day work.

0 views

An interactive dive into Memo Pad

While most of my posts work great in RSS readers, this post contains elements that do not work so well! Please view the post on my site here: https://thatalexguy.dev/posts/interactive-dive-into-memos/ Join me on an interactive look at the user experience of "Memos" on Palm OS! Thanks for reading on RSS, you're awesome! If you want to be notified of new posts even faster, I have a newsletter as well, you can signup here!

0 views