Latest Posts (20 found)
Xe Iaso Today

Giving your Go apps Tigris superpowers

Tigris is S3-compatible, which means you can point the AWS SDK at it and most things just work. The catch is that the Tigris-exclusive features—bucket forking, snapshots, object renaming, and the like—need verbose workarounds because the AWS SDK doesn't know they exist. So we wrote a Go SDK that does. It comes in two flavors: the package is a drop-in replacement for the standard S3 client with first-class methods for the Tigris-specific operations, and is a higher-level client for the common single-bucket case that infers its configuration from the environment so you stop passing the same parameters over and over. You can adopt the Tigris features incrementally without refactoring your existing S3 code, and the simpler API still works against other S3-compatible providers. I wrote up how it works and why we built it over on the Tigris blog.

0 views
Unsung Today

“I am skeptical it achieves what Apple intends.”

Nick Heer’s analysis of Apple’s Pages interface over time is a nice counterpart to the recent post about Sinofsky doing the same for the early years of Microsoft Office . Here is the key comparison, 2011–2025: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/1.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/2.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/2.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/3.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/3.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/4.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-am-skeptical-it-achieves-what-apple-intends/4.1600w.avif" type="image/avif"> I’ll let you read the whole excellent analysis and Heer poking at the notion of “content over chrome” which feels dogmatically attractive, but needs deeper thinking which usually doesn’t follow. The interesting thing to me about that last screenshot above is that the team didn’t want a toolbar separated from content – and yet, they walked themselves into recreating a de facto toolbar anyway, just uglier and with more problems. (Just like designers who use all-white for complex surfaces , and arrive at visual hierarchy challenges that now require more work.) We’re a few hours away from WWDC. I don’t imagine we will see any direct response to the criticism of Liquid Glass as Apple doesn’t work that way , but it will be interesting to spot any indirect signs of reactions or course corrections. #apple #nick heer

0 views

📝 2026-06-08 14:22

So I've been listening to #Spotify most of the day while working. Instead of playing my liked songs, I've just let it play whatever. This is the way! It's been banger after banger, and lots of great news tracks. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Unsung Today

“Each one of these buttons has four distinct purposes.”

A nice blog post by Nathan Manceaux-Panot on Pending Design about the subtle design of the tabs underneath the search results in the programming editor Nova: Through buttons right below its text field, the bar also lets you filter results: only show files, only show symbols, or only show symbols in current tabs. Here’s the thing, though: each one of these buttons has four distinct purposes . They’re not just for clicking. The tabs are clickable as they normally are, but they’re also a treasure map (to tell you something is possible), a cheat sheet (to remind you how to do it again), and an onramp for faster keyboard navigation. I’d add two more things to the celebration: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/2.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/2.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/3.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/3.1600w.avif" type="image/avif"> The search pop-up always has a nice contrasty appearance: dark when the background is light, or vice versa. Many modern interfaces go for white background for every UI element and surface. This seems like solely an aesthetic choice, but has more consequences when it comes to visibility of things, and even hierarchy. I am personally always excited when I see a duochrome app these days, because it feels like the team knows what they’re doing and isn’t just chasing visual trends. (Below is an example from Bear.) = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/4.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/each-one-of-these-buttons-has-four-distinct-purposes/4.1600w.avif" type="image/avif"> #bear #coding #keyboard #onboarding #search I myself often forget onboarding is not just about the first run, but also about reinforcement . Here, this UI does a lot of reinforcing over time, helping you build the habit. Pressing the key highlights the tab. Clicking on the tab adds a key as if you pressed it, and so does using an advanced shortcut (e.g. ⌃⌘O instead of ⇧⌘O). Even slash as a symbol comes from path names, so you might naturally associate it with files. The search pop-up always has a nice contrasty appearance: dark when the background is light, or vice versa. Many modern interfaces go for white background for every UI element and surface. This seems like solely an aesthetic choice, but has more consequences when it comes to visibility of things, and even hierarchy. I am personally always excited when I see a duochrome app these days, because it feels like the team knows what they’re doing and isn’t just chasing visual trends. (Below is an example from Bear.)

0 views

Why I Didn’t Buy a New MacBook (Yet)

I currently use three laptops: a work-issued Windows machine that I use every day, a personal 17-inch Windows laptop that I used at my previous job (and subsequently bought from my employer when I left the company more than five years ago), and my trusty MacBook Air. My Mac is a 2017 MacBook Air - Intel i7, 8GB of RAM, and a 500GB SSD. It’s almost nine years old now. The battery gives me a few hours at best, it’s occasionally sluggish, and it doesn’t support multiple external monitors the way newer Apple Silicon Macs do. That’s particularly annoying because both of my Windows laptops connect to my home setup through a single dock. I have two home desks with almost identical setups because I work from home a lot, and my son and husband also use the docking stations to connect their laptops - it’s literally a case of plugging in one cable and you’re away. It’s seamless. My Mac doesn’t do that, but throughout its entire life with me I’ve mostly used it as a literal laptop on my lap - for writing, browsing, and general personal use. And of all the laptops I own, my old MacBook Air still feels the nicest to use. It’s light, comfortable, quiet, and just plain elegant. Despite its age and limitations, I still reach for it when I want to sit on the couch and write. So recently I started thinking that maybe it was time to update that experience with a new MacBook. The laptop I’ve been looking at is a 13-inch MacBook Air M5 with 16GB of RAM. On paper, it’s a massive upgrade. Better performance, all-day battery life, support for multiple monitors through a single dock., and a machine I could probably keep well into the mid-2030s. It would fit seamlessly into my existing home setup (apparently). The problem is that my current one still works. Pretty well. It’s not frustrating. The battery isn’t great, but I mostly use it at home or in a coffee shop for an hour. The more I sat with the decision, the more I got annoyed with myself. This has happened before. A few years ago bought an iPad Air and an Apple Pencil. I spent a huge amount of time on that decision - comparing models, watching reviews, mapping out use cases. I bought it. Then a few months (of agonizing over it) later I decided it would probably be even more useful with a keyboard, so I bought that too. I barely use either of them now. The problem was that I’d been buying a future version of myself - someone who would take notes in meetings or while reading, journal on an iPad, and create things. And for a while, I did. But it was never as seamless or friction-free as I’d imagined. At least not like it was in my head during those hours and hours of research. Or my new Kindle Paperwhite. I bought it to replace my very old Paperwhite, and yet I still use the old one most nights because it performs better in bed, in the dark, which is when I use it most. It’s smaller, the light is gentler in the dark, and somehow it just does the job better. What I’ve come to understand - and never really articulated to myself before - is that research creates its own momentum. The more time I invest in a decision, the more I start to feel like I should buy something, if only to justify all the effort I’ve already put into researching it. The MacBook Air replacement conversation was starting to feel very familiar. And exhausting. I don’t need a new MacBook. Not yet. I want one. I can afford one. I could even frame it as a reward for all my hard work or whatever. But should I? Round and round it goes until eventually I buy the thing and then feel disappointed because the purchase never solved the problem I thought it would. My MacBook is old, but it’s still good enough. Whatever I’d gain from a new one doesn’t feel like enough to justify the cost (and effort). If the battery or performance eventually gets bad enough that it genuinely gets in the way, I’ll replace it without guilt. Maybe I’ll replace the battery now and get another year out of it. It still feels great to use. I’m typing this on it now. What I keep coming back to is the difference between wanting a device and wanting the life you imagine you’ll have with it. In my case, I wasn’t really buying a MacBook. I was buying the fantasy version of myself who would finally finish that novel. The person who would take advantage of 18 hours of battery life to write for hours in cafés, on planes, or on top of some mountain somewhere. Never mind that my actual life doesn’t really allow for 18-hour writing sessions in the first place. Maybe I get an hour. Maybe two if I’m lucky. But would I use them for writing anyway? If I don’t on one of the three laptops I already own, why would I on a new MacBook Air?

0 views

I Love F1

by Gordon McLean Gordon has been hooked on F1 for 45 years, from childhood memories of Murray Walker to booking a trip to the Madrid GP. It’s a great look at how much the sport has changed and why he's still obsessed. Read post ➡ I discovered Gordon's blog after he commented on my recent note about F1 . I always check people's sites when they comment, as it's a great way to discover new blogs. Gordon's blog didn't disappoint - lots of great content on there, including this gem all about his love of the sport of F1. I've been following F1 for around 25 years now, and have similar memories as Gordon when it comes to thinking about some of the great drivers from the past. His post has me thinking about booking some ticket's to next years practice sessions for my wife and I (who's also a big F1 fan). Anyway, fun read. Thanks, Gordon. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

A solid tradition // Week 23 — 2026

A week in which some things happen and some things do not happen, much like other weeks. Current situation: The pain of having children who become driving teenagers is the exorbitant cost of auto insurance. The joy is getting to stare out the window as the midwestern landscape moves by and think about nothing and everything for hours at a time. Monday 01 June: Dentist in the morning, then work. I make all the kids’ appointments for the same day, which was a logistical requirement for years and now is just a solid tradition. We all troop in and the receptionist says, “Ahhh, the Muellers are here.” I guess at some point my adult-ish children will start doing their own dentist thing but we haven’t crossed that line yet. I’m not rushing it. Tuesday 02 June: My Mom died 19 years ago. I miss her every single day. Wednesday 03 June: Certainly some things happened on this day. But memorable things did not happen, I guess. How often we made our worst fears come true, by behaving as though they already were. — Louise Penny, A Better Man Thursday 04 June: Early work meeting, then met a friend for a walk. Work, gym, then I spent a lot of money on Amazon 1 . Now that we’re down to two kids at home (🥺) each with their own! separate! bedroom!, they no longer have to tolerate twin-size creaky metal loft beds. So: I ordered bed frames and mattresses. Which means next week we’ll have to put together the bed frames…. Friday 05 June: Took a long leisurely walk at the end of the day. Refused to cook dinner due to the volume of leftovers in the fridge. Saw some pretty peonies . Snuggled up on the couch and read Louise Penny while Lily watched Wednesday again. Saturday 06 June: Hospital day. Floor wasn’t full and then we had multiple discharges so I ended up with only 5 patients. I actually sat down for probably 2-3 hours total. Home. Shower. Balcony time. Stayed up too late watching Slow Horses . Sunday 07 June: An absolute luxury of a morning. No alarms. Slept until I woke up, then coffee and slow soft waking time. Then unhurried gym time: Weights, run, sauna. I’m not sure when I became a person who finds a 2 hour gym session luxurious but apparently that’s who I am now. At this precise moment Zeke and I are about 60 miles from Bentonville, Arkansas, where my sister lives. Tomorrow we’ll do  a college tour at U of A in Fayetteville, then drive home. 📚 Read A Better Man by Louise Penny. Also FINALLY returned the stack of horrifically overdue library books. SORRY I’M SO SORRY. 📺 Started rewatching season 4 of Slow Horses because I realized I never watched season 5! But then I didn’t remember what happened in season 4 so I’m watching it again before I watch season 5. 🩻 I’m starting Anatomy & Physiology 1 class tomorrow so that’s probably all I’m gonna be able to think or talk about for the next 8 weeks. BONES! MUSCLES! ORGANS! ALSO TISSUES! 💪 3x weight training. I moved up to 40lb dumbbells on bulgarian split squats. I like to start every workout with some split squats because then you know you’ve already done the worst possible thing. 👟 2 runs and hit over 12k steps every day. 🐈‍⬛ 1x Goobie took over my journal. I know Amazon is the devil. Don’t @ me. I know Amazon is the devil. Don’t @ me.

0 views

📝 2026-06-07 22:11

Why does everyone in #F1 love #Monaco? It's by far the most BORING race on the calendar. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Coding Is Designing

Code isn’t just a way to implement a design, it’s a way to find one. With an interface, you have to use it, feel it, interact with it, and poke at it to see the relationships between things . Change X, see Y react. If it doesn’t feel right, tweak it. Change X again, now Y reacts differently. Keep tweaking — this here, that there, until the relationships of all the disparate elements fall into place as a single whole. Design is “how it works” and code is the tool to specify how it works. Reply via: Email · Mastodon · Bluesky

0 views
Unsung Yesterday

Book review: Steve Jobs in Exile

★★★★☆ (as a book) ★★★☆☆ (for the purposes of this blog) There are as many books about Steve Jobs as there were Quadra models, but they focus mostly on two phases: Steve Jobs in Exile by Geoffrey Cain is a just-released, rare volume that focuses on the “in-between years” – starting with Steve Jobs founding NeXT and Pixar after his Apple ouster, and ending with him coming back to Apple under the absolutely strangest of circumstances. It’s a doubly interesting phase, both because we see Jobs maturing as a leader and actually learning from his many mistakes, and because the early technical NeXT decisions eventually became underpinnings for modern macOS and iOS. I do not see this as a book of new immense insight, technical depth, or design details, but that doesn’t mean it doesn’t go beyond surface level. What I appreciated most was Cain not shying away from pointing at some of Steve Jobs’s mistakes: hiring wrong people he happened to like, almost driving the company to the ground through obstinance, inability to focus on things he considered uninteresting, and a profound dose of duplicity coming into the NeXT/​Apple merger. Other things that stood out: focus on people around Jobs, spotlight on Jobs’s disappointing moral flexibility around working with government (or befriending Larry Ellison, for that matter), and a really fun pizza ordering story that serves as a prelude to the Starbucks call during the iPhone 2007 keynote. Some learnings: The one thing I didn’t like about the book was that the few photos inside are only perfunctory; there’s a lot of chatter about a beautiful, symbolic NeXT lobby staircase, top-of-the-landline phones, and expensive chairs, but we never get to see them. Many of the photos are by Doug Menuez – which you can also see online – but the problem is that those photos are generally not that interesting. That aside, it’s still a breezy and entertaining read that filled in some gaps and provoked new thoughts. = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/book-review-steve-jobs-in-exile/1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/book-review-steve-jobs-in-exile/1.1600w.avif" type="image/avif"> #apple #book #book review #culture #next #review 1955–1985 – Steve Jobs befriending Wozniak, the early days of Apple, Lisa, and the Mac 1997–2011 – Steve Jobs’s “second act” at Apple, and the creation of the iMac, iPod, iPhone, and so on Craft and taste alone are not enough; you can spend your talents and energy on things that don’t “matter” given some definition of the word. That could be okay if that’s a choice you make – “impact” is ill-defined and often overrated, anyway – but you need to approach it clear-eyed, which Jobs didn’t initially know how to do. Confidence, like everything, needs to be practiced, and focused, and influenced back by feedback and reactions. (Witness the negotiating acumen of a certain Jean-Louis Gassée!) It’s really hard to create a culture of hard and honest and deep conversations that’s also not a culture of abuse and toxicity.

0 views
Lalit Maganti Yesterday

17 bugs in 10 weeks from AI security scanning

Over the last several weeks, I’ve been receiving more security bug reports for Perfetto’s trace processor than I ever have before, all of them found by AI. And I’m very happy about it! These are bugs that would almost certainly not have been found a year ago and it feels good to close these loopholes even though trace processor is by no means security critical. For years, security researchers concentrated their time on the highest-stakes targets: kernels, cryptography libraries, password managers. But there’s a lot of code out there which is security-relevant but not truly security-critical. In my experience, these sorts of projects didn’t draw much attention. Now systems in the long tail can get that attention which they wouldn’t have before. Trace processor is a project which sits squarely in that long tail. It’s a C++ library (yes, Rust would be the obvious choice today but it’s not practical to rewrite, see footnote 1 ) for processing recorded traces of various formats. These are typically traces you collected yourself or in your test infra and process offline so “untrusted input” isn’t much of a concern.

0 views
Kev Quirk Yesterday

First Impressions of the Fitbit Air

A little over a week ago I took delivery of my new Fitbit Air, so I thought I'd jot down some of my thoughts after using it every day to track my health. I recently started running again , for which I use my Suunto Run to track. I've had it for a little while now and it tracks all my walks and runs. It's pretty good, but I wanted something that I could wear along side a proper watch, so it needed to have no screen and just silently track my health as I'm not interested in replacing a proper watch with a wrist phone . The Whoop band was an obvious contender, but the £200+ per year subscription that leaves me with a brick if I ever cancel is a deal-breaker and there was nothing else that I could find on the market...that was until the Fitbit Air came along. Fitbit Air on one wrist, watch on the other The Fitbit Air costs £85 (~$100) and unlike the Whoop, is a one-off purchase. You also get 3 months of Fitbit Premium, which basically adds Gemini to the app to help provide context, motivation, and workout schedules. After the 3 month freebie it's $10/month, but crucially the device and app work fine without Premium. You just don't get the "AI Coach" which is probably a positive for lots of people. 🙃 I already have a Gemini subscription that gives me access to Fitbit Premium, so I get it with no extra cost anyway. Although the "Coach" has made basic mistakes a few times - like referring to my Suunto watch as a set of smart scales, or incorrectly stating I'd done a 10km run instead of a 5km one - generally speaking I've found the extra context and advice it gives to be very useful. It has helped me to tweak some of my strength sessions and improve my form while running. My hope is that the basic mistakes the AI is making is down to teething problems. If so, I'd like to think they will improve with time. Like anything AI generated though, it's important to not take the feedback and advice it gives as gospel. Whenever it's made mistakes and I've called it out, it's always responds with the correct data and context afterward. Most of the time I don't even notice I have the Air on. It's so small and light - it just chugs away in the background, doing its thing. It's also about half the width of the Whoop. I bought the rubber strap for mine too, which is more comfortable while running, and less absorbent than the standard canvas strap, so hopefully no sweat will sink into it. The OEM straps are super expensive though, so I'm looking forward to aftermarket ones becoming available. Google advertise the Air as having a 1 week battery life. I can attest to that - I'd easily get a week out of this. It's also super quick to charge. Earlier in the week I was down to around 30% battery, so I chucked it on charge while I jumped in the shower. 20 mins later when I put it back on, it was nearly fully charged. This is great news as I'll be able to keep it topped up when I shower, then pop it back on when I go to bed so it can track my sleep. This is the major downside to all this - Fitbit are owned by Google, so they're likely to use the data in all kinds of unscrupulous ways. But the way I'm looking at it is that the gamification, the data, and the motivation that this little thing provides is helping me to get out and exercise. That's because I love data, so being able to review it all after my workouts, and see progress is hugely motivating. So if it helps me to get fit, and stay fit, it's a price I think I'm willing to pay. I'm kind of at the point in my life now where I just want things to work for me. If there's tradeoffs, so be it. Anything for an easy (and healthier) life. This post was a little all over the place. But overall, I really like the Fitbit Air. The data is keeping me motivated, and although the AI Coach makes mistakes, it is helping me navigate the data and improve my training, so I'll take that as a win. For me, it's an easy decision between this and the Whoop. The Fitbit wins out. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

A "patchwork" of AI laws is a feature, not a bug

Consider a few of the big AI risks: violating privacy, taking jobs, and destroying the world. Here’s what Sam Altman (CEO, OpenAI) and Dario Amodei (CEO, Anthropic) have to say about each, representing the two largest AI companies. Sam : If you go talk to ChatGPT about your most sensitive stuff, and then there's a lawsuit or whatever, we could be required to produce that. Dario : AI seems likely to enable much better propaganda and surveillance, both major tools in the autocrat's toolkit. Sam : AI is going to eliminate a lot of current jobs, and there will be classes of jobs that totally go away. Dario : AI could wipe out half of all entry-level white-collar jobs—and spike unemployment to 10-20% in the next one to five years. Sam : And the bad case—and I think this is important to say—is like lights out for all of us. Dario : My chance that something goes really quite catastrophically wrong on the scale of human civilization might be somewhere between 10 to 25%. So, the leaders of AI think there is a decent chance it will destroy the world, are pretty sure it is going to wipe out lots of jobs, and are absolutely sure it is violating our privacy and has the potential to enable much more mass surveillance. We need government to help manage these risks, let alone all the other concerns surrounding AI like data centers, child safety, etc. There is currently a debate whether Congress should prevent states from regulating certain AI risks. For example, the recently introduced Great American AI Act preempts (stops) states for three years from “regulating the development of any artificial intelligence model” (from § 121 in the full text and explanatory text ). It would be one thing if Congress was super effective, but it’s just not, especially with regard to tech regulation. Case in point, Congress has failed to pass a general privacy law for the entire life of the Internet! In fact, aside from narrow, reactive measures like the TAKE IT DOWN Act , Congress has failed to pass any comprehensive tech regulation for a generation. Do we really think that it is going to start now and sufficiently deal with the myriad of AI risks? NO. Their lack of progress to date is more than enough evidence. Meanwhile, states have already been moving to regulate these risks. I also don’t buy the criticism leveled against these state laws that they are somehow slowing down innovation via costly compliance, particularly in our AI race against China. The state laws getting the most pushback, like California's SB 53 , apply to a small number of frontier firms, now some of the most well-capitalized companies in the world. Yes, monitoring rules across states carries real costs, but these compliance costs are a rounding error to these companies. And, the broader state AI laws that reach smaller AI companies mostly require things like transparency and consumer notices, exactly the kinds of rules states already impose in many other regulated industries without killing progress. I've seen no systematic evidence that state AI laws have materially slowed innovation. The case for state participation doesn’t even depend on Congress’s track record on tech regulation (or lack thereof). Even if Congress could somehow become much more effective overnight, AI is still too big and too fast-moving for one governing body to manage its regulation. In this case, a so-called “patchwork of laws” is a feature, not a bug. Parallel beats serial, and independent bodies produce independent ideas, covering ground a single regulator never would. It’s worth noting that Dario himself made this case in an op-ed last year , opposing a proposed 10-year federal moratorium on state AI laws, calling it “far too blunt an instrument.” I agree . Congress should set a federal floor on the risks it's uniquely positioned to handle, like existential safety, national security, and cross-border harms, letting states go further where they see fit. While the preemption clause in the Great American AI Act is narrower than previously proposed (for example three years instead of ten years), the provisions of the act itself further make the point that zero years is warranted. On jobs, the act just funds a study to examine the problem rather than meeting the moment, say by levying a direct tax on AI consumption to fund displaced workers. If it’s too weak on jobs, arguably the most politically salient aspect of AI regulation needed, you can bet it will also be too weak on any area it is preventing states from regulating. So no preemption please. Let states protect their citizens as they see fit. More rules for AI are needed than Congress is realistically positioned to enact. Thanks for reading! Subscribe for free to receive new posts or get the audio version . Consider a few of the big AI risks: violating privacy, taking jobs, and destroying the world. Here’s what Sam Altman (CEO, OpenAI) and Dario Amodei (CEO, Anthropic) have to say about each, representing the two largest AI companies. On AI Violating Privacy Sam : If you go talk to ChatGPT about your most sensitive stuff, and then there's a lawsuit or whatever, we could be required to produce that. Dario : AI seems likely to enable much better propaganda and surveillance, both major tools in the autocrat's toolkit. Sam : AI is going to eliminate a lot of current jobs, and there will be classes of jobs that totally go away. Dario : AI could wipe out half of all entry-level white-collar jobs—and spike unemployment to 10-20% in the next one to five years. Sam : And the bad case—and I think this is important to say—is like lights out for all of us. Dario : My chance that something goes really quite catastrophically wrong on the scale of human civilization might be somewhere between 10 to 25%.

0 views
Unsung Yesterday

“Artifacts from a strange moment”

Welcome to another Super Mario Sunday! This is an 11-minute video from gruz talking about the fascinating world of South Korean bootleg Marios, such as Super Boy, Super Bros World, and Super Bio Man – existing solely because of Korea’s subpar copyright law of that era: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/artifacts-from-a-strange-moment/yt1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/artifacts-from-a-strange-moment/yt1.1600w.avif" type="image/avif"> In short: The code was copyrighted, but the IP was not, so many companies rebuilt Mario for the dominant game console of the region, in the process stripping it of all of the original game’s actual craft – with “levels feeling assembled rather than built” and “getting the [visuals] right and missing almost everything underneath” – and as such become interesting as a reflection of the details that actually made Mario great. However, as the time moves on, some of the bootleg games actually get better and better, and come into their own. It’s interesting to compare this to Nintendo’s own “clone” I mentioned before . What I wouldn’t give for some oral history of what looks like an absolutely fascinating time and place for software. #craft #details #games #history #super mario bros #youtube

0 views
ava's blog Yesterday

what i read this week - week 23 2026

I've read less this week than the prior one. I was catching up on a lot of recorded talks from re:publica 26, which I couldn't attend (those talks will get their own post!). I was also a tad busy with meeting people and doing the ring class, and also unfortunately struggling with depression. I didn't manage reading a lot of the stuff I meant to read. Oh well. Character.AI enshittification - good example of what will arrive at other models and use cases soon, inevitably, especially since this stuff is so expensive. EU's Digitalisation Push : Surveillance, control, exclusion - Made me learn about the concept of the increasingly "digital welfare state" and how governments worldwide are using tech to automate, predict, and "optimise the delivery of social protections". This means people most vulnerable, who need state safety nets the most, are also most at risk of having their privacy violated, their data collected and sold, and their behavior automatically analyzed. It's getting easier and easier for the EU to grab control over social security aspects instead of leaving it to the MS, due to classifying the digital exchange of data between governments, digitalization etc. as an EU issue. I have some issue with the presentation of the facts because they make it seem as if your data is only now just coming online, when in fact, it has been on government agencies' servers for over a decade already, and exchanged between them; it was just not accessible online to you . It was a hassle for you to get anything digitally in return. So yes, while the risks are there, and there are new projects and developments, a lot of it is just finally making sure you get an easier access to your own data as well, and hopefully have to submit less paperwork. I can say that as an employee in the public sector who was partially involved in our internal changes we made to comply with Germany's new Onlinezugangsgesetz (OZG; "online access act"). Our servers and databases stayed the same; now you just no longer need to send out a pigeon with a note and hope to hear something back within the next 5 years when you want something from [my employer]. They also underestimate how much need-to-know basis there is in granting access; no one will have a complete 360-view access to a digital life file. I don't like how it's just presented as inevitable that your doctor can see info about your housing situation, as if this would be in any way complying with any privacy regulation, or the reality of the software state of any doctor's office (it is bad). Aside from the digital aspect, the risk of the government just completely shutting off your lifeline has not increased, since they could already have all the info and decide to do so when your life was a folder in an office. Linked in, and related to, the prior one: UN Digital Welfare Dystopia - from 2019, exploring how digital welfare is often used to cut welfare spending, set up intrusive government surveillance and generate profits for private companies as these supply the systems. EU €200 Million Temu Fine - Temu breached the Digital Services Act as it failed to properly assess the systemic risks posed by illegal and unsafe products being sold in the EU. The attack on competence - amazing read about how uses of LLMs get differently scrutinized at work, and how real competence (which is usually about constraints, structure, patience, testing, knowing your craft etc.) is outshined by this sort of fake competence companies value nowadays, in which the vibecoder who cares about none of these things is pushing out a lot of stuff and is seen as more impressive or working harder. They are protected and encouraged by the investor class (usually the management) to continue to use LLMs this sloppy way under the guise of "moving fast and breaking things", because they only care about the superficial numbers as it helps them build legitimacy by posting that stuff on LinkedIn or when they are speaking about technology at conferences and meetings. Management can begin to feel quite threatened by real competence, as the competent people are not yes-men like the vibecoders. AI promised this investor class to finally eliminate the need for competence and not have to consider or respect any feedback other than their own (which they generate with sycophantic AI). I'm... actually fighting a somewhat similar fight at work right now, so this hits home. Debate over UK Migrant Age Assessment - AI wrongly assigns minors as adults, which poses dangers. European operators get more satellite spectrum - European companies get prioritized in the satellite spectrum. Orbans Payroll - The Lajos Batthyány Foundation and its Danube Institute spent a ton of money to buy influence on the Western Right, particularly spending it on public figures of the Western right-wing media and think-tanks. Meta Silently Added Face-Recognition Code for Its Smart Glasses to Millions of Phones - ( Archive ) - an AI companion app by Meta that is needed to use key feature of the glasses has code for NameTag included now. NameTag will use biometric features of the people you're looking at to check if they are already registered on your phone, and for info/notifications regarding them. I assume this is for when you meet people you know and it shows you facts around them, a summary of your last exchange, etc.? I understand why people might think this is useful in a vacuum/ideal world, but damn, not like this, in this climate... Why the AI Policy Debate Should Focus More on the Harness and Protocol Layers - loooong transcribed interview with the chief technology officer of Mozilla about AI, especially cybersecurity. I find his takes about the dual use nature a bit naive and evasive of the real issues, as well as how many issues AI causes for open source, but it was an interesting read. Shocking though that he spends $1,500 a month on his AI agent, and he can't even properly explain what he does with it, aside from... checking his calendar? But I am giving him grace in that sometimes you just aren't that great with explaining stuff spontaneously in a podcast. What I also don't like in general, in many formats and interviews like this one, how people can just say "We could build this very privacy-invasive tech in a way that is privacy-respecting, but we choose not to." which at first is right, depending on the tech, but then it's about Ring cameras, car cameras and other surveillance in public, and it's like - how? Either you get recorded and it alters your behavior, or you aren't. There is no middle way for some of this, but they always get away with saying that without offering specifics. FAQ zur Bezahlkarte - German FAQ about the payment cards for refugees. It's a problem that every area handles them differently, support is patchy, they can not be used to pay online, teens cannot get one, and the cards allow a very low amount to be withdrawn and spent. Gewerkschaftsgründung bei Wikimedia - German article about how Wikimedia (mainly their US branch) is attempting to unionize! federated IT architecture in Cross-border e-Services for Business Mobility, Updating Connected Company Data and Online Ship and Crew Certificates. Single Digital Gateway - official explanation for the Single Digital Gateway -> facilitates online access to information, administrative procedures, and assistance services that EU citizens and businesses may need in another EU country. Improving AI Labels - YouTube's new AI labels Positionspapier Verbraucherzentrale zur Datenschutzaufsichtsreform - opinion of the German consumer rights organization on the coming changes to our data protection authorities. They make clear they want unambiguous responsibilities, same decision-making processes and results across the board (right now, each DPA is kinda giving their own, sometimes wildly varying positions), fast help and decisions for citizens, no centralization in the non-public sphere, a possible One-Stop-Shop on the national level, and more resources for our Datenschutzkonferenz (DSK). Gesetzesentwurf zur Durchführung der Verordnung (EU) 2024/2847 (Cyberresilienz-Verordnung) BfDI fordert Bundestransparenzgesetz None this time, couldn't make it. I have several open tabs of some though, especially around the new tech sovereignty package... I am also currently reading Arne Semsrott's new book; I am about 40 pages in (from a small ~180). In total, that is roughly ~ 125 pages, if we count an article as two pages on average. Reply via email Published 07 Jun, 2026 Character.AI enshittification - good example of what will arrive at other models and use cases soon, inevitably, especially since this stuff is so expensive. EU's Digitalisation Push : Surveillance, control, exclusion - Made me learn about the concept of the increasingly "digital welfare state" and how governments worldwide are using tech to automate, predict, and "optimise the delivery of social protections". This means people most vulnerable, who need state safety nets the most, are also most at risk of having their privacy violated, their data collected and sold, and their behavior automatically analyzed. It's getting easier and easier for the EU to grab control over social security aspects instead of leaving it to the MS, due to classifying the digital exchange of data between governments, digitalization etc. as an EU issue. I have some issue with the presentation of the facts because they make it seem as if your data is only now just coming online, when in fact, it has been on government agencies' servers for over a decade already, and exchanged between them; it was just not accessible online to you . It was a hassle for you to get anything digitally in return. So yes, while the risks are there, and there are new projects and developments, a lot of it is just finally making sure you get an easier access to your own data as well, and hopefully have to submit less paperwork. I can say that as an employee in the public sector who was partially involved in our internal changes we made to comply with Germany's new Onlinezugangsgesetz (OZG; "online access act"). Our servers and databases stayed the same; now you just no longer need to send out a pigeon with a note and hope to hear something back within the next 5 years when you want something from [my employer]. They also underestimate how much need-to-know basis there is in granting access; no one will have a complete 360-view access to a digital life file. I don't like how it's just presented as inevitable that your doctor can see info about your housing situation, as if this would be in any way complying with any privacy regulation, or the reality of the software state of any doctor's office (it is bad). Aside from the digital aspect, the risk of the government just completely shutting off your lifeline has not increased, since they could already have all the info and decide to do so when your life was a folder in an office. Linked in, and related to, the prior one: UN Digital Welfare Dystopia - from 2019, exploring how digital welfare is often used to cut welfare spending, set up intrusive government surveillance and generate profits for private companies as these supply the systems. EU €200 Million Temu Fine - Temu breached the Digital Services Act as it failed to properly assess the systemic risks posed by illegal and unsafe products being sold in the EU. The attack on competence - amazing read about how uses of LLMs get differently scrutinized at work, and how real competence (which is usually about constraints, structure, patience, testing, knowing your craft etc.) is outshined by this sort of fake competence companies value nowadays, in which the vibecoder who cares about none of these things is pushing out a lot of stuff and is seen as more impressive or working harder. They are protected and encouraged by the investor class (usually the management) to continue to use LLMs this sloppy way under the guise of "moving fast and breaking things", because they only care about the superficial numbers as it helps them build legitimacy by posting that stuff on LinkedIn or when they are speaking about technology at conferences and meetings. Management can begin to feel quite threatened by real competence, as the competent people are not yes-men like the vibecoders. AI promised this investor class to finally eliminate the need for competence and not have to consider or respect any feedback other than their own (which they generate with sycophantic AI). I'm... actually fighting a somewhat similar fight at work right now, so this hits home. Debate over UK Migrant Age Assessment - AI wrongly assigns minors as adults, which poses dangers. European operators get more satellite spectrum - European companies get prioritized in the satellite spectrum. Orbans Payroll - The Lajos Batthyány Foundation and its Danube Institute spent a ton of money to buy influence on the Western Right, particularly spending it on public figures of the Western right-wing media and think-tanks. Meta Silently Added Face-Recognition Code for Its Smart Glasses to Millions of Phones - ( Archive ) - an AI companion app by Meta that is needed to use key feature of the glasses has code for NameTag included now. NameTag will use biometric features of the people you're looking at to check if they are already registered on your phone, and for info/notifications regarding them. I assume this is for when you meet people you know and it shows you facts around them, a summary of your last exchange, etc.? I understand why people might think this is useful in a vacuum/ideal world, but damn, not like this, in this climate... Why the AI Policy Debate Should Focus More on the Harness and Protocol Layers - loooong transcribed interview with the chief technology officer of Mozilla about AI, especially cybersecurity. I find his takes about the dual use nature a bit naive and evasive of the real issues, as well as how many issues AI causes for open source, but it was an interesting read. Shocking though that he spends $1,500 a month on his AI agent, and he can't even properly explain what he does with it, aside from... checking his calendar? But I am giving him grace in that sometimes you just aren't that great with explaining stuff spontaneously in a podcast. What I also don't like in general, in many formats and interviews like this one, how people can just say "We could build this very privacy-invasive tech in a way that is privacy-respecting, but we choose not to." which at first is right, depending on the tech, but then it's about Ring cameras, car cameras and other surveillance in public, and it's like - how? Either you get recorded and it alters your behavior, or you aren't. There is no middle way for some of this, but they always get away with saying that without offering specifics. FAQ zur Bezahlkarte - German FAQ about the payment cards for refugees. It's a problem that every area handles them differently, support is patchy, they can not be used to pay online, teens cannot get one, and the cards allow a very low amount to be withdrawn and spent. Gewerkschaftsgründung bei Wikimedia - German article about how Wikimedia (mainly their US branch) is attempting to unionize! Gematik plant mehr Zentralisierung und Macht für sich - Organization responsible for our digital health stuff. IRINI dient nurnoch der Migrationsabwehr CDU, SPD und BSW wollen Überwachung in Sachsen ausweiten NRW will Haftung der Online-Plattformen erweitern - calls for platform owners to be held liable for deepfakes etc. under an amendment of the DSA. I think that makes sense, considering the decision impact of the Russmedia case. Recovery and Resilience Facility - official explanation of what the EU's Recovery and Resilience Facility (RRF) is. Member states are getting money for hitting specific goals in their plan, comprised of at least 37% going to green measures and 20% to digital measures. They must meet all milestones and targets by 31 August 2026, and the Commission must make the final payments by 31 December 2026. Once Only - official explanation of the Once Only Principle -> federated IT architecture in Cross-border e-Services for Business Mobility, Updating Connected Company Data and Online Ship and Crew Certificates. Single Digital Gateway - official explanation for the Single Digital Gateway -> facilitates online access to information, administrative procedures, and assistance services that EU citizens and businesses may need in another EU country. Improving AI Labels - YouTube's new AI labels Positionspapier Verbraucherzentrale zur Datenschutzaufsichtsreform - opinion of the German consumer rights organization on the coming changes to our data protection authorities. They make clear they want unambiguous responsibilities, same decision-making processes and results across the board (right now, each DPA is kinda giving their own, sometimes wildly varying positions), fast help and decisions for citizens, no centralization in the non-public sphere, a possible One-Stop-Shop on the national level, and more resources for our Datenschutzkonferenz (DSK). Gesetzesentwurf zur Durchführung der Verordnung (EU) 2024/2847 (Cyberresilienz-Verordnung) BfDI fordert Bundestransparenzgesetz This one about how a parent using retail video surveillance footage in child custody proceedings is a legitimate interest under Article 6(1)(f) GDPR. I summarized it for GDPRhub.

0 views
<antirez> Yesterday

A new era for software testing

Automatic programming dramatically speeds up writing software in certain use cases and in the right hands. In my experience the output does not reach the structural quality and economy of complexity of the best hand-written software. However, not all the software is stellar, and my feeling is that automatic programming surpasses most of the times (and if well managed) the quality of decently developed hand-written code. Yet, there is a tradeoff between quality and time, in the case of writing new software with AI. This tradeoff in certain projects I developed can be brutal, that is, completing projects that may take many months in a few weeks. However, there are domains where LLMs simply open new strictly more powerful ways to automate processes, without any compromise on quality. One of those domains is software QA and testing. Traditionally software is tested using test suites that are composed of locally-scoped tests and integration tests (think of Redis: one thing is testing if SET foo 10 will be matched by GET foo => 10, another thing is testing if replication works in this case). And then by QA passes that are usually manually executed, and that can capture holes in the runnable test suite. It is a known fact that covering all the lines of the code does not mean covering all the possible states. Moreover integration testing is structurally hard: there are a number of timing issues, setups, and certain quality outputs that can only be visually inspected and not automatically checked that leave a lot of testing opportunities not really exploited because of time or logistic constraints. LLMs offer a new way to do QA on top of the existing testing methodologies. The idea is to create a markdown file where an AI agent is asked to work as a QA engineer, performing a number of manual testings on the new release. For instance, in the case of DwarfStar (an inference engine for open weights LLMs) I use the following approach. In the markdown file, the agent is asked to check what are the new commits on top of the already released version of the software project. Then the model is told a list of things that should be performed, like: - Check that distributed inference works across MacBook A and MacBook B, making sure the output is coherent, the inference works with all the GGUF files we have in both the machines, ... - Make sure this release does not contain any speed regression. And so forth. Notably, in the speed regression part, I don't have to tell the agent what was the previous expected speed, as this is a moving target that changes with new releases and new optimizations. Similarly the integration test for distributed inference does not require many instructions, at the start of the file there are just SSH endpoints and the key to use, the paths, and so forth. The agent is asked to check the long list of QA activities *especially* in light of the added commits, starting with an inspection of the changes and with the identification of what could be affected, so that the QA pass specializes trying to find specific regressions. In the case of Redis Arrays, I used a similar methodology asking the agent to build a large array-based Redis application, to setup a production environment with replication and persistency, to simulate the usage of the application for days and with many users, checking if something was odd. Testing that uses these approaches may also move in the more psychological side of software quality, asking the agent to identify all the new features that may look surprising, not documented enough, or generally sloppy from the POV of the user. All things that needed to be executed manually before, and that most of the times were mostly skipped. I have the feeling that the introduction of automatic QA may raise the bar of quality for new releases of software, and maybe partially compensate for the lower quality of the code produced at high speed with the use of automatic programming. Comments

0 views

Software will never feel the same

When I visit a restaurant, I care a lot about the food I’m being served, and I want everything I order to be delicious. But arguably, my favourite part is understanding and appreciating the decisions behind each item on the menu, as well as the menu design, the choice of cutlery, the lighting, everything. For me, going to a restaurant isn’t only about eating, it isn’t only about having a good time in a nice place: it’s also about appreciating how harmonised everything is, and how the professionals working there have organised their respective skills to make the restaurant as good as it can be. If I learn that a restaurant uses ready meals, and that the people working there simply reheat something they didn’t make and label it as their own because they’ve added some of their own seasoning, it would change my opinion of that place, even if the food served tastes as good as before. But what if I never know? What if I can’t tell if a restaurant I like cheats its way to success? If I’m just tasting what’s on the plate, how can I know? It’s hard to tell. Sure, there are signs. For instance, if the food is served only two or three minutes after I order, or if the dish served is exactly the same as a dish served in another place, these sorts of things. Regardless of the taste, regardless of the food quality itself, I would grow suspicious, disappointed. It would eventually diminish my appreciation of that place, having the feeling that they aren’t really cooking anything themselves. I would simply not go to that restaurant again. If this general feeling of doubt occurs in most places I visit, whether justified or not, it will certainly taint my overall enthusiasm for restaurants. Yes, you’ve guessed it, this is a post about A.I. and software development. As much as I love good software — especially on the Mac — and trying out applications that may end up part of my digital routine, I think what I love the most is appreciating what is seemingly called “the craft”. When I use an app for the first time, the exploration part is my favourite: to see if this app has a chance to stay on my Mac. I try to understand the design and the decisions made building the app the way it is. I actually love spending time digging through the different options and settings. “What are your hobbies, Nicolas? – Well, spending time in preference panels on MacOS surely is in the top 3. – OK, weirdo.” When I manage to grasp the extent of the app’s features, how they can be operated, and how they are laid out in different menus and toolbars, I like to imagine the debates that went on within the teams. I like to see how shortcuts are implemented, I like to find out what changes when I press the Option key. I like to think “ Oh, this is clever ” when something unexpected is available and yet makes perfect sense. If the app is useful to me and fits into my “workflow” , fantastic: I get to appreciate both the craft and the app itself. If the app isn’t really for me, I can still recommend it and appreciate the efforts, the design, the features and the intentions of the team. It’s a sort of catch-and-release approach to software fishing. At least this is what I enjoyed doing until recently; alas, the fishing part now has a strong, inescapable stink; too often with brand new apps something smells… …suspicious. *1 To be clear, there’s nothing wrong with using A.I. in app development, but just like there are good apps and bad apps, there is good and bad use of A.I. I know how A.I. can help some teams to reach their roadmap goals much faster, to fix bugs that were never a priority before, or to build features that couldn’t be afforded in the first place. This is great, and I’m sure this will now be the norm. Objectively, this is a good thing for the Mac. As John Gruber writes on Daring Fireball , it can serve the platform by increasing the number of truly native apps: The Mac has never faced a decline in popularity, but truly native Mac application development (and the skills) did. Now it’s turning around. Mac users are thirsty for Mac apps, and with A.I., they can quench their own thirst and tell the dullards promulgating Electron bundles to pound sand. But not all uses of A.I. are ideally implemented or even well-intentioned. Some apps can appear to be “authentic”, but they are not . Some apps can appear to be native, but they are not. They can appear to be indie apps, with a small dedicated and passionate team, but they are not. Instead, most of them are imagined, created, updated and distributed in a matter of weeks by a single person and their favourite A.I. tool, without any consideration for best practices, security, good UX , or transparency. It’s a free market, and users can decide what’s best for them, sure. Although it’s increasingly hard to tell the difference without any honesty from the developers, and this is where the deception begins. Nowadays, I get a feeling of unease more and more often. I look at a new app’s website, and I feel like I’m being lied to. I feel like I’m in an artisan shop and I see hundreds of miniature Eiffel Towers carved in oak that are way too perfect to be handcrafted by a single person. I’m sad that from now on I may never be as trusting and curious regarding apps as I was before. The good smell of craft is now covered by the stink of doubt and suspicion. When it comes to software, the Olympic Games have been replaced by the Enhanced Games . The sports are the same, and not all athletes take drugs, but it just doesn’t feel the same, does it? Today, not only can we not watch a cute kittie video without wondering if it’s real or not, we now can’t even use an app without wondering if it’s vibe-coded or not. I’ve had this feeling again this week with an app called Aphera . This RAW editing app looks great, native, fast, efficient: exactly my cup of tea. But I have an itch. Something feels too good to be true. Maybe I’m wrong, maybe I’m just paranoid, but when an app seems to come out of nowhere, made by a very small team in a short amount of time, I’m now sceptical. I’ll stick with RAW Power , thank you. Nothing to do with the app Aphera itself. The app could be “legit” all the way, and I may be a fool for not giving it a go. Two years ago, I would have downloaded it already. Today, I have too many doubts; I’ve even grown suspicious of an app I genuinely like: Helium Browser . Some of these A.I.-enhanced apps are fine, I guess, and well-intentioned. Apps like Tolaria are prime examples. Not for me, but I can see the value for some people (and it’s free, so it’s hard to complain here). But most of the time, something feels off. The other day, I received an email from a reader pointing out that the link I used in a post for the great Mac app called MarkEdit was not the one to be expected. The link pointed to the domain where the MarkEdit app I love lived at the time of the post’s publication : . The app developer eventually let that domain expire, and the app has been living only on GitHub since . But when I went to check that link again, the website was indeed for an app called MarkEdit. But it was another app, not the MarkEdit I praise regularly on this blog. This new MarkEdit has since changed its name to AnnexEdit . Not sure if the original naming was a pure coincidence or a sneaky way to capitalise on an existing brand (MarkEdit is a catchy name after all). Maybe anxiety about copyright infringement finally pushed the developer to do the right thing after a few weeks, or maybe it was a simple mistake. This new app, AnnexEdit, is, I believe, another example of AI-generated software. Something feels off: only a few months had passed between the first alpha version and the final product, the app is built using the Rust language (nowadays it’s pretty much a tell), and overall, it lacks personality, it lacks intent, it lacks a human touch. It doesn’t lack a pricing page though, nor a “pro” version. *2 Of course, no mention of A.I.-driven development anywhere on the website. If you’re a serious developer using A.I. to speed things up, I guess you’d be transparent and honest about it , just like a good restaurant would tell its tenants that all the desserts listed on the menu come from a specific patisserie shop. When it’s not even mentioned, yet the app offers some MCP features from the get-go, colour me suspicious. When I look at the feature set of AnnexEdit, it should excite me: it’s a text editor after all. I should be curious to download the app, and give it a go. Who knows, it could probably end up listed in one of my future blog posts about cool apps or whatever. But my suspicions and unease — whether they are deserved or not — are preventing me from doing this. The features are there, but I can’t detect the craft. It seems like software will never feel the same. Now, I wonder: will I ever be able to give a chance to a new app again? I fear that the magic and my software innocence are gone. Stole a joke from Stephen Colbert here.  ^ Question: if AI-generated content cannot be copyrighted , does it mean that an AI-generated app cannot be sold?  ^ Stole a joke from Stephen Colbert here.  ^ Question: if AI-generated content cannot be copyrighted , does it mean that an AI-generated app cannot be sold?  ^

0 views

Thoughts on starting new projects with LLM agents

A few months ago I wrote about using LLM agents to help restructuring one of my Python projects . It's worth beginning by saying that the rewrite has been successful by all reasonable measures; I've been able to continue maintaining that project since then without an issue. In this post, I want to discuss another project I've recently completed with significant help from agents: watgo . In this project many things are different; most notably, it's a from-scratch project rather than a rewrite, and it uses a different programming language (Go). This post describes my experience working on the project, and some lessons learned along the way. This is a new project, so it required extensive design. I began by iterating on the design with the agent, with a sketch of the API. For this purpose, I recommend using a Markdown file committed into the repository for future reference. After that, I started asking the agent to write CLs [1] in a logical order that made sense to me, keeping them small and reviewable (more on this in the next section). Sometimes it's not easy to have a small CL, and multiple rounds of revision may confuse the agent; in this case, I commit the CL and then go back and ask the agent to modify or refactor the code, as much as needed, with separate CLs. In the worst case, the whole sequence can be reverted if I feel we've taken the wrong direction (branches could also be helpful here for more complicated scenarios). This point is worth reiterating: sometimes a single CL is a huge step forward, but requires lots of review, cleanup and refactoring to be viable. I've had multiple instances where an agent produced several days of work in a single CL, but I then spent hours instructing it to clean up and refactor. Overall, it's still a productivity gain, just not as much as some pundits would like us to believe. Given the current state of agent capabilities, I think it's worth splitting projects into two categories: The watgo projects is a clear example of (2): I certainly intend to maintain this project in the long term, so I insist on code that I understand. With very few exceptions, no code gets in without full review and often multiple rounds of revisions. Even if the cost for writing code went down, maintaining a project is so much more than that. It's triaging and fixing bugs, it's thinking through what needs to be done rather than how to do it, it's keeping the code healthy over time, and so on. As Brian Kernighan said : Maybe at some point agents will become good enough that projects in category (2) can be implemented and maintained completely autonomously. Maybe. But we're certainly not there yet. My hunch is that getting there will require crossing the AGI line [2] , after which little in our world remains certain. If you're using an agent to send an actual PR and only review that , it's difficult to be disciplined enough to actually perform a thorough review. I find the following method to be more reliable: I use a CLI agent running locally in my repository, and ask it to update the code there. In parallel, I have a VSCode window open in the same project, where I can: Once I'm pleased with the change, I manually create a commit. As mentioned above, it's imperative to keep making progress in small chunks, with small enough CLs that a human can fully understand in a single review. It's very tempting to sprint ahead submitting thousands of lines of code every day, but this temptation has to be avoided. Coding with an agent is like speed-reading; yes, you're making more progress, but comprehension suffers the faster you go. Particularly for refactoring, agents still take the shortest route to destination. It's important to guide them to think about the "big picture" at all times, find all instances where X is better done as Y, not just a single place noticed during a review. This is why it's sometimes OK to have a CL submitted before you fully agree with everything, and go back to it later for several refactoring rounds. Source control works amazingly well when pair-coding with agents. It's a key point discussed in every "how to succeed with AI" article, but still critical enough to reiterate here: a solid testing strategy is absolutely crucial for success. Agents produce - by far - the best results when they have a solid test suite to test their code against. With the pycparser rewrite, I had a large existing test suite. For watgo , the very first thing I did was think through how to adapt the test suites of the WASM spec and of the wabt project for my needs. If your project doesn't have such tests to rely on, this should be your first order of business - finding one, or building one from scratch. Beware of self-reinforcing loops though; it's dangerous to trust agents for both the tests and the implementations tested against them. Go is a fantastic language for agents to write, because it's designed to be very readable by humans. The biggest strengths of Go are exactly what makes the experience of reviewing agent code so positive: Since most of the time spent by humans when using agents is reading rather than writing code, these effects compound and produce a great experience. Recall the discussion of how some languages are optimized for writability (Perl) while others are optimized for readability (Go)? Well, when working on a project with an agent we live in a world of 99% reading vs. 1% writing, so this really matters. I find this aspect really crucial in light of the earlier points made in this post - namely, keeping the human in the loop by understanding and reviewing all of the agent's design choices and code. If you're working on a subject that's completely new to you, I would strongly recommend against the approach described in this post. To really learn something, you have to work through it from scratch, yourself, reading, designing, writing the code. Agents don't change this basic fact; even before agents, if you wanted to learn X, copying it from Stack Overflow or some other project clearly wasn't the right way to go. Similarly, while agents can be used as a prop for learning, they cannot learn for you . As a corollary, junior engineers should exercise extreme caution when relying on LLMs. There's no replacement to hard-won experience and the sweat and tears of learning new, challenging topics. Learning is supposed to be hard; if it's too easy, you're probably not learning. For senior engineers, agents are a boon; it's a great tool to increase productivity, avoid the boring stuff, and get unstuck from procrastination; but only when used judiciously. Low importance / prototype / throw away projects where deep code understanding is unnecessary. These can be "vibe-coded" (submitting agent code without even reviewing it). High importance projects that I actually want to maintain; here, vibe-coding is ill advised and I insist on reviewing and guiding all code the agent writes before it's submitted (or shortly after, as discussed above). Review the agent's changes using VSCode's diff view Make my own tweaks and code changes if needed Go changes very infrequently, so you don't have to wonder "are we using the most modern / idiomatic approach" or "what the hell is this construct" as often as with other languages (looking at you, Python and TypeScript). There are relatively few ways to accomplish the same thing in Go, further lowering the mental burden. The standard library is rich and there's much less need to keep abreast of the package-everyone-uses du jour. In general, Go is designed for readability, with a mild-but-still-strong type system, uniform formatting, explicit error propagation and opinionated choices already made for you.

0 views
Jeff Quast Yesterday

Perfecting Terminal Character Width Using Correction Tables

Measuring terminal Unicode width has always been a difficult problem. Per-terminal correction tables make near-perfect accuracy possible. In past years, I have published a specification of Python wcwidth, and Kovid Goyal has published "The algorithm for splitting text into cells" as part of the Kitty Text Sizing protocol. Terminal emulator authors have agreed to disagree on many broad interpretations of Unicode standards, such as whether emoji should be supported at all, down to the very fine details of individual codepoints, categories, and grapheme widths.

0 views