Latest Posts (20 found)

A few words on DS4

I didn’t expect DwarfStar 4 (https://github.com/antirez/ds4) to become so popular so fast. It is clear that there was a need for single-model integration focused local AI experience, and that a few things happened together: the release of a quasi-frontier model that is large and fast enough to change the game of local inference, and the fact that it works extremely well with an extremely asymmetric quants recipe of 2/8 bit, so that 96 or 128GB of RAM are enough to run it. And, of course: all the experience produced by the local AI movement in the latest years, that can be leveraged more promptly because of GPT 5.5 (otherwise you can’t build DS4 in one week — and even with all this help you need to know how to gently talk to LLMs). The last week was funny and also tiring, I worked 14 hours per day on average. My normal average is 4/6 since early Redis times, but the first few months of Redis were like that. So, what’s next? Is this a project that starts and ends with DeepSeek v4 Flash? Nope, the model can change over time. The space will be occupied, in my vision, by the best current open weights model that is *practically fast* on a high end Mac or “GPU in a box” gear (like the DGX Spark and other similar setups). I bet that the next contender is DeepSeek v4 Flash itself, in the new checkpoint that will be released and, hopefully, a version specifically tuned for coding, and who knows, other expert-variants (not in the sense of MoE experts) maybe. For local inference, to have a ds4-coding, ds4-legal, ds4-medical models make a lot of sense, after all. You just load what you need depending on the question. It is the first time since I play with local inference (I play with it since the start) that I find myself using a local model for serious stuff that I would normally ask to Claude / GPT. This, I think, is really a big thing. It is also the first time that using vector steering I can enjoy an experience where the LLM can be used with more freedom. DeepSeek v4 Flash is really an impressive model, no doubt about that. If you can imagine in your mind the small good local model experience as A, and the frontier model you use online as B, DS4 is a lot more B than A. I can’t wait for the new releases, honestly (btw, thank you DeepSeek). So, after those chaotic first days, I hope the project will focus on: quality benchmarks, potentially adding a coding agent that is also part of the project, a hardware setup here in my home that can run the CI test in order to ensure long term quality, more ports, and finally but as a very important point: distributed inference (both serial and parallel). For now, thank you for all the support: it was really appreciated :) AI is too critical to be just a provided service. Comments

0 views

Fragments: May 14

Last week I spent a day at a retreat that brought together several people working in software development to talk about the profession’s future with the rise of agentic programming. The event was help under the Chatham House Rule , so I can’t attribute the comments and stories I heard. (If anyone recognizes themselves, and would like attribution, let me know.) Here are a few tidbits that caught my notebook. ❄                ❄ One group developed a behavioral clone of GNU Cobol compiler in Rust. The result is 70K lines of Rust and was built in 3 days. This is yet another sign of the ability of LLMs to do a good job of porting existing code to a new platform. Good regression tests are extremely valuable here (and I don’t know how good GNU Cobol’s are). There’s also the possibility of building a test suite if you have access an existing implementation. ❄                ❄ Large spec documents can be complex for a human to review. One attendee shared the idea of getting the LLM to interview a human expert, asking the human questions to verify the correctness of the specification, a form of Interrogatory LLM . ❄                ❄ Not specifically about AI - but I liked how one attendee commented that the first thing they do when consulting with an organization is to read the guidelines for their change-control board. This is the scar tissue of what’s gone wrong in the past. I’ve often said that to understand why a thing is the way it is, you need to understand the history of how it got there. This seems like an excellent way to tap into important parts of that history. ❄                ❄ My colleagues who work with modernizing legacy systems have long been rather sniffy about “Lift and Shift” - porting a legacy system to a new platform while retaining Feature Parity . We see this pattern as a huge missed opportunity. Often the old systems have bloated over time, with many features unused by users (50% according to a 2014 Standish Group report) and business processes that have evolved over time. Replacing these features is a waste. Instead, try to muster the energy to take a step back and understand what users currently need, and prioritize these needs against business outcomes and metrics. But this point of view was developed before LLM’s ability to port code appeared. One attendee who does a lot of work in this field said they believed that lifting and shifting to a new platform should now be always the first step in a legacy migration. The cost is no longer as formidable as it used to be, and a better environment makes further evolution much cheaper. Just don’t stop there. ❄                ❄ Several attendees were from the financial industry, and thus were immersed the problems of complex legacy environments coupled with regulatory controls and significant risk should software do something wrong with money. One of their issues is the complexity we run into when a financial product is offered in multiple jurisdictions, each with their own regulations to satisfy. There’s a lot of software complexity in deciding which jurisdiction applies, and picking the right set of rules at the right point of the workflow. The question here is whether the rapidity of agentic programming means that we can build individual, simpler systems for each jurisdiction. We would then use LLMs to ensure consistency between them, so that as the product rules change, each system reflects that into its own environment. A large part of software design is about identifying what is the same and what differs between various business contexts. Where things are the same, and need to be the same, we are rightly wary of duplicating code, since this increases the cost of updates the dangers of inconsistency. The interesting question is what role LLMs can play to give us new tools to tackle this. ❄                ❄ As is usually the case in gatherings like this, folks were concerned about junior developers. When we work with The Genie, our value comes from good judgment - how do we teach that? This group did have one common tool - Pair Programming . One of the key benefits of pairing has always been skills transfer, and here an experienced agentic programmer can pass on their judgment for software design and how to use the genie to get there. And the junior will often have a trick or two to share too - that fresh pair of eyes in particularly valuable in the shift to our agentic future. ❄                ❄ Historically, we use computer systems to bring order to chaotic human processes. Is AI reversing that? ❄                ❄ So much software is involved in data transformation. Those records over there need to be consumed by these APIs over here, but there are differences in how the data is structured, often due to being in different Bounded Contexts , so we have to do some conversion. Agents are particularly adept at writing this kind of transformation code, which is often more tedious than we’d like. ❄                ❄ Chaos Engineering has become a valuable technique to improve resiliency, made famous by Netflix’s Chaos Monkey that randomly breaks live services to see how well the ecosystem reacts and recovers. What would a Chaos Monkey for AI look like? Would it deliberately introduce hallucinations into a pipeline to see if sensors were able to catch them? ❄                ❄                ❄                ❄                ❄ Back at my desk There’s been a bunch of questions about the article on Structured-Prompt-Driven Development (SPDD) that the authors answered in a Q&A section . One in particular caught my eye: Have you considered having an agent do the prompt/spec review itself — not a human reviewing the Canvas, but an agent that reads the REASONS Canvas alongside the code diff and verifies alignment? The reply talks about how there is an available command to do this, but there downsides. In particular one reason not to do this automatically is: Letting humans learn. Review is also where humans learn from the AI’s choices — patterns, trade-offs, options they had not thought of. Cutting humans out speeds things up, but it blocks the long-term skill growth that SPDD is designed to protect. […] Once enough decision rules build up to give us real confidence, we may shift more of the review to the agent step by step — but the part where humans learn from the AI is something we plan to keep. One of the ways we should judge the value of an AI tool is how much it helps us humans learn more about the world we inhabit and build. ❄                ❄                ❄                ❄                ❄ In some strange way I injured my elbow last week. No idea how, there was no event where I said “oh shit”. It just gradually started hurting and swelling. My life-long strategy to avoid sports injuries 1 had defied me. I applied ice and ibuprofin, the swelling went down, but my range of motion got worse. I’m glad I learned to use a knife and fork in English childhood, so I normally eat with my left hand. I noticed that that loss of range of motion occurred after I got home, when I started spending all day at the computer again. I might not use my elbow directly, but my right hand does a lot of typing and mousing. My desk set up is pretty ergonomic, with a good keyboard , a wrist rest for the mouse, and arm rests on my chair. But even so, did my computer use make my elbow get worse once I got home? I can’t imagine not using the computer, for me writing has become an unstoppable habit. But maybe I should use this opportunity to explore voice input - after all most people can speak faster than they can type. I tried this many years ago, when a colleague told me how good voice recognition was once it trained to you. I tried it, and indeed the voice recognition, even in those pre-AI days, was very good. But it didn’t work for me. When I’m writing I rapidly type words into Emacs, but almost immediately I go back to edit them. Write two sentences, edit them, write another, re-edit the paragraph. The back-and-forth between seeing my words and thinking about them is tight - I can’t just dictate my words. That made me reflect further. I only started using a computer for my writing in my 20s. At school I had to write longhand, and in university to type on typewriter. But those media don’t support the constant rewriting that I do now. Would I even have become a writer had the text editor not been invented? ❄                ❄                ❄                ❄                ❄ James Pritchard thinks that many developers are over-using agents at run-time in their products, when LLMs are better used as functions . The problem with agents isn’t that they don’t work. It’s that they work unpredictably. You trade a known execution path for “autonomy” that mostly means “I don’t know what it’s going to do.” When an agent-powered feature breaks in production, you’re debugging a conversation transcript, not a stack trace. Most “agent” use cases are actually workflows, a known sequence of steps where one or two of those steps happen to involve an LLM. You don’t need autonomy for that. You need a function call. He points out that functions compose predictably, so if you know the workflow, then composing in a program text is better than agents figuring out how to coordinate themselves. It’s faster, and needs less tokens. It’s usually easier to deal with failures, since the scope of the interaction is smaller. ❄                ❄                ❄                ❄                ❄ Pritchard also thinks that people use skills far more than they should . He thinks people accumulate folders of markdown skill files but LLMs use them inconsistently, often missing them when they’re needed, or bloating context when they are not. Many things that should go in skills should be other parts of a harness , preferably computational. Skills should only be used with deliberate, infrequent workflows. The skills obsession is a symptom of a deeper pattern: people reaching for configuration when they should be reaching for architecture. “The LLM doesn’t write good tests.” Don’t write a testing skill. Are your existing tests inconsistent? Is the test setup complex? Fix those things and it’ll write good tests without being told how. Point it at a test file you’re proud of. Code is clearer than English. The best setup is one where you barely need to configure the LLM at all. A clean codebase with clear patterns, a short project config for the non-obvious stuff, hooks for automation, and maybe one or two skills for specific workflows you run intentionally. That’s it. ❄                ❄                ❄                ❄                ❄ An oft-stated point about the rise of agentic programming is that we have to start dealing with non-determinism in our work. Of course that’s somewhat of a simplification, because some aspects of software development have long had to face non-determinism. A notable example of this is distributed systems, and a notable figure in helping us probe the truly uncomfortable waters of distributed systems is Kyle Kingsbury (Aphyr). Last month he dropped a long article (the pdf is 32 pages) on how he sees our LLM-enabled future. The title “ The Future of Everything is Lies, I Guess ” betrays his lack of enthusiasm for this future. Some readers are undoubtedly upset that I have not devoted more space to the wonders of machine learning—how amazing LLMs are at code generation, how incredible it is that Suno can turn hummed melodies into polished songs. But this is not an article about how fast or convenient it is to drive a car. We all know cars are fast. I am trying to ask what will happen to the shape of cities. It’s worth the long read, even if it isn’t terribly cheerful. Kingsbury brings up many of worries about AI’s growth from the perspective of someone who is clearly well-informed about their capabilities. His view is that the best response to all this is that we should stop. He wants to avoid using AIs for his writing, software, or personal life. He thinks those working for the AI companies should quit. And yet he also knows that these tools are useful, and wants to use them. I’m both a hoper and a doomer when it comes to our AI future. Fundamentally I see any powerful technology as a big bus: we are either on it, or get run over by it. I’m onboard the bus because I don’t think putting up some barriers would stop me being crushed by its wheels. Maybe if I’m on the bus I can join some people to influence the driver a bit. I’m also very reluctant to speculate on the future outcomes of anything, let alone something as powerful as this. Did the early industrialists in the late eighteenth century have any clue what the industrial revolution they unleashed would do? While it created many harms, it also created a massive rise in the living standards of millions of people, at least those whose countries were on the bus. AI may create benefits that I can’t really dream of, although I can glimpse it when it helps a friend stave off Parkinson’s disease. Those hopes are there, but Kingsbury’s article shines a light on the darker elements of the here-and-now, asking serious questions of responsibility a part of my work as a moderator of a Mastodon instance is to respond to user reports, and occasionally those reports are for CSAM, and I am legally obligated to review and submit that content to the NCMEC. I do not want to see these images, and I really wish I could unsee them. On dark mornings, when I sit down at my computer and find a moderation report for AI-generated images of sexual assault, I sometimes wish that the engineers working at OpenAI etc. had to see these images too. Perhaps it would make them reflect on the technology they are ushering into the world, and how “alignment” is working out in practice. Don’t do sports  ↩ Don’t do sports  ↩

0 views

Do You Take Ethics Into Account When Buying Video Games?

Something terrible almost happened. I almost bought a ModRetro Chromatic retro handheld device until someone pointed me towards Natalie’s Don’t Buy from ModRetro post who outlined you’re indirectly supporting a war: […] I am against children dying […] Also, there is a evidence that this was done by the US most likely through bad information from AI. We’ve also seen this month that AI companies like Open AI and Anthropic are largly entangled in the US empire’s war machine. You know another that exclusively mixes AI and weapons? Anduril. Anduril’s co-founder is Palmer Luckey who is the creater of ModRetro. ModRetro has release a Game Boy Color FPGA emulator, the Chromatic. More toxic and very questionable evidence is provided by Natalie. But how could I have known all this? I accidentally discovered the existence of the Chromatic through a YouTube video on modern retro-inspired handhelds (that include my own Analogue Pocket). I was shocked to discover this—not only because of Luckey’s actions, but also because of how easy it is to willingly or unwillingly ignore all this and buy a Chromatic anyway. This begs the question: should we be actively researching the ethics behind every product we want to buy? I think the answer to that is yes in theory . In practice, as Roy Baumeister taught us about the working of willpower, pouring energy into this means having less energy in reserve for other more pressing urgent matters happening in your immediate vicinity such as your family. In practice, thoroughly researching something—especially ethics which isn’t as easy to find as technical features—requires willpower I don’t always have available in abundance. This is where the government should step up by providing regulations to prevent such shady products from entering the market in the first place. We all know how that turned out… Do you ever wonder why the ethically sound chocolate bars are put in lower shelves (or entire different aisles) while the cheap and established brands promote their bars all over the place? Whoops, slave labour still exists, did you know you’re supporting it through the purchase of a stupid chocolate bar? Why are the organic locally-grown apples put somewhere else besides next to the other apples? No wait, why are the other apples there in the first place? Sadly enough there are ample publicly leaked examples of ethically questionable behaviour by video game developers; some of which I only found out after playing their game. A few examples then: The most obvious example without a doubt is Activision Blizzard’s many abuses of their employees. They lost nearly billion in market value thanks to a discrimination lawsuit . More lawsuits two years later were “settled” (read: bribed). The stock prices tanked and Microsoft bought them, resulting in a huge payday for the exact executives that were under fire. That’s irony for you. Some developers are very vocal on social media about their extreme-right, transphobic, and/or homophobic beliefs. I don’t know what goes on in their stupid heads as this obviously damages your reputation and game sales. At least, you’d think. Apparently, it doesn’t damage them enough? Voidpoint, the makers of Ion Fury , are one example of this . It’s so sad to read as I really enjoyed that game and feel very conflicted about it now. The lead developer of Pizza Tower apparently left an offensive joke in some private Discord channel that was of course screenshotted and much later discovered by (or explicitly sent into?) the angry Reddit mob. He later apologised, but I wonder: is this a case of extremism on the defensive side? Is this a recurring theme in the indie development scene because the teams are small and their edgy jokes that mean no harm that otherwise would be filtered out by a huge HR department are easily misinterpreted? Or not? There are more examples to be found but you get the gist. The problem is not limited to video games. I was recently shopping around for a new terminal emulator after growing tired of iTerm2’s blatant genAI feature adoption. Apparently, the developer of Kitty adopts a toxic stance telling some of their users to “go soak your head” if he disagrees with their statement. I do understand that it’s tiresome to reject silly feature request after request but that doesn’t mean you have to resort to an aggressive stance. But again, how would you know? I didn’t until I found out about that in some random blog post. Should I uninstall Kitty now? Or what about JK Rowling’s crazy public transphobia outings? What if you read about that in the news after you read all the Harry Potter books and loved them? Would you burn them and vow to never read or watch related material? Or just shrug? Another question might be this: does the maker’s preference for vices instead of virtues affect my opinion on the made product? I love Pizza Tower —it’s in my Top 25 Games of All Time although that might be recency bias talking here. I’m typing this on a MacBook instead of a Framework laptop. God knows how the materials of this Apple laptop are mined (and will that differ from another one?). We buy lots of stuff that carries the label “made in PRC” that might or might not be ethically bad. It’s all just one big question mark. Why are so many companies opaque about their ethics? (I think the answer begins with the letters C-A-P-I…) There should be a community-based filter for this. And there is, it’s called “asking around”, but that method is far from perfect. I wish companies would be more open about their ethics—and not in a meaningless code of conduct letter written by the legal department. Perhaps then the honesty and peer pressure around it might enforce them to behave. Related topics: / ethics / By Wouter Groeneveld on 14 May 2026.  Reply via email .

0 views

Freedom from unreal loyalties

In the work against war, Woolf notes that women—unlike many of their brothers—have four great but perhaps misunderstood teachers: And those teachers, biography indicates, obliquely, and indirectly, but emphatically and indisputably none the less, were poverty, chastity, derision, and—but what word covers “lack of rights and privileges?” Shall we press that old word “freedom” once more into service? “Freedom from unreal loyalties,” then, was the fourth of their teachers; that freedom from loyalty to old schools, old colleges, old churches, old ceremonies, old countries which all these women enjoyed, and which, to a great extent, we still enjoy by the law and custom of England. We have no time to coin new words, greatly though the language is in need of them. Let “freedom from unreal loyalties” then stand as the fourth great teacher of the daughters of educated men. Woolf, Three Guineas , page 267 These are strange teachers. We may be forgiven for not seeing them as such when they’ve visited us. Woolf continues: By poverty is meant enough to live upon: That is, you must earn enough to be independent of any other human being and to buy that modicum of health, leisure, knowledge and so on that is needed for the full development of body and mind. But no more. Not a penny more. By chastity is meant that when you have made enough to live on by your profession you must refuse to sell your brain for the sake of money. That is you must cease to practice your profession, or practice it for the sake of research and experiment; or, if you are an artist, for the sake of the art; or give the knowledge acquired professionally to those who need it for nothing. By derision—a bad word, but once again, the English language is much in need of new words—is meant that you must refuse all methods of advertising merit, and hold that ridicule, obscurity, and censure are preferable, for psychological reasons, to fame and praise. Directly badges, orders, or degrees are offered, fling them back in the giver’s face. By freedom from unreal loyalties is meant that you must rid yourself of pride and nationality in the first place; also, of religious pride, college pride, school pride, family pride, sex pride, and those unreal loyalties that spring from them. Directly the seducers come with their seductions to bribe you into captivity, tear up the parchments; refuse to fill up the forms. Woolf, Three Guineas , page 270 Woolf is echoing what we already know of wealth, fame, and loyalty—namely, that they encourage possessiveness and defensiveness, that they drive us to the violent defense of prestige and power, and that on that road lies war . We see this possessiveness and defensiveness in the whingeing insecurity of the leaders declaiming DEI; in the boss who insists his workers flatter his every decision, however foolish and arbitrary; in the patriarch who demands obedience from his wife and children; in the man who beats his partner when she tries to leave. (The most dangerous time for a woman in an abusive relationship is always when she is trying to leave.) Woolf, again: “the public and the private worlds are inseparably connected…the tyrannies and servilities of the one are the tyrannies and servilities of the other.” 1 If we are to prevent war in our public worlds, then we must also root it out in the private. And we must root it out among ourselves. For we are no more immune to the appeal of tyranny than anyone else: And the facts which we have just extracted from biography seem to prove that the professions have a certain undeniable effect upon the professors. They make the people who practice them possessive, jealous of any infringement on their rights, and highly combative if anyone dares dispute them. Are we not right then in thinking that if we enter the same professions we shall acquire the same qualities? And do not such qualities lead to war? Woolf, Three Guineas , page 249 In naming these teachers, Woolf transforms a proscription into a refusal. The lack of wealth becomes the refusal of it; the lack of fame, of prestige, of authority becomes the rejection of all those ugly and pernicious forces. (The one benefit of living in an era in which we are bombarded with the lives of the super wealthy is we cannot even for one moment forget that they are deranged.) By claiming that lack as a refusal, we release ourselves from longing for that which we can never have; we end a ravenous hunger that could never be sated. For had we great rank and great wealth and all the rest, we would be as eager for war as the warmongers, as miserable and unhappy as the billionaires. Without, we can see war for the horror it is; we can use our time and attention to imagine other worlds, and other roads to get there. I think these teachers go by other names—frugality, integrity, humility, and solidarity, to name a few. Like the best teachers, they ask a lot of us. Perhaps too much on some days; we may not always be able to hear them, especially through the din of the war drums and the noise of the platforms and the very real fear of precarity that screams ever so loudly in our ears. But I think perhaps that if we make an effort to listen, we will find that they still have much to teach us, that we still have much to learn. Woolf, Three Guineas , page 364  ↩︎ View this post on the web , subscribe to the newsletter , or reply via email . Woolf, Three Guineas , page 364  ↩︎

0 views

Pedro the Vast

Pedro is vast, but he is also hidden and mysterious, tucked behind locked doors and a colloquy of priests and doctors. A eucalyptus farm worker, he and several of his fellows fall suddenly ill from a strange fungal disease. None of the others survive, but Pedro slips into a coma and then, miraculously, awakes. His survival brings the attention of a foreign mycologist and an enterprising priest who reckons him a prophet, while his children are left to fend for themselves. Pedro, meanwhile, continues to lurk and rant, his words making little sense, his body succumbing to decay. His story haunts the lives of everyone else trying to survive amid the ruins, waiting— expecting —something to change. View this post on the web , subscribe to the newsletter , or reply via email .

0 views

Bliki: Interrogatory LLM

When we need an LLM to perform a complex task, we often need to feed it a lot of context. Coming up with a design for a new feature requires descriptions of how we want the feature to appear to the user, guidelines on how it should be implemented, information on external systems to consult, and so on. All this can be several pages of markdown. The obvious way to do this is for a human to write this context, but an alternative is to use an LLM to write this context after interviewing a human. The way I can do this is to prompt the LLM to interrogate me. It should ask me all the questions it needs to create this appropriate context. I can feed much of the information it needs, and tell it other sources it needs to consult if it can't figure those out itself. Once it's done, it can then create the context report for another session (perhaps with another model) to carry out the next step. I first saw a decent description of this approach in Harper Reed's blog . A striking element of his approach is insisting that the LLM ask only one question at a time. (When I tried it, I found it needed to be frequently reminded of this.) Another way to use an interrogatory LLM is to give it a document, such as a software specification, that captures knowledge about a domain - and then ask the LLM to interview a human expert to determine if the document is accurate. This is an alternative to getting the human expert to read the document to review it. People often find reviewing hard, so a conversation with an LLM might be more fruitful, particularly if the document isn't well-written. Naturally we can use both of these, using one interrogatory LLM to build a document, then using other interrogatory LLMs to review it with other experts. The above is getting an LLM to create or assess context for a particular use of an LLM. But the technique is more broadly applicable. I've become a natural writer, someone who finds the process of writing an essential part of thinking. To really understand something, I need to write about it. But different people are different. Many folks find writing hard, often very hard. This can be a real problem when we need to get information out of someone's head into a form that other humans can consume. Maybe such people would find it easier to ask an LLM to interview them than to write a document themselves. Certainly the result will have that tang of AI-writing that folks like me shudder at - but that's better than not having the information itself, either due to rushed writing or no writing at all.

0 views

Note #735

Night in Shimbashi. Taken with an M11 and a 50mm summilux We will be back in a few weeks. HMU Thank you for using RSS. I appreciate you. Email me

0 views
Stratechery Yesterday

An Interview with Ben Thompson at the MoffettNathanson Media, Internet & Communications Conference

An interview with me about the implications of the compute shortage on Aggregation Theory, consumer AI, and more.

0 views
iDiallo Yesterday

It's funny because it's true

I made a joke online. Based on Internet upvote points, it was pretty funny. OK, I didn't come up with the joke, but it was a perfectly timed reference. A few days back, Cliff Stoll, of the Klein bottles, submitted a post on hackernews titled: Rumors of my death are slightly exaggerated First of all, I was surprised that Cliff frequents hackernews. Second, I didn't know that he was supposed to be dead. Yet, he wasn't since he is writing to say that he isn't dead. Apparently, there was a post on facebook that told his story and claimed that he had died in May of 2024. I was shocked to hear it. I've always admired Cliff and his work. But then again, he was the one who posted it on HN. Large language models learn facts by scraping the web, including facebook. So an LLM had ingested this information and regurgitated the information as fact. Wikipedia also used the facebook post as reference for his death. AI hallucinations are getting ambitious. A couple people recently emailed, asking whether the Klein bottle business was still operating after my death. “Huh?” I thought. “I ain’t dead yet.” After some digging, I discovered the source: an AI-generated review of The Cuckoo’s Egg circulating on Facebook. Alongside the usual synthetic praise and fabricated details, it confidently announced that I had died in May 2024. Apparently AI has now advanced to the point where it can kill people off before they notice. Mark Twain once wrote, “Reports of my death are greatly exaggerated.” I never expected to field-test the quote personally. source: [redacted to stop the spread] Cheers, -Cliff It was a funny story. It reminded me of the story of Doc Daneeka in the book Catch-22. Doc Daneeka is an army doctor who is afraid of flying. He bribes other soldiers to add his name on the flight manifest so as to appear as if he had performed his mandatory flight time. One day his name is entered in the manifest, the plane takes off, then crashes. The army checks the logs and sees that his name is in the logs. The army sends a generic message to his wife. I decided to respond in kind to Cliff’s post. Oh we already mailed the letter: "Dear Mrs., Mr., Miss, or Mr. And Mrs. Stoll Words cannot express the deep personal grief I experienced when your husband, son, father or brother was killed, wounded or reported missing in action" It was funny. I got many Internet points. But the last thing I expected was a response from Mr. Stoll himself: Oh my, but you know more than you can guess. About a year ago, my wife passed on. While deep in grief, I began receiving letters from financial institutions and banks that began, "Dear Mr/Ms Stoll, we offer our sincere condolences ..." How can a corporation have "sincere condolences"? They're the last place I'd go for comfort or sincerity. Everyday Catch-22 seems to become less and less absurd. 1984 is becoming a reality. Brave New World is the world we live in. These authors have become oracles. Somehow, the joke wasn't funny anymore. Because it was true.

0 views

Managed agents are the new Lambda

Managed agents (cloud-hosted agents) are the next big push from the frontier labs. They're genuinely incredible. They're also going to be the AWS Lambda of this cycle - powerful, sticky, and an absolute nightmare to migrate off once you're in deep. While the exact definition is up for debate, in my mind a managed agent is an agent harness (like Claude Code) running in the cloud , not on your local machine. This has a few major advantages. The most obvious one is that you don't need a machine running locally - it can do its work 24/7, in the background. The other that running in the cloud means it can be notified of changes and act on them. Imagine, for example, agents responding to incoming emails or webhooks and doing some activity based on them (this is very possible locally - but easier with the agent running on the server). The other advantage is security - probably the key part of the "managed" agent. Much like PaaS (platform-as-a-service) products like Heroku, AWS ECS/App Runner/Lambda and Azure App Service/Functions, the provider manages not just the underlying physical infrastructure for you, but also manages patching the operating system and related server software on your behalf. Sandboxing is another related benefit. Managed agents only get access to what you give them - no risk of an agent wandering into files it shouldn't. If you're already running Claude Code/Codex/OpenCode in Docker on a server, you've basically built one yourself. The frontier labs are just productising the pattern. Anthropic has really been pushing their managed agents product hard lately. This makes a lot of sense - cloud hosted agents are genuinely incredible in what they can do - but I'd urge real caution on locking yourself into a vendor - at least at this point. Fundamentally, agents are not particularly difficult to swap out. While there are important differences and nuances in how they work and operate, switching from Claude Code to Codex (or OpenCode, or Pi, or one of the many other agent harnesses) is a fairly simple process. Fundamentally the pattern is the same - run a harness with a prompt, context and tools and capture output and logs. All agent harnesses have the same primitives. And at least having the ability to swap the agent harness and model out is really important. Clearly pricing is one important dimension, but equally so is being able to use new models from different labs. The competition is absolutely cutthroat and shows absolutely no sign of slowing down. Once you start using a managed agent product from a frontier lab this gets far more difficult. A lot of your data and workflows are embedded in their cloud. While Anthropic have gone to lengths to say it is your data and it can be exported, in my many years of experience of vendor lock in this definition drifts and gets harder and harder to migrate to another provider. As many people found out with AWS, moving Docker container workloads is fairly easy if you want to move hyperscaler clouds. Moving AWS Lambda [1] functions is far, far more difficult - I've seen organisations spend months upon months unpicking Lambda code and assumptions when they realise it isn't a good fit after all the hype dies down. Yesterday Anthropic announced huge changes to their pricing model which underlines this point. If you run Claude Code non-interactively (which includes nearly all cloud-hosted agent usage - and many others [2] ), these now are not eligible for your subscription token allowances and will instead use some new credit. After this allowance is exhausted then it is very expensive API tokens ahead. It's fair to say if you were using a lot of "non interactive" Claude Code you are looking at a 5-20x price increase with these changes. It's clearly Anthropic's prerogative to do this - and (I think) points to their compute shortages more than anything, but it has given OpenAI a real opening for users to switch to Codex - OpenAI (currently, at least) have been very explicit you can use your included allowances on your plan with any tool and however you like. Expect to see a lot more talk around Codex (which has been already gaining significant traction over the past few months) and other providers in the future - developers are often remarkably price sensitive around things like this, especially for personal 'side projects' - which often then end up informing enormous purchasing decisions in the companies they work in months and years down the line. [3] Now it's easy to say don't use a frontier lab's managed agent product, but what are the solutions? I think there's two main ways you can solve this in your organisation. Firstly, roll your own managed infra. This is a good option for developers and tech adjacent teams - they will have the expertise to do this. Essentially, it's just running a Docker container which they do all day every day. Using something like OpenCode as a harness allows you to use any model provider and switch between them in minutes. Secondly, there's a flood of startups and other companies that allow you to run managed agents with any model or provider you want. I haven't (yet) evaluated them in detail as the market landscape is switching so fast to give any real thoughts on quality, but providers include Cloudflare Agents, Vercel and the hyperscaler options (AWS AgentCore, Azure AI Foundry and GCP Vertex AI Agent Engine). My personal view is until this shakes out a bit more, stick to self hosting them. It's not difficult, allows you to secure them inside your current infrastructure and builds organisational competence around agent primitives. Outsourcing this knowledge at this point is a path to serious organisational knowledge gaps. However, expect this to change as the platforms introduce more capabilities that become more and more difficult to replicate. One to keep an eye on. The one ointment in this plan is that I have a strong gut feeling the frontier labs are going to start introducing new models and capabilities that are only available on their managed agent platforms. This is where the pendulum (maybe) starts swinging to having to use managed agents - but again, maybe not. Lambda is a way of running applications "serverless" which in theory allows much easier deployment and scaling - more of the primitives of application hosting is abstracted. However, it means you start really having to lean into AWS specific code, techniques and patterns, that can be really difficult to revert ↩︎ It also includes alternative frontends to Claude Code, like the excellent Conductor Mac app, despite this really being the definition of interactive usage. ↩︎ This is why I really hope that Anthropic rethinks this at some point. ↩︎ Lambda is a way of running applications "serverless" which in theory allows much easier deployment and scaling - more of the primitives of application hosting is abstracted. However, it means you start really having to lean into AWS specific code, techniques and patterns, that can be really difficult to revert ↩︎ It also includes alternative frontends to Claude Code, like the excellent Conductor Mac app, despite this really being the definition of interactive usage. ↩︎ This is why I really hope that Anthropic rethinks this at some point. ↩︎

0 views
Justin Duke Yesterday

Sweet Smell of Success

At its core, Sweet Smell of Success is about two men. At the beginning of the film, you think — while similar — one is decent, just desperate, and the other is beyond saving. By the end, you understand that both men are evil; the only thing separating them is the amount of power they wield. These two performances by Burt Lancaster and Tony Curtis are flatly terrific. There is little to say, because I've concerned myself much more with the 60s and 70s than the 50s, and so I can't say much about how these roles are in conversation with their prior oeuvre. But it is plainly clear that the screen bursts alive whenever either of them is talking. The rest of the film is a push-pull: a fairly standard and at times cartoonish melodrama — filled with an evil that feels more cartoonish than banal as each act progresses — rescued by the best window dressing in the world, and a whiplash script that finds entertainment and grace in its brief moments of joy. The director wrings a lot of tension out of how lovely every individual scene feels at the onset. Beautiful jazz soundtrack. Beautiful Manhattan nightclubs. Filmed and captured with just the right amount of realism. And then, the decrepit material disgust they're all wading through. I don't really go for morality tale movies at this point. While there's a certain world-weariness and hardscrabble wisdom to the proceedings here that might have been more winning with contemporary audiences, it's not exactly breaking news to me that owners of media corporations can be childish, petty, and controlling. Perhaps my fundamental flaw with viewing the film is that I think it hinges on a dwindling confidence that our protagonist is going to, at some point, snap out of it and do the right thing — even though it's so aggressively telegraphed that he won't. It seems odd to spend so much time criticizing a movie I thought was very good, so let me end with this: it is a smart, beautiful, honest movie that does not pull any punches.

0 views
Justin Duke Yesterday

Just aim the cannon correctly

James Shore has a post I found myself nodding along to until the very last step, where he loses me. The thesis is clean: Your AI coding agent, the one you use to write code, needs to reduce your maintenance costs. Not by a little bit, either. Productivity over the long haul, he argues, isn't bounded by how fast you can produce code — it's bounded by maintenance cost, which compounds. Any coding agent that accelerates production without taming maintenance is, definitionally, a debt-laundering operation: it lets you skip the bill today and pay it forever afterward. Each individual claim is, I think, correct. Code is debt; maintenance compounds; an agent that bolts on features faster than your team can absorb them is, given a long enough horizon, an anti-productivity tool. The conclusion Shore stops just short of stating — that current LLM tooling is, on net, bad — is where I get off the train of thought. A useful frame that I like across a variety of contexts is that of a difficulty score. The idea is straightforward: every recurring operation in your organization has some friction associated with it, and you can roughly approximate that friction with a back-of-the-envelope cost function. For something like opening a pull request , my version goes: For support, it might be: The specific weights don't really matter. The point is that once you have a function, you can take a derivative — you identify the things that are Bad and then you start taking discrete steps towards reducing them, with the overall goal of getting the score down as low as possible. And LLMs are, in my experience, fantastic at bringing these scores down. The bulk of what I've spent my own LLM-augmented time on at Buttondown in 2026 has been on this axis — squeezing the inner loop , trimming dependencies , handing diagnostic and scoping work to agents in the background — and the return has been outsized. Conversely, where I see LLMs deployed in the most deleterious manner — and where I think Shore's argument probably finds the most purchase — is when the relationship between the tool and the codebase is purely additive. LLMs are very, very good at adding features. They are also, more insidiously, very good at telling you that adding a feature is a great idea. 1 I have, on multiple occasions, tried to talk a coding agent out of building something. It is genuinely harder than the opposite, which I find both funny and faintly horrifying. You can ask any coding agent whether it should build the thing you just described to it, and the answer will essentially always be yes, with concomitant action plans and bullet points that it will litter throughout the codebase. The failure case I see in organizations that rhymes with Shore's point is along those lines. If scaffolding a feature took a week, you thought hard about whether to scaffold it. If it takes an afternoon, the answer skews toward yes, and yes, and yes, until you wake up one morning with sixteen new endpoints and no real idea why any of them exist, nor any graceful seams across them. But you can simply not use the tool that way, in much the same way you can simply not use Playwright as a full substitute for a testing suite. Here's a rough playbook: None of this is LLM-specific 2 In general, I think a useful framing device for reading essays about LLM-assisted engineering is "how much of this rings differently if they're just talking about all engineering writ large?" . The failure mode of letting maintenance debt accumulate without ever asking what specifically is making my life hard predates LLMs and will outlast them. What LLMs do is give people (and organizations) a fast forward button: they let the organizations that were already losing to maintenance debt lose faster, and they let the organizations that have their score-keeping in order pull away further. All of which is to say: I agree with Shore on the diagnosis. I just don't think the cure is to abandon the tools — it's to point them at the right operations, with eyes open about which ones, and to remember that adding code is the most expensive thing a coding tool can ever do for you. +1 point per ten seconds of wall-clock time spent waiting on tests or CI +5 points per tool you have to context-switch into (docs, dashboards, terminals) +1 point per click +10 points per manual check you run before merging +5 points per click required to triage a ticket +10 points every time the answer isn't simply a link to documentation we already have +25 points every time you have to log into the user's account because the relevant data isn't surfaced to support ×1.25 per follow-up response from the user Define the difficulty scores for the operations that matter. Write them down somewhere your team can bikeshed (non-derogatory, of course) and flesh them out. Triage the obvious low-hanging fruit (which is always much more than one assumes) against all the other work to do, biasing heavily towards this stuff because it's inner-loop. Ship the improvements.

0 views
Michael Lynch Yesterday

Refactoring English: Month 17

Hi, I’m Michael. I’m a software developer and founder of small, indie tech businesses. I’m currently working on a book called Refactoring English: Effective Writing for Software Developers . Every month, I publish a retrospective like this one to share how things are going with my book and my professional life overall. At the start of each month, I declare what I’d like to accomplish. Here’s how I did against those goals: I keep feeling like I’m close to done, but then I spend more time than I intend to on bug bounties. Revenue dropped for the book, as I haven’t done any marketing since March . Instead, I’ve been getting distracted by bug bounty hunting . I’m glad I’ve been able to skate by on past effort, but I see the numbers trending toward zero if I neglect marketing. For the past three months, I’ve been spending a lot of time using AI to find security vulnerabilities. I haven’t talked about it publicly because I didn’t want to attract competition to the limited supply of bug bounty programs. I wasn’t sure if other people realized just how effective AI is at security research, but I think the cat is out of the bag . If you haven’t been following along with AI and security research, Firefox is an astonishing case study. Throughout 2025 (before AI was any good at security research) Mozilla and external researchers collectively found 10-20 security vulnerabilities in Firefox each month. In February 2026, Anthropic used Claude Opus to find 22 Firefox vulnerabilities . In other words, that month, Anthropic alone found more than everyone else combined in any of the previous 13 months. Two months later, Anthropic used Claude Mythos to find a whopping 271 more vulnerabilities in Firefox. I sort of spotted this early, but I got it slightly wrong. Back in January, I thought that AI might be able to revolutionize cybersecurity research, but I thought the value was in creating security tools. I was using AI to write fuzz testing tools and was amazed at how much faster I could perform fuzz testing than when I did it by hand . Despite the fact that I could write fuzzers 10-20x faster, it turned out that my strategy was way more work than was necessary. Instead of asking AI to create a fuzz testing tool and evaluate its output, you can just ask AI, “Hey, look at the source code and tell me all the vulnerabilities.” After I saw how good AI was at directly auditing source code, I stopped fuzzing and focused on source auditing. I’ve now reported 50+ bugs to five different bug bounty programs and earned about $10k in bug bounties. While I’ve successfully used AI to find security vulnerabilities, I’ve been less successful at finding companies willing to pay me for my findings. Here are my results so far: So, the $10k from vendor 4 only took two weeks of part-time work. That would be a great return on investment had I not also spent 6+ weeks on bounty programs that paid nothing. It would also be great if I could find more vendors like vendor 4, but I don’t know how to do that. I’m now torn on how to allocate my time between the book and bug bounties. Here’s my thinking: Rationally, I have a hard time justifying why I should continue chasing bug bounties, but I do want to keep at it a little bit, maybe like a 70/30 split between the book and bug bounties. A third possibility is that instead of chasing bug bounties, I teach people what I’ve learned in the last few months about using AI to find security vulnerabilities. I’m thinking about offering a small, cohort-based course where we find bugs in open-source projects. We’ll pick projects with no bug bounty attached so that students can internally share findings without worrying about someone running off with their reward. The format will be some combination of live or recorded screencasts + a private group chat for 2-4 weeks. The course is not going to be about making money from bug bounties. Maybe I’ll cover that some, but that won’t be the focus because that’s not what I’ve learned most about in the last three months. The course will be about using AI to find security vulnerabilities in large codebases. I’ll show the techniques I’ve learned for getting AI tools to focus on likely areas of bugs and avoid wasting time and tokens on bad leads. You can apply these lessons internally with your own closed-source code or on open-source projects you want to help secure. If you’re interested, sign up for my interest list below: A few weeks ago, I saw a question on reddit from someone who wanted to delete their Facebook account but capture an archive of their data in a usable format. That reminded me of a project I’d seen on Hacker News but never explored much called Timelinize . Timelinize lets you import data you exported from Facebook, Google, Twitter, and similar services, and the app creates a unified timeline to explore your data. The creator is Matt Holt , who also created Caddy , the popular reverse web proxy. Timelinize still feels pretty alpha-stage, and I had to add a bunch of local patches to make it usable, but I like where it’s going. I plan to upstream more of my patches as I use it more. Whenever I find a local, offline solution for something that previously required a cloud service, it feels oddly refreshing. When I switched from streaming services to Jellyfin , I was surprised at how different it felt to just watch what I’m watching without a company watching me back for ways to squeeze money from me. The weird thing was, when I watched Netflix or HBO, I never consciously thought, “Oh no! I’m being monitored.” But when I started exclusively watching TV and movies locally, it was as if I spent so much time in an office cubicle that I forgot that there’s an outside at all. Then, I went outside and enjoyed fresh air and sunshine. Metaphorically, I mean. Literally, I was still sitting inside watching TV on my computer. But it was so much faster and freer than before! I had a similar “breathing fresh air” experience with Timelinize. Timelinize’s interface is user-oriented, which makes me realize just how user-hostile the interface is on cloud platforms. Facebook and Twitter don’t want you to just scroll through your old messages because that doesn’t make money for them. To discourage you from reading your old messages, they make the experience subtly uncomfortable: they squeeze the conversation into a tiny box, they force you to stop and wait for new messages to load every few seconds, and they constantly show you distracting notifications to lead you back to the new content they can monetize. With Timelinize, the reading experience is designed to let me just read my archive. There’s nothing trying to steal my focus and check out what’s new because Timelinize shows a historical snapshot. I find it fun to jump to a date 10 years ago and read what my conversations were at the time. The Timelinize interface lets you read your conversations without trying to steal your focus with notifications. I didn’t follow React2Shell at the time, but it was a critical vulnerability in React.js that allowed an attacker to gain code execution on many React.js and Next.js apps. Last week, the two researchers who discovered React2Shell wrote about what happened behind the scenes: Lachlan’s post got more attention, but I found Sylvie’s more interesting, especially given that she was a 20-year-old college student at the time. Lachlan and Sylvie both realized they’d found a “nuclear bomb” that affected hundreds or thousands of major websites. After reporting the bug to Meta (who maintains React) and Vercel (who maintains Next.js), they wanted to identify other bug bounty programs that would pay them for their work on this massive bug. The researchers couldn’t disclose the bug to the other vendors until Meta publicly announced the security advisory. The problem was that once React2Shell was public, Lachlan and Sylvie would lose their edge over everyone else rushing to claim the same bounties. To get a head start, Sylvie scouted bug bounty programs during the bug blackout period and checked whether those vendors’ sites were vulnerable to React2Shell. That way, as soon as Meta announced the vulnerability, Sylvie and Lachlan could claim these third-party bounties. The problem was that before React2Shell became public, Vercel created a filter for the bug in their web application firewall (WAF) that would protect Vercel customers even if the customer sites were running vulnerable versions of React or Next.js. Meta and Vercel also worked with Cloudflare and similar WAF platforms to teach them how to filter React2Shell attacks. So, after Meta announced React2Shell, Sylvie tried reproducing the bug against the sites she scouted ahead of time, but the bug didn’t trigger. Almost all of the sites with bug bounties were on Cloudflare or Vercel, so the WAFs blocked Sylvie’s exploit. So, now Lachlan and Sylvie had to figure out how to sneak their exploit past Cloudflare’s and Vercel’s WAFs to trigger React2Shell, but bypassing enterprise WAFs is a massive research project in itself. Fortunately, Sylvie found a bypass in Cloudflare’s WAF and five distinct bypasses in Vercel’s. Interestingly, the “vast majority” of what Sylvie earned came not from React2Shell itself but for the WAF bypasses, as Vercel paid $50k per reported bypass. If you’re interested in learning about using AI to find security vulnerabilities in your team’s code, sign up for my interest list. If there’s enough interest, I’ll put together a course. I’m torn between focusing on my book and pursuing security bug bounties. I’m considering a course to teach what I’ve learned about using AI to find security vulnerabilities. Result : I’ve still got about 1-2 weeks of writing left Vendor 1: Meta I submitted eight reports, including one remote code execution bug. I received no response for several weeks. I found email addresses for developers that worked on the product and pinged them, and they escalated my reports to get them past triage, but there’s been no movement since then (two weeks and counting). Vendor 2 I submitted one report. Vendor triaged it in one business day, but said it would be several weeks before they could investigate thoroughly. I haven’t heard anything in over 30 days. Vendor 3: I submitted one report. Vendor claimed it was a duplicate, so no bounty. Vendor 4 I’ve submitted 40ish reports. Eight were paid after two weeks for a total of $9,700. Two were rejected as duplicates. The remaining are all awaiting triage, though the most valuable ones were in the first eight that have received payouts. Vendor 5: Firedancer (crypto project) Found a few medium-severity issues. When I started the bounty reporting process, I realized that they require researchers to upload their passport to a service I’ve never heard of, so I stopped there. Their program rules are also sketchy in that they seem to contradict the rules of the bounty platform they’re using. Focus on my book Pro : The book is nearly done, so if I focus on finishing, it will be complete and more valuable than a partially-finished book. Pro : The book is something only I can create, whereas lots of people can participate in bug bounties. Pro : I’m already late on delivering the book, so finishing it makes me feel less guilty about making readers wait. Pro : I can talk publicly about my book, and not only does it help me think out loud, it helps new readers discover the book. Con : The expected value of the book feels lower than bounty hunting, at least in the short-term. In theory, I could find a $100k bug next week, whereas it’s unlikely I could do anything that would drive $100k in book sales by next week. Focus on bug bounties Pro : I made more in two weeks of bug bounties than I did in all of 2025 on my book. Pro : There’s still a massive amount of undiscovered, bounty-paying bugs that AI tools can find. Pro : If I pause for a few months, the value of the remaining bugs will be significantly lower, as many other researchers will have claimed the easy-to-find bugs. Con : Participating in bug bounties is frustrating, as you have no leverage. The vendor can completely lowball or shaft you, and you have no recourse or negotiating power unless you sell the exploit to buyers who want to use it for nefarious purposes. Con : Bug bounty hunting is addictive like gambling in that there are variable rewards that appear semi-randomly. Con : Bug bounties push me back into bad AI usage habits . If I have an AI agent searching for bugs in the background, I constantly want to check on its progress and redirect it based on early results. Con : I’m much more limited in what I can share publicly about my work, both because bounty programs often require it and because I don’t want to attract competition to the same places where I’m focusing effort. Early interest list - Using AI to find security vulnerabilities “The React2Shell Story” by Lachlan Davidson, the lead researcher on finding the vulnerability. “The React2Shell Story and What Happened Next.js” by Sylvie Mayer, who assisted Lachlan in exploring the vulnerability, notifying vendors, and identifying bug bounty programs that would pay for the vulnerability. Published new chapters: “Improve Your Writing with AI” and “Meet the Reader Where They Are” Held a live session with readers about using AI to improve writing Reported a bunch of security bugs through bug bounty programs It’s better for me long-term to focus on my book than on security bug bounties. The hard part is that bug bounties are so short-term rewarding, whereas the rewards for the book typically lag my investments by at least a month. It’s unexpectedly satisfying to migrate from a cloud service to something you run locally. Get Refactoring English to “content complete.” Create a tool that allows Refactoring English readers to give feedback as they read the book. Early interest list - Using AI to find security vulnerabilities

0 views

Flat/Non-higher-order construction of ANF via mutable holes

This post will describe an algorithm for constructing an ANF/similar IR from a functional-like base language with no closure usage, and with no related excess space requirements. This algorithm is as far as I know not new, and has apparently been known in imperative compilation for a while. However, I admittedly have no source for this, nor have I seen it described for functional-like languages. Thanks to Ryan Brewer and rpjohnst for their inspiration on this. If you are only interested in code, please skip to Implementation. First, let us describe our input and output languages. Our input language will look like this: This is meant to echo some simple functional programming language, and contains all the elements needed to demonstrate the algorithm. Our output language will look like a lexically-scoped flat ANF-style construction, with all lets at the toplevel. The goal is to convert from our input to this language. The notable addition is the constructor in , the purpose of which will be described now. Imagine we are trying to convert the term When we arrive at the body of , we have a conundrum. We need to put the conversion of before , so that using is well-scoped. Hence, we must convert before we can continue. How do we use this conversion, though? 's content needs to be put in the field of 's , so it's not clear how to both convert before and to use 's finished conversion in . One "traditional" method of solving this is via continuations, where is passed a closure representing "what to do next", after it converts its body. We can then put 's content in that closure, which will be invoked later. This has two main flaws: This method resolves these. Recall that we left a field in our type. I will denote a hole like this: . The central idea is that our conversion of any given fragment will return: Consider this simplified example: When we convert , this signals for the conversion of . There are two steps to converting a - we must first convert the "head" ( ), and then the "body" ( ). Converting the will result in something of form Note that the hole is left to be filled later. This way, the conversion of 's head need know nothing about its environment. We represent this in code via the constructor, which contains a reference to a fragment that either may be empty ( ) or filled ( ). The conversion of the body can then proceed by first generating the expected addition, binding it to so it may be referenced later: We now need to get this fragment into the proper lexical scope. We can do this by plugging it into the hole left by the conversion of the head: Then, returning this fragment, the name bound ( ), and the reference to the inner hole, we can continue converting ; First, convert the head, using the name returned by the conversion of : Then, the body: And finally plug it into the hole returned by . We are left with a final hole we can plug with to be explicit about the end of a control flow path. One disadvantage of this strategy is that it does generate many unneeded names. Most of these can be avoided by carefully special casing certain , and the rest can be eliminated via a simple folding pass. Below is a sample implementation of this algorithm, in OCaml. I recommend toying with the conversion yourself, using or similar. In the following, I will "clean up" the output slightly, as to not make it unreadable. If we use our first example term: We can then plug the returned hole, as described. Via some inspection, we can verify this is indeed correct, and conforms to our ANF-like form, as expected. As mentioned earlier, the large amount of duplicate names can be mitigated through strategies such as special casing and folding passes. Clearly, the algorithm does not use closures, or any indirect jumps. It does not allocate any more than needed for the representation of the output program; there is no excessive closure allocation. I do not currently have a good way to test performance on real-world large inputs, but I have no reason to believe it would not perform better than an equivalent higher-order version. I believe this method should be extendable to CPS as well, although I have not tried as of the moment. The "usual" higher-order CPS conversion described in e.g. "Compiling with continuations, continued" looks quite similar to the higher-order presentation of ANF conversion, so I would be interested in whether it can be similarly massaged into a mutable form. It uses excess space. A conversion of a term will allocate ~2x the memory "needed"; it must allocate for the ANF construction itself, and for the closures needed during the construction. It is slower, due to the aforementioned closure allocation and application. The fragment itself. A name we can use to refer to what was bound in the fragment (In the conversion of addition/similar with nested content, names must be conjured.) The reference created by the hole, so that it may be filled.

0 views
Den Odell Yesterday

Browsers Treat Big Sites Differently

Some browsers ship code that checks which domain you’re visiting and changes how the page renders based on it. Yup, you read that right. If site == X , do Y . TikTok gets special treatment. So does Netflix. So does Instagram. And so does SeatGuru. Safari and Firefox both do this. Chrome doesn’t. That tells us something interesting. The source code is right there if you want to look. These are literal domain checks baked into browser rendering engines that say things like "if the user is on this domain, render this differently" or "if they’re on that domain, handle that API call differently." It’s not a bug. It’s a feature, and it ships to billions of devices. If you open Firefox and type in the address bar, you’ll see a list of site-specific interventions complete with toggle switches. Each one is a targeted fix for a specific website, and you can turn them off and watch sites break. Firefox’s WebCompat system injects custom CSS and JavaScript into specific domains, changes user agent strings for sites that sniff browsers incorrectly, and papers over bugs that would otherwise make the web feel broken. The interventions are tracked in Mozilla’s Bugzilla, complete with bug reports and sometimes failed outreach attempts to the sites in question. Safari’s WebKit engine calls them "quirks," and the file is publicly available on GitHub. Reading through it is an education in how the web actually works. Here’s one comment from the code: So the browser detects when you’re on facebook.com, x.com, or reddit.com and changes how it handles Picture-in-Picture video. These companies wrote broken video code, and rather than wait for them to fix it, the browser shipped a workaround to every user. Here’s another comment: Someone added domain-specific rendering code for SeatGuru, and the comment implies outreach was attempted. SeatGuru didn’t fix their site, so the browser fixed it for them. The commit history is a fascinating read. In just the last few months: Zillow’s floorplan images weren’t centering, TikTok was showing "please upgrade your browser" messages, Instagram Reels were resizing erratically during playback, Netflix’s "Episodes and Info" button was dismissing popovers incorrectly, Twitch was pausing PiP videos when you switched tabs, and Amazon Prime Video wasn’t letting Safari users watch at all. Each one got a domain-specific fix shipped to every single user. The quirks files aren’t just fixing broken sites; they’re often compensating for Chrome’s control over what "working" means in the first place. The pattern goes like this: Chrome ships a feature, developers use it because Chrome dominates the market, and other browsers scramble to either implement the feature or add site-specific quirks to paper over the difference. By the time Safari or Firefox catches up, the quirk has already shipped to millions of users. WebKit’s source code includes user agent overrides that make Safari pretend to be Chrome for specific sites like Amazon’s video pages and various streaming services. These sites sniff for Chrome and serve degraded experiences to everyone else, so rather than let Safari users suffer, WebKit lies about what browser it is. From the current Quirks.cpp source: Safari literally ships with a fake Chrome user agent string, ready to deploy when sites refuse to work otherwise. Firefox does the same thing, and many of its interventions are user agent spoofs telling sites "yes, I’m Chrome" because those sites actively block or break on non-Chrome browsers. The Mozilla wiki explains that some sites "block access completely, display a different design, or provide different functionality" based on browser detection. So Firefox ships workarounds. This creates a feedback loop. Developers build for Chrome because Chrome dominates. Their sites work best in Chrome. Users who hit bugs elsewhere blame the browser, not the site, so they switch to Chrome, reinforcing its dominance. These aren’t just cosmetic tweaks. Browsers change fundamental behavior based on your domain, including scrolling behavior, touch event handling, viewport calculations, and image MIME type handling. The list in WebKit alone runs to thousands of lines. Here’s one about simulated mouse events: The browser checks if you’re on Amazon and changes how touch-to-mouse event translation works for their product zoom feature. Amazon’s site assumes certain event behavior that Safari doesn’t provide by default, so Safari provides it anyway, but only for Amazon. There are quirks for storage access, scrollbar rendering, autocorrection behavior, and zoom handling. Each one is behind a domain check, and each one is compiled into the browser executable. You might have noticed something. I’ve shown you Firefox’s and WebKit’s , but where’s Chrome’s equivalent? Chrome doesn’t really need one, and not necessarily because Chrome is better engineered. The web is already built for Chrome. When over 80% of users browse with Chromium-based browsers , developers build for Chrome first. If a site works in Chrome, it ships. If it breaks in Safari or Firefox, they decide, knowingly or otherwise, that it’s less of a problem. Chrome doesn’t add quirks; it sets the agenda. When Chrome changes how something works, sites update to match, and other browsers follow or break. This is the asymmetry that runs through the modern web. When a site breaks in Safari, WebKit engineers add a quirk. When Chrome wants to change how the web works, Chrome just changes it and everyone else adapts. Chrome doesn’t need quirks because Chrome’s interpretation of web standards is the version that everyone else works to. This isn’t done maliciously and it isn’t entirely Google’s fault; really it’s the natural consequence of market dominance. Browser engineers will tell you the specs themselves are actually well-defined now. The HTML5 "living specification" approach solved the chaos of the IE/Netscape era by making specs match reality. The problem is that developers rely on unspecified implementation details, then blame non-compliant browsers when those details differ. While that may be true, it doesn’t change the outcome. When Chrome is the implementation everyone targets, Chrome’s unspecified details become the de facto spec. The same thing happened with Internet Explorer in the 2000s. When developers built for IE, sites broke elsewhere, and standards compliance became secondary to just making it "work in IE." We spent years digging out of that hole. A decade ago, the hope was that browser quirks would eventually disappear as the web became more standards-compliant. You could argue they did, but not for the reason anyone expected: the quirks didn’t go away, they just moved to browsers that aren’t Chrome. You might wonder why browser vendors don’t just contact the offending sites and ask them to fix their code. Sometimes they do, and there’s even a field in source code comments linking to outreach efforts, but consider the economics of that. A browser vendor’s job is to make the web work for users, and if a popular site is broken in their main browser yet works in Chrome, users blame the browser. Filing a bug with a third party and waiting weeks or months for a fix that may never come is a losing proposition when you can ship a five-line workaround tomorrow. There’s also the question of who you’d even contact. The developer who wrote the broken code might have left the company years ago, the team that owns that endpoint might not know it’s their responsibility, and the site might be in maintenance mode receiving security patches but nothing else. From the browser’s perspective, the choice is simple: fix it now, invisibly, and save everyone the trouble. A WebKit engineer wrote a blog post about removing a quirk for FlightAware. The site was comparing CSS transform matrix strings, but the CSS spec had changed how browsers should serialize the values, and the browser became compliant, FlightAware broke, and engineers added domain-specific code to fix it. Outreach eventually worked, FlightAware fixed their code, and the quirk was removed. But for months, Safari users had a working experience only because someone wrote an statement in the browser checking for . Your site might be getting special rendering treatment and you might not be aware of it. That quirk you’re benefiting from doesn’t show up in your error logs, and there’s no console warning that says "this browser is working around your mistakes." The fix is invisible by design. If you test mostly in Chrome, you’re especially exposed. Your site might work perfectly not because you wrote good code but because Chrome’s behavior aligns with your assumptions. Other browsers will have to choose between letting your site break for their users or adding you to their quirks file. Open your site in Firefox and Safari. Not occasionally, not before a big launch, regularly . The quirks files exist because developers didn’t do that. If you find your domain in one, consider auditing whatever it was they worked around. Not because you have to (after all, the web kept working without your intervention) but because somewhere an engineer at a browser you don’t use solved a problem you didn’t know you had. The specs are the map, but the quirks lists are the messy terrain. Standards were supposed to eliminate browser-specific code. We dug ourselves out of the IE era, celebrated, and then built exactly the same hole again around a different browser. Only now the browser-specific code lives in the browsers that aren’t dominant, patching over a web built for the one that is. Sites I’ve worked on are in these files. Yours might be too. And the lists are getting longer .

0 views
Susam Pal Yesterday

Commenting Guidelines

When commenting on this website, please keep the following points in mind: You may include HTML or Markdown in your comment. Comments are converted to HTML and sanitised before they are published on this website. All submitted comments are held for review. Whether a comment is published or not is at the discretion of the author of this website. Typically, only the following types of comments are published: Generally, rants are not published, even when the post you are commenting on is itself a rant. This website is the author's place to rant. It is not your place to rant. If you really need to rant, please do so on your own website. This guideline exists to maintain a high signal-to-noise ratio in the comments section. All comments deemed suitable for this website by its author become publicly available on this website at two places: on the comment page for the article you commented on ( example ) and on the overall comment index page at comments . Read on website | #meta You may include HTML or Markdown in your comment. Comments are converted to HTML and sanitised before they are published on this website. All submitted comments are held for review. Whether a comment is published or not is at the discretion of the author of this website. Typically, only the following types of comments are published: Comments that add new information or insight to the topic discussed in an article. Comments that provide a neutral, supporting or opposing viewpoint. Comments that report typos, errors or bugs on the website. Comments that contain good humour. Comments that express appreciation. Generally, rants are not published, even when the post you are commenting on is itself a rant. This website is the author's place to rant. It is not your place to rant. If you really need to rant, please do so on your own website. This guideline exists to maintain a high signal-to-noise ratio in the comments section. All comments deemed suitable for this website by its author become publicly available on this website at two places: on the comment page for the article you commented on ( example ) and on the overall comment index page at comments . Do not submit sensitive personal data in your comments.

0 views
iDiallo Yesterday

Software Engineers are Obsolete

In my first interview for a developer position, I shared a link to my personal project with the interviewer. It was a website for learning how to program. I created it from the ground up. I built the PHP app, designed the database schema, made a nice design to tie it all together. I wrote down my process, and it became the first tutorial on the site. Then I collected tutorials from all over the web and displayed them on my website, which acted as a portal. There was a section for PHP tutorials, for Ruby on Rails, for .NET, etc. Each one individually curated by me. My interviewer was so impressed. I got the job. Later, I added a section where anyone could submit their own tutorials. It was fascinating how quickly people found my website and started submitting links. The tutorials were coming in so fast that I removed the verification system and let people upload links directly. But then my mind wandered. What if I start a blog? Yes, I had another blog before this one. I built an entire blog engine from scratch. A colleague found my blog. He was so excited that he shared his own with me. At lunch, we would discuss ideas, and that same evening after work, we would buy a domain name and start a new project. We shared tips and tricks on how to rank on Google. We had a skill, being web developers, and we took full advantage. When we had an idea, we would fire up our computers that same night and build it. Friends and family would come to us for validation. We were the ultimate deciders of what was a good idea and what was a bad one. We were the gatekeepers. We knew how to program, and nobody outside our circle could say otherwise. Now, friends and family don't come to us anymore. They go straight to ChatGPT, and it tells them their idea is brilliant . They launch their favorite AI agent, which builds their entire product from a single prompt. Some of them even manage to host it on the web, accessible to the world, and they are seeing their first customers. People who used to confuse Java with JavaScript now tell me they have a platform. People who don't even know what programming is are standing at the forefront of software innovation, advocating, evangelizing, and making money. This skill I spent years honing has been made obsolete by everyday people. We, the developers, are no longer the gatekeepers. In fact, now we need to keep up or risk being left behind. Some commenters online tell me I'm just jealous, that I need to embrace progress. I don't want to be obsolete. I'm on openclaw, moltenclaw. I have accounts on all the video generation websites. I have accounts on ChatGPT, Claude, Gemini, and Mistral. Just as I'm getting a hang of one tool, my friend who works in a warehouse tells me, "just use Perplexity for that." But Perplexity isn't enough, because another friend says GenSpark is better. For some reason I can't sign into my Manus account anymore. And apparently, to get the most out of it, I need to get Meta Ray-Bans. Everyone is empowered, no one needs me, and that's that. The developer is now obsolete. But then, I opened LinkedIn. My peers, fellow developers who for some reason all have the word "AI" in their job title, are saying the opposite. "Developers are not losing their jobs to AI," they say. "Developers are losing their jobs to other developers who use AI." They are vibe-coping to the max. The history of technology has always been a story of nearly missing out. I remember another job I applied for and totally didn't get. The company had moved all their client-facing apps to Silverlight. If you're wondering what Silverlight is, you might understand why I chuckled when the interviewer described their plight: they were struggling to find developers to help them migrate to HTML and JavaScript. I'm fairly sure that chuckle is why they never called me back. It's one thing to embrace new technology. It's another thing entirely to put all your eggs in one basket. Companies are betting everything on Silverlight. Sorry, I mean AI. Without thinking through what happens if things don't pan out. AI has lowered the barrier to entry. That's a good thing. More people can now bring a fresh pair of eyes to the software engineering field. But there's a problem. Those new entrants won't become better engineers over time. Why? Because they are not writing code, not reading code, not debugging code. Their growth path, with time and experience, is to become better prompters. What this means is that, amid all the noise, my role as a software engineer may seem obsolete. But in the long run, we will be back to square one, where engineers writing code with their own meatware will hold all the cards. These are the people who learned the hard way: by reading documentation, by debugging broken apps, by having their seemingly perfect Stack Overflow question closed as a duplicate. These are the engineers who will hold the keys to software. Not because they're guarding secrets, there are no secrets. It's simply that the new developer is not, and will never be, interested in learning. While we pride ourselves on producing more software than ever, it doesn't take long to realize that software is never truly finished at delivery. It has to be maintained. It's strange, computers whose entire purpose is to repeat the same process over and over, perfectly, somehow manage to degrade over time. My tutorial website, seemingly working fine, returned an error when I visited it after months of neglect. I restarted all the services and brought it back up. It was now full of spam and NSFW URLs. An application that worked perfectly yesterday is broken today. It could be a memory leak, unexpected input, or just users with fat fingers. Your completed application is suddenly incomplete, and you have to fix it. In an ideal world, we wouldn't keep producing more software. We would have working software, and less of it to maintain. AI thrives on quantity. If you need me, I'll be in the back, patiently waiting for you to realize you can't prompt your way out of a Silverlight migration. My rates just doubled.

0 views

What Time Is It?

On Palm OS, the interface for picking the start and end time of an event is represented as two columns, hour and minutes. The hours list either starts at 8AM and shows until 7PM (covering a full business day, or it starts at the next hour (if creating an event for today). Minutes are represented for every 5 minute interval, allowing every option to be shown at once. This interface is simple and requires an extremely low cognitive load to use. It's scannable and adaptive to the current situation (today vs another day). It limits options (ie you can't set a time of 12:33) to drive simplicity. If we compare to the time picker on Android, we can see it's significantly more complex. One must first tap the hour, then tap AM/PM, then tap the minutes section and tap the minute they need. While minute intervals of 5 are shown on the screen, the user is able to select specific minutes, if they know how (one must drag the circle to get a specific minute). The interface has many more taps, states and cognitive load. How about iOS? Like Palm OS, iOS limits you to 5-minute intervals. Similar to Android though, an additional interaction is needed to pick AM/PM. Picking hour and minutes is more involved as well, you must scroll the picker to the desired value. The Palm OS UI might not be the prettiest, but it's the fastest for most use-cases. The most common options (business hours and 5-minute intervals) are presented without the need for multiple states or scrolling. Setting the time is 2 taps away!

2 views
Unsung Yesterday

“This is where your mouse becomes a cryptographic instrument.”

A fascinating 9-minute video from PawelCodeStuff about randomness in the context of computing: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/this-is-where-your-mouse-becomes-a-cryptographic-instrument/yt1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/this-is-where-your-mouse-becomes-a-cryptographic-instrument/yt1.1600w.avif" type="image/avif"> It explains those weird moments where sometimes the computer asks you to wiggle your mouse – to generate unpredictable numbers – although the specifics of what exactly was random in my wiggling was a surprise to me. There is something poetic about computers yearning for that one thing they can never get – complete unpredictability – and collecting it in a little pool like you would something very precious. Also fascinating that in modern CPUs, there now exist hardware components that gather truly random data from the real world. While I have never needed true randomness in my design career, knowing how to control pseudorandomness (specifically, how to replay it) has been helpful. Here’s an example. In my essay about Gorton , there is this interactive bit where you can drag a slider for “messiness.” With regular pseudorandomness, the experience is wiggly and gross: But when you always restart the prng from the same seed (“the Groundhog Day maneuver”), it feels much better: #details #motion design #security #youtube

0 views
ava's blog Yesterday

new challenges at work

In the past, I have complained about some aspects of my work here and there. As I continue to grow, get more qualifications, visit conferences, and apply to interesting positions, I've put more effort into transforming the place where I'm at, to the best of my abilities. I've repeatedly asked for more work, I've asked for different tasks, and I helped create a new role. Not replacing my current role and work, but something on top/on the side next to my core tasks. I needed change and something worth logging in or coming into the office for, and of course I wanted to pivot more into my desired field. That brings some new challenges, which are desired, but can be uncomfortable at first. Years of doing the same tasks with comparatively little cooperation and following repetitive processes never forced me to put a lot of thought into what I put out, so to say. That can be very nice, and in the beginning, it was hard enough learning everything and doing everything correctly. With my core work, no one asks me to create anything from scratch, make any decisions, or organize anything independently; it's all set in stone. If I wanted to, I could just spend years doing the work-equivalent of "minding my own business" and keeping my head down, in which I work off what came in that day based on our rigid standards and use fixed email templates (not even having to formulate my own sentences), nothing more, nothing less, unbothered. That's what I did for years as I got used to everything, and as I was very sick. But now, when I want to do more challenging work, I notice that years of working like this have made me very comfortable. Not lazy, but it feels unusual and slightly scary to suddenly have a more "active" part of work where I actually have to plan meetings, host and lead them, prepare slides, and even approach people first about needing to find time to discuss something together. Completely normal office tasks for others, of course, and it's what I wanted to not stagnate further in something that bores me, but my brain still perceives it as a threat. Due to internal restructuring and moving of employees, we lost our sub-department's IT coordinator 1 (each sub-department has to assign someone). I asked my boss if I could be the new one, and she agreed. Unfortunately, at least in our department, this title is more decorative than anything else, as the IT coordinators don't even have any meetings to discuss anything at all. This has generally worked fine enough, as in " we are surviving ", but now with different AI model rollouts and other software changes, I notice employees becoming more and more confused and helpless, and a more proactive approach would be nicer. When I asked my boss for permission to be one, I said I would like to organize a meeting of all coordinators to discuss some challenges and more, and both her and the department head thought this was a good idea and asked me to schedule one soon. I didn't expect how much this task would make me freeze up; I didn't wanna be the newcomer in a group who piles more work and yet another meeting onto the other people as a first move. So I obsessed over a good way to introduce this, and how to make the first meeting worth it. I didn't want everyone to show up, discover we have nothing to discuss, and leave after 5 minutes. The invitation mail should stress that this is just a first, casual meeting in which we will talk about x, y and z topic, and then determine whether this should happen again and in what frequency. I also kept pondering whether I should also already prepare a topic/mini-presentation to not come with empty hands myself as an organizer, and what that could be, putting a lot of pressure on finding something good enough. The final hurdle was that no one in my department apparently even had a full list of who the other coordinators are; had to research that myself somehow and ask around. All that made me put off scheduling anything for a good 3 weeks. Yesterday, I finally dealt with this mess, as the task became more and more pressing and uncomfortable to think about, threatening to become this huge anxiety beast strangling me. Detangled my feelings, set realistic expectations, and scheduled it to mid June to have a bit of time. At the same time, I am finally officially the data protection coordinator of my department. My work never had any before, no other department has one either. This is just my department wanting to lead by example, and admittedly, also accommodating me and my ambitions, as I have asked for this for months. Leadership up top has repeatedly thwarted my attempts to move into the data protection team, or officially implement coordinators house-wide, and refused to even discuss it or process it in the idea management system, so this is my little rebellion, you could say. Doing things from the bottom up. I have already prepared the slides they will use to announce it in the next department meeting and the meeting of all department heads. I will also have to prepare a short presentation about data protection challenges in our department, scheduled around Q3 or Q4 of 2026 as I need time to get an overview of everything. I'll have to meet up and interview a lot of people about their team's data workflows to see what needs to be adjusted, write some analyses, write deletion concepts, create awareness, ensure compliance, and more. I'll also be the person to go to before the data protection officer is getting involved. It's what I wanted, but internally it also makes me very nervous. I finally get to create things and success will be about the quality, not just that something was done; but it opens the door for thoughts about whether I am good enough or not. Merely following process steps as described makes it easy to just be a bot that gets things done; creating things yourself, sharing your own ideas and opinions exposes you as a person, makes you vulnerable. There are people working there that will finally see that there is a person with a brain underneath the years of automatically generated emails they received in my name. There is no one else to watch and learn from, as I am the only one, and I get to make things up as I go for this new role. I will be the blueprint, for now. There are horror scenarios in my head of not knowing something in a meeting and everyone thinking I am an impostor who doesn't really know anything. That's not how real life goes, of course, and everyone is usually understanding when you say " Sorry, I will have to look that up and get back to you about that. ", but you know how brains are. I'll have to learn from every meeting. I am scared of not doing a good job and doing it all a disservice. The culture is an aspect of it too, because unfortunately, my place has a reputation of not being kind to ambitious people, and many people being rather hostile if anything is asked of them - time, expertise, feedback, a change in routine, a little bit of grace; anything. There are also a few coworkers that have proven again and again that they are unable to view younger people or people lacking this or that university degree as worth taking seriously. That's what I will be up against, and my own harsh standards I have for myself. I'm trying to reassure myself that I have time to figure things out and that I need to make mistakes to improve. Reply via email Published 13 May, 2026 The IT coordinators' role at my workplace is to share IT knowledge around in all kinds of teams so it isn't just concentrated in specific areas, and to ensure everyone is up-to-date on internal policies, new software options and more. They're also a sort of first responder to task-specific tech problems in that specific team before annoying our general helpdesk. The communication of our IT department can be lacking, and not everyone has the time to keep on top of new things (like the sudden rollout of Copilot recently, new options available in Teams, etc.), so having these people "posted" in each sub-department to share news and developments was supposed to help that. ↩ The IT coordinators' role at my workplace is to share IT knowledge around in all kinds of teams so it isn't just concentrated in specific areas, and to ensure everyone is up-to-date on internal policies, new software options and more. They're also a sort of first responder to task-specific tech problems in that specific team before annoying our general helpdesk. The communication of our IT department can be lacking, and not everyone has the time to keep on top of new things (like the sudden rollout of Copilot recently, new options available in Teams, etc.), so having these people "posted" in each sub-department to share news and developments was supposed to help that. ↩

1 views