Latest Posts (20 found)

The Nvidia AI PC, Project Solara, Microsoft AI

Listen to this post: Good morning, I don’t normally give away my interview subjects ahead of time, but I’m going to make an exception this week given the subject and the below Update. I am writing this in San Francisco where I interviewed Microsoft CEO Satya Nadella after his Build developer conference keynote ; normally I would want to publish that immediately so that you have the full context of my analysis. In this case, however, I came to the opinions below during the keynote, and before the interview, so for that reason (and a few logistical ones) I wanted to articulate them first (before you see my questions), and follow up with Nadella’s view on them (and a number of other topics) afterwards. So with that noted, on to the Update: From CNBC : Nvidia has emerged as the world’s most valuable company by dominating the market for artificial intelligence chips in the data center. Now the company is expanding its prowess to chips that will serve as the main processor for personal computers, entering an arena that’s long been ruled by Intel, Advanced Micro Devices, Qualcomm and Apple. During a keynote address at Taiwan’s Computex conference on Monday, Nvidia CEO Jensen Huang unveiled a new PC processor made alongside Microsoft. The RTX Spark superchip, which Huang also referred to as the N1X, debuts in the fall on a fresh line of Windows PCs from Microsoft, Dell, HP, ASUS, Lenovo and MSI. I’m actually starting in Taipei on Sunday, where Huang introduced the long-rumored Nvidia PC chip; from Tom’s Hardware : At full strength, this chip offers up to 20 Arm CPU cores, a Blackwell GPU with 6,144 CUDA cores, 128GB of LPDDR5X RAM, and up to 300 GB/s of memory bandwidth. That powerful CPU and GPU, connected over NVLink C2C, and the large memory pool give AI agents and 120-billion-parameter models plenty of power and space for long-running tasks with context lengths stretching to a million tokens, according to Nvidia. We don’t have any benchmarks yet, but the RTX Spark appears to be broadly similar to the DGX Spark; that’s a decent chip that excels at prefill, but is slower than an M5 Max at decode (thanks to lower memory bandwidth), and significantly slower at CPU tasks. Huang appeared during the keynote via live video to discuss the chip. Satya Nadella: Suddenly, this concept of unmetered intelligence right at the edge is so hot again. So maybe you want to talk a little bit about this: you have thought about this, talked about this, and now, of course, with RTX Spark really delivered, I think, what’s a breakthrough system for AI to be much more ubiquitous. But maybe, Jensen, you can just share a little bit your vision around where you see this going. Jensen Huang: Well, this all started about three years ago between a conversation between you and I. And we were talking about how we could build a new class of PCs that’s incredible for designers and creators. And it would be incredible for artificial intelligence. And it would be one of these systems that has the processing capability, but also the software stack that’s integrated into the world’s design packages and creator packages. And, of course, all the things that we’re doing with AI. And here we are, three years later, we built an incredible new chip. And this system is supported by all of this new software that you created for Windows. And we now have the ability to have essentially an autonomous agent running on the PC. This clip explains why I find this chip specifically, and AI PCs generally, pretty underwhelming. Three years ago we were still in the ChatGPT era of AI, and I was very excited about the possibility of local inference. Then came the reasoning era, blowing up KV cache (which increases the need for more memory) and emphasizing the importance of decode (to generate that many more tokens). Now we’re in the agentic era, where CPU performance is incredibly important. To that end, the ideal setup for a local agent is strong local CPU performance and calling out to the cloud for inference. The RTX Spark, however, spends tons of die space on GPU cores that are inferior to the cloud (because of memory size and bandwidth if nothing else) at the expense of CPU. It’s a suitable chip if you just want a chatbot circa 2023; it’s hard to see it being worth the price — or the software compromises that are the reality of Windows on ARM — in 2026. Jump ahead to the Build keynote, which I found very underwhelming to start. Nadella opened with a brief overview of the AI stack, then started talking about Windows, and I was honestly pretty surprised at the lack of vision and enthusiasm. That’s when it occurred to me: I think that Nadella agrees with me! Sure, some local inference is nice, but that’s not where the AI that matters is going to be located. Nadella, keep in mind, has no real loyalty to Windows; indeed, I credit him with The End of Windows . Specifically, Nadella didn’t end Windows as a product, but he ended its run as the organizing principle around which the entire company operated, focusing on software that ran everywhere and a cloud that ran everything. That leads to a surprising takeaway, and the most interesting part of the Build keynote: what if Microsoft is actually well positioned to get back into AI devices? From GeekWire : A team inside Microsoft has been quietly building a platform for devices that run AI agents instead of apps, based on Android instead of Windows, with two working hardware designs so far, and an initial set of big-name companies lined up to run pilots. The platform, dubbed “Project Solara,” is Microsoft’s bet that AI will open up entirely new scenarios for computing — using agents to avoid the constraints of traditional software, and off‑the‑shelf components to develop new devices quickly and inexpensively. Project Solara is, to be clear, vaporware at this point, although the company did show real devices and has signed up Qualcomm and MediaTek as chip partners. It is also extremely compelling. Here’s how Nadella introduced it: So far, we’ve talked about the edge and the cloud. The current form factors, right? I mean, when I saw that Jensen picture from the weekend where he had all the desktops, I felt like, man, I’m back in the 90s, right? Because it was so cool to see the lineup of all the machines that I loved and I grew up with back yet again with new functionality, right? It’s the same form factor, but unbelievable new functionality because of the onboard AI capability, right? So that’s sort of what we’ve seen with the laptop, the desktop, and of course with the cloud. But it also, you know, sets up that next question: if you have that capability, which is new function, and you can put it into existing form factors, can you even purpose-build new form factors for the new function? Can you build a new platform even for the agent era? And that is the motivation behind Project Solara, which we’re introducing today. First off, note the framing: the PC is old tech with agents; what about new tech uniquely enabled by agents? And note the classic Microsoft hook: could that new tech sit on top of a new platform? Corporate Vice President Steve Bathiche, the head of Microsoft’s Applied Sciences Group, explained the vision: Before I talk about those awesome new devices you just saw, let me start with the why. Back at Build 2023, I talked about the outside AI application structure, where AI moves from operating within the application frame to operating globally, working across multiple apps and services to connect, coordinate, and maintain context across entire workflows, devices, and time scales. What if there were an ecosystem of devices specifically designed for that new type of application structure, for those types of agents, for that transformational interaction technology? That is the impetus behind Project Solara. But with so many possible forms, which one do you pick? What is the next device? You see, the big aha for us is that it’s not about choosing one specific form factor. It is about creating a system that extends your agent across a constellation of devices. The next computer is not one device. It is all these devices working together as one system, with agents showing up closer to where and when you need them. There was one brief moment in the promotional video that preceded Bathiche’s appearance that made the concept click for me: The problem with wearable devices is the interaction model: they are only useful when you are interacting with them, when the human is in the loop, but being in the loop with a wearable is annoying and inefficient. What is being demonstrated here, however, is a brief interaction, and then an agent doing work in the background. In other words, the usefulness happens in the cloud without the human needing to be involved, because an agent is doing the work. That’s what I find compelling. On one hand, you can make the case that of course Microsoft would be interested in a device model that uses the cloud as a platform, given that Microsoft doesn’t control a mobile device like an iPhone. What occurs to me, however, is that even if Microsoft doesn’t succeed with Project Solara, this model — where the cloud is the hub and multiple devices are the spoke, instead of the phone being in the center — is clearly a better one for agents. Agents work best in the cloud, and across apps and devices; yes, the phone might be one of those devices, but when it comes to agents it shouldn’t be the hub. Again, this is vaporware, and very much in Microsoft’s interest, so take Project Solara with the appropriate grain of salt. It’s a vision of the future, however, that does make a lot of sense, particularly in an enterprise scenario where all of the context and compute is already in the cloud (and Project Solara is focused on enterprise, not consumer). It’s also something completely different from the past, and fits my thesis that, in the age of AI, thin is in . From GeekWire : Microsoft has based much of its AI business on models from OpenAI, before expanding more recently to Anthropic. On Tuesday, the company showed how it plans to rely less on both. At the Build developer conference, the Microsoft AI Superintelligence Team unveiled a family of seven models built from scratch. It’s part of an ongoing effort by the company to build credible in-house alternatives to models from partners and rivals with competing allegiances… The flagship of the seven newly announced MAI models is MAI-Thinking-1, a reasoning model that Microsoft says draws even with Anthropic’s Claude Sonnet 4.6 in blind human testing, and matches the more capable Claude Opus 4.6 on a widely used coding benchmark. [CEO of Microsoft AI Mustafa] Suleyman stressed that MAI-Thinking-1 was trained from the ground up with no distillation from other companies’ models, looking to appeal to enterprises that care about clean data lineage. These models seem pretty decent, all things considered, but what was interesting to me was the framing: Microsoft emphasized that enterprises could take these models and make them their own. Suleyman said: This is what owning the full stack end-to-end looks like. It’s the foundation of Microsoft Frontier Tuning, it lets you customize the MAI models using our full stack hill climbing machine right where you want it. And it means that the disciplined and very relentless engineering that has gone into building our models is now available to all of you on a platform that you can trust, working on your behalf to create custom agents that you will control. So the really big thing, of course, that’s happened in the last year is these RLEs, reinforcement learning environments, these unique training gyms for your AIs. They create company and task-specific agents adapted only to you, built on MAI models. So for example, within Microsoft, we use our RLEs combined with our MAI models to climb towards the best agentic use cases on Excel. Our MAI-tuned model is now on par with GPT 5.4 on public and private benchmarks, whilst at the same time being 10 times more efficient on cost, and many other early adopters are seeing similar results. When we’ve tuned our models on McKinsey’s tasks, MAI delivered the highest win rate, even outperforming GPT 5.5, and again delivering 10x greater efficiency on cost. So to us, this is the advantage of very carefully calibrated frontier tuning. And importantly, unlike with some of the other companies, with MAI, you don’t rent intelligence from a shared model that learns from everybody. Only you keep the benefits of your hard-earned workflows, know-how, knowledge, and your own institutional data. Only you get to control the resulting model. And so with us, the RLEs and the models that you build inside of them, they become your moat. I really think this is distinct. It marks a new era in AI that we’re all very, very excited about. This has shades of AWS’s Nova Forge offering , which lets enterprises add their data at a checkpoint in pre-training; it’s a little different in that it’s more focused on reinforcement learning, but those lines are getting blurred. The concept is that enterprises get to have their own model for their own data, without sharing it with the frontier labs that want to eat their lunch, and it’s a concept that is certainly appealing in theory; the real test will be to see if enterprises that choose this route aren’t penalized by not being on the cutting edge of functionality. Then again, helping cautious enterprises embrace the future on their terms, without necessarily having to win on pure performance, is exactly how Microsoft has long maintained its position. This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery . The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a subscriber, and have a great day!

0 views
DHH Today

A pond of interesting problems

The great joy of having built a successful business that employs a broad team of talented people is that I get to fish for exactly the kind of problems that most interest me, most of the time. Usually, this coincides well with the needs of the business. When we moved out of the cloud, I spent months getting Kamal off the ground, so we didn't have to get mired in the complexity of Kubernetes. Fun problem to solve! And of course, the origin story of Ruby on Rails is that Basecamp gave birth to it all back in 2003. Because I simply wanted Ruby to work well for the web, and we needed a platform to build the business. But sometimes it's also a bit further afield. We had our big clash with Apple over the App Store's monopoly abuses back in 2020, but it wasn't until 2024 that I severed our exclusivity with the Mac on the engineering side by moving to Linux, and ultimately building Omarchy. I don't always get to choose, of course. There are occasionally urgent problems that just need our, and therefore my, full attention as a company, or humdrum issues that I just happen to be best qualified to tackle. But this is increasingly rare because of all those great people we've managed to assemble at 37signals. And that's how it should be! Building a successful business should yield dividends beyond just the financial ones. It should afford you more opportunity to press your comparative advantage, so you spend most of your time on the projects that stimulate a little Call of the Wild. Never to the point of being too good for anything, mind you. Taking out the trash is still everyone's job some of the time. But mostly, I want to be sitting by the pond of interesting problems, fishing for the ones that catch my eye and hook my motivation.  Who could wish to retire from that?

0 views

Scour - May Update

Hi friends, In May, Scour scoured 865,266 posts from 28,671 feeds (1,766 of which were newly added), and 260 new users signed up to bring it across the 3,000 user mark! Here's what's new in the product: Scour is now better at finding posts that match your interests. You should see more relevant content and far fewer off-topic articles in your feed. (This sounds simple, but it represents at least a full month's effort 😅.) The way this works under the hood was one of the single biggest changes I've made to Scour's core ranking system since I started working on it. At a high level, scoring now combines Scour's original fuzzy concept matching (embedding vector distance) with how much the article uses relevant vocabulary (lexical search). While these ingredients are well-established, I think the exact way Scour implements them might be a somewhat novel system design. The reason this was so complex to build was that existing approaches to lexical search did not work for Scour. For example, every Scour user has between a handful and hundreds of interests (I have 642), each of which might have 3-10+ relevant keywords. This means that every "search" is actually a search for thousands of terms (for my feed, it's around 5,000). Most search systems are built for individual queries with a handful of terms. The even more tricky issue is that lexical search algorithms like BM25 do not produce scores that are comparable across queries, because they are designed for ranking (ordering results for a specific query), not scoring . Scour, however, needs to know which of your interests a given post is most related to and it sorts the posts in your feed by how relevant they are for any of your interests. I believe that the custom scoring and indexing system Scour now uses provides both cross-query score comparability and efficient lookup for thousands of parallel queries. Stay tuned for more details! 🙏 Help me out! Please like, dislike, and report posts as off-topic as you're browsing. These signals help me tune the system and figure out the edge cases where it could be improved. Scour bolds keywords in the post titles to make the feed easier to skim. The new lexical scoring layer discussed above makes it easier to bold exactly the words related to your interest. Two other small changes let you peek under the hood of the new scoring system. On desktop, hovering over a post's title will show you the score breakdown between semantic and lexical. Separately, if you click on an interest tag and go to the single-interest page, there is now an Advanced link that will show you the terms the lexical scoring system is using to find and rank posts. Here were some of my favorite posts that I found on Scour in May (you can tell from the topic concentration where my mind has been!): Happy Scouring! Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies Rethinking Agentic Search with Pi-Serini: Is Lexical Retrieval Sufficient? Re-autoresearching MSMARCO BM25, on Vespa How we made a SQL query optimization agent 59% more accurate using autoresearch and LLM Observability Your Vector Database Doesn't Know What Similar Means My Plan with RSS Agentic Coding is a Trap

0 views
Unsung Today

Three good interactive explainers

Interactive explainers are one of my favourite things about the web: people passionate about things introduce them to others, for free, with care, and often using some interesting interactive or educational approaches in the process. I picked a few I particularly liked for this post. These aren’t just explaining things useful to know as a designer, but also themselves contain inspiring/​unique interactive moments worth knowing about: 1. Curves and Surfaces , Bartosz Ciechanowski 2. ASCII Characters Are Not Pixels , Alex Harri 3. Tab Roving , Niklas Gadermann Also, kudos to all three creators for their explainers working equally well on phones as they do on desktop computers. #above and beyond #interface design #web Every example has draggable points, but it also pops up an undo button once you start messing around, so you can feel safe experimenting: Specific ideas and definitions are color coded between text and the interactive pieces: For complex three-dimensional shapes, you can simply rotate them around to orient yourself better: I liked this trick of claiming something is impossible, but leaving a door open to try it anyway – I bet it will get some people more engaged, in the “but I’m sure I can lick my elbow” sense: I think the traditional “drag a divider to swap between two representations” interaction is actually kind of bad, but this essay subverts it by allowing you to toggle between representation A, representation B, or both side by side: A button to copy code to clipboard is always appreciated: I don’t know, I liked these minimalistic sliders: It’s hard to know what to do with complex interactivity, for example a specific sequence of keystrokes… let alone the fact that mobile phones don’t have easily accessible arrow or Tab keys. Here, a brilliant solution: not just on-screen soft keys, but also automatic playback!

0 views

An Ode to the Exacting Pedantry of Computers

The very first computer programming class I ever took introduced me to the idea of there being different kinds of numbers, like integers, floats, and doubles (it was a C++ course). “You mean, when I assign a variable, I have to say up front what kind of number this is?” It was such an odd concept to me. A number is a number. Why do I have to say it’s this kind of number or that kind of number? I dropped out of that class. A few years later, I decided I wanted to try programming again. So I took another intro class. This time they were teaching with Python instead of C++, so you can imagine my excitement to learn that I didn’t have to think of numbers in this way anymore! It felt like the computer was meeting me partway. Over time, I came to learn how pedantic computers are. They require a kind of exacting precision in saying what you want them to do. And they’ll only ever do exactly what you tell them to do, nothing more, nothing less. If there was a bug in your program, that wasn’t because the computer was doing something you told it not to. The computer was only ever doing exactly what you told it to do. A “bug” was very likely a flaw in your conception of how the program should execute, not the actual execution. It was a failure on your part to be more precise, to imagine a scenario where something happened that you didn’t anticipate — and therefore didn’t tell the program how to handle. “Do what I mean, not what I say!” But now, with LLMs, that kind of exacting precision in language and thought is disappearing. You can have a thought, ask the LLM to build it, and it will fill in all the details you didn’t specify or anticipate. All those pesky details which previously would’ve made you reflect, “Oh, I didn’t think of that. Maybe I should design this differently…” Or, “Oh, well now that I have to think about this some more, I can see that it might not actually be a very good idea…” The pedantic friction, which seemed like such a nuisance, was actually acting as a kind of tool for sharpening and improving your thinking and output. The exacting nature of the computer required you to think more. LLMs, however, have significantly lessened that friction. You can think less and move faster. And yet, that feels like our job as software makers: to think, to anticipate, to explicitly articulate intent. As a software user, I’d rather folks spend more time thinking so that I, in turn, have better experience. This is preferable to giving me more stuff faster that’s only partly conceived. As an industry it feels like we’re headed in a direction where we think it’s better to ship more faster and fix the effects of half-conceived intent later, than to spend more time upfront discovering, sculpting, and specifying intent. That’s one thing writing code by hand has taught me: intent — what you want to build and how you want it to work — is shaped through the act of articulating it. That hard work is not required of us anymore. The LLM will fill in the details. The exacting pedantry of the computer is going away, and in its place are assumptions about intent — many of which we don’t even know about until our users run into their effects. Reply via: Email · Mastodon · Bluesky

0 views

Yield Not Thy Core

Yield Not Thy Core Achilles Benetopoulos, Peter Alvaro, Andi Quinn, and Robert Soule EUROSYS’26 This paper describes a solution to the placement problem in distributed systems. If you model a computation as a directed graph, how do you optimally distribute the graph among a set of cooperating computers? The authors propose a dynamic placement system and implement it in Magpie . One common solution to the placement problem is to ship data over the network. For example, a set of compute nodes could access data via network requests to a separate set of nodes running Redis servers. At the opposite end of the spectrum, code can be shipped over the network. The canonical example is expressing computation as a SQL query which is sent to the node(s) that hold the relevant data. Magpie proposes a more fluid solution, where both code and data can move dynamically. In Magpie, an object represents data that is operated on. What makes Magpie objects unique is that pointers to data stored in an object are encoded as tuples. This allows Magpie to dynamically move objects around the system without invalidating pointers. The downside of this approach is that it prevents traditional libraries (that rely on raw pointers) from being used in user code. Magpie assumes a high degree of inter-object locality, so any given object is stored by exactly one node (i.e., a single object is never split between multiple nodes). User code is expressed in terms of nanotransactions and epics . A nanotransaction runs to completion on a single node and accesses a pre-specified set of objects. The Magpie runtime ensures that all objects accessed by a given nanotransaction are resident on a single node before executing the nanotransaction. The code for a nanotransaction is simple, because there is no need to query data over the network, and there is no need to deal with locking. If a hazard is present between two nanotransactions, they will execute serially. In Magpie, nanotransactions are written in Rust. An epic is a computation graph where each vertex is a nanotransaction and each edge is a data dependency. In contrast to nanotransactions, a single epic can be distributed across multiple nodes. Magpie schedules nanotransactions once all data dependencies are satisfied. Conflicts between concurrently running epics are handled via snapshot isolation . Any particular epic has a consistent view of each object and may abort in the event of a conflict. Scheduling and data movement are implemented hierarchically. A worker node can locally determine if it has ownership of all dependencies required for a nanotransaction. If this is the case, then the worker node executes the transaction immediately. Otherwise, the worker node uses a local ownership cache to try to determine if another node has all required dependencies and communicates with that node if possible. Failing that, scheduling is performed by a global orchestration node. Fig. 9 compares Magpie to memcached executing a workload that involves a user-specified read-modify-write operation: Source: https://dl.acm.org/doi/10.1145/3767295.3803616 Magpie is able to offer a lower latency because it is able to ship the entire read-modify-write operation to the server that holds the relevant data, rather than requiring multiple roundtrips. Some applications may benefit from being able to indicate that an object is rarely changed and thus can be distributed among multiple nodes at the same time. Thanks for reading Dangling Pointers! Subscribe for free to receive new posts.

0 views

AI Doesn't Have ROI

If you liked this piece, you should subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . My Hater's Guides To the SaaSpocalypse , Private Credit and Private Equity are essential to understanding our current financial system, and my guide to how OpenAI Kills Oracle pairs nicely with my Hater's Guide To Oracle . Over the last three weeks , I’ve published an exhaustive three-part guide to how the AI bubble might collapse, the events that might trigger it, and the consequences.  Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.  Something changed in the last week. Shortly after Uber COO Andrew Macdonald said that it was “getting harder to justify” spending money on AI as it was “very hard to draw a line” from that spend to useful consumer features ( after its CTO said Uber burned its entire annual token budget in four months ), Axios’ Madison Mills reported that one company had accidentally spent $500 million in the space of a month on Anthropic’s models after failing to set spend limits. A few days later, Mills would report that other companies were now looking for ways to reduce their AI spend . That’s because, as I’ve said before , nobody can actually measure the ROI of AI, or even create a standard measurement of the cost of a task thanks to the inevitable hallucination-prone nature of LLMs and the ever-growing list of different harnesses and “agentic” (sigh) interfaces. Every different prompt and project and interaction can go wrong in a way that is hard to predict or plan for other than having an eternal vigilance that the supposed “intelligence” doesn’t do something catastrophically stupid, because LLMs have no thoughts, consciousness or ability to learn outside of pre and post-training.  If you can’t measure how good something is, how much it might cost, or what your return on investment might be, it’s fair to ask why you’re even paying for it in the first place. People are (reasonably!) harping on about the ROI problem, but I think the “can’t really measure the cost” part is an even bigger problem.  Yesterday, Microsoft’s GitHub Copilot moved all customers to token-based billing from a premium request model ( as I reported a week before everyone ) as users had been allowed to burn thousands of dollars of tokens on a $39-a-month subscription .  Customers are irate. One burned through 50% of their monthly credits in a single prompt , another burned 60% in the space of a few hours , another 31% in a single prompt , another estimated that they’d burn their monthly credits in the space of a single five hour session , another burned nearly half of their credits in eight prompts , another around 14% of their credits in two prompts , and another lamented that GitHub Copilot had gone from their favorite subscription to their most-stressful overnight after burning 33% of their monthly balance in a few hours . And, to be clear, this is during a promotional period where you get $11 or $21 in free monthly credits: These users — much like the users of effectively every subsidized AI subscription — never really knew how much anything they did cost, because Microsoft intentionally hid the actual cost of prompts and allowed users to spend obscene amounts as a way of boosting growth for GitHub Copilot.  This problem is industry-wide. Every single user of every single AI subscription service is having their tokens subsidized and the actual cost of AI obfuscated. As a result, every frothy, fluffy hype-piece about Claude Code or AI in general is a kalopsia — the belief that something is more beautiful than it really is.  Think of it like this: if you’re using an AI subscription with rate limits but no actual costs , any mistakes a model makes — such as getting stuck in a loop or just doing the wrong thing — can be dismissed as the troubled nature of early-stage technology, because the “cost” was $20, $100, or $200 for the entire month. Anthropic, OpenAI and every other AI company deliberately obfuscated these costs because they knew that the second a user actually had to pay for the fuckups of an AI model they’d scream like they were being stung to death by bees. This issue bubbled to the surface in the last few months because Anthropic and OpenAI both quietly moved all of their enterprise customers to token-based billing in Q1 2026 , and because these enterprise customers are run by Business Idiots with no connection to actual work , CEOs encouraged (or actively incentivized ) their workers to use AI as much as possible, in some cases even making one’s AI use a KPI that could cost them their job.  These same workers were conditioned — through their use of AI subscription products that hide the true costs — to use them as if they cost nothing , all while being screamed at by useless middle managers to “make sure to adopt AI at scale,” all while never, ever having any awareness of what a particular unit of work cost. This was always a recipe for destruction. The overwhelming majority of AI users are completely divorced from and actively trained to ignore the true cost of AI tokens, which means they naturally use these services in a way that’s actively uneconomical. Every frothy hype-piece you’ve read has been written by somebody who has been conned into ignoring the true cost of AI, all in service of spreading a technology that’s unreliable, inconsistent and expensive at its core, and never, ever seems to get cheaper.  OpenAI, Anthropic and other AI companies have actively conspired to mislead the world about the true costs of AI, and it was working great right up until they decided to try charging what it actually cost. Less than a quarter into the shift to token-based billing, enterprises are freaking the fuck out, with Walmart setting token limits on its internal “Code Puppy” AI coding tool , with a spokesperson saying that it “wanted employees to apply AI in ways that create value” mere days after Amazon SVP Dave Treadwell told employees to “ not use AI just for the sake of using AI .” The last few years of AI hype have been built on lies. Every company has conspired to make you think that AI is affordable and sustainable, that profitability was possible, that hallucinations were fixable, and that any problems you faced today were a result of being in “ the early innings .” In reality, the AI industry has absorbed over a trillion dollars, effectively all tech talent, the majority of startup funding, the majority of media coverage, the art and work of millions of people, and been given chance after chance after chance to fix the obvious, glaring issues.  Every time a skeptic dared to stand out and say that none of this made sense, they were told that it was just like Uber ( it’s not ) or that Amazon Web Services cost a lot of money ( it cost $52 billion over the course of 14 years and was cash-flow positive in nine ), that “costs always come down,” and that everything would magically be alright as long as they were patient for an indeterminate amount of time. Four years and a trillion dollars in, AI is more expensive, its companies more cash-intensive, its products just as unreliable, and its boosters more desperate than ever to make you ignore reality as a means of empowering one of a few ultra-rich oafs. Products from OpenAI and Anthropic are built to ingratiate and coddle losers while creating work-shaped outputs that are good enough to impress braindead executives, imbeciles and middle management hall monitors that don’t do any real work, and the reason it’s worked this long is that both companies intentionally misled everybody about how much the real costs were. I must repeat myself: AI is more expensive today than it was three years ago, and it is not getting cheaper. Sam Altman’s comments about “ intelligence too cheap to meter ” were lies. NVIDIA’s Blackwell GPUs didn’t make it cheaper, and its Vera Rubin GPUs won’t either. Google’s TPUs won’t do it, Amazon’s Trainium or Inferentia chips won’t do it, Vera Rubin CPUs won’t do it, OpenAI’s chips won’t do it, and no, DeepSeek won’t do it either.  People chose — and still choose — to believe that AI would get cheaper because they think things got cheaper over time in the past, which is sort of true but not remotely similar in any way, because the cost of running and training AI models comes from using the hardware as well as its upfront cost. Large Language Models require expensive GPUs thanks to their reliance on power-intensive parallel processing, and larger, more-complex models in turn require more GPUs to both train and run inference with. And three generations in, NVIDIA GPUs don’t appear to be bringing the cost down at all, which heavily-suggests that the inherent business model of generative AI is broken. People love to compare AI to the Dot Com Bubble ( AI is far, far worse ) because it’s much easier to rationalize bad behavior than accept that we’re facing the largest misallocation of capital of all time. The Dot Com Bubble was really two bubbles — one around eCommerce and internet startups, and one around telecommunications infrastructure. Per Justin Kollar , the telecommunications bubble grew because of a fundamental misunderstanding of demand: As a result, infrastructure was built far in excess of what demand existed, because most people weren’t online, and those who were had very slow internet connections. Per me : Here’s a critical difference between AI and the Dot Com Bubble: when people actually lit up the dark fiber, the underlying internet service was faster, better and cheaper than a dial-up connection. Services like TheGlobe, WebVan, and Pets Dot Com ran businesses that lost incredible sums of money did so not because of the costs associated with accessing their services, but the unrealistic and unsustainable business models themselves.  Their eventual functional forms — Facebook, Instacart, and Chewy — didn’t require fundamental scientific breakthroughs in how goods were delivered or internet services were accessed. Their failures were a result of poorly run businesses that lost money by expanding too rapidly or spending $400 to acquire each customer .   Dell and CoreWeave just turned on the first Vera Rubin GPUs , and you’ll notice nobody is saying the words “profitable” or “sustainable,” because NVIDIA is not interested in making stuff more efficient rather than more expensive.  According to CEO Jensen Huang , AI data centers — which currently cost somewhere in the region of $50 billion per gigawatt — will now cost between $80 billion and $100 billion per gigawatt in the future. Does this sound like it’s getting cheaper to you? Even if said data center packs theoretically more “power,” what does that “power” do for the customer running compute on it? Is it cheaper? More efficient? How do we not have these answers? All of this is to say that the Dot Com Bubble happened due to irrational exuberance and growth lust, and what was recovered at the end came not from scientific breakthroughs but the fact that the useful infrastructure existed and could be adapted and used to make things cheaper and more efficient. That isn’t the case with AI data centers, AI startups or anything else to do with the AI Bubble. Every few days somebody makes a post like this suggesting that “the internet didn’t go away” and “railways didn’t go away” when their bubbles popped, but I think this is a fundamental misunderstanding of what AI is . An AI data center full of AI GPUs is useful for AI and very little else. There are GPU-powered analytics tools, GPU-powered modeling and scientific applications, but the nature of GPUs — good at doing the same thing across big data sets in parallel, but bad at handling many little independent tasks — makes them impractical for most of what modern computing demands. The entire Dot Com Redemption storyline comes from the idea that it “left behind useful infrastructure,” by which they mean “cabling that allowed hundreds of millions of people to use the internet.” While there was some amount of further construction and capex to handle, the end result was useful fiber that connected people with a faster connection at a lower cost. No such story exists for AI. AI data centers are ruinously expensive , requiring billions in upfront funding with operating costs so high that they, at best, run at a loss for the first five or six years of service, if they ever recover their original costs at all. A rack of Vera Rubin or Blackwell GPUs will cost as much to run in five years as they do today, as will an incomplete data center cost just as much to finish construction, connect to the grid or acquire behind-the-meter (IE: generators) power for.  In the aftermath of the Dot Com Bubble, dead startups flooded the market with cheap server and office gear, which allowed plucky founders to cobble together their own services. A single Sun Microsystems Ultra Enterprise 3000 cost $43,000 ($89,000 in today’s money) and had a power draw of between 1,200W and 1,500W, but could run an entire company’s infrastructure . A single B200 Blackwell GPU uses 1,200W , and more-complex AI coding tasks can take up four to twelve of them for a single user’s output. Put simply, you can’t really do very much with a few of these GPUs, and what you can do isn’t profitable, scaleable or valuable. Similarly, dark fiber could be lit up with the right transceivers and networking gear to create internet access. AI data centers are effectively large boxes with custom cooling built for a very limited subset of chips. Adapting them to other uses would require gutting the data center, which would mean that the vast majority of the capital expenditures were wasted.  Even if you were able to buy a hundred Blackwell GPUs from a dead neocloud, you, as a regular person, couldn’t do anything with them. In fact, nobody really could, because you’d still need a physical data center and bespoke cooling , which means that even if the chips were free , the associated construction capex or, at the very least, physical colocation space would still cost a great deal of money The internet and railways didn’t go away because their up front costs were the only real costs that mattered.   Even if somebody were able to pick up a cheap AI data center full of the latest generations of GPUs, the underlying operating expenses are awful, and the only way to make them even close to generating a profit is to have consistent use of all your GPUs. There’s a cost to having them sit idle — both in electricity and personnel — and unless the plan is to have them sit in a data center turned off until you can find somebody else to sell them to, you’ll have to come up with a business model for your AI services that actually makes a profit…which nobody appears to have done, even with unlimited capital and the entire focus of the tech industry. Then there’s the issue of training , which is entirely made up of opex. If you want to train a new model, you’ll likely need thousands — or even tens of thousands — of H100 or H200 GPUs, and they’ll cost just as much electricity whether or not you make anything useful. A failed or unhelpful training run could cost tens of millions or hundreds of millions of dollars , and that will require financial backing that won’t exist. While there could be a theoretical future of LLMs run at their true cost (IE: unaffordable for most) as I covered in last week’s premium newsletter , that would require demand, and as I’ve discussed above, the demand for AI services is a mirage built on subsidized subscriptions, and companies paying the actual costs are already screaming for mercy.  Once the bubble bursts, any excitement for AI — and by extension excitement to spend money on AI — goes out the window. AI startups won’t get funded . AI token budgets won’t get greenlit . AI data centers won’t be able to raise debt .  Every part of this bubble relies upon the momentum of hype to substantiate every link in the chain. Hype must exist around the nebulous concept of an “ AI factory ” to raise debt to buy NVIDIA GPUs and build data centers, hype must exist around AI software to convince enterprises to keep buying services from OpenAI and Anthropic, hype must exist around theoretical demand and outcomes from AI services to fund AI startups, and hype must exist perpetually in the media to make everybody ignore AI’s ruinous costs.  This hype was unsustainable without buckets of lies, misinformation and a captured tech and business media. The value of AI has been inflated by the vagueness of how it’s discussed. For example, major media outlets will gladly write that “AI can build software,” but said sentence suggests that you can just type “build me Slack 2” into Claude and have it fart out a fully-functional, production-ready piece of software, rather than a quasi-functional mound of code-slop that can do enough to trick a business idiot or lazy journalist, but little else.  Said vagueness created a society-wide gravitational pull of consensus that you needed to be behind AI now, because it’s just like the new internet, except bigger, and if you say it’s not you’re going to be really embarrassed.   Creating this pressure was necessary, because without a society-wide aggression against those who didn’t adopt these tools, AI might have actually had to stand on its own merits. That fact AI companies backed by the full manufactured consent of the markets and most of the economy still had to subsidize their products shows exactly how flimsy their value truly is. The only way to inflate the AI bubble both on a hardware and software level was to mislead the general public and investors on the costs and efficacy of AI models.  Now that organizations are having to pay the actual cost of AI, suddenly they’re concerned about its outcomes, and everybody has become a little hysterical. Late last week, SemiAnalysis wrote one of the most insane articles I’ve ever read — AI Dark Output: The Visible Cost of Invisible Output — saying that “AI output will be real before it is measurable,” and, well, whatever the fuck this is: SemiAnalysis is a semiconductor analyst firm with an obvious reason to keep the AI bubble inflated, and if they’re writing a piece that amounts to “AI has a return on investment, you just can’t see it,” things are getting desperate. Here’s how they define “Dark Output”: That “substitution dark output” is explained using a theoretical example of “...a simple legal document which in theoretical GDP should have the same inflation adjusted value to a user whether a lawyer drafts it or AI drafts it,” which is nonsense.   When you pay a lawyer, you don’t pay them to “create an output,” you buy their experience and time and ability to find and adapt case law to reach an outcome, such as in the process of filing stuff, avoiding or actively participating in litigation. Just because AI can fart out an approximation of what a human output may look like — likely riddled with hallucinations — doesn’t mean that said output was created with any “experience.” Models don’t think , they have no experiences , and even if a lawyer is prompting them , that doesn’t mean that the lawyer’s discernment or taste is reflected in the final output. Then there’s this bit: We’re four fucking years into it but we’re still using hypotheticals. Are “...the simplest documents now completed by AI and not lawyers”? You don’t get a lawyer to write a document because they’re the only ones who can write it — you get it to mitigate the risk using the experience of the law firm, both in the associate drafting the document and the partner overseeing it. This flimsy, half-assed logic is how the AI bubble got inflated in the first place. Supposedly smart people continually show a total lack of awareness of how jobs work at basically every level, and in this case — where it should be theoretically possible to find and talk to a lawyer doing this — the supposed “dark output” includes “the research done to complete this article.”  You may be wondering what that “new work done by AI that wasn’t previously being done by humans because AI made it cheap” is, and the answer is “literature reviews” and “summarizing the last six months of email,” and I wish I was kidding. But don’t worry, “...there are anecdotal signs that a large fraction of current token spend is for new work that wasn’t previously paid for rather than replacing existing work.” Have you ever noticed that every story about AI job loss reads like it was written by The Riddler? For example, last year a ton of outlets reported that “Oxford Economics had proven that entry-level workers were being replaced with AI,” but in reality, the study said that “... there are signs that entry-level positions are being displaced by artificial intelligence at higher rates ” with no actual data beyond post-2022 employment declines in some fields that AI might be able to do.  Similarly, CNBC’s brainless headline that an MIT study found that AI “could already replace 11.7% of the US workforce” was entirely based on a labor simulation tool rather than any economic analysis of the actual shit AI can do and what it’s doing in the real world. That’s because AI job loss is a fucking myth. Every company laying off people because of “the power of AI” is doing so because their shareholders are mad and because they know they’ll get headlines.  And if it were actually happening there’d be fucking riots in the streets! Unemployment would be spiking! Things would be burning!  The thing that everybody wants you to avoid thinking about is that if AI worked as advertised, there would be obvious, impossible-to-ignore economic signs: For all of these things to happen, AI would have to be both flawless , hallucination free, a completely different product capable of autonomous intelligence and having unique ideas.  The reason that we can’t measure “AI job loss” is because AI can’t do jobs. It can be used to replace some specific contract positions with extremely shitty versions that don’t scale , but it does not replace jobs because it is incapable of human work. It cannot speak to colleagues, it cannot accrue experience, it does not have instincts or culture or taste or anything other than whatever training data has been crammed up its ass or through endless post-training.  Nevertheless, the threat of AI job loss has been enough to allow both Sam Altman and Dario Amodei to raise hundreds of billions of dollars lying about it, and now that both of them have walked back their job loss scare-propaganda , every oaf and moron that believed them without actually checking should be booted out of their representative industries. It’s fucking embarrassing! You should all be ashamed of yourselves! As I said above, the ROI of AI should be really easy to measure if it actually existed.   If AI was magically able to build and maintain software, we’d have small companies that could build and deploy at the scale of a hyperscaler, and hyperscalers would, in theory, be expanding their margins so aggressively that it would create a new golden age of software revenues…or they’d become entirely infrastructure providers, as anybody else could compete on software. But on a far-simpler level, it would be extremely obvious. Anybody can access ChatGPT, Claude or Gemini, effectively anywhere in the world. The theoretical “power” of AI is that it “just does stuff,” and the proliferation of LLMs would mean that somebody would’ve “done” some “stuff” that we could point at with exceptional ease. Random guys in the midwest would be pumping out profitable, functional, and feature-rich software. Lawsuits would be won by pro se plaintiffs with incredible counsel from a theoretical “ country of geniuses in a data center .” Four years in, we’d have one major AI-powered company demolishing the competition in any industry, or every industry would become so prevalent with (powerful) AI that it would effectively reduce the cost of the service to nothing.  We’d be able to point to companies that adopted AI and then completely fucking exploded. We’d be able to point to useless coworkers who were now doing impressive, meaningful work. There would be widespread economic upheaval, as the concept of a “large company” would lose meaning, because those theoretical “geniuses in the data center” would be automating all the work.” There also wouldn’t be so many pieces insisting that AI is super powerful and so many quotes from Business Idiots saying it’s “ real .” We wouldn’t talk about what AI could do at all. We wouldn’t need Anthropic to lie that Mythos was too powerful to release only to release it several months later .  We wouldn’t have to talk about the fucking potential at all because we’d be able to point to what was going on because it would be obvious! Last week, Bain & Co. released a study of 951 executives from companies with more than $100 million in revenue , and unsurprisingly, the data did not declaratively explain what the ROI of AI was: 10% of…what? What’s the cost you saved on? 10% of $10 million is a lot for a company with $100 million in revenue, but 10% of $1000 isn’t, much like 20% or 30% isn’t either! Yet there are two punchlines to come: This also assumes that those savings are enough to warrant future spending, which…this data does not actually prove. Thankfully, Bain did manage to publish one of the single-funniest quotes of the AI bubble: Put another way, the technology “worked (?),” but did not provide value in doing so. Sounds like it didn’t fuckin’ work to me! Bain had one other crucial bit of advice: Just so we’re clear, Bain & Co, a management consultancy with billions in annual revenue, is advising its clients that they should make sure that they’re getting some sort of return on their investment? And that reinvesting in something that doesn’t have a return on investment would be bad? If AI was real, these fucknuts would be replaced first! They’d replace everybody who wrote this report! You don’t need somebody to tell you this, and if you do you’re a fucking moron!  Thankfully, the AI industry is saved, as Sam Altman had the following to say about AI’s remarkable costs : Motherfucker you are the industry! You are the one that has to work this out! OpenAI is the AI industry ! You are OpenAI’s CEO! You lazy, ignorant, dog-brained loser!  This was an opportunity for “journalist” David Faber to push back, and here’s how that went: This is how the AI bubble inflated! This is how it happened! It happened every time a journalist asked a meaningful question and then immediately diverted to a totally different imaginary topic that made the subject feel good! David Faber, resign and give your job to somebody who has an iota of courage or pride in their work! Unbelievable! Sam Altman is worth billions of dollars, and OpenAI is allegedly worth $852 billion too, and the best he can give us is “teehee, someone else will work it out,” because Sam Altman is a loser that ingrates other losers empowered by losers to sell loser technology to other losers , and the only way that he’s been able to do this is because the people that should know better are sitting around their thumbs up their asses asking him whether there will be data centers in space. If AI had ROI, we wouldn’t be debating whether it had ROI. We wouldn’t discuss its potential, or whether it could, theoretically, under different circumstances, in the future, in a way that nobody can describe be super powerful and do all of the stuff it can’t do today.  If AI had ROI, we’d be able to point with specificity to inarguable examples of economic impacts. AI boosters can jerk their binguses all they like about how Spotify’s CEO said its best engineers don’t write any code anymore . What does that mean? Is Spotify shipping better features, and are those features launching at a rapid clip? Is the software more secure, or stable? Spotify’s design still looks like absolute dogshit ! Most software is worse! Things keep breaking everywhere , and in many cases it’s because of AI coding tools ! In fact, I’d be willing to believe that AI had a negative economic impact, increasing operating expenses across the board and giving some software engineers prompt-based concussions by automating some coding in a way that makes them lazy and bad at writing software by speeding up the process of writing code with so much of it that it’s impossible to review it all ( see Mo Bitar’s video ). LLMs appear to be able to write some code sometimes and do so at high speed , and ingratiates software engineers that don’t really care about writing software by making them feel like they wrote it.  While it might allow some things to go theoretically faster, the overall economic impact of AI-generated code appears to be worse code, worse software, and massive, multi-million dollar bills from Anthropic and Cursor . I will concede that some software engineers seem to like these things, and that many software engineers appear to be using them, but I am yet to see a single one who obsessively posts about their token spend create anything of note or worth, and none of these people appear to be able to point to the actual ROI of all that AI they’re using. I realize I’m painting with a broad brush, so let me get a broader one: I believe anyone who relies on LLMs for anything is a mark.  I don’t give a shit if you use them to spit out a script or do some simple sideline part of your job, or transcribe or dictate into them, or if you’ve used them as a search engine (and even then, you best check every source!), but the moment you rely on and run your entire process on these things, I immediately doubt your ability to do anything, or at the very least wonder how gullible you truly are when somebody ingratiates you enough. Why? Because every single “AI setup” I’ve seen anyone ever use involves a rube goldberg machine of bullshit deterministic scripts to try and bring the hallucination-guaranteed nature of LLMs to heel, usually to the point that you’re doing more work making the LLM work than you did before they existed, and you’re only proud of it because you feel like you’re special. There are, of course, exceptions. I’ve talked to a few people who describe LLMs normally, without hype, who tell very specific stories of very specific outcomes that save indeterminate amounts of time. There are some that have used LLMs to create python scripts to search and organize data, to which I say “you’re impressed with Python, not LLMs.”  If all we’re left with from this era is the ability for some people to write Python scripts without learning Python, this is still an egregious and horrifying waste of capital.  Remember: what you are using is the end result of over a trillion dollars of investment. It is only made possible through manufactured consent that actively misinforms people about the current and future capabilities of LLMs. They didn’t raise hundreds of billions of dollars by talking about any product currently on the market, and that’s because the current products are not very good products. You are all the victims of a con. No matter how “well” your Breakfast Machine of different API calls and if-this-then-that automations may or may not function, you have been sold a bill of goods for “artificial intelligence” that is impossibly stupid. When some of you are pushed to prove the ROI of AI, you immediately return to boring talking points about Uber, or the Dot Com Bubble, or some other slop fed to you by people actively conning you at this very moment.  I mean this with as much empathy as I can muster: if you’re a huge AI booster, why do you defend this so vociferously? What is it about my criticism that hurts? Is it that I’m yucking your yum? Is it that I don’t immediately ingest and regurgitate the theoretical idea that the thing you’re using all the time is or may become sentient? Is it because I’m not impressed?  I think it’s far more likely that people are angry that I’m asking simple questions that should have — and don’t — have satisfying answers. I’m also fundamentally unimpressed with anything I’ve seen an LLM do, because my requirement for software or hardware is that it works as advertised, and the very fundament of the AI con is that LLMs are sold based on their theoretical capabilities. The reason nobody can show you the ROI from AI is that AI does not have a return on investment. Large Language Models can speed up some things in a way that becomes increasingly less-valuable and accurate with the complexity of the task, and more investment in AI data centers does not appear to do anything other than expand the number of tasks that an LLM can attempt.  While some people have been able to get something out of generative AI, that something never seems to be a tangible or impressive achievement. Every “successful” AI story is a result of either ignoring the obvious problems with LLMs or mitigating them at a great cost for an aggressively expensive and mediocre result.  LLMs are sold as “AI,” a technology best-known for automating things, yet they can’t be trusted to run anything on their own.  Instead, they manipulate the user into covering up their errors, explaining away their failures, coddling their meager returns and crediting them with the actual labor that LLMs are meant to automate away.  They do so by their investors and executives conning the media and the markets with outright lies and half-truths that exploit society’s weak points. The media and markets are informed by people that neither understand technology nor history, and Business Idiots that have reached the heights of their careers through diplomacy and ratfucking that care only about attention and adulation for things that other people do.  LLMs coddle the easily-led and narcissistic into believing that the model is doing the work as the human being has to constantly cater to the model’s inefficiencies and inabilities, using more energy and resources than any technology ever made.  And yet with all the money, all the attention, all the resources, all the land, all the power, all the affordances and excuses and endless fucking applause for mediocrity, nobody can actually point to the ROI of AI, because it doesn’t exist outside of it burping out stolen content and enriching and ingratiating billionaire dullards. Even at a hundredth of the price I’d be dismissive, because everything I’ve seen is so decidedly unexceptional. I realize that some will say I’m dismissive of LLMs’ capabilities, and I’m sorry — I’m just not impressed. You spent a trillion dollars to make it somewhat easier to code some things sometimes but not in such a way that it actually results in anything, research reports that nobody will read, shitty powerpoint decks and excel spreadsheets, and art that looks like stock images because that’s exactly what it was trained on.  This shit needs to work every time without fail and be absolutely flawless and autonomous.  You are paying for a tool. You are paying for software. You are a customer. Your job is not to explain to others why this is exciting, nor is it your job to cover up for its mistakes. If you truly love this stuff you should be either secure enough in doing so that you don’t feel compelled to defend it or be demeaning to those that disagree. The fact that I have to write that sentence is proof that something is very, very wrong with the AI industry, and that LLMs are about far more than software.  If you liked this piece, you should subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 10,000 to 18,000 words, including vast, detailed analyses of the biggest events and companies in the AI bubble.  The foundation of software would be destroyed, as literally anyone could create and maintain any software they desired . Literally nobody would buy any software because they’d just type “computer make me a Slack clone for my organization” and it would magically appear on AWS.  The SaaSpocalypse ( see my premium here ) is a media and market-based hallucination where the collapsing growth of software companies is being explained as “AI taking their business” versus “private equity and venture capital overvalued software companies between 2018 and 2022 to the point that Apollo’s John Zito said “ all the marks are wrong ,” which is very bad, but nothing to do with AI. Accountancy would completely collapse, as nobody would need anyone but ChatGPT to do their taxes. Law schools would collapse, because legal internships would become useless and law firms would no longer have need for the thousands of new associates, because ChatGPT could just draft it all.  Legal salaries would also dramatically collapse. Research in effectively every discipline would collapse, because you could ask for a detailed report and said report would be better than any human being creates. The entirety of scientific research would change, because you could now automate many different disciplines out of existence.

0 views
Stratechery Yesterday

The Google Capital Company

Listen to this post : What does the most beautiful business model of all time look like? First, imagine that your supply is free. Second, imagine that your customers willfully compete against each other to raise your prices. Third, imagine that your users decide which of your customers gets the privilege of paying you. All you have to do is build a bit of infrastructure to make it all happen, pay a nominal bit of depreciation on that infrastructure, and make billions of dollars on some of the greatest margins in the history of business. I am, of course, describing Google, a company so good that Warren Buffett, the legendary investor, could never quite bring himself to invest in it. Buffett explained in the 2017 Berkshire Hathaway annual meeting : We were their customer very early on with GEICO, for example, and we saw — these figures are way out of date — but as I remember, we were paying them $10 or $11 a click, or something like that. And any time you’re paying somebody $10 or $11 bucks every time somebody just punches a little thing where you got no cost at all, you know, that’s a good business unless somebody’s going to take it away from you. And so we were close up seeing the impact of that…But, you know, you’ve almost never seen a business like it. One of the characteristics of an Aggregator like Google is the way in which they maximize absolute value at the expense of relative value. For supply — i.e. content on the web — Google dramatically increases the number of visitors, even as the value of any one visitor who comes from Google is worth much less than a visitor who visits directly; for an advertiser, the value of one click makes up for thousands of impressions of an ad that make no difference; for a user, Google helps them discover what they are looking for amidst the overwhelming abundance that is downstream from distribution being free. In every case the Aggregator increases quantity at the expense of relative quality, confident that the absolute amount of quality will be more in the long run. What is interesting is that this is the exact inverse in terms of why these companies have been valued by investors. The best tech companies are “asset-light”, predicated on maximizing zero marginal costs. Yes, they spend a lot of money on R&D and on the infrastructure to make markets happen, but they don’t actually participate in those markets; simply taking a skim and keeping the vast majority of that skim is what gets Wall Street excited. In other words, it was the relative amount of money made that was generally more important to the market than the absolute amount of money. Berkshire Hathaway was, before Buffett acquired it, a failing textile business; Buffett originally invested because the stock was worth less than the liquidation value, and ended up owning it outright after a dispute with management. It was a decision he regretted; from the company’s 1989 letter to shareholders : If you buy a stock at a sufficiently low price, there will usually be some hiccup in the fortunes of the business that gives you a chance to unload at a decent profit, even though the long-term performance of the business may be terrible…Time is the friend of the wonderful business, the enemy of the mediocre… I could give you other personal examples of “bargain-purchase” folly but I’m sure you get the picture: It’s far better to buy a wonderful company at a fair price than a fair company at a wonderful price. Charlie understood this early; I was a slow learner. But now, when buying companies or common stocks, we look for first-class businesses accompanied by first-class managements. One of the first-class businesses Berkshire Hathaway acquired was See’s Candies in 1972. Buffett explained in the 2007 shareholder letter : We bought See’s for $25 million when its sales were $30 million and pre-tax earnings were less than $5 million. The capital then required to conduct the business was $8 million. (Modest seasonal debt was also needed for a few months each year.) Consequently, the company was earning 60% pre-tax on invested capital… Last year See’s sales were $383 million, and pre-tax profits were $82 million. The capital now required to run the business is $40 million. This means we have had to reinvest only $32 million since 1972 to handle the modest physical growth – and somewhat immodest financial growth – of the business. In the meantime pre-tax earnings have totaled $1.35 billion. All of that, except for the $32 million, has been sent to Berkshire (or, in the early years, to Blue Chip). The “problem” with a See’s Candies is that there is nothing to be done with all of that profit; if it’s privately held then its owners end up with more cash than they know what to do with, and if it’s public, then the job is to figure out how to return that cash to shareholders through some combination of dividends and stock buybacks. What Berkshire Hathaway did, however, was use that cash to grow: After paying corporate taxes on the profits, we have used the rest to buy other attractive businesses. Just as Adam and Eve kick-started an activity that led to six billion humans, See’s has given birth to multiple new streams of cash for us. (The biblical command to “be fruitful and multiply” is one we take seriously at Berkshire.) One of the businesses Berkshire Hathaway used the See’s profits for was on the opposite end of the spectrum in terms of capital utilization: BNSF Railway. Railways require a lot of capital to operate; BNSF consumed $3.8 billion last year; they also make a lot of money: BNSF’s net income was $5.5 billion on revenue of $23.4 billion. To put that in perspective, the total amount that Berkshire Hathaway has made from See’s Candies is probably less than $3 billion (the last disclosure was “over $2 billion” in 2019), i.e. less than BNSF made last year. So which is the better business? In Q4 2019, the first year that Alphabet disclosed Google Cloud revenue, Google Services — the high margin beautiful business I described at the beginning — made $43.2 billion in revenue and $13.5 billion in operating profit; Google Cloud made $2.6 billion in revenue and lost $1.2 billion. Google Cloud revenue was 6% the size of Google Services. In Q1 2023, Google Cloud made a profit for the first time. In that quarter Google Services made $62.0 billion in revenue and $21.7 billion in profit; Google Cloud made $7.5 billion in revenue and $0.2 billion in profit. Google Cloud revenue was 12% the size of Google Services, and its profit was 1% the size of Google Services. In Q1 2026, Google Services made $89.6 billion in revenue and $40.6 billion in profit; Google Cloud made $20.0 billion in revenue and $6.6 billion in profit. Google Cloud revenue was 22% the size of Google Services, and its profit was 16% the size of Google Services. Google Services is, needless to say, a much more scalable business than See’s Candies. The growth just over the last seven years — more than doubling revenue and tripling profits — is astounding. And yet, at the same time, Google Cloud is growing faster, and while its margins are worse — 33% last quarter as compared to 45% for Google Services — they are expanding more rapidly. The bigger question is how big can those numbers go? Google Services’ advertising business is inherently high margin, but advertising is definitionally but a fraction of the overall economy; Google Cloud’s growth, meanwhile, is AI, which many people think/worry/hope might take over the entire economy. In other words, might we one day look back and realize that Google Services provided the cash flow to build a business with relatively worse margins but absolutely higher dollars, much like See’s helped fund BNSF? The context for this discussion is this news from Bloomberg : Google parent Alphabet Inc. is raising $80 billion through a package of equity offerings, including an investment deal with Berkshire Hathaway Inc., as the company races to fund its ambitious artificial intelligence spending plans. The undertaking includes a $40 billion so-called at-the-market program to sell shares from time to time beginning in the third quarter, according to a statement Monday. The company will also offer $30 billion in underwritten offerings of shares and mandatory convertible preferred stock, as well as the $10 billion deal with Berkshire. Together, the transactions represent one of the largest equity deals of all time — and they bring an unexpected twist to a blockbuster year for initial public offerings. First off, a decent portion of the ATM program, launching in the fall, is going towards paying tax obligations on Google equity awards (which are quite large thanks to the stock’s run-up in value). That leaves equity being issued now, particularly the $10 billion to Berkshire Hathaway, which is fascinating for a number of reasons. The first question is why did Google issue equity instead of debt? Debt is, all things being equal, the preferred instrument for investment: the proceeds of the latter pay off the former, and existing equity holders reap all of the benefits. Equity, on the other hand, removes the risk of debt, but at the cost of giving up a share of future profits. Google has to date funded its massive AI-related capital expenditures with free cash flow, and while the company does have around $81 billion in debt, that is more than balanced by $126 billion of cash. In other words, Google’s capacity to issue more debt — and to reap the financial benefits of doing so (because interest is tax-deductible) — is substantial. That leads to what may be the Occam’s Razor explanation: Google is also going to start issuing a lot more debt as well, which is to say that everyone continues to underestimate the amount of demand there is for compute. Of course that’s not far off from a more bearish interpretation: Google is uncertain about the return on investment of all that capex, and would prefer to share the risk (along with the upside). If there isn’t a substantial debt issuance down the road then this might be the right answer. The second question is why is Berkshire Hathaway suddenly, after all these years, interested in Google, and at only a slight discount to its all-time high price? Does it really just come down to the fact that Buffett is no longer making investment decisions, and Greg Abel, his successor as CEO, is? In fact, you can make the case that Abel is actually just replaying Buffett’s strategy, only this time Berkshire Hathaway is See’s Candies, and Google is BNSF. At the end of last quarter Berkshire Hathaway had $373 billion in cash, and $25 billion in free cash flow in 2025. How many companies could actually employ that cash in a way that generated a high rate of return? It’s hard to imagine a better option than Google. The company is not only investing in AI, but has optionality in terms of outcomes: its Services business benefits from the investment, it is in contention at the model layer with Gemini, and it can sell capacity to the frontier labs. Moreover, that capacity has a sustainable cost advantage because of TPUs, which means that in a world where compute becomes a commodity — as hard as that is to imagine right now — Google is the hyperscaler that is poised to make the most profit. It is worth noting that $10 billion is a relatively small amount of money to both companies. To that end, perhaps the primary utility is as a signaling mechanism. On Google’s side, the signal is that the expected demand is actually far greater than anyone thinks, and that the company is ready and willing to fund supply using all means at its disposal, including equity; for them Berkshire Hathaway’s investment is an endorsement of this view and a validation of the wisdom of the investment. And, on the flip side, if the signal is correct, then Berkshire Hathaway is getting a deal and putting its cash flow machines to work building the future. A couple of months ago, when Anthropic was clearly ascendant, OpenAI backers tried to make the case that actually OpenAI was in the driver’s seat because the frontier lab had secured more compute; I made the case in Mythos, Muse, and the Opportunity Cost of Compute that this was not at all dispositive: OpenAI is betting that this compute constraint — and the deals they have made to overcome it — will matter more than Anthropic’s current momentum with end users…I’m less certain that this will be dispositive. When it comes to AI, distribution and transaction costs are still free — the two preconditions for Aggregators — which means that the winners should be those with the most compelling products. Those products will win the most users, providing the money necessary to source the compute to serve them; consider Anthropic’s deal to secure a meaningful portion of TPU supply, which, given the capacity constraints at TSMC, is ultimately an example of taking supply from Google. I suspect that Anthropic can take more, including already built hyperscaler and neocloud capacity. Yes, that compute will be more expensive, but if demand is high enough the necessary cash flow will be there. That thesis was proven correct just weeks later when SpaceX ponied up the supply Anthropic needed (and yes, it was expensive): This deal is a perfect example of what really is basic economics. First, if demand exceeds supply, then prices should increase. At the same time, prices are elastic: if they are lower there is more demand, and if they are higher there is less demand. In this case, while there is broad demand for computing, Anthropic has arguably the most demand; furthermore, Anthropic has the most willingness to pay, not just because they are making meaningful revenue, but also because they have the capacity to raise money in the pursuit of winning in AI. Implicit in this analysis was that there was enough compute capacity in the world to be bought; what happens, however, when and if there isn’t? What if the ultimate battle — the one that determines who gets compute — becomes a matter of who can bring the most cash to bear? And what if that advantage compounds, such that the company with the most cash capacity ends up with the most compute capacity (which we already know they will sell, in addition to using themselves) driving the ability to generate more cash? In that world, what company would be your best bet? We now know which one Berkshire Hathaway is betting on. 1 As an aside, it’s notable that Alphabet has another business — Waymo — where the company has so far rejected an asset-light model of licensing their software to OEMs, and has instead to date pursued a much more capital intensive approach of owning and operating their own cars; that’s a choice that has always felt at odds with Google Services, but is perhaps more compelling and aligned with Google Cloud and the Google Capital Company. ︎

0 views
Martin Fowler Yesterday

Fragments: June 2

Greg Wilson has noticed that lots of folks are using dodgy metrics to figure out if AI tools are worth their costs. Would you measure lines of code generated, or tickets closed? Or would you send out a survey asking whether developers feel more productive? Each of those approaches is flawed in a different way; He lists lots of common metrics, and why they are flawed. Sadly he doesn’t give any suggestions on what would be better. In my view, since we cannot measure productivity , any metrics are weak evidence at the best of times. I do somewhat use one of his flawed measures: “Asking Developers If They Feel More Productive”. While I acknowledge the problems he gives with this measure, I find that in an environment where decent measures are hard to find, even such a dim light is the best we have. In this situation these kinds of qualitative metrics may not be conclusive, but they are useful . ❄                ❄                ❄                ❄                ❄ Benedict Evans observes that extensive automation didn’t mean the demise of professions in the past. we spent a century automating accounting: we built calculating machines, punch cards, mainframes, data processing, databases, PCs, spreadsheets, ERPs, cloud… in fact, we built half of the tech industry around automating this. Yet the number of accountants kept going up. He goes into the myriad of problems that exist when we’re trying to forecast the impact of a technology on jobs. There’s the much-talked-about Jevons paradox - once something becomes cheaper, people do it more, which can increase demand. Often this leads to the nature of jobs changing, even if it’s called the same thing. Accountants today aren’t doing exactly the same work that they did in 1970 or 1980 ‘but more’ - they’re still called ‘accountants’ but the job is different. New technology often starts out being used for ‘the old thing but more’, but it rarely ends up like that. Technologies often affect whole businesses - consider the impact of the internet on news publishing. Did anyone observing the rise of smart phones in the early 2000s realize that a consequence of this would change the economics of taxis due to the rise of ride-sharing apps? The conclusion is that it is, at the very least, almost impossible to forecast the impact of AI on our work. ❄                ❄                ❄                ❄                ❄ Stephen O’Grady looks at how closed and open models have performed on benchmarks over time . Closed models are setting the pace of innovation, and constantly breaking new ground from a capabilities standpoint. Open models are chasing them, and the cycle times seem to be getting shorter. There are no clear capability moats, and what is frontier today is table stakes tomorrow. It tooks 13-18 months for open models to catch up to GPT-4 on these benchmarks, but only 2-7 months to catch up to GPT-4o. There’s a bunch of caveats to this analysis, that he lists, but it’s a worthwhile survey of how various kinds of models perform against the various measures we are trying to assess them with. ❄                ❄                ❄                ❄                ❄ One of the starkest examples of sloppy AI use is hallucinated citations - a give-away of both usage of LLMs and carelessness driving them. GPTZero is a company that makes tools to detect AI writing. I’ve no insight as to whether their tool is effective or not, but they do publish investigations of AI usage, and have published several articles highlighting hallucinated citations. One post focuses on Ernst & Young Canada’s report on cyber threats to loyalty systems and found that more than half its references were hallucinations. The post uses a lot of extremely annoying animations in how it presents its information (breaking Safari’s reader mode in the process). But the harm that these kind of AI generated reports can do goes further than just some misled humans: Publishing a report online is essentially a form of data injection into the pool of knowledge that is the internet. When the report includes fake information (either vibed citations or false claims) it can “poison the well” by misleading future researchers, especially if the report is published by a well-known consulting firm and hosted on a high-traffic website. ❄                ❄                ❄                ❄                ❄ As LLMs get more capable in programming, we are rightly worried that people will use them attack software systems. But these models can also be used for defense, allowing teams to find bugs before attackers do. Some folks from Mozilla posted an article on how they’ve used AI model to identify and fix an unprecedented number of latent security bugs in Firefox . Just a few months ago, AI-generated security bug reports to open source projects were mostly known for being unwanted slop. Dealing with reports that look plausibly correct but are wrong imposes an asymmetric cost on project maintainers: it’s cheap and easy to prompt an LLM to find a “problem” in code, but slow and expensive to respond to it. It is difficult to overstate how much this dynamic changed for us over a few short months. This was due to a combination of two main factors. First, the models got a lot more capable. Second, we dramatically improved our techniques for harnessing these models — steering them, scaling them, and stacking them to generate large amounts of signal and filter out the noise. During 2025, there were 17-31 security bugs fixed each month. In April 2026, they fixed 423. ❄                ❄                ❄                ❄                ❄ Pavel Voronin riffs on Unmesh Joshi’s post on What is Code . He observes that cruft in a codebase (technical debt) has always added friction to software development. But the consequences of this cruft are compounded when LLMs are using existing code as context for future work. In a degraded codebase, the model does not see “technical debt” as debt. It sees examples. It sees precedent. It sees a style to continue. LLMs multiply what’s currently happening. I hear reports that good code might take the place of much of what’s put in markdown, because LLMs will imitate what’s already in the code base. But bad code multiplies too. Inevitably he introduces another variation of rampant debt metaphors: Cognitive debt accumulates when a team uses abstractions it no longer understands. Generative debt accumulates when a codebase contains confused concepts that models are likely to continue. Cognitive debt is about what the team no longer understands. Generative debt is about what the model is now likely to reproduce. ❄                ❄                ❄                ❄                ❄ Jason Koebler, from the very worthwhile 404 media, has written a plaintive essay on how AI-generated slop is driving us crazy . Not just because its filling the web with this slop, but also because how it’s making us humans react to slop and the threat of slop. We review our own writing and notice: it’s not just reading AI slop that hurts us, it’s the risk that we write something that looks like AI slop. If I use phrasing that AI copied from me, does it seem like I’m copying AI? This has led to the appearance of “humanizers” - AI tools that make our writing look less like AI. Humanizers add typos, randomly replaces words, removes “AI tells,” and sometimes inserts random characters. It’s another step on the way to the Zombie internet: I called it the Zombie Internet because the truth is that large parts of the internet are not just bots talking to bots or bots talking to people. It’s people talking to bots, people talking to people, people creating “AI agents” and then instructing them to interact with people. […] It’s my email inbox, in which I used to occasionally get poorly-formatted, poorly written, extremely long emails from delusional people who were positive the CIA had imprisoned them in a virtual torture chamber using undisclosed secret technology but where I now get well-formatted, passably written, extremely long emails from delusional people who are positive they have proven AI sentience and have the AI transcripts to prove it. ❄                ❄                ❄                ❄                ❄ Andy Osmani points out that spawning lots of agents is like launching a bunch of parallel processes that all rely on a single orchestrating thread - yourself . Python has the Global Interpreter Lock (GIL). You can spawn as many threads as you want but only one executes python bytecode at a time because they must acquire the lock. You are the GIL of your AI agents. They all can run at once. But when any of their work needs genuine understanding of the architecture or resolving merge conflicts, that work has to acquire the lock. There is one lock. You hold it. This means you must design the workflow with the agents with that GIL in mind. You shouldn’t launch more agents than you can properly review. It’s handy to separate background tasks that can be offloaded to an agent from complex tasks that require applied attention. Don’t use that precious brain for things that the machine can verify itself. [And I’d add - do get the machine to build tools that ease human verification. For example, it’s better to surface test case data in tables rather than buried in assert statements.] Spawning agents is not the skill. Anyone can run 20. The real skill is designing the system around the one serial resource that cannot be cloned or parallelized. That resource is your attention. ❄                ❄                ❄                ❄                ❄ Jamie Hurst is a Principal Engineer at booking.com, where he works in developer experience with a focus on AI tooling. He’s written realistically about the gains and losses of using LLMs in this work. The cost of building has collapsed, but the cost of aligning organisationally has not. If anything, it’s gone up. When three different teams can each produce a working solution to the same problem in the time it used to take to write a proposal, the bottleneck moves from engineering to coordination. He thinks he’s able to do more as a senior engineer, but is concerned about how sustainable it is, both for him personally and for the organization he works for. He’s able to shape directions for multiple workstreams at once, in a way that he couldn’t three years ago. But one loss is that he doesn’t have enough time for mentoring, which will exact a toll on his employer in the longer term. He also finds he doesn’t have enough time to think. The productivity gains from AI got captured by output volume rather than output quality. The org’s expectations rose to absorb the speed-up, and the slack that used to exist between tasks, the unstructured time where strategic thinking actually happens, got eaten first because it’s invisible on a dashboard. I’m at a point in my career where thinking is supposed to be most of the job, and most of it now happens on holiday because the working week doesn’t accommodate it.

0 views
Brain Baking Yesterday

Favourites of May 2026

May was another weird month here in Belgium: the last weeks have been unusually hot. It’s pouring now, but I’m glad that it is as it gives our airconditioning units a few moments of respite. We’ll see what the upcoming summer months will bring. This month is packed with exams, grading, and deliberations, but after that, school’s out! Which means I’ll have to play daddy day care as my wife isn’t as lucky as me when it comes to paid leave. Hopefully I’ll emerge beaten up but victorious. Previous month: April 2026 . I somehow managed to keep up the pace from last month and finished four (mostly small) games: Forbidden Solitaire is a candidate for my GOTY. Yes, it’s that good. Related topics: / metapost / By Wouter Groeneveld on 2 June 2026.  Reply via email . Duck Detective: The Ghost of Glamping which is the second episode of the Duck Detective’s deducktions-based adventure game. It’s nice but only two hours long. Forbidden Solitaire , without a doubt the best and most unique card game with horror vibes I’ve ever played. Its atmosphere is inspired by nineties FMV games. Check out the launch trailer and tell me you don’t want to rush out and buy it: Pipistrello and the Cursed Yoyo is inspired by GBA-era top-down adventures. Its unique mechanics, pixel art, and soundtrack are superb, but the platforming near the end was a bit too much for me. Strange Horticulture is a cosy puzzler/visual novel where you have to figure out which client wants which plant. Are we past the point of the The Last Human-Written (academic) Paper ? Scientists Liu et al. compile a hefty report on what they call agent-native research artefacts. The results are… worrying? Promising? Popcar reviewed every single UFO 50 game . I don’t agree with many ratings but they’re well-explained. Noah Clements discusses how he hacked his bycicle monitor . The “welcome to hell developer” message is hilarious! James Sweeting thinks there should be a better term for game remakes and remasters: some specific types could be called videogame remakes instead. PekoeBlaze sees a difference in private versus public creativity . Do you make something different if you don’t have an audience? Seth Godin reminds us that nostalgia used to be deadly : For hundreds of years, nostalgia was seen as a serious disease, with doctors across Europe scrambling for a cure. Hundreds of thousands of people died from it . I enjoyed Pablo Meier’s review of Tunic . Every blog post by Pablo is accompanied by a song (“The song for this post is…”) which is a whimsical way to start an article. What’s a metroidbrania ? Even as a metroidvania lover, I hadn’t come across that term before. GiovanH explains the differences and intersection points . Is Animal Well a metroidbrania? Diederick de Vries played around with a retro printer and explained how you can use a “Retro Printer” daughterboard on top of a Raspberry Pi to connect an old electronics port to a modern printer. Inspired by my workspaces post, Diederick also shared his past and present computing workspaces . Nicole Express unboxes a unique high-tech computer . You’d be surprised. I enjoyed the many photos present in this one emphasiszing the high-tech part. Slightly related: Stephen Sherratt restored his childhood family computer . I’m a sucker for these kinds of articles. Some more articles to rattle the cage: just fucking use Go . Fuck no: Go is a terrible programming language . Ty Porter explains how he built a Game Boy Game in 2021 . Very educational! Blain Smith reflects on 30 years of programming at 44 years old . Profit pursuing enterprises indeed kill the joy of programming. Matthijs van Boxsel maintains a “digital encyclopedia of foolishness” where in 2021 he wrote about the Homo Viator (in Dutch). Miss Booleana writes about the struggles of motherhood (in German). Her experiences with parenting felt very much like ours. Matthias Wiesmann shared his thoughts on the video game Loop Hero . I’ve had my eye on that one for a long while; perhaps it’s finally time to try it out. Fabian Sanglard expresses why he likes the Magic: The Gathering 40 card format more than the usual 60. Over at ResetERA, Toma played 120 roguelikes so that we don’t have to . Their report is very extensive, a must read if you’re into the genre. Jeff Kaufman again urges us to donate 80% of our income and he has the numbers to prove it works for his family. I came across david.reviews , another indie site David maintains where he collects his thoughts on the media he consumes. After ten years of tinkering, Steve McCrea managed the impossible: to recreate Ultima Underworld in the Unity engine. It’s freely available at itch.io and looks very impressive compared to the original. Keep It In Your Pants! What? Apparently, Nintendo thought it was a good idea to create a few bold commercials for their Game Boy—including cuffing a woman to the bed as a misplaced joke to claim the GB is “seriously distracting”. The roguelike Game Boy game Roguecraft got a physical release! Of course I’m too late and everything is sold out. https://www.jwt.io/ is a JWT token debugger that shows how a token is compiled. Retro Ready is a nice Dutch blog that reviews retro-inspired handhelds (think Anbernic, GameMT). It seemed that Atari acquired the rights to the first five Wizardry titles . I still hope they’ll sometime, somewhere do something useful with Wiz 7 & 8. They deserve a proper reimagining/remake/whatever-you-call-it. Geneat is a website that helps you dig into the genealogy of your family. https://cooklang.org/ is a… recipe markup language? Nerds!

0 views
ava's blog Yesterday

be a good cook when you use AI to edit your writing

Whenever someone talks about how they let AI improve their writing, I realize we are still taking the wrong things away from what good writing supposedly is. Not that I am the arbiter of good writing, but we can agree that at its core, good writing is a pleasure to read, connects to the reader, respects the context and chooses the correct tone for the audience. It’s also about correctly identifying when “good” writing is needed at all. I think the literacy crisis we are going through extends in this way, where people aren’t just lacking in media literacy, but lacking in the skills above. It’s easy to think that good language is always full of jargon, that being an expert on something means long, drawn out explanations, and that you should use a supposedly intelligent, professorial tone all the time. It’s only with education and reading a lot that you learn that good writing is a spectrum, and all these things depend on the author’s style, occasion and intent, and are used in the right moments. That’s why people who generate responses to casual chat messages aren’t being met with excitement; cases like when you get an AI-generated Happy Birthday text, or when your friend replies to your vent with an AI-generated professional therapist response. These people want to do it right , but don’t respect context-switching and why some interactions need to decidedly be personal and “imperfect” to others. They have only taken away that good writing is big words, many words, and they are willing to shoehorn it into everything. I cannot blame anyone for reading over an AI-generated improvement of their text and thinking “ Wow, that’s so much better! ”. On the first read, it does seem impressive. And I don’t wanna sit here and pretend humans don’t manage to choose a completely wrong tone for the occasion or audience without AI as well, but it seems like many don’t actually tell AI the context or audience, and AI guesses incorrectly. People know what you sound like and how you usually write. Of course you are allowed to improve and change your writing style, but people will know when it is very sudden, completely out of character, and not something you’d manage on your own. And if you overdo it, AI will turn a concise, engaging and personal read with your own endearing quirks into either SEO marketing language, or an extremely dry scientific journal style read. You should be able to detect when that happens and take a step back. Otherwise you will sit there, proud of yourself that you wrote that, when it is so markedly different to your usual style and draft that you essentially employed a ghostwriter and pat yourself on the back for its output. And weirdly enough, I get the feeling many of you were never interested in “improving” your writing when it didn’t mean just copying a machine’s work. That’s having an editor, not you improving on your skills. You can liken it to skills in the kitchen: People who are just learning how to cook are learning about spices and think: The more spice, the better, so throw all of it in! Until a dish doesn’t taste good at all; too salty, too intense, everything is clashing. There is a point when it doesn’t elevate the dish, but ruins it. Some occasions don’t call for a curry, but instead a salad. A good cook will know the right dish and how to use ingredients and spices to make it pop. Don’t come with the fine dining if the people want your rustic potato bake. Employing AI to improve your text into oblivion is a slippery slope to sounding uneducated and phony. Please get away from the notion that longer and more complicated is better just for the sake of it. Reply via email Published 02 Jun, 2026

0 views
Unsung Yesterday

A mixed blessing

The otherwise excellent note-taking app Bear has an interesting bug that’s worth talking about while it’s still here. When you’re around to-do items, you can press ⌘. (period) to toggle any task complete or incomplete. It’s actually a really fun shortcut in practice: But when you have a larger selection with a mixed state (some checkmarks are on, others off), this is what happens: This feels like an obvious thing to implement, and this is also where the code itself wants to go when left alone. But this is not great. The rule is: When you have a mixed state, changing it should collapse (or: normalize) the entire selection to one state or the other, rather than perform individual inversion. Try ⌘B in your text editor on partially bolded text, and you can see that collapse in action: It feels strange to recommend that, particularly as it seems like it loses data. So, what gives? The first argument is “do not make the user jump through hoops” or maybe “respect a large selection.” If, as a user, I want to actually make sure all my tasks are done, the shortcut not being idempotent means that I now have to go through tasks one by one, and that’s a lot of work – especially since we’re talking about text selection, which is famously unpleasant. The second reason is that even the UI layer has an opinion here. In the above bolding example, Pages collapsing the selection to bold when you press the B toggle, makes the toggle UI behave exactly as it normally would with a simple selection: Elsewhere, in Figma, typing a number on top of a “mixed” state changes all the properties of relevant objects to that number: Imagine how awful it would feel mechanically in both the above examples if your action would still leave the text in the “mixed” state. It would simply appear like the UI broke, since the change didn’t fully “stick” – kind of like those tiny hated moments when you close the door, but it doesn’t latch on and reopens on its own, or when you engage the turn signal stalk, but it refuses to stay put and snaps right back. There is also one last reason. It’s the simplest one that I sometimes have to remind myself to put in my head before I jump too deep into the mechanics, or details, or technical nuances. Let’s say the toggles invert individually on a large selection. Who would ever benefit from it behaving this way? #bear #bugs #details #flow #keyboard

0 views
Unsung Yesterday

“Whoever is doing this had some fun with adding these portions.”

I hate lorem ipsum with passion, but guess what? There’s more intentionality in it than I assumed, and even easter eggs, as this fun 25-minute mini documentary from Emily Zhang/The Rabbit Hole shows: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/whoever-is-doing-this-had-some-fun-with-adding-these-portions/yt1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/whoever-is-doing-this-had-some-fun-with-adding-these-portions/yt1.1600w.avif" type="image/avif"> You can tell that this was not the work of an amateur. The garbled text is done with a lot of care and knowledge. You can see a lot of rational decisions about why it looks the way it does in there. They are making very careful additions, such as adding letters… the “-and”s and the “-ng”s… The “y” got stuck in because that’s an English letter. […] I think the fact that it is garbled Latin text, and that it has those other letters in a fairly Latin-based alphabet amount of frequency, speaks to that it was done very, very carefully. Just a fun watch all around. #typography #youtube

0 views
annie's blog Yesterday

Adorable tiny skulls // Week 22 — 2026

Sunday came, Sunday went, but the notes can be week notes any day they want to be. Current situation: Monday 25 May: Memorial Day, also hospital day. Having to work certain holidays is a new thing for me. The long light days of summer begin. Lat night it was light until well after 8. Pool is open now. Hasn’t been very warm yet but that will change quickly. We’re a week away from June. Tuesday 26 May:  Thinking about the cost of optimizing: The more you optimize, the more difficult it is to be flexible. A danger of losing resilience. There’s an underlying principle at work here. Stress creates resilience. — Scott Hogan, Built from Broken On the other hand: A nervous system that is constantly in sympathetic mode cannot hold complexity. — Nate Hagens, A Framework for Action (YouTube link) Wednesday 27 May: Back to runnnning. I did a Couch ➡︎ 5K program over March and April, ended in early May. Took a few weeks off. Wasn’t sure how I’d do today but all the muscles seem to remember what to do. Now I’m pleasantly tired, my legs are sore, and I feel amazing. I kind of wish running didn’t feel so good because it’s also so goddamn awful. Got Rob all moved out today. 😭 It’s fine, it’s good, he’s ready, it’s great, it’s time, blah blah blah blah blah I hate it. He’s still in town, at least. Thursday 28 May: One of those days where there are too many things. We gotta quit with all the things. So many things. One thing then another thing. LET ME NAP. The linden trees are blooming and they smell amazing. Also the magnolias. After everything, the wild went on. Of course it did. — Moonbound , Robin Sloan Friday 29 May: You know you are genuinely in old-person territory when sleeping till 7 feels late. Also, impossible to sleep past 7. My back has a strict time limit on when it must no longer be on a mattress. I find this upsetting and unfair. 💜 Mara here for the weekend! Saturday 30 May: Hospital day. I have GOT to get better about having some sort of dinner mostly prepped on hospital days because otherwise I come home and just eat whatever I can grab like a starved maniac. Gremlin mode activates. I just stood in the kitchen shoveling stale old potato chips in my mouth for… I don’t know how long. Let’s not talk about this anymore. Sunday 31 May: Hiking church. Very muddy after all the rain. Delightful water sounds everywhere. Spontaneous tattoo time, thanks to Mara who had her tattoo stuff with her and  just casually freehand drew this design on my wrist from a couple of inspo pics I found. It’s a blackberry vine with a few tiny skulls. I LOVE IT SO MUCH. 📚 Finished Moonbound by Robin Sloan. Excellent. I loved it. Cozy but in the way I want cozy to be when I try to read a cozy genre book and am inevitably disappointed (bored?). This one is the feeling. …. other things happened like I remember going to the gym at least twice? OH WAIT FUCK YEAH I PR’D BENCHPRESS BABY!! 115 POUNDS. That was satisfying. I want to write more about that Nate Hagens which I have not finished watching but which is really good but I am too tired. To sum up: A week (or so) has passed, I was alive, I did things or did not do things, here we are, me and the adorable tiny skulls are ready for sleep now.

0 views

I went on the Built for Turbulence podcast

I joined Radical's "Built for Turbulence" podcast recently for a wide-ranging chat on what AI agents are doing to the economics of software. We got into whether 5 people with agents can really out-build a 500-person company, the "Figma Trap" where you end up paying your competitor through your own customer usage, and why I think running human-written code that hasn't been AI-audited is going to start looking reckless. We also got onto open weights quietly closing up, why "safe" enterprise AI tools may be handicapping the organisations using them, and whether the small-team-with-agents advantage is temporary or here to stay. You can find the episode here on Spotify, Apple Podcasts and everywhere else.

0 views

Is the Monaco Grand Prix decided at qualifying?

A Formula One driver triggered my fact-checkitis. They claimed that Winning the Monaco Grand Prix in Monte Carlo is determined nine out of ten times by which position one starts in. That makes intuitive sense, because the Monte Carlo track is a narrow street track with few opportunities for overtakes. But … really? Is that an off-the-cuff remark or an accurate statistical prediction of the race? (Continue reading the full article on the web.)

0 views
Farid Zakaria Yesterday

Every byte matters

I have spent a large portion of my career working in Java. In that time, you get used to huge classes. New functionality? Just add a new method and field to the class. The cost of each new field is rarely considered. Performance is often considered from a classic computer science perspective by considering asymptotic analysis of the algorithms and data structures in-use. Turns out that even within a growth scale for your algorithm, such as a simple for-loop , time can vary dramatically if we have a little deeper understanding of the underlying hardware. First, let’s understand our current machine. Let’s take a peek at our cache line and page sizes. The instances number is a reflection of how the caches are shared amongst CPUs. If I had 10 CPUs, each one has their own cache, whereas two of them would share an cache. Our cache line size is 64 bytes . When you read a single byte from memory, the hardware will fill the surrounding 64 bytes into the cache line. The idea being that data is often temporal and spatially located, meaning data is often accessed near each other and close in time to each other. We can reference Jeff Dean’s famous “Latency numbers every programmer should know” , however a quick recap with the values from our particular machine is the following: The sizes for each cache, is the number returned by divided by the number of cores or instances; i.e. 352 KiB ÷ 10 instances = ~35 KiB. We then determine the number of cache lines by dividing this number by 64; i.e. 35 KiB ÷ 64 bytes = 560 cache lines. How does this all matter ? 🤔 Let’s consider an example where we want to iterate over a single struct and pull out the to filter them. We create our struct, and in this particular example we need 64 bytes to represent a single Monster. If we had an array of Monsters and we iterate over them, the cache line would fill up like so. Each cache line would fill with a single monster, and we would fetch only the byte. This is often referred to as “Array of Structs”. If we instead normalize the data such that each field is in it’s own list, we can pack the cache lines much tighter. This type of layout is referred to as “Struct of Arrays”. How much of an impact can this have? We can observe up to 30x improvements when the Monster struct is 1KiB 🤯 The delta is less observable when the struct is small because multiple Monster structs can still be fetched within a single cache-line. This data access is incredibly hot though. Your CPU pre-fetcher knows it’s going sequentially and fetches the next cache line before you need it. You never actually have to wait for the memory to be fetched. What about random access patterns? Not all access patterns are sequential. Hash maps, trees, graph traversal, and pointer-heavy data structures jump to unpredictable locations. The CPU can’t prefetch what it can’t predict. With random access, the CPU needs the entire array to be present in the cache in order to avoid stalls due to memory lookup. This means the total size of your collection determines your performance tier. Doubling the struct from 64B to 128B doubles the working set for the same number of monsters, pushing the data into slower cache levels. At just 512 monsters, a 64B struct fits in L1d at ~3 ns — but a 128B struct has already spilled to L2 at ~11 ns. We can observe this with a pointer-chasing benchmark. We allocate N monster-sized nodes, wire them into a random order, and chase pointers. Each hop lands at an unpredictable address, defeating the CPU’s prefetcher entirely. Rather than graph it logarithmically, which I find sometimes is easy to miss, I have included a zoomed in graph. We can see that all struct sizes hit the same staircase like pattern as they go through the various cache levels however the larger struct sizes are shifted left , meaning they hit the increase earlier. This means for random access patterns, if you can keep tight control on your total working set size, you can drastically affect the time. Knowing your struct and working set size can make a substantial difference.

0 views
iDiallo Yesterday

The web is changing, and we are not going back

Whenever I saw someone type a natural language query into Google, it made me cringe. "It's not a person," I would say. "Type like you're talking to a machine." This was especially true for programmers and it was before AI took over everything. Instead of "how do I write a function that reads a file?", I would suggest they use specific keywords, something that sounded more like machine language than conversation. "js function to read csv file" or "css gradient background property example." This got you better results. Even though Google was a sophisticated search engine, it was still doing a kind of keyword matching under the hood. But not anymore. You don't get any advantage from writing in "machine language." Google understands natural language just as well. In fact, even better. How is it that in 2026, I Google things less than ever? It's not that I know everything now. It's more that I don't want to call the friend who always talks too much. If the height of the Eiffel Tower ever comes up in conversation, I'll type "eiffel tower wiki" and click through to Wikipedia. I don't want to have a conversation about it. Googling something these days feels like Google is trying to join my private conversation. Where it used to be a tool for finding answers elsewhere, now it's a buddy who gives you an answer. And just as you're about to leave, it says, "hey, did you also know that..." There used to be a machine between me and the information I was looking for. It was good at its job. It sorted, ranked, then presented information. But now, the machine is constantly pushing information at me, watching my reaction, learning from it, and feeding me more, unsolicited. Before, information lived on the web and was hard to find. Today, information still exists, but it's buried under noise. Google no longer helps you find it, it just gives you an answer. That answer might be right or wrong, and right below it, in small print: "AI responses may include mistakes." You rarely get to verify whether the answer is correct, because almost no one clicks through to the source. I know this firsthand. More than three-quarters of my Google referral traffic has disappeared, while my search impressions keep climbing. So what's left to do? I could mourn the old Google, the simpler web. But as the title says, we aren't going back. This is the new reality, and we have to adapt. Rather than blindly embracing change, I think it's smarter to pick and choose. Just last week, I wrote about the small web still being alive . And it did exactly what its name suggests. It stayed small. There are other search engines built for people who want more control. DuckDuckGo. Kagi (my personal favorite). The habit of Googling everything is learned behavior and learned behaviors can be unlearned. What's harder to convey is that Google never presented us with facts, only sources and citations. The way the google answer is presented, we have the impression they are giving us undisputable truths. When everyone is sharing screenshots of the answer they got, all you can do is share a screenshot of the opposite answer you got. The source gets lost. That's where we are now. Skimming the average sentiment of a Reddit thread, or confirming something we already believed. This is the new reality. We're not going back to keyword matching. But I also don't have to accept the new way as the only way. Google has made its search box AI-first and that's their right, it's their product. But it's also my opportunity to try something different. We are not going back. So I might as well choose where I go next.

0 views

Hackers Used Meta’s AI Support Bot to Seize Instagram Accounts

The Instagram accounts for the Obama White House and the Chief Master Sergeant of the U.S. Space Force were briefly defaced with pro-Iranian images and messages over the weekend, after instructions began circulating on Telegram showing how to trick Meta’s “AI support assistant” bot into resetting account passwords. A screenshot from a video released on Telegram claiming to show how Meta’s AI customer support bot could be tricked into resetting a target’s password. On May 31, word began to spread on several Telegram instant message channels that Meta’s AI bot would happily add an email address to an existing account as part of the bot’s standard password reset flow. A video released on Telegram by pro-Iran hackers claimed to document a remarkably simple exploit that appears to have involved using a VPN connection with an IP address that is in or near the target’s usual hometown, requesting a password reset for the account, and then choosing to chat with Meta’s AI support assistant. From there, the video shows the attacker told the bot to link the account in question to a new email address, after which the bot dutifully sent that address a one-time code that allowed a password reset. The Telegram account that posted the video also linked to screenshots of pro-Iran images, videos and messages that defaced the hacked Instagram accounts, saying hackers had used the exploit to hijack a number of valuable (read: short) Instagram account names that allegedly have a resale value of more than a half million dollars. Meta has not responded to requests for comment on the video’s claims, but the company reportedly did acknowledge the dormant Instagram account for the Obama White House was briefly compromised. The security blog thecybersecguru.com reports that Meta pushed an emergency patch over the weekend, and clarified that no back end database was breached. “Instagram has notoriously poor human support infrastructure,” Cybersecguru wrote. “Recovering a locked account – especially a high-value one can take weeks of back-and-forth with an automated ticketing system. Meta’s solution was to deploy a conversational AI layer to handle common recovery workflows: relinking a lost email address, triggering a password reset, verifying account ownership. The assistant, presumably, was supposed to reduce friction for legitimate users stuck in account-access hell.” Ian Goldin , a threat researcher at Lumen’s Black Lotus Labs , said we’re entering unchartered security territory as more large online platforms start allowing AI chatbots to handle sensitive account recovery requests. Just like human customer support employees can be social engineered into providing unauthorized access to someone’s account, AI bots are equally eager to help and vulnerable to persuasion and trickery, he said. “AI chatbots create interesting new attack surface, and we’re likely going to see a lot more of these kinds of attacks,” Goldin said. Securing your various online accounts means taking full advantage of the most secure form of multi-factor authentication (MFA) offered (such as a passkey or security key). In this case, even using the least robust form of MFA that Instagram offers — a one-time code sent via SMS — likely would have blocked the exploit: The hackers who released the video on Telegram said their exploit failed to work against any accounts that had MFA enabled.

0 views