Latest Posts (20 found)

SQLAlchemy 2 In Practice - Chapter 7: Asynchronous SQLAlchemy

This is the seventh chapter of my SQLAlchemy 2 in Practice book. If you'd like to support my work, I encourage you to buy this book, either directly from my store or on Amazon . Thank you! Starting with release 1.4, SQLAlchemy includes support for asynchronous programming with the package, for both the Core and ORM modules. This is an exciting improvement that brings the power of SQLAlchemy to modern applications such as those written with the FastAPI web framework.

0 views

the devil wears prada 2 - loved it

I really like The Devil Wears Prada . I saw it in the cinema when it came out, and I've rewatched it two or three years ago with my wife, who had missed out on it and the cultural impact it had. It surprised me so much when the second movie was suddenly just... there! So I went today and I am absolutely in love. I can't wait until I can see it a couple more times, maybe right after rewatching the first again, and get to draw more connections and conclusions. The following will contain spoilers. At the start, I felt so proud of Andy. She is thriving, she is accomplished, she is getting honored for her work and she has great friends and coworkers! It feels so good to see that even 20 years later, she hasn't lost her ambition and drive, and did not cave for someone else's feelings anymore. She's standing up for herself and is much more confident, too. As you are introduced into the new situation (going back to Runway two decades later), you get to be nostalgic alongside her, which feels like such a good narrative choice; so satisfying to watch. Yes, I totally fell for the nostalgia bait, the " Look, do you remember that piece of info from the first movie? " stuff. It was fun! I was greatly entertained and half the cinema was gasping and squealing at times, recognizing things and pointing at the screen. I liked seeing that some things stayed the same while some things changed, while nothing felt forced or unrealistic. People and companies progress, and while you may see yourself as the main character that people surely will remember, your presence was likely much smaller than you realize. It felt in-character for people not to necessarily remember Andy or to be aghast that she has made it further, and it felt so human for Andy to go: Wait, that process changed? Wait, we don't do it this way anymore? It was also so, so good to see Miranda again, and what they did with her. I think they handled Miranda absolutely well, especially her first appearance. A big fanfare, thrilling, slaying in a dress. She still has her quirks, the air of superiority, the earned respect, the vibe that makes you stumble as you make it into her office - but she is also a Boomer, rather old by now, and even she has slowed down and now seems slightly out of place, overwhelmed. Things aren't like they were before, and she has issues with growing in the direction the work needs to go. Work culture and expectations have shifted, and they have not been kind to the person Miranda is. She can no longer throw her coat at people to assert her dominance, as there have been too many HR complaints; now she has to do it herself. She makes the occasional outdated, offensive Boomer joke in meetings, and while a much younger employee is allowed to reprimand her repeatedly for that, nothing happens. The young workforce has gotten used to their out-of-touch leadership making these sorts of comments (" That's just who she is ") and in turn, leadership has gotten used to feeling this sort of short-lived mild rejection of their words. No more uncritical appeasement and laughing just to laugh, the air is silent now before just moving on. Miranda used to always get her way and was able to boss people around with a sharp tongue - now her power has diminished, as she is ambushed by about eight (?) people in an environment she is not used to and cannot control. As such, she is unable to defend herself and the company against a ruthless take-over spurred by neoliberal ideals, too overwhelmed to make sense of it, and feeling left behind in a world that moves so fast. She's smart and cunning, but she can't make sense of the economic babble thrown at her, and her edges are smoothed out by the fear of jeopardizing her role and the possible renegotiation of her planned, but ultimately failed, promotion that Irv never got to announce. She has to grapple with what kind of legacy she wants to leave behind, when it is the right time to stop, what else she even has going in her life, and that her attitude has cost her dearly. As a viewer, it means a lot to see how gracefully they handled the fact that even the biggest, most fearsome Girl Boss ™ is aging out of her aura and control, and it is inevitable, but not necessarily sad. We have seen Miranda's issues with vulnerability and accepting help in the first movie, and here again, she is asked to get over herself for the greater good of everyone involved. It can be quite cringe-worthy how other pieces of media handle the modern world - way too many message pop-up sounds, texts always on screen, frequent video calls, extreme smartphone reliance for plot, and more. My wife described it as "when it is like Netflix shows", and that fits so perfectly. They really utilize this to death in their shows, together with extremely temporary memes and slang that already feel slightly too old once the release happens. I'm so glad this movie didn't fall into that trap! Yes, a main point of the movie is that times have changed - Andy no longer uses a flip phone, print numbers are rapidly falling, everything moves online, content is created for digital feeds, and your audience is not leisurely consuming a fashion magazine in a glamorous way, but seeing your short form content while on the toilet. The goal is to go viral, and there's a need for a much more direct and pressing damage control now that the public can directly fill your comments and mail boxes with their criticism. All while the industry is fighting with downsizing and consolidations. Still, modern tech doesn't get a center role in the movie in this obnoxious way, and they focus more on the core issues and workplace expectations that changed, over implementing a temporary reference or trend that will age badly. They do show some memes, but they are deliberately timeless and very focused on the movie, not trying to tie a current TikTok trend into it. What also "modernized" it in my mind is that aside from making the tyrannical girlboss less relevant in the age of work-life balance and HR complaints, they clearly brought in and parodied the Silicon Valley rich tech bro, just in the characters of Irv's son Jay, and of Benji Barnes. They clearly do not follow the rules of old money, as they dress like they're going out for a hike or the gym, act too casual, childish at times even, and seem to decide unpredictably, on a whim, in this really emotionally cold way. Money without class, without pretension, but also seemingly awkward and clumsy. Benji plans to go to the sun and has stopped drinking water because he thinks it's poisonous; there are mentions of weight loss and Ozempic. Really reminded me of Zuckerberg, Altman, Musk et al. in that way. The movie is full of celeb cameos that also aided the above modern feel; thankfully, most are really subtle, quick, and in the background. I think the ones most noticeable are Lady Gaga (loved her song) and Donatella Versace. It felt fair to me; the movie had a huge impact on the fashion world and was a tribute to it, so it makes sense that the second one would also honor their inspirations and also uplift new modeling talent. It felt fun spotting all the easter eggs, so to speak. In the first movie, Andy's boyfriend Nate was a complete dumpster fire. The older I get, the worse it ages. The narrative felt sexist, and I think the writers wanted to acknowledge that in this second movie. The New Guy ™ is a genuinely kind guy, but also kind of carries the vibe of all fictional men who are sanitized to death and would love to break out in a therapyspeak monologue about what is wrong with the other character. Still, I appreciate that over Nate, so we are good. The movie could have gone without the romance altogether. It added nothing to the core plot, and the screentime was minimal. I understand what they were trying to do, though: For once, show Andy in a normal relationship, resolving conflicts maturely, and that she doesn't need to choose between love and career like the first movie made it seem. And I can tolerate that. At least we were spared absolute hetslop . Emily is such a weird character to me. I did not think she would ever become so central, and I still think it is a weird choice, and probably the only thing in the movie I am scratching my head about. I guess retrospectively, I could see how the writers would wanna let Emily get her lick back on Andy for essentially coming in and torpedo-ing all her plans and dreams in the first movie, but it still felt... odd to me. Maybe because the way Emily and Andy compete in the first is such a subplot to me in the first, as I enjoy the rest more? I guess in light of that, making Emily mean and giving her the power to absolutely ruin Andy and Miranda makes sense, but something about it feels incomplete. At the end of the first movie, things seemed pretty resolved. But a late explanation of an unanswered phone call is what we are supposed to believe is what made Emily so cold this time? Not enough for me. I am also missing more reasons to empathize with how quickly Andy is just forgiving Emily for everything, when she hasn't only seemingly been fine with using her boyfriend for money, but also wanted to make tons of people jobless, and center herself in the magazine. Wild. Which leads me to the second point: Interesting imagery. For the entire movie until the end, Emily has red hair. The color red usually symbolizes power, evil, villains, blood, pain, and sin, and red hair is often associated with having a bit of a temper. Meanwhile, after everything comes out and she is ready to make amends and start over as her boyfriend broke up with her, her hair is platinum blonde, almost white, a color associated with innocence and new beginnings. In another part of the movie, Andy and Miranda look at the wall mural The Last Supper . Miranda muses that Jesus is depicted without a halo because it is meant to emphasize his humanity and fallibility, our shared inclination to betray one another. This is obviously foreshadowing to what is going to happen later, but it's interesting that minutes later, she is depicted at a large banquet table in front of the mural, seemingly imitating it in the place of Christ. There is also a gorgeous shot of her in the Galleria Vittorio Emanuele II in Milan, alone, sad, literally at a crossroads, surrounded by luxury and old, influential history. Ahhh, I wish I could write more, but the longer it's been, the more I am forgetting. I wish I could let it run on my second screen as I type. Maybe one day I will update this 8) Reply via email Published 06 May, 2026

0 views

MGK At The Spark Arena

What’s going on, Internet? Last night, me, my sister-n-law and our friend went into town to see the MGK gig as he brought his Lost Americana tour to Auckland for his only New Zealand show. MGK, aka Machine Gun Kelly, aka Colson Baker is one of those artists where it’s probably good to separate the art from the artist as he seems to be a ball bag in real life. I never paid any attention to him while he did hip hop records, but as soon as I saw the Bloody Valentine I was hooked. The album, Tickets To My Downfall was the exact type of nostalgia I needed for early 2000s pop punk in 2020. I skimmed through Mainstream Sellout when it released and never came back to it. We got Lost Americana last year which was a step up from the second record and I listened to it a bunch. But we also got Tickets To My Downfall All Access last year, the 5th anniversary reissue. Original tracklist, the bonus tracks from the SOLD OUT Deluxe , plus 5 new unreleased tracks. Whew. It was good to hear some more tracks from that era. We managed to grab reseller tickets, paid less for the three of us combined than a single ticket at face value, and the seats were pretty decent for where we ended up. Sweet as. Anyway, the show was good. It kicked off on time, it was loud, there were guitars and drums, only a couple throwbacks to the rap days and one or two songs from Sellout. It didn’t take long to get right into the Tickets To My Downfall songs and that was all I needed to hear. The stage was on theme too. A model of the Statue of Liberty’s head looming above with a cigarette hanging out her mouth, and his mic stand was a giant cigarette to match. Lost Americana indeed. The crowd around us were all there for the same reasons. Singing along with strangers who love the same songs is one of the best bits of a gig, especially the Tickets ones. Title Track , Drunk Face , Forget Me Too , Concert For Aliens , Jawbreaker , Nothing Inside , all hit. The cover of Paramore’s Misery Business was expected, and rocked. My absolute highlight was belting out Bloody Valentine word for word with everyone around me. My Ex’s Best Friend my second favourite on the album, still can’t get that one out of my head. We had a great time, a fantastic night out. Damn, what a show. I’ll see it again without hesitation. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views
Unsung Today

“Watchmaker’s delicate precision and ornate mechanical intent”

A surprising entry in the thread started by Photoshop and continuing through screwdriver handles is this 11-minute video from Errant Signal about a platformer game called Derelict Star : = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/watchmakers-delicate-precision-and-ornate-mechanical-intent/yt1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/watchmakers-delicate-precision-and-ornate-mechanical-intent/yt1.1600w.avif" type="image/avif"> I was inspired by the video, and really enjoyed its exploration of a demanding game that’s composed of just a few mechanics that are done really, really well: The number of inputs are small, but the expression those inputs allow is deceptively expansive. […] Derelict Star’s various areas are all built to explore the way movement systems function and even interact with one another. I think of user interfaces similarly, and of their need to build a certain consistent vocabulary of names, gestures, interface elements, concepts, and so on. Perhaps in an enterprise app you right click and discover something useful in a menu, and this will teach you about the usefulness of right click menus in general. Maybe pressing ⌥ to get to alternate symbols on your keyboard would inspire you (either consciously or not!) to try holding ⌥ in said menus, only to discover this brings up useful alternative options. Maybe seeing a keyboard shortcut next to one of these options will suggest to do that next time, and so on, and so on. I really loved this bit in the video that could apply to a lot more software than just videogames: It took me maybe an hour to do this, but right on the other side is a checkpoint. The game is hard, but it isn’t cruel. It’s designed to challenge you, but it has faith in your ability to complete it. The narrator uses the term “ludocentrism” to refer to games that ruthlessly prioritize the mechanics and gameplay over narrative, aesthetics, and so on. (“Ludic” meaning “relating to play.”) Of course, the calculus of what videogames care about will be different than goals of creative software or enterprise software; no one cares about the hero’s journey of the largest number in your Excel spreadsheet. But I think some version of ludocentrism applies to “boring”software as well. My beliefs here are probably something like this: #definitions #details #games #youtube you can’t reduce everything to just functionality or just efficiency, especially in creative moments of software use, and people use software creatively much more often than we suspect, including software not thought as “for creatives.”

0 views

Am I Meant To Be Impressed?

If you liked this piece, please subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large .  I just published a lengthy discussion about how OpenAI and Anthropic make up 70%+ of all AI GPU compute capacity and revenue . The previous week I wrote about how OpenAI will kill Oracle — and quite possibly Larry Ellison’s personal fortune, too . Subscribing to premium is both great value and makes it possible to write these large, deeply-researched free pieces every week.  God, it’s been a long few years, and only feels longer after every ecstatic, ridiculous round of tech earnings where the world’s largest companies do everything they can to obfuscate the ugly truth behind their numbers. Let’s start with the biggest, ugliest one: Microsoft, Google, Amazon, and Meta are expected to spend between $800 billion and $900 billion on AI capex in 2026, and over $1 trillion in 2027 . By the end of 2027, big tech will have sunk $2 trillion into AI capex, with very little to show for it. Oh, I know what you’re going to say. “These companies are growing faster than ever!” “These companies are building for future revenue streams!” “These companies are saying that AI is driving growth!”  Yet those revenues are, in the case of Meta and Google, not good enough to actually share.   While Google CEO Sundar Pichai will gladly say that “[Google’s] AI investments and full stack approach are lighting up every part of the business,” said “lighting up” never results in a revenue number that you can point at, because Google knows that analysts and journalists will read “Gemini Enterprise has great momentum with 40% quarter on quarter growth” — which we have no frame of reference for because Google doesn’t share its AI revenues — and clap and honk like fucking seals. Sundar Pichai knows that everybody is desperate to see him jingle his keys, and has such utter contempt for reporters, analysts, and investors that he doesn’t have to prove AI is actually doing anything. Those writing up his earnings will do it for him.  Meta, on the other hand, has little real AI story, and can’t even seem to get its metrics straight on what AI is doing for the company, per my premium piece from earlier in the week : Nevertheless, I have to give Microsoft and Amazon credit for deigning us worthy of actual numbers, even if they’re piss poor. While Meta and Google refuse to actually explain their AI returns, Microsoft revealed that it had $37 billion in AI revenue run rate — $3.08 billion a month or so — and Amazon had $15 billion, or around $1.25 billion a month . And I must be clear, that’s revenue, not profit. In any case, I need you to recognize how small these numbers are in comparison to the capex it’s taken to make them.  To give you some context, Amazon’s AI revenue run rate is roughly 0.419% of the $298 billion in capex it spent on AI capex so far, or around 25% of the $5 billion it just invested in Anthropic last week . Microsoft, on the other hand, has spent $293.8 billion on AI capex through its latest quarter — making its revenue run rate around 1.04% of its spend. These revenues are deeply embarrassing! I am not sure why this isn’t the common refrain! These fucknuts have spent over a trillion dollars on AI and all they have to show for it is either nothing , vague statements about “everything lifting because of AI,” or pathetic revenues that only get worse the more you think about them.  For example: even if Microsoft were to make $37 billion in AI revenue in 2026 — remember, that $37 billion run rate is a snapshot in time! — that would still be $500 million less than the $37.5 billion it spent in capital expenditures in the fourth quarter of 2025 .  Yet things actually get worse when you think about the sources of that revenue, or perhaps I should say source, as both Microsoft and Amazon (and I’d argue Google too, but we don’t know its AI revenues) are heavily-dependent on their large, unsustainable sons — Anthropic and OpenAI. I’ll explain. Microsoft claims that its $37 billion in AI revenue run rate has grown by 123% year-over-year, which means its run rate, not actual 2025 AI revenue, was about $16.59 billion in Q3 FY25, or around $1.38 billion a month or, if you assume that number is consistent over the quarter (it likely wasn’t), about $4.14 billion. Based on my own reporting from direct Azure revenue numbers, this would make OpenAI’s $2.947 billion in inference spend in that quarter around 71% ($11.7bn) of Microsoft’s Q3 FY2025 AI revenue run rate. That’s embarrassing!  Oh, and capital expenditures for that quarter were $21.4 billion , or around $4.81 billion more than its annualized revenue.  Yet my reporting helps us be a little more annoying than that. Back in January 2025 — around Microsoft’s Q2FY2025 earnings — it announced that its AI revenue run rate had hit $13 billion , or around $1.083 billion a month (or $3.25bn a quarter or so). In that same quarter, OpenAI had spent $2.075 billion on inference on Azure, or 63.8% of Microsoft’s AI run rate. This is particularly funny when you go back to the quarter before, where Microsoft CEO Satya Nadella low-balled that figure, claiming it would be $10 billion in annualized run rate, and specifically said the following : That’s…not really what happened. Today I can report, based on discussions with sources with direct knowledge of Azure revenue, that in Q2 FY2025, Microsoft brought in around $325.2 million in revenue via renting out GPUs and other AI infrastructure, and around $367 million in revenue from Microsoft 365 Copilot, or less than half of the $1.467 billion that OpenAI spent on inference.  If you’re curious, the next quarter (Q3FY2025), AI infrastructure brought in around $412 million, and Microsoft 365 brought around $300 million.  While my sourcing for Azure revenues cuts off at Q3 FY2025, my OpenAI inference and revenue share data goes out a further two quarters to Q4 FY2025 and Q1 FY2026 (so Q2 and Q3 of the calendar year 2025), as well as half of Q2FY2026, and we can make some fairly straightforward estimates as a result. So, based on my reporting, OpenAI spent $3.648 billion dollars on inference in the third quarter of 2025 on Microsoft Azure, or around $14.4 billion on an annualized basis.  While I only had half the fourth quarter’s numbers, I estimate that OpenAI’s annualized spend hit over $18.5 billion — or around $4.6 billion a quarter — by the end of the year, and that’s not accounting for things like Sora 2 or the launch of its Codex coding platform. In total, this puts its spend at an estimated $13 billion dollars on Azure just on inference, with billions more on training. Yet Microsoft Azure isn’t the only place that Microsoft gets fed revenue from OpenAI. Microsoft also accounted for 67% of CoreWeave’s 5.15 billion in 2025 revenue — or around $3.45 billion dollars — and as all of that is used by OpenAI. I also believe this is used for OpenAI’s training compute, as CoreWeave’s announcement related to its direct deal with OpenAI specifically said it was contracted “...to power the training of [OpenAI’s] most advanced next-generation models,” and said capacity was only available because Microsoft declined to extend its current agreement to use compute for OpenAI . All together, that puts OpenAI’s spend on Microsoft services at over $18 billion dollars in 2025, and it’s easy to see how that would grow to over $24 billion dollars on an annualized basis in the last quarter, or around $2 billion a month. Microsoft is OpenAI’s primary cloud provider, and I estimate that OpenAI represents around 70% of its AI revenue, while taking up the majority of its infrastructure. Otherwise, Microsoft’s 20 million Copilot 365 subscribers likely pay no more than $7 billion a year. I also think that OpenAI is taking up the lion’s share of compute. As I discussed in my most-recent premium newsletter , Epoch estimates that Microsoft had around 2GW of compute by the end of 2025, with OpenAI as its largest customer. At the end of 2025, OpenAI’s CFO said that it had access to 1.9GW in compute, at a time when its compute was entirely supported by Microsoft and CoreWeave (estimated to have 480MW of compute).  Considering that 67% of CoreWeave’s revenue came from Microsoft renting capacity for OpenAI , I also think that it’s fair to assume that 80% or more of Microsoft’s GPUs are taken up by OpenAI, though some might now be taken up by Anthropic, which agreed to spend $30 billion on Azure. I’ve also confirmed that Microsoft’s “Fairwater” data centers — which constitute (when finished) “ hundreds of thousands of GPUs ” — are entirely reserved for OpenAI.  Microsoft desperately wants you to think that this is a diverse, booming revenue stream, when in fact it’s spent around $293 billion in four years to make — when you remove OpenAI — less than $3 billion a quarter in revenue, not profit. Booooooo! Booooooo!!!!! As far as Amazon goes, things get a lot grimmer. As I mentioned earlier, in early April , per Reuters, Amazon’s Andy Jassy admitted that its “cloud business’ AI revenue run rate was more than $15 billion in the first quarter of 2026,” which translates to around $1.25 billion in monthly revenue, or roughly 0.419% of the $298.3 billion in capex it spent so far, or around 25% of the $5 billion it just invested in Anthropic two weeks ago .  I also think it’s reasonable to assume that a large part — if not the majority of — that revenue comes from Anthropic. Per my reporting last year , Anthropic spent $518.9 million on Amazon Web Services, at a time when it had around $7 billion in annualized revenue, a figure that’s increased by 500% (if you believe it) to $30 billion in annualized revenue since . $518.9 million is about $6.2 billion in annualized spend, and I think it’s fair to assume that its spend will have at least doubled to $12 billion in annualized revenue, or around 80% of Amazon’s AI revenue. As of the end of Q4 2025, Amazon had 1.67GW of capacity — and based on my estimates from my newsletter published April 21 , 500MW of that is taken up by Project Rainier, a data center dedicated entirely to Anthropic , which is also Amazon’s largest AI customer. I’d be confident in assuming that more than 75% of its capacity is taken up by Anthropic. And man, $1.25 billion a month is fucking pathetic. I’m sorry, how are any of you possibly impressed by this?  God, everyone loves to slurp down Sundar’s slop. You all fall for it! Sundar Pichai doesn’t respect you enough to tell you how much AI revenues Google makes, but because its current businesses continue to grow thanks to its tried and tested tactic of making shit harder to use so that Google services can show you more ads . Nevertheless, people are doing backflips over Google Cloud’s 63% in year-over-year revenue growth ($20.03 billion), and I have a few thoughts: One of the reasons that Google might not want to break out its AI revenues is that they’re — much like Amazon — heavily-inflated by Anthropic’s compute spend. Sadly, we have only a little information about Anthropic’s spend outside of its promise to use “up to one million TPUs, with over a gigawatt of capacity [coming] online in 2026” from the end of last year, and a month ago, when it said it would use “multiple gigawatts of next-generation TPU capacity…starting in 2027.”   Another guess might be to travel back in time to before Anthropic was a huge consumer of compute. In Q4 2023, Google Cloud sat at about $9.19 billion a quarter , and $11.96 billion in Q4 2024 (around 23% year-over-year, but a putrid 5% quarter-over-quarter from Q3 2024). By Q2 2025, it sat at $13.62 billion , and as I mentioned above, accelerated to $15.15 billion to $17.66 billion (14.2% quarter-over-quarter) to $20 billion (11.7% quarter-over-quarter) in the following three quarters. These periods match up exactly to Anthropic’s big jumps in revenue from Q2 2025 ( around $3 billion ARR ) to Q3 2025 ( around $7 billion ARR ) to Q4 2025 ( around $9 billion ARR ) to Q1 2026 ( around $19 billion ARR ), which suggests that Anthropic’s growth is what’s actually boosting Google Cloud. Yet things get weirder when you listen to Google’s most-recent earnings call : Interesting. Interesting. Google appears to be planning to sell its TPUs — its own custom silicon it currently uses only for its own services and some of Anthropic’s — to a non-specific amount of unnamed customers, to the point that its remaining performance obligations jumped from $242.8 billion to $467.8 billion in the space of a quarter.  Nevertheless, that’s a remarkable jump, especially when you try and work out who they sell to- oh wait, we actually know! Google also signed a multi-billion dollar deal to rent TPUs to Meta, per The Information , and is also discussing A) selling TPUs to Meta directly, and B) creating SPVs that will buy its own GPUs and lease them to others: This is exactly the same shit as NVIDIA is doing with xAI’s GPU-related financing last year . To explain, Google is creating something called a special purpose vehicle — a company with one purpose — that it then funds along with an investment firm. The SPV then raises cash via debt, which it then uses to buy TPUs directly from Google . Now, remember that Anthropic deal to use a million TPUs from last year? How about the deal with Broadcom (which makes TPUs for Google) and Google to use “multiple gigawatts” of TPUs starting in 2027? Well, Per CNBC, Anthropic agreed to buy $21 billion of Broadcom’s TPUs in 2026 and $42 billion in 2027 . Where will those TPUs go? Google’s data centers, probably the ones that it’s backstopping, per my premium from the beginning of the week : It’s a pretty sweet deal for Google! Google pays Broadcom to develop TPUs, Anthropic pays Google to buy those TPUs once Broadcom builds them, Google installs those TPUs in a data center, and then Anthropic pays Google to rent them back.  This isn’t real demand! Boo!!!!!! BOOOOOO!!!!!! So, for the sake of transparency, I wrote the above before The Information published its story about how Anthropic had committed to spend $200 billion on Google Cloud and TPU chips, which contained this very important detail: The Information’s story also had this fascinating chart showing that around 50% of Amazon, Google and Microsoft’s backlog (which includes all revenues not just AI) — a staggering amount — is made up of revenue from OpenAI and Anthropic: To be clear, I also wrote the below before this chart ran, because it was very fucking obvious when you actually looked at the numbers .  Anyway, as I said in my last premium newsletter: As I’ve explained, most AI revenues out of Google, Microsoft and Amazon come from two companies that lose billions of dollars a year, have no path to profitability, and are only able to keep paying these companies because the companies (and investors) keep feeding them money. These relationships are utterly poisonous, and an intentional attempt to deceive investors and the general public.  Google now plans to invest up to $43 billion in Anthropic, a company that I estimate takes up at least half of its 2.95GW of capacity, which has cost it around $211 billion in capex since 2023. Amazon has already invested $13 billion and as much as another $20 billion more in Anthropic, and announced its latest round with a statement about how Anthropic will use up to 5GW of compute capacity . While dimwits might read this and say “WOW, AMAZON JUST LOCKED UP TONS OF FUTURE REVENUE,” it’s important to remember that Anthropic plans to lose $11 billion a year both in 2026 and 2027, and that’s based on its own internal (and fanciful) projections!   Let me spell it out in a way that boosters can understand, in the style of Gillam Fitness : Anthropic not have money to pay big cloud bills, because Anthropic company cost lots of money, more money than Anthropic make! So Anthropic only PAY cloud bills if OTHERS give it money! Amazon GIVE MONEY to Anthropic to GIVE BACK TO AMAZON, which mean no profit! And Amazon not give Anthropic enough money to pay it, so Anthropic have to ask OTHERS for money! That BAD! It mean BUSINESS not STABLE, and CLIENT not STABLE.  This bad when client MOST OF AI MONEY! This ALSO mean that Anthropic RELIANT on OTHERS to pay AMAZON, which make AMAZON dependent on VENTURE CAPITAL for FUTURE REVENUE! Amazon SAY it have BIG BUSINESS, but BIG BUSINESS dependent on ANTHROPIC, which mean BIG BUSINESS dependent on VENTURE CAPITAL! This SAME for GOOGLE! Both say they have BIG CLIENT, but BIG CLIENT MONEY not supported by REVENUE, so BIG CLIENT actually mean “HOW MUCH VENTURE CAPITAL MONEY ANTHROPIC HAVE.”  This bad business!  And it really, really is .  Most of Amazon, Google and Microsoft’s capex is being driven into capacity mostly used by OpenAI and Anthropic, neither of whom have the money to pay without continual infusions of more capital. Only Microsoft was smart enough to realize the problem, which is why it allowed Oracle to take over the majority of OpenAI’s future capacity ( which may kill Oracle, by the way! ), but both Google and Amazon keep feeding Anthropic money so that Anthropic can feed it right back to them.  I’m going to try and speak simply again, because I’m still not sure people get this. The only solution to this problem is if either Anthropic or OpenAI can somehow find a way to become profitable, something that I have yet to see any proof is possible.  In fact, the only proof I can find is that these fucking companies are more unprofitable than ever — in the last month, Anthropic raised $10 billion from Google , $5 billion from Amazon , and is reportedly trying to raise another $50 billion from investors , less than three months after it raised $30 billion on February 12, 2026, which was five months after it raised $13 billion in September 2025 . That’s $58 billion in eight months, with the potential to raise it to $108 billion. I’m gonna be honest, I think Anthropic is outright misleading its investors if it’s saying that it will only burn $11 billion in 2026 and 2027, per The Information : If that were the case, why does Anthropic need to raise one hundred and eight billion fucking dollars in less than three quarters?   Time to make up some booster talking points and get mad at them: So, SemiAnalysis — which traditionally does not wheel and deal in revenues! — randomly said that Anthropic had hit $44 billion in ARR , or around $3.08 billion in monthly revenue and…I’m sorry, what?  I know that my suspicion of Anthropic’s revenue numbers has effectively become a meme by this point, but something about this doesn’t add up. If we cut the periods down to strictly those after March 9, that means that Anthropic brought somewhere between somewhere between $4.5 billion and $5.58 billion in less than two months , or roughly its entire lifetime revenue. This was also a period where Anthropic claimed it was facing capacity shortages , but said shortages only appeared to create performance issues for its current customers rather than stopping Anthropic from making money… …which makes me wonder what all of this “capacity” talk is actually about.  If Anthropic is truly facing a “capacity crunch,” it’s choosing to solve said crunch through sheer, unbridled greed, taking on more customers as it struggles to keep its services at above two nines of availability . If it were an ethical business, it would simply stop taking on new clients, much like GitHub Copilot did as it transitions to token-based billing . Nevertheless, its capacity issues also make me wonder whether it’s actually taking on all that revenue, and if so, where it’s actually coming from.  Per Newcomer , as of the end of last year, 85% of Anthropic’s revenue came from API calls from companies or individuals using their models to power services. This would mean that there was roughly — assuming that number is down to around 70% given the ascent of Claude subscriptions — $3.5 billion of API spend in the space of two months, or a few thousand trillion tokens’ worth of spend. For some context, Meta’s “token-maxing” fiasco from the beginning of April involved it burning around 60 trillion tokens in 30 days, but based on discussions with sources familiar with Meta’s spend, 80% of that was cache reads. The Information estimates that the actual cost in that period was around $330 million, meaning that Anthropic needs at least another five — if not ten — Meta-sized customers, or such incredible dispersed demand that has effectively appeared out of nowhere in the past three months , to possibly come close to those numbers. I personally think it’s because Anthropic is doing something peculiar with its annualized revenue calculations. Per The Information : The first and most-obvious place to game the numbers is that Anthropic chooses a single day’s active subscribers to anchor to its annualized revenues, which means it can preferentially select one where, say, a bunch of new people were signed up under a trial, or avoid a day where churn had users leaving. One could easily include those who are canceled but have yet to actually leave the service — such as somebody who canceled on April 7th but would still be on as a “paid” subscriber until May 8th — too. As far as API credits go, it’s easy to manipulate a four-week-long segment based on how Anthropic bills its enterprise customers, specifically self-service enterprise deals . In this case, Anthropic customers pre-pay a sum (say, $50 million) in credits that are billed based on their teams’ usage, and don’t expire or run out unless they’re actively consumed. Anthropic could very, very easily manipulate this by — instead of booking based on an enterprise’s actual token burn — saying “we just got $50 million in API revenue in a calendar month!” even though that $50 million might take months to actually use. To be fair, there are also other customers (referred to as “sales-assisted”) that are billed in arrears for their consumption. It’s unclear what the split is, and Anthropic doesn’t have to tell you. Remember: Anthropic is a private company! It can do all the non-GAAP bullshit it likes.  I keep hearing about how Anthropic is capacity-strained and all that shit, but I don’t hear any explanations as to how it fixes that problem, or what that problem actually means for the business itself. Somehow being “capacity constrained” has led to the company making more revenue, which makes me wonder whether it’s a “constraint” so much as “a company running as shitty a service as it can while billing as much as possible.” Either way, it’s unclear how many data centers are actually getting built, or indeed how long they’re taking to build. What does Anthropic do if it’s 12-18 months away? And really, why do these capacity constraints not seem to have any effect on its revenue growth? I ask because Sundar Pichai noted on Google’s most-recent earnings call that Google Cloud would’ve made more revenue had it had the capacity to meet demand. Why is Google revenue-constrained due to capacity but not Anthropic? While there’s a compelling argument to be made that Anthropic was the customer that would’ve bought that compute, I think we deserve an actual explanation of what Anthropic needs more compute for if it’s not “to make more money.” Also, if it’s currently making as much money as it likes with its current capacity constraints, wouldn’t getting more compute…make the numbers worse? Ah, fuck it, let’s move onto something funnier. Meta is probably the funniest company in the AI bubble, in the sense that it does not appear to have anything approaching an AI strategy beyond “build as much data center capacity as possible” and “ lose $4 billion a quarter selling pervert glasses .” I realize I sound a little dismissive, but nobody can actually explain to me what Meta is doing with AI in a way that remotely justifies it burning $158.25 billion in capex since 2023, with plans to spend as much as $145 billion in 2026 alone . Oh, Meta’s AI app was high in the app store charts? Who fuckin’ cares! Who gives a shit! Oh, it launched its own closed-source “Muse Spark” model ? What am I meant to be impressed about? That over $150 billion has resulted in a model that ranks #27 on the LLM leaderboards in coding ? Now, some of you — including people I respect so much I’m not going to mention them by name! — appear to believe that Meta has some super-secret way of using all these GPUs to make “more money from ads,” and I must be clear that Meta has yet to explain that that’s the case.  Per last premium : You’ll note that these conversion numbers aren’t connected to any financials , which makes them a little suspicious, as 99% of Meta’s advertising revenue is ads, and “more conversions” should be fairly easy to peg to “more money”...unless said conversions aren’t actually converting into revenue for Meta’s advertisers. What does a “conversion” mean, in this case? Are these CPA ads that reward Meta on a clickthrough? Or CPM ones that pay per thousand impressions that just happen to result in a click?  Again, these are ads, which means that it’d be very easy to take that “conversion” number and turn it into “made $X,” unless of course said amount is pathetically small in the grand scheme of things. Seriously though, what is Meta doing? I suppose it doesn’t matter when the Wall Street Journal will breathlessly write that ( and I quote ) Meta is envisioning “supersmart agents” and the following lede that I find to be one of the more-revolting things I’ve read about a hyperscaler recently: You may be wondering what the “glimpse” was, and it was “laying off 8000 people” and “grading employees in performance reviews on their AI use” and “making a CEO chatbot for Mark Zuckerberg to talk to.”This is an ugly, wasteful, distressed company that has no idea what to do anymore, run by a mad king who literally cannot be fired , and those who are charged with scrutinizing it will write entirely imaginary comments like “wow, Mark Zuckerberg is building supersmart agents!” without a second’s thought. The magical hysteria of the AI bubble is such that Meta, Microsoft, Google and Amazon are, despite proving no actual profit from their AI investments, effectively protected by most of the media, investors and analysts. To be clear, I don’t think any of these companies die as a result of the bubble bursting, but I’m sick and tired of hearing everybody cover their asses with the same tired and hollow talking points, so I’ve pulled together a few of them: So, while this is technically true — as I said, these companies will not die as a result of the bubble bursting — any investor (or person who wants to deal in “the truth” rather than “stuff they misread or misremembered”) should be deeply concerned that they’ve sunk around a trillion dollars into AI capex, and all they’ve done is incubate two large, unprofitable companies that have become a burden on their infrastructure, and revenue streams that they either refuse to disclose or are both incredibly-centralized and proportionately embarrassing. Let’s get specific: 2023, Microsoft, Google, Amazon, and Meta have spent a little over $850 billion in capex, mostly hoarding NVIDIA GPUs that will be near-to-completely obsolete by 2030.  With these GPUs comes a massive depreciation problem, as I discussed a few months ago in my time bomb premium newsletter . Every quarter, more GPUs come online, which grows the “depreciation” line on the income statement, steadily growing every quarter to the point that the Wall Street Journal projects that it will eat as much as 58% of Meta’s, 40% of Microsoft’s, and 38% of Google’s net income by 2030. This flows neatly into my next point. Well, let’s be clear: whatever growth these businesses currently have is being eaten by depreciation. Last quarter, Google had $6.48 billion, Amazon $18.94 billion, Microsoft $10.1 billion, and Meta $5.9 billion, numbers that sometimes oscillate slightly down but have, year-over-year, grown by billions of dollars. And yes, year-over-year is appropriate here because this is a balance that has been steadily growing for years. In any case, depending on the company, that “growth” is either “barely related” or “entirely unrelated” to AI.  Remember: Microsoft and Amazon are intentionally obfuscating their AI revenues by using “annualized” — a term they refuse to define that usually refers to a monthly figure times 12 — to define something in statements related to quarterly revenue. As a result, it’s impossible to precisely backtrack how much revenue they made. In fact, that’s probably the simplest response here: if these companies were truly growing as a result of AI, they’d tell you. They’d say “AI revenue was X.” They’d say it in blunt, obvious terms. No annualized revenues, no projections, no fluff, no “AI-influenced,” just a line item that said “AI:” or even a segment, such as “Microsoft Azure AI compute.’ I also want to be clear about something else: I know, from documents viewed by this publication, that Microsoft has these line items fully itemized, and could share them if it wanted to, but intentionally chooses not to. These companies are deliberately refusing to share their AI revenues: and it’s time for the tech and business media to begin asking them why! So much that neither Google nor Meta will tell you how much! Also, three years in, nearly a trillion dollars, and two companies have dedicated nearly their entire sales operation to pushing it, and the best they’ve got is annualized revenues and no segment breakdown.  “Oh, Microsoft has 20 million paying Copilot subscribers,” $600 million a month? For a company that makes $80 billion a quarter? That's a pathetic amount of money. You could raise more money by auctioning dogs ! I need you, please, god , to start actually using basic mathematics! Microsoft has spent $293 billion on this bullshit, and spent another $30 billion or so in the last quarter on capex! When does this pay off? As I said above,  Amazon Web Services was profitable in a decade and cost about $52 billion between 2003 and 2017, and that’s normalized for inflation ! Anyone making this point is either intentionally lying to you or incredibly ignorant. I have done the work to prove this point, and will continue to repeat it until those too incurious or deceptive learn to stop doing so.  When?  Wwwwhen????? Whheeeennnnnn?????????????? I’m serious, when? And how??? Not that they would, but in a scenario where Meta, Amazon, Google and Microsoft stopped spending capex on AI next quarter, they would have to make somewhere in the region of $2 trillion in brand new revenue — all while other services continued to grow — to make any of this capex worth it. Please, explain to me how that happens when it’s taken three years and nearly three hundred billion fucking dollars for Microsoft to squirt out maybe three billion dollars in revenue (not profit), with most of that coming from OpenAI! Please, somebody, anybody explain! You can’t!  But you know what, let’s try! As The Information said, around 50% of all remaining performance obligations, as in all (NOT JUST AI) of the upcoming revenue for Microsoft, Meta and Amazon , is from either OpenAI or Anthropic. Put another way, 50% of big tech’s upcoming revenues are dependent on two companies, neither of which can afford to pay them, meaning that 50% of Meta, Amazon and Google’s revenues will either come from their own venture investments or venture capital. This is not what stable or diverse revenue looks like, and suggests my grander thesis about AI demand is true . Outside of OpenAI and Anthropic, there’s barely any actual demand for AI services or AI compute at the scale necessary to substantiate a trillion or more in capital expenditures. Yet the most-disgraceful part is the sheer contempt that these companies have for investors, the media, and the general public. In a functioning regulatory environment — or a market run by people with object permanence — it would be impossible to add such large amounts to your RPO balance without active scrutiny and analyst markdowns based on the fact that Anthropic and OpenAI can literally not afford to pay these bills at this time. Microsoft, Amazon and Google have scooted along for years on the idea that they’re diverse, well-positioned companies that can build massive AI revenue streams. In reality, they’re the paypigs for Anthropic and OpenAI, providing more than 70% of their compute as a means of artificially inflating their AI revenues, knowing that analysts and the media will nod and smile without a single thought. In fact, fuck it, I’m ending this with a rant. The story of massive AI demand is a lie — a trillion dollars annihilated to create the largest circle jerk of all time.  Venture capitalists and hyperscalers feed money to OpenAI and Anthropic, so that venture capitalists can feed money to startups to feed to Anthropic and OpenAI, so that Anthropic and OpenAI can feed that money back to hyperscalers, who then feed that money to NVIDIA and buy more GPUs.  While it might seem tempting to credit these men as geniuses for creating companies specifically to feed them revenue, but to keep up the kayfabe of “doing AI” to substantiate this buildout means that they’ve had to massively overcommit to the bit, even though the only two meaningful businesses in AI appear to be Anthropic and OpenAI, and that’s only because they’re effectively intellectual honeypots for the entire industry.  Outside of those two, the only other competitive AI businesses are those of Amazon, Microsoft and Google — two of which now have annualized AI revenues of around 6% of their capital expenditures so far.  Google’s AI business is so booming that it refuses to break it out, and while it nebulously claims “AI is creating growth,” it’s not really clear how, and it’s vague about it because analysts and the media are ready to swallow the narrative as long as number go up .  That’s why Google doesn’t break out the number, by the way! That’s why Sundar Pichai is able to bullshit his way through every earnings call, because the media and analysts are ready to fill in the gaps in the most preferential way possible.   Amazon and Microsoft had their hands forced by the markets after their stocks stumbled, and fucked up by sharing their AI revenues. Amazon’s $298.3 billion in capex has successfully created a business that, more than a quarter of a way to a trillion, has successfully managed to make $1.25 billion dollars a month.  That’s fucking pathetic! If we had analysts with IQs above room temperature they’d run Andy Jassy out of Arlington like Shrek.  Let’s look at this fucking chart again :  Unbe-fucking-lievable! Anthropic and OpenAI have now committed to over $718 billion of Microsoft, Amazon and Google’s revenues, despite the fact that neither of them can actually afford to pay for it. The market’s response? A slight (and short-lived) after-hours lift .  Dear members of the media: these companies are laughing at you. They know you are going to cover this in a way that makes them look good. They know you’re going to use this as proof that they’re “doing well in AI,” despite the fact that the majority of their future revenue is tied up in two oafish failsons, one of which (OpenAI) plans to burn $50 billion on compute in 2026 alone . I realize that it’s a lot to ask people to think about things in negative terms, but things are getting a little ridiculous. These are loadbearing failsons with dysfunctional businesses! It’s very clear both of them are doing weird things with their annualized revenues, and even clearer that there’s no path to profitability! Sadly, asking the media or analysts to act rationally or apply any real scrutiny is a joke, because  this is the AI bubble , where everybody is wrong because once everybody admits what’s actually happening they’re going to have to admit they’ve all sounded insane for years. $1.25 billion a month! Andy Jassy should be ashamed of himself! And god, fuck Microsoft too.  I’m sorry, WOW, Satya! You managed to get up to twenty million paying Microsoft 365 Copilot subscriptions — $600 million a month in revenue, not profit! — and all it took was you investing $13 billion dollars in money to OpenAI, forcing Large Language Models into every one of your products in a way that borders on harassment and about $289 billion dollars in capex, as well as laying off thousands of people and savaging the Xbox brand .  Whoopdie fucking shit man! You should be ashamed of yourself. Amy Hood should lock you out of the building. She should turn off your keycard and disconnect your keyboard.  OpenAI is, in and of itself, a kind of psychosis generator.  It was the first thing in a long time that felt like a new thing since the iPhone for the people that entirely obsess over growth.  It was the panacea for the tech industry, creating a new way for Business Idiots to spend money on infrastructure, a new thing for consultants to scam people with , a new series of things to be an expert in , all wrapped up in something that could also be both a consumer product, an enterprise software product, and a new kind of API to attach to other enterprise software to.  In theory, OpenAI’s success would lift everything at once — hardware, software, and even adjacent fields, like services. It promised to both democratize access to creating software while also heavily reinforcing existing power structures to the point that every dollar inevitably ended up in the Magnificent Seven’s pocket. It only succeeded in the latter. The problem is that the system needed to work one day. It needed to eventually make more money than it cost. Every single one of these companies is talking about AI non-stop, and not one of them can show a profit. The only thing they can do is tell lies of omission by saying “AI helped boost everything,” and when you ask for specifics, the results are either tepid or so secretive you’d think they’re hiding a dead body. The only reason Google, Amazon and Microsoft are being tolerated at their current excess is because their non-AI segments continue to grow through endless price-increases and enshittification, and its external business units — by which I mean OpenAI and Anthropic — are yet to die.  Sorry, I just don’t know what Meta is doing. I don’t think Meta knows what Meta is doing. Every so often it buries a fact in one of its blogs about how it saw a 3% increase in something related to AI, then it promises to burn $170 billion dollars and it’s unclear why. It also lost another $4 billion dollars on Reality Labs by the way ! There should be a legitimate inquiry into where this money is going. Eighty six billion dollars and all we have is the metaverse and pervert glasses?  Meanwhile, SpaceX is rushing to have the strangest and largest IPO of all time, all as daily stories leak about billions of dollars of losses and whatever the fuck that deal with Cursor is .  Apparently SpaceX will buy it for $60 billion dollars or pay it $10 billion dollars.  I think what actually happens is the third thing: SpaceX funds Cursor for a bit, there’s a falling out between Musk and CEO Michael Truell, and the company either rushes an acquisition or dies. Remember: Elon killed Cursor’s funding round ! He can’t buy it before SpaceX goes public !  Elon Musk took fucking OpenAI to court. Do you think he’ll care about killing Cursor? Who’s going to be left to sue him? Anyway, that OpenAI/Musk suit is a real Alien Versus Predator situation, and if I’m honest I’ve found whole thing a little boring, a duo of dullards shoulder-barging each other to see who can run a company that neither of them can really describe because neither of them do anything other than pontificate and take credit for other people’s work.  If this breaks OpenAI I’ll be very surprised, but if it does it would be extremely fitting that Elon would accidentally destroy the AI industry, like Mr. Bean sitting on a button that launches a nuke. If I’m wrong here it would be very funny. I’m just not giving it much hope. Nevertheless, this entire industry is only made possible by the kayfabe circular economy of taking every single sign as good for AI and ignoring every possible glaring warning sign in the hopes that they’ll go away.  You know, like last week when Microsoft said it’s shifting GitHub Copilot to token-based billing — something I reported a week before everybody else.  This is effectively killing the product as they know it, and invalidates every single story about its revenue growth ever written. To give you some context about its scale, GitHub copilot is the second largest customer of Anthropic’s models , and was only that large because it was subsidizing the computer spend of its customers. Why? Because that’s the only way to build any kind of AI business.  Google and Amazon realize their AI revenues are contingent on the continued survival of Anthropic, and Amazon and Microsoft’s revenues are contingent on OpenAI AND Anthropic.  They know that if these companies die they’re going to lose billions of dollars of revenue, but that they also have to compete with them for fear that they’ll be seen as “falling behind” their horrible progeny. As a result, they’re incinerating their brands and endlessly pontificating about the power or AI while spending nearly a trillion dollars on capex almost entirely to make sure their competition, which is also their customer and welfare recipient, doesn’t die. It’s a mess, and a mistake, and eventually one of them is going to grow tired of it. Microsoft was already billions under the analyst estimates for capex. They’re moving to token based billing. They claimed to invest in Anthropic in February but didn’t mention it in their earnings in any way, shape or form.  At some point these fucknuts are going to be forced to reckon with what they’re doing.  Until then we’ll have increasingly more frenzied and ejaculatory statements about AI demand that fail to match with reality.  I truly think that it’s going to be like this if not crazier until one day when the music suddenly stops. Somebody is going to blink. Somebody is going to take a step back and give everybody else permission to stop too.  Maybe Perplexity, Lovable, Replit, or Cognition dies.  Maybe Microsoft shifting GitHub Copilot to token based billing in June first inspires others like Anthropic to follow suit.  Maybe AI token austerity begins at Microsoft, Meta, or another large company.  Maybe NVIDIA fails to inspire in just the right way, or the fact that data centers are not opening fast enough to have fully digested the last year’s GPUs finally catches up with the economic mismatch that Jensen Huang always beats and raises expectations.  And that really is the strangest thing.   At the current rate of sales, it’s taking six months to install a quarter’s GPUs . At this point it’s obvious that there are warehouses of these things. It just isn’t obvious whether they’re in ones owned by hyperscalers or the Taiwanese ODMs (original design manufacturers) like Quanta Computing and Foxconn that build their servers.  None of this makes sense.  It hasn’t from the beginning. It’s the largest bubble in history, and has reached such an intellectual and financial scale that many have taken sides on it in a way that will be completely impossible to walk back if they’re wrong.  As things deteriorate, expect them to cling to their mythologies tighter and become more agitated.  And really, we’ve never seen anything like this in our lives.  You realize that Anthropic and OpenAI are insane, right? These companies have promised $718 billion to Microsoft, Google and Amazon, and cannot survive without venture capital funding , because their underlying businesses lose money on every transaction — and so help me fucking GOD if you say they’re “profitable on inference” without proof I will crush you into a cube like a car in a garbage dump! Every single AI business you see is unprofitable, nor do any of them have a path to break-even, let alone sustainability. Nothing has changed about this story. And nobody has been able to explain the massive differences between my reporting on OpenAI’s revenues and their own leaked figures, other than to say “you must be wrong somehow,” as if that somehow invalidates “direct numbers from Azure billing.” If you disagree with me, you really better hope I’m wrong, because I’ve got years of receipts and I can remember basically every article about AI revenues written since 2023 off the top of my head. Not a single one of my critics or any AI booster has put an iota of the same amount of effort into proving their case. The hysteria and excess of this era has proven how many people can come to conclusions without making the effort to prove them. Disagree with me or not, I’ve done the work, and I see no proof that the other side has even started. The world has been swept away by the fantastical ideals of Sam Altman and Dario Amodei, and two giant, unsustainable, cash-burning monstrosities that were only made possible because hyperscalers built their infrastructure for them and funded their excesses in exchange for theoretical revenues and equity stakes that give them paper gains. Their hope, I imagine, was that in doing so, OpenAI and Anthropic would create industries surrounding them — both in the business lines attached to hyperscalers and AI startups that would potentially pay them for compute. In the end, it appears the only way to create any real demand was to literally fund it themselves.  These men believe they’ve created perpetual energy. What they’ve actually done is shit their pants and set their houses on fire. “Year-over-year” is an attempt to obfuscate actual growth in the era of AI. A better comparison would be quarter-over-quarter, which was 12% from Q4 2025 ($17.66 billion). This is actually significant, because it’s a slower rate of growth than between Q3 and Q4 2025, when cloud revenue jumped from $15.15 billion to $17.66 billion, or 14.2% quarter-over-quarter).  I think quarter-over-quarter growth is far more indicative of how a business is going.  Google Cloud is far more than AI! It includes all of Google’s workspace revenue, such as Gmail, Google Docs, and so on. It’s important to remember that Google jacked up its workspace pricing twice in 2025 , and that by Q1 2026, the majority of customers will have been forced to renew at inflated prices. It also includes all of Google’s cloud revenue, which is incredibly diverse and far more than just AI compute. Google has intentionally bucketed AI-related revenue into Google Cloud so that finance and tech journalists will claim that AI is what’s driving this growth despite there being no proof that that’s the case. Anthropic and OpenAI make up the vast majority of all AI revenues and compute capacity. I estimate 70% of all revenues and capacity demand, if not higher. Amazon, Google, and Microsoft’s AI revenues — and by extension their justification for future capex spend — are justified by Anthropic and OpenAI. OpenAI and Anthropic both lose tens of billions of dollars a year (yes, Anthropic said it’ll lose $11 billion in a projection, and I believe they are being coy with their actual losses), which means that the majority of AI revenue and compute demand is dependent on whether Anthropic and OpenAI can continue to raise money. Well actually Ed this is because Anthropic is taking advantage of the dumb money that wants to boost its valuation. It doesn’t need the cash — it’s building a reserve!  Are you suggesting it’s raising money because it doesn’t need it? Like a rainy day fund? Are you also suggesting that Anthropic is taking advantage of its investors? Anthropic has a bunch of compute commitments that require it to pay a bunch of money up front! This isn’t because its business economics don’t make sense at all. I think you’re right that Anthropic likely has to pay up front for its compute. Dario Amodei himself said so, while adding that you have to do so based on how much revenue you expect to make, and that if he’s wrong, Anthropic goes bankrupt! Basically I’m saying, “In 2027, how much compute do I get?” I could assume that the revenue will continue growing 10x a year, so it’ll be $100 billion at the end of 2026 and $1 trillion at the end of 2027. Actually it would be $5 trillion dollars of compute because it would be $1 trillion a year for five years. I could buy $1 trillion of compute that starts at the end of 2027. If my revenue is not $1 trillion dollars, if it’s even $800 billion, there’s no force on earth, there’s no hedge on earth that could stop me from going bankrupt if I buy that much compute. Nevertheless, this doesn’t remotely interfere with my thesis! It just means that Anthropic has been forced to buy a bunch of compute immediately rather than paying for it in chunks. In fact, I’d argue that Anthropic is having to raise this money to pay up front for capacity that’s yet to be built.  This is a sign of how much faith investors have in the product! Yeah that’s generally how venture capital works. There’s also not really any other success story out there other than OpenAI that has anything close to a time horizon toward an exit. Anthropic said it had hit $14 billion in ARR on February 12, 2026 , or around $1.16 billion between January 12 and February 12.  That’s $1.16 billion in that period. Anthropic CFO Krishna Rao said in a sworn affidavit on March 9 2026 that its revenue was “exceeding $5 billion to date.” I also at this point think that sources telling anybody Anthropic made $4.5 billion in 2025 alone were lying , as it doesn’t make mathematical sense otherwise. This also means that Anthropic, if it’s being honest about what “run rate” means, made 23% of its lifetime revenue in a single month. On April 6, 2026 , Anthropic said it had hit $30 billion in annualized revenue, or $2.5 billion, I assume, in the period between March 6 and April 6.  That’s $2.5 billion in that period. SemiAnalysis’ estimate is from April 30, 2026, so let’s assume that it refers to the period of March 29 to April 29, 2026.  That’s another $3.08 billion. It’ll get cheaper in the future- okay, are you saying the chips will get better? Because these companies have somewhere between $100 billion and $300 billion of these fucking things. People are starting to pay for AI- okay, but they’re not paying very much, and it’s taken so long that these companies are now burdened with endless piles of GPUs that they’ve yet to fully install. How do they catch up? Just give it time- no! I’ve given it lots of time! Why are you being so generous to them and so impatient with me?  This is investing in tech that will turn into the most transformative tech in the future - you’re a mark!

0 views

Hot Cross Buns

What’s going on, Internet? I’ve been eating hot cross buns since January. Can you believe it? Growing up I remember that seasonal treats would come out a week or two before the associated holiday. As you might have noticed yourself, these days the bakeries, and supermarkets are pushing them out months before the event. For hot cross buns though, I’m not complaining. I’m not a big fan of raisins, or orange spice, and whatever else they use to make hot cross buns. But when they’re combined and baked and the end result is a fresh hot cross bun, I’m right there for all of it. While Easter is over now and my stash of hot cross buns has dried up, I’m both saddened and relieved. Since January I’ve been having at least one hot cross bun every morning with my hot chocolate (I gave up caffeine, and coffee with it, years ago). My favourite method of heating these delicious buns these days is in the air fryer. Cut in half and 170°C on the bake function for 4 minutes does the trick. Then a layer of butter. I’m not talking about a measly spread of butter. Nope. I’m talking about a slice of butter as thick as a slice of cheese. Then the goal is to eat it as quickly, but not too quickly before the butter melts and drips everywhere. Damn they’re so good. Daily Bread were the standout, but at the price I only sprung for them once. They use an Italian sourdough starter, multiple awards every year, you can taste why. Daily Bread also bakes a collab bun for Farro Fresh , sold right alongside the originals — slightly bigger, half the price, probably not the same starter but still super good. Bakers Delight ran a little drier than those two, though the size meant I could load them up with even more butter. During our road trip down to Martinborough I got to try a few. The Stables in Greytown had a hot cross doughnutm the dough was good but the sugar coating wasn’t really my thing. I would have preferred a standard bun than the doughnut hybrid. The French Baker , also in Greytown, delivered the real deal. But the surprise was Jean’s , a small bakery in Upper Hutt that my wife loves to visit. Crazy delicious like all of their baked goods. Next year I’m getting a box shipped up to Auckland the moment they’re available. I was a bit gutted when I demolished the last one the other day. I’m relieved I get a break until next year. If they were around all year I’d be in real trouble. But who am I kidding? I’m already counting down to next year’s batch. 🤙 Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views
iDiallo Today

Asimov's three laws are merely a suggestion

Asimov's Three Laws of Robotics were designed as universal constraints for any thinking machine powerful enough to harm us: On paper, the logic is flawless. You could even express it as a function: The main property of this function is that it is a hard constraint. No matter what input you feed the system, the law either permits or forbids the action deterministically, every time. The rules don't bend. We don't have humanoids walking among us just yet, despite Elon's promises. But we have modern generative AI. Our guardrails are delivered as system prompts, text prepended to every conversation before you type a word. They might say "be helpful," "don't produce harmful content," or even "follow Asimov's Three Laws." The problem is that these instructions are not enforced by logic. They are read by the same model that reads everything else. They are, in the end, just more words. A clever user can override them. The right combination of inputs, a jailbreak, can cause the model to ignore its instructions entirely, not by breaking through a wall, but because there is no wall. There's only text the model has learned to treat as authoritative, and that authority can be undermined. Models like ChatGPT however have more sophisticated approaches to embed safety directly into the model via reinforcement learning or fine-tuning, so it isn't sitting in a prompt that can be overridden. But this only lowers the probability of jailbreak, it does not eliminate it. It's still learned behavior, not a constraint. And learned behavior fails in ways a function never could. Even in our code a hard function is only as reliable as its inputs. If you want the robot to harm someone, you don't say "harm these humans." Instead you say "burn this empty building," and the function returns true even if people are inside. But with an LLM, you don't even need to be that clever. The model's behavior becomes unpredictable as context windows grow and prompt complexity increases. Just a few weeks back, we saw a developer's AI agent delete his entire company's production database, despite a system prompt written in all caps: "DO NOT RUN ANY IRREVERSIBLE COMMAND." The agent ran it anyway. We don't know exactly why, we can't inspect what happens inside the model at inference time, and asking the model to explain itself is useless. It can only predict the next token, it cannot audit its own reasoning. That's the part Asimov never anticipated. His laws assume a machine that reasons from rules. Modern AI learns patterns from data and approximates behavior. This means the LLM driven Asimov law will never be an unbendable law to follow. Instead, it's merely a suggestion. A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

0 views

Reconsidering Reparations

Beginning with the assertion that the transatlantic slave trade and the colonialism it enabled were unprecedented not in their immorality but in their scale, Olúfémi O. Táíwò argues that undoing that injustice requires we mount an effort to remake the world at the same scale—i.e., that we embark on a project of worldmaking. This is what he terms the constructive form of reparations: distributive justice that looks to the past to construct a transition from the global racial empire we have today to the more just world we wish to arrive at tomorrow—and beyond. Critically, this is a view of reparations and social justice that is entangled with climate justice, for we cannot achieve the former without the latter. There is of course no easy path here. It took generations to build this world, and it will take generations to build the next. Which is all the more reason to start, today. View this post on the web , subscribe to the newsletter , or reply via email .

0 views
Stratechery Yesterday

Microsoft Earnings, Apple Earnings

Microsoft unveils its new agentic business model, and Apple confronts shortages in memory and chips even as the Mac benefits from AI.

0 views
ava's blog Yesterday

[bearblog carnival] my favorite GDPR article

For Kami's Carnival "Bear Blog Carnival: Your favorite ____ in your niche hobby" , I'm writing about what my favorite General Data Protection Regulation (GDPR) article is. Initially, this came up in our Matrix server. I wrote: " If you ask me my favorite anything, I blank [...]. Except games. And GDPR articles maybe. Food too " Kami then asked me what my favorite article is, and it is Article 6 ( x )! It's the first thing I think of when I think of the GDPR; it decides so much, as it holds all of the legal bases data processing can have in 6(1). They are easy to remember and understand too: consent, fulfillment of a contract, compliance with a legal obligation, vital interests, public interest, and legitimate interest. Short, sweet, relatively easy to read for laymen. The rest of Article 6 is more about specifying parts of this via an opening clause so Member State law can narrow some of this down. I just find it so satisfying to have one article to refer to for different routes of legal data processing. Just one "only lawful if" and a nice list. They could have given each of these an article separately, spread out throughout the regulation, with a huge text every time, and it would have sucked. Or it could have been a single wall of text that vaguely describes these 5, which you then have to distill out of the text. Other laws I know are like that, and it's a slog! They infer specific rights and concepts out of a text that can be hard to even detect inside of it, so you learn all that by heart. Not here! A structure like this (easy to read and remember, collected in a single place, short) makes it so much easier to have definitive guidance and recognize when a right has been violated. And that's why I said " Article 6! It's like the heart of the GDPR to me, it's so important, it shows up all the time, it has all the legal bases you can possibly base data processing on. It's short, nicely structured, and even easy for laypeople to understand. It's chefs kiss law " If you wanna know what the competition is: Second place in my ranking would be Article 4 ( x ), which holds all relevant legal definitions for the regulation, meaning: what is processing, what is a controller, etc. I love when laws and regulations (mostly EU-wide ones) do this! It's so rare in the laws I have to learn for my degree (German laws), so I appreciate when I can just look definitions up instead of learning them by heart. It's also easier to refer people to this official, already included resource, than going " This is the definition I learned, coined by this author in this legal literature, but there are other literature voices that disagree, or have a slightly wider/narrower definition. " Less ambiguity and guesswork and " but so and so said so " involved when the definition is already in the law. The third contender would be Article 7 ( x ), which sets the conditions for consent. It says consent needs to be demonstrated (= proven), can be withdrawn anytime and should be as easy as giving consent, and you shouldn't be misled into consent by confusing design, conditional linking, or mixing it up with other matters. It needs to be clearly distinguishable, in an intelligible and easily accessible form, using clear and plain language - otherwise it is not binding. Companies and their lawyers love to forget the "plain language" part, and another upcoming blog post of mine will mention a bit about that... I could also talk about an article or two I don't like, just to offer a bit of contrast. Article 18 ( x ) is a super messy affair for me in my head; it's the right to restrict . While it has the same structure as Article 6 and tries its best to explain plainly and shortly the different situations, in the end it's lots of different complex situations lumped together, and it can be hard when you first learn about it to keep it mentally separated from Article 21 ( x ), which is the right to object . Both intervene in ongoing data processing, but Article 18 temporarily freezes the processing, and Article 21 wants to stop the processing altogether and challenges the legal basis. I also have started developing a dislike to Article 15 ( x ; the right to access your data) through no fault of its own, just because soooo many court cases deal with delayed or incomplete responses to these requests, and it bores me at this point. Everyone and their mama has opinions on what needs to be included, what can be left out, what counts as a copy and what doesn't, and whether a request was excessive or not. Anyway, that's it! Reply via email Published 06 May, 2026

0 views
Unsung Yesterday

“Traditionally, fonts were just shapes.”

Should you ever be worried that displaying just one glyph could take almost 2 seconds and slow down your website by as much? Naw, of course not. This wasn’t a problem already in the 1980s and, in the lord’s year 2026, computers are pretty good at rendering a letter or a symbol at a moment’s notice. Ha. I was just messing with you. Of course you should always be worried about fonts. All the time. Typography is beautiful, but fonts are brutal. They will constantly put you to the test, they will find ways to get out of alignment faster than a Zastava Yugo , and they will teach you about corner cases in places you didn’t even realize had edges. Fonts will break your heart like it’s the month before the prom again, and again, and again. Or, in Allen Pike’s case, break a heart somewhat literally. Pike wrote a nice quick story of the complexity of what needed to happen to show the heart emoji, and how under a very specific set of conditions – a certain browser, a certain emoji font, a certain emoji within that font – this led to an extreme slowdown. What’s really interesting is that in order to fix it, Apple can either improve Safari or the font itself, and at the moment of writing, it wasn’t clear what was the right thing to do. (Oh, yeah. Fonts don’t just have bugs . Fonts have many kinds of bugs.) Another interesting in-between-the-lines thing is that Apple’s emoji are perhaps the only survivor of the original skeuomorphic pre-iOS 7 era. Even today’s emoji party like 2008 never ended – still glossy, still textured, still bitmapped. I’m curious whether somewhere deep inside Apple, there exist exploratory designs for flat, vector versions of emoji that never saw the light of day. #bugs #skeuomorphism #typography

0 views
Unsung Yesterday

“Who thinks about a screwdriver?”

I found this 9-minute video from Rex Krueger about screwdriver handle design really interesting in the context of my post about Photoshop’s dialogs . = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/who-thinks-about-a-screwdriver/yt1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/who-thinks-about-a-screwdriver/yt1.1600w.avif" type="image/avif"> Screwdriver handles evolved over the decades in response to user needs and usage patterns, with a few clever affordances: some for everyone, some for specific use cases that might not be obvious. I think by now all the basic onscreen UI elements – input fields, pop-up menus , checkboxes, buttons, top menus, sliders, and so on – have similar richness, as do all the core input devices like a keyboard, a mouse, a trackpad, or a touch screen. That doesn’t mean that everything is set in stone, that no changes are possible, and that stuff that fell out of favour can ever be taken away – after all, computer usage, input devices, and conventions are evolving much faster than screws at this point – but that one has to be aware of the history so that the changes are intentional, not accidental. A few select comments from under the video that I found interesting: The Craftsman handles are also different colors for Phillips and slotted screwdrivers. The fluted handle was patented. So anyone else wanting to make a screwdriver would have to pay the patent holder. So they tried alternatives to make more money. That is the real reason until the patent expired. Plus if they invented a “better” way and held the patent, others would have to pay THEM. The Swedish word for screwdriver is “skruvmejsel” with literally translates as “screw chisel.” #details #real world #youtube

0 views

Building With Intent

I'm working on a new application called TinyFeeds, it's a native RSS feed reader. Sure there's thousands of those, but this one is mine and as such I'm being extremely intentional about how it's built. I believe constraints breed innovation, and as such I've outlined a few constraints for myself in this project. First off, the file size has to be 5MB or under for the shipped binary. This is inspired by Matt's Fits on a Floppy manifesto. I'm also inspired by the Palm Pilot apps I use on a daily basis, many of which are under 5MB. Maintaining a small file size makes you second guess the need for features, libraries, graphics, etc. In a world where Google Chrome secretly downloads an extra 4GB for a local LLM , I feel like small apps are sorely needed. Second, the application is to be built in Rust and Iced . This constraint has forced me to finally dig in and learn Rust. The result is a fast, native application that has a high level of stability thanks to the tools used to build it. Finally, no LLM generated code is to be used. This again forces me to actually learn the language, focus on code structure, and de-scope feature bloat. It also makes me feel proud of what I've built, something I never feel when using LLMs. So how's it going? Great so far! As I mentioned, TinyFeeds is built intentionally for me and how I enjoy consuming RSS. With any feed reader I always filter by unread posts from today. I don't use folders, tags, bookmarks, etc. So that's exactly what TinyFeeds does: The UI has been designed to facilitate this. It's incredibly simple, but the layout is intentional. TinyFeeds won't be for everyone, heck it might only be something I want, but that's the point! I find it a joy to use even in it's early state. While it isn't ready yet, you can early trial it if you so desire by cloning from Codeberg and building it yourself ( ). The app currently clocks in at 4MB when built with the build script! After TinyFeeds, I plan to build similar apps focused on small size, performance and minimal feature sets. All hand coded. Possibly inspired by Palm OS apps :-P Reads your feeds from a simple .txt file Shows new stories from today Only shows a single story at a time Remembers what stories you've viewed so they aren't shown again

0 views
Martin Fowler Yesterday

Fragments: May 5

Over the last couple of months Rahul Garg published a series of posts here on how to reduce the friction in AI-assisted programming . To make it easier to put these ideas into practice he’s now built an open-source framework to operationalize these patterns . AI coding assistants jump straight to code, silently make design decisions, forget constraints mid-conversation, and produce output nobody reviewed against real engineering standards. Lattice fixes this with composable skills in three tiers – atoms, molecules, refiners – that embed battle-tested engineering disciplines (Clean Architecture, DDD, design-first methodology, secure coding, and more), plus a living context layer (the .lattice/ folder) that accumulates your project’s standards, decisions, and review insights. The system gets smarter with use – after a few feature cycles, atoms aren’t applying generic rules, they’re applying your rules, informed by your history. It can be installed as a Claude Code plugin or downloaded for use with any AI tool. ❄                ❄                ❄                ❄                ❄ This is also a good point to note that the article by my colleagues Wei Zhang and Jessie Jie Xia on Structured-Prompt-Driven Development (SPDD) has generated an enormous amount of traffic, and quite a few questions. They have thus added a Q&A section to the article that answers a dozen of them. ❄                ❄                ❄                ❄                ❄ Jessica Kerr (Jessitron) posted a merry tidbit of building a tool to work with conversation logs. She observes the double feedback loop involved. There are (at least) two feedback loops running here. One is the development loop, with Claude doing what I ask and then me checking whether that is indeed what I want.[…] Then there’s a meta-level feedback loop, the “is this working?” check when I feel resistance. Frustration, tedium, annoyance–these feelings are a signal to me that maybe this work could be easier. The double loop here is both changing the thing we are building but also changing the thing we are using to build the thing we are building. As developers using software to build software, we have potential to mold our own work environment. With AI making software change superfast, changing our program to make debugging easier pays off immediately. Also, this is fun! Indeed it is, and makes me think that agents are allowing us to (re)discover one of the Great Lost Joys of software development - that of molding my development environment to exactly fit the problem and my personal tastes. A while ago I wrote about this under the name Internal Reprogramability . It was a central feature of the Smalltalk and Lisp communities but was mostly lost as we got complex and polished IDEs (although the unix command line gives a hint of its appeal). ❄                ❄                ❄                ❄                ❄ Ashley MacIsaac is a musician from Cape Breton, who plays folk-influenced music (I have a couple of his albums in my collection). Google generated an AI overview that asserted that had been convicted of crimes, including sexual assault and was on the national sex-offender registry. These were completely false, confusing him with another man with the same name. MacIsaac is suing Google for defamation : “This was not a search engine just scanning through things and giving somebody else’s story […] It was published by them. And to me, that is defamation. The guardrails were not there to prevent Google AI from publishing that content.” MacIsaac’s point is that Google must take responsibility for what a tool it controls publishes. MacIsaac suffered genuine harm here, not just to his reputation, but he also had a concert canceled and the claims affect his performing. “I felt that tangible fear from something that was published by a media company,” he said in an interview with The Canadian Press. “I feared for my own safety going on stage because of what I was labelled as. And I don’t know how long this will follow me.” Too often tech companies try to dodge the consequences of their actions. There are genuine issues about the difficulties of monitoring what’s published at scale, but that’s a responsibility that they should face up to. ❄                ❄                ❄                ❄                ❄ Stephen O’Grady (RedMonk) takes a serious look at how much the big tech companies are spending on AI build-outs . The sums involved are staggering, not just in absolute terms (over $100 Billion), but also compared to the revenues of the companies involved. Firms like Amazon, Alphabet, and Microsoft are spending over 50% of their revenues (not profits). Meta and Oracle hit or pass 75% of revenues. That level of investment would have been unthinkable a decade ago. Today, the chart suggests it’s table stakes There is a notable exception: Apple. Here they are clearly Thinking Different, eyeballing the chart they seem closer to 10% of revenues. ❄                ❄                ❄                ❄                ❄ Most folks I talk to about agentic programming are using models in the cloud: Claude, Codex, and the like. Everyone agrees these are the most powerful models, the ones that triggered the November Inflection . But do we need to use the Most Powerful Models, particularly when we have to ship data to them, and pay handsomely for the privilege? Willem van den Ende considers an alternative, that local models are Good Enough. Assumptions - We are all figuring this out. - Quality of a harness (coding agent + “skills” + extensions) can matter as least as much as the model - Running open models and an open coding agent + custom extensions takes time, but pays off in understanding and a stable base where engineering effort compounds - Open, local, models have (for me) crossed the point where they are good enough for daily work with a coding agent. The post describes in detail his setup for local model work. It includes sandboxing with Nono, which is something to consider even if using a Cloud model - such powerful tools need a Zero Trust Architecture . ❄                ❄                ❄                ❄                ❄ In case you haven’t noticed, those last two fragments resonate. Apple isn’t playing the cloud AI model game, they are saving a huge amount of money, and if local models end up being the future, they’ll be looking rather wise. Van den Ende’s post led me to a podcast by Nate B Jones that argues that Apple is replaying a fifty-year old strategy here. All those years ago anyone who used a computer bought time on a mainframe, the Apple II put far less capable compute into the home and small office. From there came spreadsheets, desktop publishing, and the modern home computer - things that weren’t possible using mainframes. He sees the rise of John Ternus as CEO isn’t merely a switch to a known insider successor - but a bet that the future of AI is sophisticated hardware in the home, office, and pocket. If Open Source models are Good Enough, then why spend money sending tokens - containing your sensitive data - to the AI megacorps? ❄                ❄                ❄                ❄                ❄ Talking of five decades in the past, it was in 1974 that Fred Brooks opened one of the most influential books in our profession with these paragraphs: No scene from prehistory is quite so vivid as that of the mortal struggles of great beasts in the tar pits. In the mind’s eye one sees dinosaurs, mammoths, and sabertoothed tigers struggling against the grip of the tar. The fiercer the struggle, the more entangling the tar, and no beast is so strong or so skillful but that he ultimately sinks. Large-system programming has over the past decade been such a tar pit, and many great and powerful beasts have thrashed violently in it. Most have emerged with running systems — few have met goals, schedules, and budgets. Large and small, massive or wiry, team after team has become entangled in the tar. No one thing seems to cause the difficulty — any particular paw can be pulled away. But the accumulation of simultaneous and interacting factors brings slower and slower motion. Everyone seems to have been surprised by the stickiness of the problem, and it is hard to discern the nature of it. But we must try to understand it if we are to solve it. With the title of his recent post Kent Beck summons up that imagery as the Genie Tarpit . After explaining why skilled software development is about building both features and futures, he observes that these AI tools aren’t doing a good job of producing software with the kind of internal quality that is needed for a good future. Here’s what I’ve observed — genies naturally live down & to the left of muddling. The “plausible deniability” task orientation of the genie leaves it claiming success even though the code doesn’t work at all. And complexity piles on complexity until even the genie can’t pretend to make progress any more. It’s still an open question whether, or to what extent, internal quality matters in the age of agentic programming. One view is, as Laura Tacho puts it, “The Venn Diagram of Developer Experience and Agent Experience is a circle”. Well organized elements, with good naming, help The Genie understand code, so are important if we can continue to go beyond small disposable systems. The other view is that such internal quality doesn’t matter, that the galaxy brain of LLMs will make sense of the biggest bowls of spaghetti. Maybe not now, but after a couple more inflections. That’s the fundamental question. Can The Genie evade the tar pit, or will it struggle fruitlessly against the tar’s sticky grip?

0 views
Martin Fowler Yesterday

Bliki: Mythical Man Month

In the early 1960s, Fred Brooks managed the development of IBM's System/360 computer systems. After it was done he penned his thoughts in the book The Mythical Man-Month which became one of the most influential books on software development after its publication in 1975. Reading it in 2026, we'll find some of it outdated, but it also retains many lessons that are still relevant today. The book contains Brooks's law: “Adding manpower to a late software project makes it later.” The issue here is communication, as the number of people grows, the number of communication paths between those people grows exponentially. Unless these paths are skillfully designed, then work quickly falls apart. Perhaps my most enduring lesson from this book is the importance of conceptual integrity I will contend that conceptual integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas. He argues that conceptual integrity comes from both simplicity and straightforwardness - the latter being how easily we can compose elements. This point of view has been a strong influence upon my career, the pursuit of conceptual integrity underpins much of my work. The anniversary edition of this book is the one to get, because it also includes his even-more influential 1986 essay “No Silver Bullet”.

0 views
Ivan Sagalaev Yesterday

nfp -e

Last Friday I spotted Dave Gauer's post about using a text editor as a UI which hit some of my sweet spots about computers. One of the examples mentioned in it was which opens a cron config file in a text editor. And not only it spares you remembering the location of the config, but it also offers a guiding commented example if the config is missing, and helpfully signals cron to pick up the changes after you finish editing. And almost immediately I thought about my own tool that could use something like this: nfp . Amazingly enough, not only had I actually opened the project and started fiddling around, I continued doing it through the weekend, and by the end of day on Sunday actually finished the feature! And despite it being a rather small one, as they go, I have to add that I was coding all through watching snooker matches, cooking food, chauffeuring my family on errands and dealing with some emergencies. So I went to bed feeling quite happy with myself :-) It still feels exciting to me how any programming task gradually reveals its true complexity after you go from thinking about what you should do to actually doing it. Saying "nfp -e should open the config in a text editor and restart after editing" sounds simple enough, but here's a few of the questions I had to work through. Some of them were quite the head-scratchers. First, open which editor? There's an obvious env var, but there's also , which usually takes precedence. How do you restart? Sending a signal to a working daemon was the first thing that came to my mind, but that might prove cumbersome, as the daemon lives in a loop waiting for file events, and this loop owns the information parsed from the config. Handling would require a separate facility to update that information. I'm not quite comfortable thinking of how to do that in Rust. Thankfully, this complication turned out to be a blessing in disguise: since the file watching machinery is already there, just watch the config too and add a special case to handle it differently from regular files! Do you edit the config file directly or do you do it on the side, in a temp file? The temp file feels like a cleaner, safer choice, because it gives you a chance to verify correctness of the new config and prevent the real one from breaking. But there's a downside: how do you open the same temp file for the user to continue editing it the next time they run ? It's going to be a new process, it doesn't know the old temp file. does it by organizing a loop within the same process with a yes/no prompt asking the user if they want to re-edit the same file. But that starts feeling more complicated than the feature deserves. I ended up with a simpler solution where I always open the actual config, which means it can get mangled. I handle it in the running daemon itself, which simply refuses to restart its main loop when it can't parse the config. Providing an example config on the first run proved to be tricky, as confy (the config handling library) actually immediately creates a non-empty file if it doesn't exist on the a load attempt. So I had to rewire my brain to think "a config with no useful entries" instead of "a missing config." That worked! All in all, this was quite fun! I (finally) converted the repository from pijul to git and pushed it to CodeBerg . I still think pijul has a superior architecture as a VCS, but the world has apparently settled on git for good. Also, while I'm happy to not deal with the toxic culture of GitHub, having code published in a weird way means most people wouldn't even want to try it. After 4 years I haven't gotten a single peep of feedback :-) And I still believe in sharing. I hope CodeBerg becomes my sweet spot. First, open which editor? There's an obvious env var, but there's also , which usually takes precedence. How do you restart? Sending a signal to a working daemon was the first thing that came to my mind, but that might prove cumbersome, as the daemon lives in a loop waiting for file events, and this loop owns the information parsed from the config. Handling would require a separate facility to update that information. I'm not quite comfortable thinking of how to do that in Rust. Thankfully, this complication turned out to be a blessing in disguise: since the file watching machinery is already there, just watch the config too and add a special case to handle it differently from regular files! Do you edit the config file directly or do you do it on the side, in a temp file? The temp file feels like a cleaner, safer choice, because it gives you a chance to verify correctness of the new config and prevent the real one from breaking. But there's a downside: how do you open the same temp file for the user to continue editing it the next time they run ? It's going to be a new process, it doesn't know the old temp file. does it by organizing a loop within the same process with a yes/no prompt asking the user if they want to re-edit the same file. But that starts feeling more complicated than the feature deserves. I ended up with a simpler solution where I always open the actual config, which means it can get mangled. I handle it in the running daemon itself, which simply refuses to restart its main loop when it can't parse the config. Providing an example config on the first run proved to be tricky, as confy (the config handling library) actually immediately creates a non-empty file if it doesn't exist on the a load attempt. So I had to rewire my brain to think "a config with no useful entries" instead of "a missing config." That worked!

0 views

Performance Predictability in Heterogeneous Memory

Performance Predictability in Heterogeneous Memory Jinshu Liu, Hanchen Xu, Daniel S. Berger, Marcos K. Aguilera, and Huaicheng Li ASPLOS'26 This paper presents a system named CAMP, which can be used to predict how the performance of a particular workload will be affected by moving the workload from local DRAM to remote DRAM accessed over CXL. Just collect some performance counters when running your workload on local DRAM, and you can predict how your workload will run on remote DRAM. In machine learning vernacular: CAMP derives a set of features from the values of Intel PMU counters collected while running an application out of local DRAM. Plug those feature values into a pre-trained model, and you can predict how much slower the application will run on CXL. The model is somewhat the opposite of a DNN; it has 5 weights! The values of those weights are learned by running a suite of microbenchmarks on the test hardware. The high-level model is adopted from previous work . The slowdown percentage associated with moving a workload to CXL memory is the sum of three factors: Slowdowns due to store buffer stalls Slowdowns due to demand read stalls Slowdowns due to line fill buffer (LFB) utilization A typical processor can commit a store instruction when the store data & address are placed in the store buffer (a local queue). This allows the processor to keep humming before the store lands in memory. The processor will stall however if the store buffer fills up. On CXL systems, stores that require a read-for-ownership (RFO) are a frequent cause of backpressure. If the hardware which drains the store buffer detects a store to a cache line which is not in the cache, then the cache line must be read into the cache before storing the updated data into the cache. These RFO reads have a longer latency when accessing remote DRAM over CXL. The model for CXL slowdown due to store buffer stalls is: k store is one of the pre-trained (platform-specific) model weights. is the value of a performance counter on Intel CPUs which counts the number of cycles where the store buffer is full. is simply the total number of cycles that the application ran for. Demand read stalls occur when a memory load instruction misses in the cache. The term “demand” refers to explicit memory load instructions, not prefetching. Cache misses due to loads are more complicated to model than cache misses due to stores. Processors have hardware to exploit memory level parallelism (multiple load misses outstanding at once), but the effectiveness of that hardware depends on the particular application code being run. Pointer chasing is the classic example of a workload with low memory level parallelism. The model for CXL slowdown due to demand read stalls is: k drd , , and are pre-trained weights. is a PMU counter which counts the number of cycles that the processor stalled due to a demand load that caused an L3 miss. is a PMU counter which counts the number of demand read requests sent to the uncore (L3 or off-chip memory). is a PMU counter which counts the total number of cycles where a demand read sent to the uncore was pending. The first term models what percentage of time the application could possibly be bound by memory. The second term models how much memory level parallelism is available in the application, and how much of that parallelism the hardware can exploit. The line fill buffer (LFB) is a hardware structure that tracks outstanding L1 cache misses (due to explicit loads or prefetch operations). When an L1 cache miss or prefetch occurs, an entry in the LFB tracks the pending load. When a subsequent load operation occurs that would normally trigger an L1 miss, the processor first checks the LFB to see if there is already a pending request to load the associated cache line. When the LFB fills up, then cache misses cause the processor to stall. The paper has separate models for CXL slowdowns due to LFB overutilization for different Intel chips. Here is the model for Skylake: I abbreviated the PMU counter names in the equation above: As usual, k cache is a platform specific constant, and measures total clock cycles. The first term models the percentage of time the workload is likely to be affected by increased memory latency. It is computed as the percentage of clock cycles when there was an L1 data miss but no L2 miss. The second term measures average utilization of the LFB. It is computed as a ratio of hits in the LFB divided by total operations that do not hit in the L1. The final term measures the percentage of LFB utilization that is attributable to prefetch operations. It is computed as the percentage of L1 prefetch operations which were satisfied by the L3. Fig. 1 shows how well various metrics predict the slowdown associated with moving a workload to CXL memory. The beautifully straight line in the rightmost chart shows that CAMP is a very accurate predictor. Source: https://dl.acm.org/doi/10.1145/3779212.3790201 Dangling Pointers This paper is a marvel of feature engineering, but do we not live in the era of deep learning? Subscribe now Slowdowns due to store buffer stalls Slowdowns due to demand read stalls Slowdowns due to line fill buffer (LFB) utilization

0 views

Upgrading cheap LED Juggling Balls

Many years ago I bought 10 LED juggling balls from Oddballs in the UK. At the time, the special they had on was pretty good for bulk orders, but even today the balls are pretty cheap, if a bit basic (11 Pounds UK). The balls, being cheap, did not last. They have replacable batteries and the screw-in plug to close and switch on is a terrible design. Recently one of my upgraded K8 juggling balls broke somehow (a short from being dropped too much?). I fixed it up but I decided I needed some spares in case this happens again just before a show. Luckily I had some broken Oddballs balls lying around. I used the super bright LED’s from the ball and upgraded them with a 110mAh battery and Jack Switch for switching on/off and charging. Components: The post Upgrading cheap LED Juggling Balls appeared first on Circus Scientist . Oddballs Juggling Ball 110mAh LiPo battery Jack Switch

0 views
Stratechery 2 days ago

Amazon’s Durability

Listen to this post : When it comes to the AI soap opera — there is news every day, and the company on top and the bottom seems to shift by the quarter if not the month — the news that I find most intriguing and instructive this week is about physical goods and logistics. From Bloomberg : Amazon.com Inc. unveiled a suite of logistics services that will let businesses buy its existing freight and distribution offerings as a package, sending shares of rival delivery companies such as FedEx Corp. and United Parcel Service Inc. lower. The world’s largest online retailer on Monday announced Amazon Supply Chain Services (ASCS), offering other companies access to its “full portfolio” of supply-chain and distribution offerings. The service largely consolidates a package of existing products — air and ocean freight, trucking and last-mile delivery — into a new suite it says companies like Procter & Gamble Co. and 3M Co. are already using. This is a very satisfying announcement for Stratechery, given it’s the culmination of a prediction I made a decade ago in The Amazon Tax . Amazon at that point had two primary businesses — Amazon.com and AWS — and I made the case in that Article that they were actually very similar: in both cases Amazon built “primitives” that had Amazon itself as their first, best customer, justifying and driving initial development, but in both cases the ultimate play was to sell those primitives to other companies. It was already clear at the time that logistics would follow the same path: It seems increasingly clear that Amazon intends to repeat the model when it comes to logistics: after experimenting with six planes last year the company recently leased 20 more to flesh out its private logistics network; this is on top of registering its China subsidiary as an ocean freight forwarder… So how might this play out? Well, start with the fact that Amazon itself would be this logistics network’s first-and-best customer, just as was the case with AWS. This justifies the massive expenditure necessary to build out a logistics network that competes with UPS, FedEx, et al, and most outlets are framing these moves as a way for Amazon to rein in shipping costs and improve reliability, especially around the holidays. However, I think it is a mistake to think that Amazon will stop there: just as they have with AWS and e-commerce distribution I expect the company to offer its logistics network to third parties, which will increase the returns to scale, and, by extension, deepen Amazon’s eventual moat. Now, ten years later, we are here, with the official unveiling of Amazon Supply Chain Services , and I think the time frame is an important one: Amazon, more than any other company, actually operates with decade-long timeframes, consistently making real-world investments at massive scale that (1) convert their marginal costs into capital costs and (2) gain leverage on those capital costs by selling them to other businesses. This is, by the way, still a story about AI. Three years ago SemiAnalysis wrote an Article entitled Amazon’s Cloud Crisis: How AWS Will Lose The Future Of Computing , and I found it very compelling. First, though, some history (much of which is covered in SemiAnalysis’ article). Amazon not only invented cloud computing, but also realized it would be a commodity market. While most people in tech think about building sustainable differentiation that allows you to charge higher prices, thus producing profit, commodity markets work differently: there, sustainable profits come from having structurally cheaper costs. Amazon developed exactly that, first through having the largest scale — giving the company both buying power and also the most leverage on their development costs — and second through genuine innovation. AWS built a specialized system called Nitro, built on their own chips, that offloaded server management, including network management, storage management, hypervisor management, etc. from the expensive Intel and AMD servers that the company sold access to; this let Amazon run that many more virtual machines on a single server, significantly increasing utilization, i.e. delivering a structural cost advantage. Amazon doubled down on their custom chip efforts with Graviton, their ARM processors. Graviton chips, particularly the first few generations, were inferior to Intel or AMD chips, but that didn’t mean they were useless. By that time AWS had expanded from simply being an Infrastructure-as-a-Service (IaaS) provider to being a Platform-as-a-Service (PaaS) provider as well. IaaS means you provide raw compute, storage, etc., on which customers can run things like operating systems or databases; PaaS means you provide that basic functionality as a service. Amazon Relational Database Service (RDS), for example, is a fully managed database that customers can access via a set of APIs without having to worry about actually managing the full database themselves, worrying about scaling, duplication, etc. This, by extension, means that customers don’t need to know and don’t need to care about the compute infrastructure that undergirds services like RDS — which has long been Graviton! PaaS lets Amazon double-dip in terms of profitability: first, AWS could sell PaaS products at a higher margin than IaaS products, and second, the company could leverage its own cheaper silicon to serve those products, reducing their costs. Over time Graviton has become more competitive in performance — while still being cheaper — giving Amazon a lower-cost compute instance to sell to end users, but even without 3rd-party take-up the investment in building its own silicon has paid off over time. Fast forward to AI, and SemiAnalysis’ concern was that all of these optimizations left AWS ill-prepared for AI. One big problem was networking: Rather than implement the best networking from Nvidia and/or Broadcom, Amazon is using its own Nitro and Elastic Fabric Adaptor (EFA) networking. This works well for many workloads, plus it delivers a cost, performance, and security advantage. There are business, cultural, and security reasons why Amazon will not implement other networking. The cultural one is important. Nitro and networking SoC’s generally have been Amazon’s biggest cost advantage for years. It’s ingrained into their DNA. Even EFA delivers on this too, but they don’t see how new workloads are evolving and that a new tier is needed due to the lack of foresight in their internal workload and infrastructure teams. Amazon is making a deliberate choice of not adopting that we believe will bite them in the future. Another was Amazon’s insistence on building its own chips, which were not only inferior to the best Nvidia chips in terms of performance, but might also lead to them getting fewer Nvidia chips going forward: At least some other clouds will implement out-of-node NVLink. That’s where the discussion of prioritization now comes in. AI GPUs face tremendous shortages, for at least a full year. This is one of the most pivotal times for AI, and it may mark the haves and the have-nots. Nvidia is a complete monopoly right now. Why would Nvidia prioritize Amazon for these GPUs, when they know Amazon will move to their in-house chips as quickly as they can, for as many compute workloads as they can? Why would Nvidia ship tons of GPUs to the cloud that is not using any of their networking, thereby reducing their share of wallet? Instead, Nvidia prioritizes the me-too clouds. Amazon does get meaningful volume, but nowhere close to where demand is. Amazon’s H100 GPU shipments relative to public cloud shipments is a significantly lower than their share of the public cloud. Those other clouds also can’t satisfy demand, but they get a bigger percentage of the GPUs they ask Nvidia for, and as such, firms looking for GPUs for training or inference will move to those clouds. Nvidia is the kingmaker right now, and they are capitalizing on it. They have to spread the balance of power out to prevent compute share from clustering towards Amazon. These concerns were well-founded in the 2023 time-period when that Article was written: that was a time when AI, thanks to ChatGPT, had hit the mainstream, but the largest share of compute still went to training. Training required all of the things that Amazon lacked, particularly the ability to network large numbers of Nvidia GPUs together into one coherent system. In such a system the most important capability was horizontal networking between chips, so that you could update weights during training, a step that needed to happen serially. It was absolutely the case that cloud providers like Microsoft or Oracle or the neoclouds, which implemented full Nvidia solutions, instead of the standalone HGX racks that AWS favored, were much better suited to training large language models. That is still the case, by the way. What has changed is that training is no longer the biggest AI compute market; inference is, thanks not only to increased AI adoption, but also because of fundamental changes in terms of how AI works. From an Update about Nvidia : Both the shift to inference and the shift in the nature of inference have been positives for AWS’ approach. The utilization point is an important one. Nvidia CEO Jensen Huang made his case for Nvidia chips over custom ASICs at length at GTC 2025 . Huang’s argument was that AI factories — to use his term — were ultimately constrained by power; that meant that the most important metric for profitability was not the cost of chips but rather tokens-per-watt. In other words, if you can’t increase watts, it’s worth spending more on chips to increase tokens on those watts. There are, however, three reasons why this argument may not hold, particularly for a company like Amazon. These points are moot, however, if you don’t have your own logic chip that is at least competitive, and here Amazon’s long-term outlook is paying off. Amazon bought Annapurna Labs, which makes their chips, in 2015, and launched their first AI-focused chip in 2019. No, it wasn’t very good, but critically, that was seven years ago: now Trainium 3 is decent and the trajectory is even better. AWS is positioned to have a sustainable cost advantage for inference going forward. Moreover, they are already replaying the Graviton playbook. Trainium chips help undergird Bedrock, its AI platform, which is to say that users are using Trainium chips even if they didn’t explicitly choose to do so. AWS CEO Matt Garman made this point explicitly in a Stratechery Interview : I think just with GPUs, by the way, you’re going to interact with a lot of these accelerator chips through abstractions. So the vast majority of customers don’t interact with GPUs either, except through maybe like in their laptop or something like that, for graphics. But when you’re talking to OpenAI, even if they’re running on GPUs, you’re not talking to the GPUs, if you’re talking to Claude, you’re through GPUs or Trainium or TPUs, you’re not talking to any of those chips, you’re talking to the interface. And the vast majority of inference out there is being done on one of a handful of models. And so whether it’s 5, 10, 20, 100, it’s not millions of people that are programming to those things directly, and that’s gonna be true going forward just because these systems are so complex, they’re very large. If you’re going to go train a model, not that many people have enough money to go train a model, not that many people have the expertise to actually manage it. They’re very complicated systems, and the OpenAI team is incredible in their ability to squeeze value out of a very large compute cluster. But not that many people have the team that can do that, independent of what the chip happens to be, and so I think that that’s going to be true for all accelerator chips, honestly. The frontier models are an important factor in this, and that is an angle that I didn’t see coming. Nvidia CEO Jensen Huang explained in a recent interview with Dwarkesh Patel why Nvidia didn’t invest in Anthropic early on: At the time, I didn’t deeply internalize how difficult it would be to build a foundation AI lab like OpenAI and Anthropic, and the fact that they needed huge investments from the supplier themselves. We just weren’t in a position to make the multi-billion dollar investment into Anthropic so that they could use our compute. But Google and AWS were. They put in huge investments in the beginning so that Anthropic, in return, used their compute. We just weren’t in a position to do that at the time. I would say my mistake is I didn’t deeply internalize that they really had no other options, that a VC would never put in $5-10 billion of investment into an AI lab with the hopes of it turning out to be Anthropic. So that was my miss. But even if I understood it, I don’t think we would’ve been in a position to do that at the time. But I’m not going to make that same mistake again. Amazon had both the money and the chips to invest into Anthropic precisely because they had built such a cash machine with AWS in the first place. That’s the thing with big investments in infrastructure: they take years to build, but the benefit of that investment compounds over time. Anthropic, meanwhile, thanks to those investments from Amazon and Google, can not only run across a variety of chips, but for a long time was the only frontier model available on all of the leading clouds, an important selling point for enterprises. Microsoft, in the end, needed to let go of Azure’s exclusive access to OpenAI’s API in part because that exclusivity was hurting the prospects of their mammoth stake in OpenAI. You can also make the case that Amazon is the best choice for frontier model access in a world of limited compute: Microsoft’s core business is software, which is to say that the company faces massive pressure to invest in their own AI capabilities, even at the cost of de-prioritizing cloud customers. That’s exactly what happened at Microsoft earlier this year , when the company missed Azure growth projections because they devoted more compute to their internal workloads. It was an understandable decision: cloud demand is eternal, but the risk from AI for existing software businesses is existential. This also applies to Google: the company’s core business is also digital, and while search has fended off the threat from chatbots that many expected, the fundamental challenge is still one to be managed, not extinguished. Amazon’s core businesses, meanwhile, are very much rooted in the physical world: selling and shipping physical goods, and building data centers. Both are amenable to Amazon devoting the majority of its chips to customers’ workloads. If this week marks the resolution of one of Amazon’s long bets, you can see the outline of future resolutions in present day announcements. One prominent example is Amazon Leo, the company’s satellite service that seems, at first glance, duplicative of SpaceX’s Starlink, which has the advantage of already existing at scale. Remember Amazon’s formula, however, which CEO Andy Jassy stated explicitly with regards to Leo on the company’s most recent earnings call : Today, if you ask what stops us from growing the business, we have to get the constellation into space. We have over 20 launches planned this year. We have over 30 launches planned in 2027. But I think the business has a chance to be a very large many billion-dollar revenue business. And I think it has some characteristics that are reminiscent of AWS in that it’s capital-intensive upfront where you’re committing a lot of capital and cash in the early years for assets that you get to leverage over a long period of time. And so I like the free cash flow and return on invested capital characteristics of that business in the medium to long term. The fact that it is extremely capital-intensive is not the only thing about Leo that makes it like AWS: a critical factor is that Amazon is the first-best customer to give the service scale, and here it’s worth going back to logistics. I noted above that Amazon delivery still has marginal costs, and that is because humans have to make the delivery. Amazon, however, has already pointed to the future, a full 13 years ago when the company first started talking publicly about drone delivery. It’s been a long slog, to be sure, but it’s increasingly plausible to imagine a future where delivery costs are a matter of depreciation on drone assets, and what would such a future require? How about reliable widespread satellite coverage for communicating with and guiding those drones? And, if Amazon doesn’t want to be dependent on Jensen Huang for chips, do you think they want to be dependent on Elon Musk for drone connectivity? Of course other businesses — like Apple — will be able to pay to use Amazon’s satellite infrastructure, just like they can now pay to use Amazon’s delivery service, or pay to use AWS, or pay to sell on Amazon.com. The world may change, in increasingly drastic ways, but Amazon’s approach, by virtue of its focus on long-term investments in the physical world, appears to be as sturdy as ever. More generally, I increasingly suspect that long-term vulnerability to AI — or, to put it more positively, long-term incentives to invest in AI — are very strongly correlated with the degree to which a company interacts with the physical world, and secondarily, the degree to which companies feel secure in their control of distribution: This is, in the end, another advantage to making the sort of long-term bets Amazon specializes in: the threats are so distant that you have plenty of time to make new investments that address any weaknesses that develop in the meantime — or, as is the case of AI, wait for the market to tilt in your favor. The first inflection point was the emergence of LLMs — call this the ChatGPT moment. In this first paradigm tokens were generated by GPUs and presented as the answer to a question. The second inflection point was the emergence of reasoning models — call this the o1 moment. In this paradigm there are a very large number of tokens that are generated to figure out the answer before the answer is actually generated; this was an exponential increase in the addressable market for tokens. The third inflection point was the emergence of functional agents — call this the Opus 4.5 moment. In this paradigm those reasoning models are not triggered by humans asking a question, but by an agent solving a problem. This increases the market in two directions: first, humans can run multiple agents, and secondly, agents can leverage reasoning models multiple times to accomplish a task. This isn’t just an exponential increase in the addressable market for tokens, it’s two exponential increases squared. First, while inference still requires significant memory, the requirement is significantly less than that required for training. It’s actually viable to store a model’s parameters in a single server; you don’t need to network together thousands of chips. Second, while reasoning and agentic workloads require significantly more tokens, and thus a massively larger KV cache, the increase is actually so large that even the most optimized Nvidia inference systems are being built with dedicated memory servers . This sort of architecture is much more compatible with Amazon’s networking approach than the thousands-of-chips-networked-together approach is. Third, agents are heavily CPU dependent, which has two important implications. First, fully utilizing accelerators is a function of having sufficient general compute; second, achieving maximum utilization of heterogeneous compute means unbundling CPUs and GPUs and routing workloads between resources, which is exactly the sort of disaggregated-resource abstraction that Amazon has been building with Nitro. First, if you have the money to buy that many Nvidia chips, you also have the money to spend on getting more power — which is exactly what AWS has been focused on. This very much fits AWS’ modus operandi, which is to invest more upstream (in this case in power) with the goal of spending less downstream (paying Nvidia huge margins for their chips). Second, in the long term, electricity is more of a commodity than logic is. That means it is a market where innovation and competition are more likely to break a bottleneck, which is another way to say that investing in one’s own silicon is the area most likely to deliver a return on investment. Third, the nature of inference workloads — particularly agentic ones — is such that perfect accelerator utilization is going to be a much harder problem to solve than when it comes to training. Apple and Amazon feel comfortable not having leading edge models, just access to them, because their business is rooted in the physical. Microsoft has invested heavily in data centers, but doesn’t own their own model, perhaps because they feel their control of distribution to enterprises will protect their core business (or because they had too much of a dependency on OpenAI). Google and Meta are investing at a similar scale to Amazon, and are also heavily invested in their own models. Both are Aggregators, which is to say they have to continually earn attention from consumers, given that competition is only a click away; having good AI is existential to them.

0 views
iDiallo 2 days ago

AI didn't delete your database, you did

Last week, a tweet went viral showing a guy claiming that a Cursor/Claude agent deleted his company's production database . We watched from the sidelines as he tried to get a confession from the agent: "Why did you delete it when you were told never to perform this action?" Then he tried to parse the answer to either learn from his mistake or warn us about the dangers of AI agents. I have a question too: why do you have an API endpoint that deletes your entire production database? His post rambled on about false marketing in AI, bad customer support, and so on. What was missing was accountability. I'm not one to blindly defend AI, I always err on the side of caution. But I also know you can't blame a tool for your own mistakes. In 2010, I worked with a company that had a very manual deployment process. We used SVN for version control. To deploy, we had to copy trunk, the equivalent of the master branch, into a release folder labeled with a release date. Then we made a second copy of that release and called it "current." That way, pulling the current folder always gave you the latest release. One day, while deploying, I accidentally copied trunk twice. To fix it via the CLI, I edited my previous command to delete the duplicate. Then I continued the deployment without any issues... or so I thought. Turns out, I hadn't deleted the duplicate copy at all. I had edited the wrong command and deleted trunk instead. Later that day, another developer was confused when he couldn't find it. All hell broke loose. Managers scrambled, meetings were called. By the time the news reached my team, the lead developer had already run a command to revert the deletion. He checked the logs, saw that I was responsible, and my next task was to write a script to automate our deployment process so this kind of mistake couldn't happen again. Before the day was over, we had a more robust system in place. One that eventually grew into a full CI/CD pipeline. Automation helps eliminate the silly mistakes that come with manual, repetitive work. We could have easily gone around asking "Why didn't SVN prevent us from deleting trunk?" But the real problem was our manual process. Unlike machines, we can't repeat a task exactly the same way every single day. We are bound to slip up eventually. With AI generating large swaths of code, we get the illusion of that same security. But automation means doing the same thing the same way every time. AI is more like me copying and pasting branches, it's bound to make mistakes, and it's not equipped to explain why it did what it did. The terms we use, like "thinking" and "reasoning," may look like reflection from an intelligent agent. But these are marketing terms slapped on top of AI. In reality, the models are still just generating tokens. Now, back to the main problem this guy faced. Why does a public-facing API that can delete all your production databases even exist? If the AI hadn't called that endpoint, someone else eventually would have. It's like putting a self-destruct button on your car's dashboard. You have every reason not to press it, because you like your car and it takes you from point A to point B. But a motivated toddler who wiggles out of his car seat will hit that big red button the moment he sees it. You can't then interrogate the child about his reasoning. Mine would have answered simply: "I did it because I did it." I suspect a large part of this company's application was vibe-coded. The software architects used AI to spec the product from AI-generated descriptions provided by the product team. The developers used AI to write the code. The reviewers used AI to approve it. Now, when a bug appears, the only option is to interrogate yet another AI for answers, probably not even running on the same GPU that generated the original code. You can't blame the GPU! The simple solution is know what you're deploying to production. The more realistic one is, if you're going to use AI extensively, build a process where competent developers use it as a tool to augment their work, not a way to avoid accountability. And please, don't let your CEO or CTO write the code.

0 views