Posts in Finance (20 found)

AI Is Really Weird

If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . I just put out a massive Hater’s Guide To The SaaSpocalypse , as well as last week’s deep dive into How AI Isn't Too Big To Fail .  Subscribing helps directly support my free work, and premium subscribers don’t see this ad in their inbox. I can’t get over how weird the AI bubble has become. Hyperscalers are planning to spend over $600 billion on data center construction and GPUs predominantly bought from NVIDIA, the largest company on the stock market, all to power generative AI, a technology that’s so powerful that none of them will discuss how much it’s making them, or what it is we’re all meant to be so excited.  To make matters weirder , Microsoft, a company that spent $37.5 billion in capital expenditures in its last quarter on AI , recently updated the terms and conditions of its LLM-powered “Copilot” service to say that it was “for entertainment purposes only,” discussing a product that apparently has 15 million users as part of enterprise Microsoft 365 subscriptions , and is sold to both local and national governments overseas , including the US federal government . That’s so weird! What’re you doing Microsoft? What do you mean it’s for entertainment purposes? You’re building massive data centers to drive this!  Well, okay, you’re building them at some point. As I discussed a few weeks ago, despite everybody talking about the hundreds of gigawatts of data centers being built “to power AI,” only 5GW are actually “under construction,” with “under construction” meaning anything from “we’ve got some scaffolding up” to “we’re about to hand over the keys to the customer.”  But isn’t it weird we’re even building those data centers to begin with? Why? What is it that AI does that makes it so essential — or, rather, entertaining — that we keep funding and building these things? Every day we hear about “the power of AI,” we’re beaten over the head with scary propaganda saying “AI will take our jobs,” but nobody can really explain — outside of outright falsehoods about “AI replacing all software engineers” — what it is that makes any of this worthy of taking up any oxygen let alone essential or a justification for so many billions of dollars of investment. Instead of providing an actual answer of some sort , AI boosters respond by saying it’s “just like the dot com bubble” — another weird thing to do considering 168,000 people lost their jobs as the NASDAQ dropped by 80% in two years , and only 16% of the world even used the internet , and those that did in America had an average internet speed of 50 kilobits per second ( and only 52% of them had access in 2000 anyway ). Conversely, to quote myself: And with that incredibly easy access , only 3% of households pay for AI . Boosters will again use this talking point to say that “we’re in the early days,” but that’s only true if you think that “early days” means “people aren’t really using it yet.”  Yet the “early days” argument is inherently deceptive. While the Large Language Model hype cycle might have only begun in 2022, the entirety of the media and markets have focused their attention on AI, along with hundreds of billions of dollars of venture capital and nearly a trillion dollars of hyperscale capex investment . AI progress isn’t hampered by a lack of access, talent, resources, novel approaches, or industry buy-in, but by a single-minded focus on Large Language Models, a technology that has been so obviously-limited from the very beginning that Gary Marcus was able to call it in 2022 .  Saying it’s “the early days” also doesn’t really make sense when faced with the rotten and incredibly unprofitable economics of AI. The early days of the internet were not unprofitable due to the underlying technology of serving websites , but the incredibly shitty businesses that people were building. Pets.com spent $400 per customer in customer acquisition costs , millions of dollars on advertising, and had hundreds of employees for a business with a little over $600,000 in quarterly revenue — and as a result, nothing about its failure was about “the early days of the internet” at all, as was the case with Kozmo, or any number of other dot com flameouts.  Similarly, internet infrastructure companies like Winstar collapsed because they tried to grow too fast and signed stupid deals rather than anything about the underlying technology’s flaws. For example, in 1998, Lucent Technologies signed its largest deal — a $2 billion “equipment and finance agreement” — with telecommunications company Winstar , which promised to bring in “$100 million in new business over the next five years” and build a giant wireless broadband network, along with expanding Winstar’s optical networking. Eager math-heads in the audience will be able to see the issue of borrowing $2 billion to make $100 million over five years, as will eager news-heads laugh at WIRED magazine in 1999 saying that Winstar’s “small white dish antennas…[heralded] a new era and new mind-set in telecommunications.” Winstar died two years later because its business was built to grow at a rate that its underlying product couldn’t support . In the end, microwave internet (high-speed internet delivered via radio waves) has become an $8 billion-a-year industry , despite everybody’s excitement. In any case, anytime that somebody tells you that we’re in “the early days of AI” has either been conned or is in the process of conning you, as they’re using it to deflect from issues of efficacy or underlying economic weakness.  In fact, that’s a great place to go next. Probably the weirdest thing about this entire era is how nobody wants to talk about the fact that AI isn’t actually doing very much, and that AI agents are just chatbots plugged into an API. Per Redpoint Ventures’ Reflections on the State of the Software and AI Market , “the agent maturity curve is still early, but the TAM implications are enormous,” with agents able to “...run discretely for minutes, [and] execute end-to-end tasks with some oversight.” What tasks, exactly? Who knows! Truly, nobody seems able to say. To paraphrase Steven Levy at WIRED , 2025 was meant to be the year of AI agents, but turned out to be the year of talking about AI agents. Agents were/are meant to be autonomous pieces of software that go off and do distinct tasks. In reality, it’s kind of hard to say what those tasks are. “AI agent” now refers to literally anything anybody wants it to, but ultimately means “chatbot that has access to some systems.”  The New York Times’ Ezra Klein recently talked to the entity currently inhabiting former journalist and Anthropic co-founder Jack Clark recently about “how fast AI agents would rip through the economy,” but despite speaking for over an hour, the closest we got was “it wrote up a predator-prey simulation (a complex-sounding but extremely-common kind of webgame that Anthropic likely ingested through its training material )” and “chatbots that talk to each other about tasks,” and if you think I’m kidding, this is how he described it: Anyway, this is all bad, because multiple papers have now shown that, and I quote, agents are “...incapable of carrying out computational and agentic tasks beyond a certain complexity,” with Futurism adding that said complexity was pretty low . The word “agent” is meant to make you think of powerful autonomous systems that carry out complex and minute tasks, when in reality it’s…a chatbot. It’s always a fucking chatbot. It might be a chatbot with API access or a chatbot that generates a plan that another chatbot looks at and says something about, but it’s still chatbots talking to chatbots. When you strip away the puffery, nobody seems to actually talk about what AI does.  Let’s take a look at CNBC’s piece on Goldman Sachs’ supposed contract with Anthropic to build “autonomous systems for time-intensive, high-volume back-office work”: …okay, but like, what does it do? Right, brilliant. Great. Love it. What tasks? What is the thing you’re paying for? Okay, great, we have two things it might do in the future , and that’s “employee surveillance” (?) and making pitchbooks. The upshot is that, with the help of the agents in development, clients will be onboarded faster and issues with trade reconciliation or other accounting matters will be solved faster, Argenti said. Onboarding? Chatbot. “Issues with trade reconciliation”? Chatbot connected to a knowledge base, like we’ve had for years but worse and more expensive. Oh, and “other accounting matters” will be solved faster, always with the future tense with these guys. How about Anthropic and outsourcing body shop giant InfoSys’ “AI agents for telecommunications and other regulated industries ”? Let’s go through the list of tasks and say what they mean, my comments in bold: How about OpenAI’s “Frontier” platform for businesses to “ build, deploy and manage AI agents that do real work” ?  Shared context? Chatbot. Onboarding? Chatbot. Hands-on learning with feedback? Chatbot. Clear permissions and boundaries? Chatbot setting. Let’s check out the diagram! Uhuh. Great. What real-world tasks? Uhhh.  Reason over data? Chatbot. “Complex tasks”? No idea, it doesn’t say. “Working with files”? Doesn’t say how it works with files, but I’d bet it can analyze, summarize and create charts based on them that may or may not have errors in them, and based on my experience of trying to get these things to make charts (as a test, I’d never use them in my actual work), it doesn’t seem to be able to do that. “Evaluation and optimization loops”? Unclear, because we have no idea what the tasks are. What are the agents planning, acting, or executing on? Again, no idea.  Yet the media continues to perpetuate the myth of some sort of present or future “agentic AI” that will destroy all employment. A few weeks ago, CNBC mindlessly repeated that ServiceNow CEO Bill McDermott believed that agents would send college grad unemployment over 30% . NowAssist , ServiceNow’s AI platform, is capable of — you guessed it! — summarization, conversational exchanges, content creation, code generation and search, a fucking chatbot just like the other chatbots.  A few weeks ago, The New York Times wrote about how “AI agents are fun, useful, but [not to] give them your credit card,” saying that they can “do more than just chat…they can edit files, send emails, book trips and cause trouble”: Sure sounds like you connected a chatbot to your email there Mr. Heyneman.  Let’s go through these: Yes, you can string together chatbots with various APIs and have the chatbot be able to activate certain systems. You could also do the same with a button you bought on Etsy connected to your computer via USB if you really wanted to. The ability to connect something to something else does not mean that anything useful happens at the end, and LLMs are extremely bad at the kind of deterministic actions that define the modern knowledge economy, especially when choosing to do them based on their interpretation of human language. AI agents do not, as sold, actually exist. Every “AI agent” you read about is a chatbot talking to another chatbot connected to an API and a system of record, and the reason that you haven’t heard about their incredible achievements is because AI agents are, for the most part, fundamentally broken.  Even OpenClaw, which CNBC confusingly called a “ ChatGPT moment ,” is just a series of chatbots with the added functionality of requiring root access to your computer and access to your files and emails. Let’s see how CNBC described it back in February :  Hmmm interesting. I wonder if they say what that means: Reading this, you might be fooled into believing that OpenClaw can actually do any of this stuff correctly, and you’d be wrong! OpenClaw is doing the same chatbot bullshit, just in a much-more-expensive and much-more convoluted way, requiring either a well-secured private space or an expensive Mac Mini to run multiple AI services and do, well, a bunch of shit very poorly. The same goes for things like Perplexity’s “Computer,” which it describes as “an independent digital worker that completes and workflows for you,” which means, I shit you not, that it can search, generate stuff (words, code, images), and integrate with Gmail, Outlook, Github, Slack, and Notion, places where it can also drop stuff it’s generated. Yes, all of this is dressed up with fancy terms like “persistent memory across sessions” (a document the chatbot reads and information it can access) with “authenticated integrations” (connections via API that basically any software can have). But in reality, it’s just further compute-intensive ways of trying to fit a square peg in a round hole, by which I mean having a hallucination-prone chatbot do actual work. The only reason Jensen Huang is talking about OpenClaw is that there’s nothing else for Jensen Huang to talk about: That’s wild, man. That’s completely wild. What’re you talking about? What can NemoClaw or OpenClaw or whatever-the-fuck actually do? What is the actual output? That’s so fucking weird! I can already hear the haters in my head screaming “ but Ed, coding models! ” and I’m kind of sick of talking about them, because nobody can actually tell me what I’m meant to be amazed or surprised by.  To be clear, LLMs can absolutely write code, and can absolutely create software, but neither of those mean that the code is good, stable or secure, or that the same can be said of the software they create. They do not have ideas, nor do they create unique concepts — everything they create is based on training data fed to it that was first scraped from Stack Overflow, Github and whatever code repositories Anthropic, OpenAI, and Google have been able to get their hands on.  It’s unclear what the actual economic or productivity effects are, other than an abundance of new code that’s making running companies harder. Per The New York Times :  As I wrote a few weeks ago , LLMs are good at writing a lot of code , not good code, and the more people you allow to use them, the more code you’re going to generate, which means the more time you’re either going to need to review that code, or the more vulnerabilities you’re going to create as a result. Worse still, hyperscalers like Meta and Amazon are allowing non-technical people to ship code themselves, which is creating a crisis throughout the tech industry.  Worse still , LLMs allow shitty software engineers that would otherwise be isolated by their incompetence to feign enough intelligence to get by, leading to them actively lowering the quality of code being shipped. Per the Times: The Times also notes that because LLM coding works better on a device rather than a web interface, “...engineers are downloading their entire company’s code to their laptops, creating a security risk if the laptop goes missing.”  Speaking frankly, it appears that LLMs can write code, and create some software, but without any guarantee that said code will compile, run, be secure, performant, or easy to read and maintain. For an experienced and ethical software engineer, LLMs can likely speed them up somewhat , though not in a way that appears to be documented in any academic sense, other than it makes them slower .  And I think it’s fair to ask what any of this actually means. What’s the advantage of having an LLM write all of your code? Are you shipping faster? Is the code better? Are there many more features being shipped? What is the actual thing you can point at that has materially changed for the better?  Software engineers don’t seem happier, nor do they seem to be paid more, nor do they seem to be being replaced by AI, nor do we have any examples of truly vibe coded software companies shipping incredible, beloved products.  In fact, I can’t think of a new piece of software I’ve used in the last few years that actually impressed me outside of Flighty . Where’s the beef? What am I meant to be looking at? What’re you shipping that’s so impressive? Why should I give a shit? Isn’t it weird that we’re even having this conversation? Shouldn’t it be obvious by now? This week, economist Paul Kedrosky told me on the latest episode of my show Better Offline that AI is “...nowhere to be seen yet in any really meaningful productivity data anywhere,” and only appears in the non-residential fixed investments side of America’s GDP, at (and I quote again) “...levels we last saw with the railroad build out or with rural electrification.” That’s so fucking weird! NVIDIA is the largest company on the US stock market and has sold hundreds of billions of dollars of GPUs in the last few years, with many of them sold to the Magnificent Seven, who are building massive data centers and reopening nuclear power plants to power them, and every single one of them is losing money doing so, with revenues so putrid they refuse to talk about them!   And all that to make…what, Gemini? To power ChatGPT and Claude? What does any of this actually do that makes any of those costs actually matter? And as I’ve discussed above, what, literally, does this software do that makes any of this worth it?   Ask the average AI booster — or even member of the media — and they’ll say something about “lots of code being written by AI,” or “novel discoveries” (unrelated to LLMs) or “LLMs finding new materials ( based on an economics paper with faked data )” or “people doing research,” or, of course, “that these are the fastest-growing companies of all time.” That “growth” is only possible because all of the companies in question heavily subsidize their products , spending $3 to $15 for every dollar of revenue. Even then, only OpenAI and Anthropic seem to be able to make “billions of dollars of revenue,” a statement that I put in quotes because however many billions there might be is up for discussion. Back in November 2025 , I reported that OpenAI had made — based on its revenue share with Microsoft — $4.329 billion between January and September 2025, despite The Information reporting that it had made $4.3 billion in the first half of the year based on disclosures to shareholders .  While a few outlets wrote it up, my reporting has been outright ignored by the rest of the media. I was not reached out to by or otherwise acknowledged by any other outlets, and every outlet has continued to repeat that OpenAI “made $13 billion in 2025,” despite that being very unlikely given that it would have required it to have made $8 billion in a single quarter. While I understand why — I’m an independent, after all — these numbers directly contradict existing reporting, which, if I was a reporter, would give me a great deal of concern about the validity of my reporting and the sources that had provided it.  Similarly, when Anthropic’s CFO said in a sworn affidavit that it had only made $5 billion in its entire existence , nobody seemed particularly bothered, despite reports saying it had made $4.5 billion in 2025 , and multiple “annualized revenue” reports — including Anthropic’s own — that added up to over $6.6 billion .  Though I cannot say for certain, both of these situations suggest that Anthropic and OpenAI are misleading their investors, the media and the general public. If I were a reporter who had written about Anthropic or OpenAI’s revenues previously, I would be concerned that I had published something that wasn’t true, and even if I was certain that I was correct, I would have to consider the existence of information that ran counter to my own. I would be concerned that Anthropic or OpenAI had lied to me, or that they were lying to someone else, and work diligently to try and find out what happened. I would, at the very least, publish that there was conflicting information. The S-1 will give us the truth, I guess.  Let’s talk for a moment about margins , because they’re very important to measuring the length of a business.  Back in February in my Hater’s Guide To Anthropic, I raised concerns that Dario Amodei was using a different way to calculate margins than other companies do .  Amodei told the FT in December 2024 that he didn’t think profitability was based on how much you spent versus how much you made: He then did the same thing in an interview with John Collison in August 2025 : Almost exactly six months later on February 13, 2026’s appearance on the Dwarkesh Podcast, Dario would once again try and discuss profitability in terms other than “making more money than you’ve spent”: The above quote has been used repeatedly to suggest that Anthropic has 50% gross margins and is “profitable,” which is extremely weird in and of itself as that’s not what Dario Amodei said at all. Based on The Information’s reporting from earlier in the year , Anthropic’s “gross margin” was 38%.” Yet things have become even more confusing thanks to reporting from Eric Newcomer, who ( in reporting on an investor presentation by Coatue from January ) revealed that Anthropic’s gross margin was “45% in the quarter ended Sep-25,” with the crucial note that — and I quote — “Non-GAAP gross margins [are] calculated by Anthropic management…[are] unaudited, company-provided, and may not be comparable to other companies.” This means that however Anthropic calculates its margins are not based on Generally Accepted Accounting Principles , which means that the real margins probably suck ass , because Anthropic loses billions of dollars a year, just like OpenAI. Yet one seemingly-innocent line in there gives me even more pause: “Model payback improving significantly as revenue scales faster than R&D training costs.” This directly matches with Dario Amodei’s bizarre idea that “...If you consider each model to be a company, the model that was trained in 2023 was profitable. You paid $100 million, and then it made $200 million of revenue.” Yes, I know it’s a “stylized fact” or whatever, but that’s what he said, and I think that their IPO might have a rude surprise in the form of a non-EBITDA margin calculation that makes even the most-ardent booster see red. This week, The Wall Street Journal published a piece about OpenAI and Anthropic's finances that included one of the most-offensive lines in tech media history: Two thoughts: As I said a few months ago about training costs: The Journal also adds that both Anthropic and OpenAI are showing investors two versions of their earnings — one with training costs, and one without — without adding the commentary that this is extremely deceptive or, at the very least, extremely unusual. The more I think about it the more frustrated I get. Having two sets of earnings is extremely dodgy! Especially when the difference between them is billions of dollars. This should be immediately concerning to every financial journalist, the reddest of red flags, the biggest sign that something weird is happening… …but because this is the AI industry, the Journal runs propaganda instead: That “fast-growing” part is only possible because both Anthropic and OpenAI subsidize the compute of their subscribers , allowing them to burn $3 to $15 for every dollar of subscription revenue. And no, this is nothing like Uber or Amazon , that’s a silly comparison, click that link and read what I said and then never bring it up again. I realize my suspicion around Anthropic’s growth has become something of a meme at this point, but I’m sorry, something is up here. Let’s line it all up: Anthropic was making $9 billion in annualized revenue at the end of 2025, or approximately $750 million in a 30-day period. Per Newcomer , as of December 2025, this is how Anthropic’s revenue breaks down: Per The Information , Anthropic also sells its models through Microsoft, Google and Amazon, and for whatever reason reports all of the revenue from their sales as its own and then takes out whatever cut it gives them as a sales and marketing expense: The Information also adds that “...about 50% of Anthropic’s gross profits on selling its AI via Amazon has gone to Amazon,” and that “...Google typically takes a cut of somewhere between 20% and 30% of net revenue, after subtracting infrastructure costs.”  The problem here is that we don’t know what the actual amounts of revenue are that come from Amazon or Google (or Microsoft, for that matter, which started selling Anthropic’s models late last year), which makes it difficult to parse how much of a cut they’re getting. That being said, Google ( per DataCenterDynamics/The Information ) typically takes a cut of 20% to 30% of net revenue after subtracting the costs of serving the models . Nevertheless, something is up with Anthropic’s revenue story.  Let’s humour Anthropic for a second and say that what it’s saying is completely true: it went from making $750 million in monthly revenue in January to $2.5 billion in monthly revenue in April 2026. That’s remarkable growth, made even more remarkable by the fact that — based on its December breakdown — most of it appears to have come from API sales. That leap from $750 million to $1.16 billion between December and February feels, while ridiculous , not entirely impossible , but the further ratchet up to $2.5 billion is fucking weird! But let’s try and work it out.  On February 5 2026, Anthropic launched Opus 4.6 , followed by Claude Sonnet 4.6 on February 17 2026.  Based on OpenRouter token burn rates , Opus 4.5 was burning around 370 billion tokens a week. Immediately on release, Opus 4.6 started burning way, way more tokens — 524 billion in its first week, then 643 billion, then 634 billion, then 771 billion, then 822 billion, then 976 billion, eventually going over a trillion tokens burned in the final week of March.  In the weeks approaching its successor’s launch, Sonnet 4.5 burned between 500 billion and 770 billion tokens. A week after launch, 4.6 burned 636 billion tokens, then 680 billion, then 890 billion, and, by about a month in, it had burned over a trillion tokens in a single week.  Reports across Reddit suggest that these new models burn far more tokens than their predecessors with questionable levels of improvement.  The sudden burst in token burn across OpenRouter doesn’t suggest a bunch of people suddenly decided to connect to Anthropic and other services’ models , but that the model themselves had started to burn nearly twice the amount of tokens to do the same tasks. At this point, I estimate Anthropic’s revenue split to be more in the region of 75% API and 25% subscriptions, based on its supposed $2.5 billion in annualized revenue (out of $14 billion, so a little under 18%) in February coming from “Claude Code” (read: subscribers to Claude, there’s no “Claude Code” subscription).  If that’s the case, I truly have no idea how it could’ve possibly accelerated so aggressively, and as I’ve mentioned before , there is no way to reconcile having made $5 billion in lifetime revenue as of March 9, 2026, having $14 billion in annualized revenue on February 12 2026, and having $4.5 billion in revenue for the year 2025. Things get more confusing when you hear how Anthropic calculates its annualized revenues, per The Information : So, Anthropic is annualizing based on the last four weeks of API revenue times 13, a number that’s extremely easy to manipulate using, say, launches of new products. In simpler terms, Anthropic is cherry-picking four-week windows of API spend — ones that are pumped by big announcements and new model releases — and annualizing them. The one million token context window is a big deal, too, having been raised from 200,000 tokens in previous models. With Opus and Sonnet 4.6, Anthropic lets users use up to one million tokens of context, which means that both models can now carry a very, very large conversation history, one that includes every single output, file, or, well, anything that was generated as a result of using the model via the API. This leads to context bloat that absolutely rinses your token budget.   To explain, the context window is the information that the model can consider at once. With 4.6, Anthropic by default allows you to load in one million tokens’ worth of information at once, which means that every single prompt or action you take has the model load one million tokens’ worth of information at once unless you actively “trim” the window through context editing .  Let’s say you’re trying to work out a billing bug in a codebase via whatever interface you’re using to code with LLMs. You load in a 350,000 token codebase, a system prompt (IE: “you are a talented software engineer,” here’s an example ), a few support tickets, and a bunch of word-heavy logs to try and fix it. On your first turn (question), you ask it to find the bug, and you send all of that information through. It spits out an answer, and then you ask it how to fix the bug…but “asking it to fix the bug” also re-sends everything, including the codebase, tickets and logs. As a result, you’re burning hundreds of thousands of tokens with every single prompt. Although this is a simplified example, it’s the case across basically any coding product, such as Claude Code or Cursor. While Cursor uses codebase indexing to selectively fetch pieces of the codebase without constantly loading it into the context window, one developer using Claude inside of Cursor watched a single tool call burn 800,000 tokens by pulling an entire database into the context window , and I imagine others have run into similar problems. To be clear, Anthropic charges at a per-million-token rate of $5 per million input and $25 per million output, which means that those casually YOLOing entire codebases into context are burning shit tons of cash (or, in the case of subscribers, hitting their rate limits faster). if Anthropic actually made $2.5 billion in a month — we’ll find out when it files its S-1! — it likely came not from genuine growth or a surge of adoption, but in its existing products suddenly costing a shit ton more because of how they’re engineered.  The other possibility is the nebulous form of “enterprise deals” that Anthropic allegedly has, and the theory that they somehow clustered in this three-month-long period, but that just feels too convenient.   If 70% of Anthropic’s revenue is truly from API calls, this would suggest: I don’t see much evidence of Anthropic creating custom integrations that actually matter, or — and fuck have I looked! — any real examples of businesses “doing stuff with Claude” other than making announcements about vague partnerships.  There’s also one other option: that Silicon Valley is effectively subsidizing Anthropic through an industry-wide token-burning psychosis. And based on some recent news, there’s a chance that’s the case. As I discussed a few weeks ago, Silicon Valley has a “tokenmaxxing” problem , where engineers are encouraged by their companies to burn as many tokens as possible, at times by their peers, and at others by their companies. The most egregious — and honestly, worrying! — version of this came from The Information’s recent story about Meta employees competing on an internal leaderboard to see who can burn the most tokens, deliberately increasing the size of their prompts and the amount of concurrent sessions ( along with unfettered and dangerous OpenClaw usage ) to do so:   The Information reports that the dashboard, called “Claudeonomics” (despite said dashboard covering other models from OpenAI, Google, and xAI), has sparked competition within Meta, with users burning a remarkable 60 trillion tokens in the space of a month, with one individual averaging around 281 billion tokens, which The Information remarks could cost millions of dollars. Meta’s company-mandated psychosis also gives achievements for particular things like using multiple models or high utilization of the cache. Here’s one very worrying anecdote: One poster on Twitter says that there are people at Meta running loops burning tokens to rise up the leaderboards, and that Meta’s managers also measure lines of code as a success metric.  The Information says that, considering Anthropic’s current pricing for its models, that 60 trillion tokens could be as much as $900 million in the space of a month, though adds that this assumes that every token being burned was on Claude Opus 4.6 (at $15 per 1 million tokens).  I personally think this maths is a bit fucked, because it assumes that A) everybody is only using Claude Opus, B) that none of that token burn runs through the cache (which it obviously does, and the cache charges 50%, as pointed out by OpenCode co-founder Dax Radd ), and C) that Meta is entirely using the API (versus paying for a $200-a-month Claude Max subscription for each user).  Digging in further, it appears that a few years ago Meta created an internal coding tool called CodeCompose , though a source at Meta tells me that developers use VSCode and an assistant called Devmate connected to models from Anthropic, OpenAI and xAI. One engineer on Reddit — albeit an anonymous one! — had some commentary on the subject: If we assume that Meta is an enterprise customer paying API rates for its tokens, it’s reasonable to assume — at even a low $5-per-million average — that it’s spending $300 million or more a month on API calls. As Radd also added, there’s likely a discount involved. He suggested 20%, which I agree with. Even if it’s $300 million, that’s still fucking insane. That’s still over three billion dollars a year. If this is what’s actually happening, and this is what’s contributing to Anthropic’s growth, this is not a sustainable business model, which is par for the course for Anthropic, a company that has only lost billions of dollars. Encouraging workers to burn as many tokens as possible is incredibly irresponsible and antithetical to good business or software engineering. Writing great software is, in many cases, an exercise in efficiency and nuance , building something that runs well, is accessible and readable by future engineers working on it, and ideally uses as few resources as it can. TokenMaxxing runs contrary to basically all good business and software practices, encouraging waste for the sake of waste, and resulting in little measurable productivity benefits or, in the case of Meta, anything user-facing that actually seems to have improved. Venture capitalist Nick Davidov mentioned yesterday that sources at Google Cloud “started seeing billions of tokens per minute from Meta, which might now be as big as a quarter of all the token spend in Anthropic.” While I can’t verify this information ( and Davidoff famously deleted his photos using Claude Cowork while attempting to reorganize his wife’s desktop ), if that’s the case, Meta is a load-bearing pillar of Anthropic’s revenue — and, just as importantly, a large chunk of Anthropic’s revenue flows through Google Cloud , which means A) that Anthropic’s revenue truly hinges on Google selling its models, and B) that said revenue is heavily-inflated by the fact that Anthropic books revenue without cutting out Google’s 20%+ revenue share. In any case, TokenMaxxing is not real demand, but an economic form of AI psychosis. There is no rational reason to tell somebody to deliberately burn more resources without a defined output or outcome other than increasing how much of the resource is being used. I have confirmed with a source at that there is no actual metric or tracking of any return on investment involved in token burn at Meta, meaning that TokenMaxxing’s only purpose is to burn more tokens to go higher on a leaderboard, and is already creating bad habits across a company that already has decaying products and leadership. To make matters worse, TokenMaxxing also teaches people to use Large Language Models poorly. While I think LLMs are massively-overrated and have their outcomes and potential massively overstated, anyone I know who actually uses them for coding generally has habits built around making sure token burn isn’t too ridiculous, and various ways to both do things faster without LLMs and ways to be intentional with the models you use for particular tasks. TokenMaxxing literally encourages you to do the opposite — to use whatever you want in whatever way you want to spend as much money as possible to do whatever you want because the only thing that matters is burning more tokens. Furthermore, TokenMaxxing is exactly the kind of revenue that disappears first. Zuckerberg has reorganized his AI team four or five times already, and massively shifted Meta’s focus multiple times in the last five years, proving that at the very least he’ll move on a whim depending on external forces. After laying off tens of thousands of people in the last few years , Meta has shown it’s fully capable of dumping entire business lines or groups with a moment’s notice, and while moving on from AI might be embarrassing , that would suggest that Mark Zuckerberg experiences shame or any kind of emotion other than anger. This is the kind of revenue that a business needs to treat with extreme caution, and if Meta is truly spending $300 million or more a month on tokens, Anthropic’s annualized revenues are aggressively and irresponsibly inflated to the point that they can’t be taken seriously, especially if said revenue travels through Google Cloud, which takes another 20% off the top at the very least.  Though the term is pretty new, the practice of encouraging your engineers to use AI as much as humanly possible is an industry-wide phenomena, especially across hyperscalers like Amazon, Microsoft and Google, all of whom until recently directly have pushed their workers to use models with few restraints. Shopify and other large companies are encouraging their workers to reflexively rely on AI, with performance reviews that include stats around your token burn and other nebulous “AI metrics” that don’t seem to connect to actual productivity. I’m also hearing — though I’ve yet to be able to confirm it — that Anthropic and other model providers are forcing enterprise clients to start using the API directly rather than paying for monthly subscriptions.  Combined with mandates to “use as much AI as possible,” this naturally increases the cost of having software engineers, which — and I say this not wanting anyone to lose their jobs — does the literal opposite of replacing workers with AI. Instead, organizations are arbitrarily raising the cost of doing business without any real reason.  Because we’re still in the AI hype cycle, this kind of wasteful spending is both tolerated and encouraged, and the second that financial conditions worsen or stock prices drop due to increasing operating expenses, these same companies will cut back on API spend, which will overwhelmingly crush Anthropic’s glowing revenues. I think it’s also worth asking at this point what is is we’re actually fucking doing.   We’re building — theoretically — hundreds of gigawatts of data centers, feeding hundreds of billions of dollars to NVIDIA to buy GPUs, all to build capacity for demand that doesn’t appear to exist, with only around $65 billion of revenue (not profit) for the entire generative AI industry in 2025 , with much of that flowing from two companies (Anthropic and OpenAI) making money by offering their models to unprofitable AI startups that cannot survive without endless venture capital, which is also the case for both AI labs. Said data centers make up 90% of NVIDIA’s revenue, which means that 8% or so of the S&P 500’s value comes from a company that makes money selling hardware to people that immediately lose money on installing it. That’s very weird! Even if you’re an AI booster, surely you want to know the truth , right?  The most-prominent companies in the AI industry — Anthropic and OpenAI — burn billions of dollars a year, have margins that get worse over time , and absolutely no path to profitability, yet the majority of the media act as if this is a problem that they will fix, even going as far as to make up rationalizations as to how they’ll fix it, focusing on big revenue numbers that wilt under scrutiny. That’s extremely weird, and only made weirder by members of the media who seem to think it’s their job to defend AI companies ’ bizarre and brittle businesses. It’s weird that the media’s default approach to AI has, for the most part, been to accept everything that the companies say, no matter how nonsensical it might be. I mean, come on! It’s fucking weird that OpenAI plans to burn $121 billion in the next two years on compute for training its models , and that the media’s response is to say that somehow it will break even in 2030, even though there’s no actual explanation anywhere as to how that might happen other than vague statements about “efficiency.” That’s weird! It’s really, really weird! It’s also weird that we’re still having a debate about “the power of AI” and “what agents might do in the future” based on fantastical thoughts about “agents on the internet ” that do not exist, cannot exist, and will never exist, and it’s fucking weird that executives and members of the media keep acting as if that’s the case. It’s also weird that people discussing agents don’t seem to want to discuss that OpenAI’s Operator Agent does not work , that AI browsers are fundamentally broken , or that agentic AI does not do anything that people discuss. In fact, that’s one of the weirdest parts of the whole AI bubble: the possibility of something existing is enough for the media to cover it as if it exists, and a product saying that it will do something is enough for the media to believe it does it. It’s weird that somebody saying they will spend money is enough to make the media believe that something is actually happening , even if the company in question — say, Anthropic — literally can’t afford to pay for it . It’s also weird how many outright lies are taking place, and how little the media seems to want to talk about them. Stargate was a lie! The whole time it was a lie! That time that Sam Altman and Masayoshi Son and Larry Ellison stood up at the white house and talked about a $500 billion infrastructure project was a lie! They never formed the entity ! That’s so weird! Hey, while I have you, isn’t it weird that OpenAI spent hundreds of millions of dollars to buy tech podcast TBPN “to help with comms and marketing”? It’s even weirder considering that TBPN was already a booster for OpenAI!  It’s also weird that a lot of AI data center projects don’t seem to actually exist, such as Nscale’s project to make “one of the most powerful AI computing centres ever” that is literally a pile of scaffolding , and that despite that announcement the company was able to raise $2 billion in funding . It’s also weird that we’re all having to pretend that any of this matters. The revenues are terrible, Large Language Models are yet to provide any meaningful productivity improvements, and the only reason that they’ve been able to get as far as they have is a compliant media and a venture capital environment borne of a lack of anything else to invest in .  Coding LLMs are popular only because of their massive subsidies and corporate encouragement, and in the end will be seen as a useful-yet-incremental and way too expensive way to make the easy things easier and the harder things harder, all while filling codebases full of masses of unintentional, bloated code. If everybody was forced to pay their actual costs for LLM coding, I do not believe for a second that we’d have anywhere near the amount of mewling, submissive and desperate press around these models.  The AI bubble has every big, flashing warning sign you could ask for. Every company loses money. Seemingly every AI data center is behind schedule, and the vast majority of them aren’t even under construction . OpenAI’s CFO does not believe that it’s ready to go public in 2026 , and Sam Altman’s reaction has been to have her report to somebody else other than him, the CEO. Both OpenAI and Anthropic’s margins are worse than they projected. Every AI startup has to raise hundreds of millions of dollars, and their products are so weak that they can only make millions of dollars of revenue after subsidizing the underlying cost of goods to the point of mass unprofitability .   And it’s really weird that the mainstream media has a diametric view — that all of this is totally permissible under the auspices of hypergrowth, that these companies will simply grow larger, that they will somehow become profitable in a way that nobody can actually describe, that demand for AI data centers will exist despite there being no signs of that happening. I get it. Living in my world is weird in and of itself. If you think like I do, you have to see every announcement by Anthropic or OpenAI as suspicious — which should be the default position of every journalist, but I digress — and any promise of spending billions of dollars as impossible without infinite resources. At the end of this era, I think we’re all going to have to have a conversation about the innate credulity of the business and tech media, and how often that was co-opted to help the rich get richer. Until then, can we at least admit how weird this all is? Telecommunications: AI agents will help carriers modernize network operations, simplify customer lifecycle management, and improve service delivery—bringing intelligent automation to one of the most operationally complex and regulated industries in the world. Meaningless. Automation of what?  Financial services: AI agents will help firms detect and assess risk faster, automate compliance reporting, and deliver more personalized customer interactions, such as tailoring financial advice based on a client's full account history and market conditions. Chatbot! “More-personalized interactions” are a chatbot with a connection to a knowledge system, as is any kind of “tailored financial advice.” Compliance reporting? Summarizing or pulling documents from places, much like any LLM can do, other than the fact that it’ll likely get shit wrong, which is bad for compliance. Manufacturing and engineering: Claude will help accelerate product design and simulation, reducing R&D timelines and enabling engineers to test more iterations before production. I assume this refers to people using Claude Code to do coding, which is what it does. Software development: Teams will use Claude Code to write, test, and debug code, helping developers move faster from design to production. Claude Code. Enterprise operations: Claude Cowork will help teams automate routine work like document summarization, status reporting, and review cycles. Literally a chatbot that deleted every single one of a guy’s photos when he asked it to organize his wife’s desktop . “Gather information” — search tool, part of chatbots for years. “Write reports” — generative AI’s most basic feature, with no details on quality. “Edit files” — to do what exactly? Chatbot feature. “Send and receive messages through email and text” — generating and reading text, connected to an email account.  “Delegate work” — what work? No need to get specific!  Are you fucking kidding me? If you simply remove billions of dollars in costs, OpenAI is profitable! Why do you think these companies are going to break even anytime soon? You have absolutely no basis for doing so other than leaks from the company!  Anthropic said on February 12, 2026 it had hit $14 billion in annualized revenue . This would work out to roughly $1.16 billion in a 30-day period, let’s assume from January 11 2026 to February 11 2026. Anthropic’s CFO said it had made “exceeding $5 billion” in lifetime revenue on March 9 2026. On March 3, 2026 Dario Amodei said it had hit $19 billion in annualized revenue.  This would work out to $1.58 billion in a 30-day period. Let’s assume this is for the period from February 2 2026 to March 2 2026. On April 6, 2026, Anthropic said it had hit $30 billion in annualized revenue . This works out to about $2.5 billion in a 30-day period. Let’s assume that said period is March 6 2026 to April 6 2026. Anthropic’s $14 billion in annualized revenue from February 16, 2026 includes both the launch of Claude Opus 4.6 , as well as the height of the OpenClaw hype cycle where people were burning hundreds of dollars of tokens a day .  This announcement also included the launch of Anthropic’s 1 million token context window in Beta for Opus 4.6 Anthropic’s $19 billion in annualized revenue from March 3, 2026 included both the launch of Claude Opus 4.6 and Claude Sonnet 4.6 . This period includes around half of the January 16 to February 16 2026 window from the previous $14 billion annualized number, and the launch of the beta of the 1 million token context window for Sonnet 4.6. To be clear, the betas required you to explicitly turn on the 1 million token context window, and had higher pricing around long context. Anthropic’s $30 billion in annualized revenue from April 6 2026 included two weeks’ worth of massive token burn from the launches of Sonnet and Opus 4.6. This includes a few days of the previous window (March 3 to April 5). This also included the general availability of the 1-million token context window , enabling it by default, billed at the standard pricing. Massive new customers that are making payments up front, which makes this far from “recurring” revenue. Massive new customers are spending tons of money immediately, burning hundreds of millions of dollars a month in tokens, and paying Anthropic handsomely for them.

0 views
Kev Quirk 1 weeks ago

I Hate Insurance!

So yesterday I received an email from Admiral , our insurance provider, where we have a combined policy for both our cars and our home. Last year this cost £1,426.00 , but this year the renewal had gone up by a huge 33%, to £1,897.93 broken down as follows: Even at last year's price this was a shit tonne of money, so I started shopping around and here's what I ended up with: These policies have at least the same cover as Admiral. In some cases, better. I knew it would be cheaper shopping around, but I didn't think it would be nearly half. So, I called Admiral to see what they could do for me, considering I've been a loyal customer for 7 years. They knocked £167,83 (8.8%) off the policy for me, bringing the revised total to £1,730.10. Nice to see that long-term customers are rewarded with the best price! 🤷🏻‍♂️ So I obviously went with the much cheaper option and renewed with 3 different companies. It's a pain, as I'll now need to renew 3 policies at the same time every year, but if it means saving this much money, I'm happy to do it. Next year I'll get a multi-quote from Admiral to see if they're competitive. Something tells me they will be, as with most things these days, getting new customers is more important than retaining existing ones. Unfortunately having car and home insurance is a necessary evil in today's world, but I'm glad I was able to make it a little more palatable by saving myself over £700! If your insurance is up for renewal, don't just blindly renew - shop around as there's some serious savings to be had. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment . Wife's car - £339.34 My car - £455.68 Our home (building & contents) - £1,102.91 Wife's car - £300.17 My car - £402.22 Our home (building and contents) - £533.52 Total: £1056.86 (44% reduction!)

1 views

News: OpenAI CFO Doesn't Believe Company Ready For IPO, Unsure Revenue Will Support Commitments

News out of The Information's Anissa Gardizy and Amir Efrati over the weekend - OpenAI CFO Sarah Friar has apparently clashed with CEO Sam Altman over timing around OpenAI's IPO, emphasis mine: I cannot express how strange this is. Generally a CFO and CEO are in lock-step over IPO timing, or at the very least the CFO has an iron grip on the actual timing because, well, CEOs love to go public and the CFO generally exists to curb their instincts. Nevertheless, Clammy Sam Altman has clearly sidelined Friar, and as of August last year, the CFO of OpenAI doesn't report to the CEO . In fact, the person Friar reports to ( Fiji Simo ) just took a medical leave of absence: It is extremely peculiar to not have the Chief Financial Officer report to the Chief Executive Officer , but remember folks, this is OpenAI, the world's least-normal company! Anyway, all of this seemed really weird, so I asked investor, writer and economist Paul Kedrosky for his thoughts: Very cool! Paul is also a guest on this week's episode of my podcast Better Offline , by the way. Out at 12AM ET Tuesday. Anyway, The Information's piece also adds another fun detail - that OpenAI's margins were even worse than expected in 2025: Riddle me this, Batman! If your AI company always has to buy extra compute to meet demand, and said extra compute always makes margins worse, doesn't that mean that your company will either always be unprofitable or die because it buys too much compute? Say, that reminds me of something Anthropic CEO Dario Amodei said to Dwarkesh Patel earlier in the year ... It is extremely strange that the CFO of a company doesn't report to the CEO of a company, and even more strange that the CFO is directly saying "we are not ready for IPO" as its CEO jams his foot on the accelerator. It's clear that both OpenAI and Anthropic are rushing toward a public offering so that their CEOs can cash out, and that their underlying economics are equal parts problematic and worrying. Though I am entirely guessing here, I imagine Friar sees something within OpenAi's finances that give her pause. An S-1 - one of the filings a company makes before going public - is an audited document, and I imagine the whimsical mathematics that OpenAI engages in - such as, per The Wall Street Journal , calculating profitability without training compute - might not match up with what actual financiers crave. If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of  NVIDIA ,  Anthropic and OpenAI’s finances , and  the AI bubble writ large . I just put out  a massive Hater’s Guide To The SaaSpocalypse , as well as last week’s deep dive into How AI Isn't Too Big To Fail . Supporting my premium supports my free newsletter. OpenAI CFO Sarah Friar has, per The Information, said that OpenAI is not ready to go public in 2026, in part because of the "risks from its spending commitments" and not being sure whether the company's revenue growth would support its spending commitments. Friar (CFO) no longer reports to Sam Altman (CEO) and hasn't done so since August 2025. OpenAI's margins were lower in 2025 "...due to the company having to buy more expensive compute at the last minute."

0 views
Rik Huijzer 1 weeks ago

COVID vs Oil Crisis

COVID declared a pandemic on 11/3 (11*3=33) 2020. On 11/3 2026, the IEA wrote _"The IEA Secretariat will provide further details of how this collective action will be implemented in due course. It will also continue to closely monitor global oil and gas markets and to provide recommendations to Member governments, as needed."_

0 views

The Subprime AI Crisis Is Here

Hi! If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . I just put out a massive Hater’s Guide To The SaaSpocalypse , as well as last week’s deep dive into how the majority of data centers aren’t getting built and the overall AI industry is depressingly small . Supporting my premium supports my free newsletter, and premium subscribers don't get this ad. Soundtrack: Metallica — …And Justice For All   Bear with me, readers. I need to do a little historical foreshadowing to fully explain what’s going on. In the run-up to the great financial crisis, unscrupulous lenders issued around 1.9 million subprime loans , with many of them being adjustable rate mortgages (ARMs) with variable rates that, after a two-or-three-year-long introductory period , would adjust every twelve months, per CBS News in July 2006 : At the time, 18% of homeowners had adjustable-rate mortgages, which also made up more than 25% of new mortgages in the first quarter of 2006, with (at the time) over $330 billion of mortgages expected to adjust upwards. Things were grimmer beneath the surface. A question on JustAnswer from 2009 showed a homeowner that was about to lose their house after being conned into a negative amortization loan — a mortgage where payments didn’t actually cover the interest, meaning that each month the balance increased . Dodgy lenders were given bonuses for selling more mortgages, whether or not the person on the other end was capable of paying, and by November 2007 , around two million homeowners held $600 billion of ARMs.  Yet the myth of the subprime mortgage crisis was that it was caused entirely by low income borrowers. Per Duke’s Manuel Adelino : Despite The Big Short’s dramatic “stripper with six properties” scene made for a vivid demonstration of the subprime problem, the reality was that everybody got taken in by teaser rate mortgages, driving up the value of properties based on a housing market that was only made possible by mortgages that were expressly built to hide the real costs as interest rates and borrower payments rose every six to 36months. I’ll add that near-prime mortgages — for borrowers with just-below-prime credit scores — were also growing, with over 1.1 million of them in 2005, when they represented nearly 32% of all loans. Many people who bought houses that they couldn’t afford did so based on a poor understanding of the terms of their mortgage, thinking that the value of housing would continue to climb as it had for over a hundred years , and/or the belief that they’d easily be able to refinance the loans. Even as things deteriorated toward the middle of the 2000s, people came up with rationalizations as to why things would work out, such as Anthony Downs of The Brookings Institution, who in October 2007 said the following in a piece called “Credit Crisis: The Sky is not Falling”: Brookings also added that “...the vast majority of subprime mortgages are likely to remain fully paid up as long as unemployment remains as low as it is now in the U.S. economy.” At the time, US unemployment was 4.7% , but a year later it was at 6.5%, and would peak at 10% in October 2009.   In an article from the December 2004 issue of Economic Policy Review , Jonathan McCarthy and Richard W. Peach argued that there was “little basis” for concerns about housing prices, with “home prices essentially moving in line with increases in family income and declines in nominal mortgage interest rates,” and hand-waved any concerns based on vague statements about “demand”: From the outside, this made it appear that the value of housing was exponential, and that the “pent-up demand” for homes necessitated a massive boom in construction, one that peaked in January 2006 with 2.27 million new homes built . A year later, this number collapsed to 1.084 million, and in January 2009, only 490,000 new homes had been built in America, the lowest it had been in history.  Denial rates for mortgages declined drastically ( along with the increase in things like 40-year or 50-year mortgages ), which meant that suddenly anybody was able to get a house, which made it only seem logical to build more housing. Low interest rates before 2006 allowed consumers to take on mountains of new credit card debt, rising to as high as 20% of household incomes in 2007 , to the point that by the 2000s, credit card companies were making more money from credit card lending than the fees from people using the credit cards, with $65 billion of the $95 billion of the credit card industry’s revenue coming from interest on debt, with lending-related penalty fees and cash advance fees contributing another $12.4 billion, per Philadelphia Fed Economist Lukasz Drozd. While the precise order of events is a little more complex, the general gist of the subprime mortgage crisis was straightforward: easily-available money allowed massive amounts of people — many of whom couldn’t afford to buy these houses outside of the easy money that funded the bubble — to enter the housing market, which in turn made it much easier to sell a house for a much higher price, which inflated the value of housing.  People made decisions based on fundamentally-flawed information. In January 2004, the Bush administration declared that America’s economy was on the path to recovery , with small businesses creating the majority of new jobs and the stock market booming. Debt was readily-available across the board, with commercial and industrial loans spiking along with consumer debt ( including a worrying growth in subprime auto loans ). The good times were rolling, as long as you didn’t think about it too hard. But, as I said, the chain of events was simple: it was easy to borrow money to buy a house, which meant lots of people were buying houses, which meant that the value of a house seemed higher than it was outside of the easy money era. Easily-available money put lots of cash into the economy, which led to higher prices, which led to inflation, which forced the federal reserve to raise interest rates 17 times in the space of two years , which made it harder to get any kind of loan, which made it harder to get a mortgage, which made it harder to sell a house, which made people sell houses for cheaper, which lowered the value of houses, which made it harder to refinance the bad loans, which meant people foreclosed on their homes, which in turn lowered the value of housing, all as demand for housing dropped because nobody was able to buy housing. The underlying problems were, ultimately, the illusion of value and mobility. Those borrowing at the time believed they had invested in something with a consistent (and consistently-growing) value — a house — and would always have easy access to credit (via credit cards and loans), as before-tax family income had never been higher . In the beginning of 2007, delinquencies on consumer and business loans climbed, abandoned housing developments grew , and a US economy dependent on the housing bubble (per Paul Krugman’s “ That Hissing Sound ” from August 2005) began to stumble. By November 2009 , 23% of US consumer mortgages were underwater (meaning they were worth less than their loans). The housing bubble was created through easily-available debt, insane valuations based on debt-fueled speculation, do-nothing regulators ( like eventual Fed Chair Ben Bernanke, who said in October 2005 that there was no housing bubble ) and consumers being sold an impossible, unsustainable dream by people financially incentivized to make them rationalize the irrational, and believe that nothing bad will ever happen. In February 2005 , 40% ($19 billion) of IndyMac Bancorp’s mortgage originations in a single quarter came from a “Pay-Option ARM,” which started with a 1% teaser rate which jumped in a few short months to 4% or more, with frequent adjustments. Washington Mutual CEO Kerry Killinger said in 2003 that he wanted WaMu to be “ the Wal-Mart of banking ,” and did so by using (to quote the New York Times) “relaxed standards,” including issuing a mortgage to a mariachi singer who claimed a six-figure income and verified it using a single photo of himself.  By the time it collapsed in September 2008, WaMu had over $52.9 billion in ARMs and $16.05 billion in subprime mortgage loans .  Had Washington Mutual and the many banks making dodgy ARM and subprime loans underwritten loans based on the actual creditworthiness of their applicants, there wouldn’t have been a housing bubble, because many of these borrowers would’ve been unable to pay their mortgages, and thus wouldn’t have been deemed creditworthy, and thus no apparent housing demand would’ve grown.  In very simple terms, the “demand” for housing was inflated by a deceitfully-priced product that undersold its actual costs, and through that deceit millions of people were misled into believing said product was viable. Did you work out where this is going yet? In September 2024 , I raised my first concerns about a Subprime AI Crisis: This theory is important, and thus I’m going to give it a lot of time and love to break it down.  That starts with the parties involved, and how the economics involved get worse over time, returning to my theory of “ AI’s chain of pain , and the hierarchy of how the actual AI economy works. The AI industry has done a great job in obfuscating exactly how brittle its economics really are, and as a result, I need to explain both how money is raised , money is deployed, and where the economics begin to break down. Generally, AI is funded from only a few places:: Some things to keep note of: This is a crucial point, so stay with me.  AI models work by charging a per-million token rate for inputs (things you feed in) and outputs, which are either the things that the model outputs (such as an image, text or code), or the “ chain of thought reasoning ” many models rely upon now, where they take an input, generate a plan (which is an “output”) and then do stuff based on said plan. AI startups, for the most part, do not have their own models, and thus must pay OpenAI or Anthropic (or other providers to a much lesser extent) to build services using them.   When you pay for access to an AI startup’s service — which, of course, includes OpenAI and Anthropic — you do so for a monthly fee, such as $20, $100 or $200-a-month in the case of Anthropic’s Claude , Perplexity’s $20 or $200-a-month plan , or OpenAI’s $8, $20, or $200-a-month subscriptions . In some enterprise use cases, you’re given “credits” for certain units of work, such as how Lovable allows users “100 monthly credits” in its $25-a-month subscription , as well as $25 (until the end of Q1 2026) of cloud hosting, with rollovers of credits between months. When you use these services, the company in question then pays for access to the AI models in question, either at a per-million-token rate to an AI lab, or (in the case of Anthropic and OpenAI) whatever cloud provider is renting them the GPUs to run the models. A token is basically ¾ of a word. As a user, you do not experience token burn, just the process of inputs and outputs. AI labs obfuscate the cost of services by using “tokens” or “messages” or 5-hour-rate limits with percentage gauges, and you, as the user, do not really know how much any of it costs. On the back end, AI startups are annihilating cash, with up until recently Anthropic allowing you to burn upwards of $8 in compute for every dollar of your subscription . OpenAI allows you to do the same, though it’s hard to gauge by how much. This is where the economic problem has begun. When the AI bubble started, venture capitalists flooded AI startups with cash, encouraging them to create hypergrowth businesses using, for the most part, monthly subscription costs that didn’t come close to covering the costs. As a result, many AI companies have experienced rapid growth selling a product that can only exist with infinite resources.  The problem is fairly simple: providing AI services is very expensive, and costs can vary wildly depending on the customer, input and output, the latter of which can change dramatically depending on the prompt and the model itself. A coding model relies heavily on chain-of-thought reasoning, which means that despite the cost of tokens coming down (which does not mean the price of providing them has decreased, it’s a marketing move ), models are using far, far more tokens, increasing costs across the board . And consumers crave new models. They demand them. A service that doesn’t provide access to a new model cannot compete with those that do, and because the costs of models have been mostly hidden from users, the expectation is always the newest models provided at the same price. As a result, there really isn’t any way that these services make sense at a monthly rate, and every single AI company loses incredible amounts of money, all while failing to make that much revenue in the first place.  For example, Harvey is an AI tool for lawyers that just raised $200 million at an $11 billion valuation , all while having an astonishingly small $190 million in ARR, or $15.8 million a month. It raised another $160 million in December 2025 , after raising $300 million in June 2025 , after raising $300 million in February 2025 .  Cursor is an AI coding tool that raised $160 million in 2024 (As of December 2024, it had $48 million ARR, or around $4 million of monthly revenue), $900 million ($500 million ARR/$41.6 million) in June 2025, and $2.3 billion in November 2025 ($1 billion ARR/$83 million). As of March 2, 2026, Cursor was at $2 billion annualized revenue, or $166 million in monthly revenue.  I’ll get to Cursor in a little bit, because it’s crucial to the Subprime AI Crisis. The Subprime AI Crisis is what happens when somebody actually needs to start making money, or, put another way, stop losing quite so much, revealing how every link in the chain was funded based on questionable assumptions and deadly short-term thinking.  Here’s the order of events as I see them. The entire generative AI industry is based on unprofitable, unsustainable economics, rationalized and funded by venture capitalists and bankers speculating on the theoretical value of Large Language Model-based services. This naturally incentivized developers to price their subscriptions at rates that attracted users rather than reflecting the actual economics of the services. Venture capitalists are also part of the subprime AI crisis, sitting on “billions of dollars” of AI companies that lose hundreds of millions of dollars, their companies built on top of AI models owned by OpenAI and Anthropic with little differentiation and no path to profitability. Nobody is going public! Nobody is getting acquired! As I discussed back in AI Is A Money Trap , there really is no liquidity mechanism for the billions of dollars sunk into most AI companies. Going public also reveals the ugly financial condition of these startups. MiniMax, for example, made a pathetic $79 million in revenue in 2025, and somehow lost $250.9 million in the process . Much like the houses in the great financial crisis, AI startups only retain their value as long as there is a market, or at least the perception that these companies could theoretically go public or be acquired. It only takes one failed exit or firesale to break the illusion.  At least you can live in a house. Every AI company will be a problem child that burns money on inference, bereft of intellectual property thanks to their dependence on OpenAI and Anthropic. What use is Perplexity without an eternal subsidy? The value of having Aravind Srivinas sitting around your office all day? I’d rather start my car in the garage.  “Fast-growing” AI companies only grew because they were allowed to burn as much money as they wanted selling services that are entirely unsustainable, raising more venture capital money with every burst of user growth, which they use to aggressively market to new users and grow further to raise another bump of venture capital. As a result, AI labs and AI startups have created negative habits with their users in two ways: To grow their user bases as fast as possible, AI startups (and AI labs) allowed their users to burn incredible amounts of tokens, I assume because they believed at some point things would become profitable or they’d always have access to easy venture capital. This created an entire industry of AI startups that disconnected their users from the raw economics of the product, creating a race to the bottom where every single AI startup must have every AI model and every AI feature and do every AI thing, all at an incredible cost that only ever seems to increase. Another fun feature is that just about every product gives some sort of “free” access period for new (and expensive!) models, like when Cursor had a free access period for GPT 5’s launch . It’s unclear who shoulders the burden here, but somebody is paying those costs. In any case, nowhere are the subsidies higher than those of Anthropic and OpenAI, who use their tens of billions of dollars of funding to allow users to burn anywhere from $3 to $13 per every dollar of subscription revenue to outpace their competition.  The Subprime AI Crisis is when the largest parties are finally forced to reckon with their rotten economics, and the downstream consequences that follow.  As I reported in July 2025 , starting in June last year, both OpenAI and Anthropic launched “priority service tiers,” jacking up the price on their enterprise customers (who pay for model access via their API to provide models in their software) for guaranteed uptime and less throttling of their services while also requiring an up-front (3-12 month) guarantee of token throughput.  Anthropic’s changes immediately increased the costs on AI startups like Lovable, Replit, Augment Code, and Anthropic’s largest customer, Cursor, which was forced to dramatically change its pricing from a per-request model to a bizarre pricing model where you pay model pricing with a 20% fee , but also receive A) at least as much as you pay in your subscription fee in tokens and B) “generous included usage” of Cursor’s Composer model: What’s crazy is that even with this pricing, Cursor still gives away 16 cents for every dollar on its $60-a-month plan and $1 for every dollar on its $200-a-month plan, and that’s before “generous usage” of other models. I’ll also add that Anthropic has already turned the screws on its subscription customers too, adding weekly limits to Claude subscribers on July 28, 2025 , a few weeks after quietly tightening other limits . Over the next few months, just about every AI startup had to institute some form of austerity. Replit shifted to something called “effort-based” pricing in June 2025, and then launched something called “Agent 3” in September 2025 that burned through users’ limits even faster — and, to be clear, Replit’s pricing gives you your subscription price in credits every single month on top of the cloud hosting necessary to get them online , meaning that a $20-a-month subscriber likely burns at least $25 a month, and Replit remains unprofitable.  Coding platform Augment Code was forced to change its pricing in October 2025 on a per-message basis, which meant that any message you sent cost the same amount no matter how complex the required response. In one case, a user spent $15,000 in tokens on a $250-a-month plan. Since then, Augment Code has moved to a confusing “credit” based model where they claim you use about 293 credits per Claude Sonnet 4.5 task, and users absolutely hate it because Augment Code was too cowardly to charge users based on the actual model pricing, because doing so would scare them away. Now Augment Code is planning to remove its auto-complete and next edit features , claiming that their global usage was in decline and saying that developers “...are no longer working primarily at the level of individual lines of code; instead, they are orchestrating fleets of agents across tasks.”  Elsewhere, Notion bumped its Business Plan from $15 to $20-a-month per user thanks to its new “AI features,” which I imagine sucked for previous business subscribers who didn’t want “AI agents” or any of that crap but did want things like Single Sign On and Premium Integrations. The result? Profit margins dropped by 10% . Great job everybody! In February 2026, Perplexity users noticed that rate limits had been aggressively trimmed from even its January 2026 limits , with $20-a-month subscribers now limited to arbitrary “average use weekly limits” on searches, and “monthly limits” on research queries ( that one user worked out dropped them from 600 deep research queries a month to 20 ), down from 300+ searches a day and generous deep research limits.  Price hikes and product changes are likely to accelerate in the next few months as things get desperate. But now for a quick intermission… I have been training with with Nik Suresh, author of I Will Fucking Piledrive You If You Mention AI Again , and while I’m kidding , I want to be clear that if you don’t stop bringing up Uber and AWS as examples of why AI will work out I may react poorly as I’m fucking tired of this point because it’s stupid and wrong. I will put you in the embrace of God, I swear.  The AI bubble and its representative companies do not and have never represented the buildout of Amazon Web Services or the growth and burnrate of Uber. If you are still saying this you are wrong, ignorant and potentially a big fucking liar.  As I discussed about a month ago , Amazon Web Services cost around $52 billion (adjusted for inflation!) between 2003 (when it was first used internally) through two years after it hit profitability (2017). OpenAI raised $42 billion last year. Anthropic raised $30 billion in February. You are full of shit if you keep saying this.  As I discussed a few weeks ago , Uber’s economics are absolutely nothing like generative AI. Uber did not have capex, and burned those billions on R&D and marketing (making it more similar to Groupon in the end): Here’re some other myths I’m tired of hearing about: Yet the most obvious one that I hear is the funniest: that Anthropic and OpenAI can just raise their prices! As both OpenAI and Anthropic aggressively stumble toward their respective attempts to take their awful businesses public, both are making moves to try and become “respectable businesses,” by which I mean “businesses that still lose billions of dollars but in less-annoying ways.” Last week, OpenAI killed Sora — both the app and the model — along with a $1 billion investment from Disney, with the Wall Street Journal reporting it was burning a million dollars a day , but Forbes estimating the number was closer to $15 million . OpenAI will frame this as part of its "refocus" on a “Superapp” ( per the WSJ ) that combines ChatGPT, coding app codex, and its dangerously shit browser into one rat king of LLM toys that nobody can work out a real business model for. All of this is part of a supposed internal effort to “ prioritize coding and business customers ” that we’ve heard some version of for months. Meanwhile, OpenAI’s attempts to bring advertising to its users have been a little embarrassing, with a two-month-long trial involving “less than 20%” of ChatGPT users resulting in “$100 million in annualized revenue,” better known as about $8.3 million in a month from what was meant to be a business line that brought in “low billions” in 2026 according to the Financial Times . Timing confusingly with this “refocus” is OpenAI’s plan to nearly double its workforce from 4,500 to 8,000 people by the end of 2026 . In fact, writing all this down makes it feel like OpenAI doesn’t really have much of a focus beyond “buy more stuff” and saying “superapp!” every six months. Hey, whatever happened to OpenAI’s plan to be “the interface to the internet” that Alex Heath reported would happen by the first half of 2025 ? Did that happen? Did I miss it? In any case, OpenAI’s other strategy is to absolutely jam the gas pedal on its Codex coding product — for example, one user I found was able to burn $2,192 in tokens on a $200-a-month ChatGPT plan , and another was able to burn $1,461 in three days on the same subscription.  Meanwhile, Anthropic has been in the midst of a months-long rugpull following an all-out media campaign through December and January, pushing Claude Code on tech and business reporters who don’t bother to think too hard about things, per my Hater’s Guide to Anthropic : On February 18, 2026,  Anthropic started banning anybody who used multiple Claude Max accounts , something that had never been an issue before it needed everybody to talk about Claude Code non-stop. The same day, Anthropic “ cleared up ” its Claude Code policies, saying that you can’t connect your Claude account to external services, meaning that all of those people who have been spinning up OpenClaw instances and buying $10,000 worth of Mac Minis are going to find that they’re suddenly having to pay for their API calls.  Around a month later, Anthropic would start a two-week-long 2x-rate limit promotion for off-peak usage that ended on March 27, 2026. A day before on March 26 2026, Anthropic would announce that it was starting “peak hours,”  with Claude users maxing out their sessions faster between the hours of 5am and 11PM pacific time Monday to Friday, with a spokesperson limply adding that “efficiency wins” will “offset this” and only “7% of users will hit the limits.” All of this was sold as a result of “managing the growing demand for Claude.” If I’m honest, this might be Anthropic’s most-egregious swindle yet. By pumping off-peak usage and then immediately cutting it just before introducing peak hours , Anthropic further muddies the water of how much actual access you get to their products. Peak hours appear to have become aggressively restricted, and I imagine off peak feels…something like the regular peak hours used to. Users almost immediately started hitting limits regardless of what time or day they were using it. One user on the $100-a-month Max plan complained about hitting 61% of his session limit after four prompts (which cost $10.26 in tokens). Another said that they hit 63% of their rate limit on their $200-a-month plan in the space of a day, and another hit 95% after 20 minutes of using their Max plan (I’m gonna guess $100-a-month). This person hit their Max limit after “ two or three things .” This one vowed to cancel their $200-a-month subscription after hitting their weekly limit in the space of a day, saying that they (and I’m going off of a translation, so forgive me) “expected a premium experience for $200, and what they got was constant limit stress.” This guy is scared to use Claude Code because of the limits . This guy blew 28% of his limits in less than an hour . This guy “can’t even do basic work on a 20x Max plan.” This guy hit his limits “in a few prompts” on Anthropic’s $20-a-month Pro plan, and the same prompts would have (apparently) consumed 5% of the limits “normally” (I assume last week), and while Thariq from Anthropic assured him that this was abnormal , he didn’t bother to respond to this guy in the thread who said he ran out of usage on the Max plan in 15 minutes . While Anthropic Technical Staff Member Lydia Hallie posted that Anthropic was “aware people are hitting usage limits in Claude Code way faster than expected” and that some investigation was taking place, it’s hard to imagine that Anthropic had no idea that these limits were so severe or that any of this was a surprise.  Naturally, OpenAI had already reset limits on its Codex coding model the second that these reports begun , claiming that they “wanted people to experiment with the magnificent plugins they launched” rather than saying something more-truthful like “we’re lowering limits so that the hogs braying with anger at Anthropic start paying OpenAI instead.” While an eager Redditor claimed that these rate limits were a result of a cache bug on Claude Code , Anthropic quickly said that this wasn’t the reason , nor did they say anything about there being a reason or that anything was wrong.   Meanwhile, users are complaining about the reduced quality of outputs from its Claude Opus 4.6 model , with some saying it acts like cheaper models , and another noting that it might be because of Anthropic’s upcoming Mythos model , which was leaked when Fortune mysteriously somehow discovered an openly-accessible “data cache” that included 3000 assets but somehow no actual information about the model other than it would be a “step change” and its cybersecurity powers were too much to release at once , the tech equivalent of deliberately dropping a magnum condom out of your wallet in front of a woman, or Dril ’s “I was just buying ear medication for my sick uncle…who’s a model by the way” post. I’m gonna be honest I just don’t give a shit about Mythos or Capybara or any blatant leaks intended to spook cybersecurity stocks , especially as these models are also meant to be much more compute-intensive, and thus, vastly more expensive to run.  How will that work with these rate limits, exactly?  I think there’re a few ways this goes: I wager that this is just the first of a few major belt-tightening operations from both Anthropic and OpenAI as they desperately shoulder-barge each other to file the world’s worst S-1. Both companies lose billions of dollars, both companies have no path to profitability, and both companies sell products — both to consumers and businesses — that simply do not work when users are forced to pay something approaching a sustainable cost.  Even with these egregious limits, a user I previously linked to was allowed to burn $10 in tokens in four prompts on a $100-a-month plan. Even in the world of Amodei’s Stylized Facts, that would still be $5 of prompts every 5 hours, which over the course of a month will absolutely be over $100.  Yet the sheer fury of Anthropic’s customers only proves the fundamental weakness of Anthropic’s business model, and the impossibility of ever finding any kind of profitability. And the AI industry has nobody to blame but itself. While it’s really easy to make fun of people obsessed with LLMs, I want to be clear that Anthropic and OpenAI are inherently abusive companies that have built businesses on theft, deception and exploitation. Anybody who’s spent more than a few minutes in one of the many AI Subreddits has read story after story of models mysteriously “becoming dumb,” or rate limits that seem to expand and contract at random. Even the concept of “rate limits” only serves to further deceive the customer. Outside of intentionally asking the model, users are entirely unaware of their “token burn,” or at the very least have built habits around rate limits that, as of right now, are entirely different to even a month ago. A user who bought a $200-a-month Claude Pro subscription in December 2025 , a mere three months later, now very likely cannot do the same things they did on Claude Code when they decided to subscribe, and those who use these subscriptions for their day jobs are now having to sit on their hands waiting for the rate limits to pass, and have no clarity into whether they’ll be able to work at the same rate they did even a month ago, let alone when they subscribed.  All of this is a direct result of Anthropic, OpenAI, and other AI startups intentionally deceiving customers through obtuse pricing so that people would subscribe believing that the product would continue providing the same value, and I’d argue that annual subscriptions to these services amount to, if not fraud, a level of consumer deception that deserves legal action and regulatory involvement. To be clear, no AI company should have ever sold a monthly subscription, as there was never a point at which the economics made sense. Yet had these companies actually charged their real costs, nobody would have bothered with AI, because even with these highly-subsidized subscriptions, AI still hasn’t delivered meaningful productivity benefits, other than a legion of people who email me saying “it’s changed my life as a programmer!” without explaining to me what that means or why it matters or what the actual result is at the end.  Isn’t it kind of weird that we have these LLM subscriptions to products that arbitrarily become less-accessible or less-performant in a way that’s impossible to really measure, and labs never seem to address? We don’t know the actual rate limits on Claude (other than via CCusage or Shellac’s research ), or ChatGPT, or any of these products by design , because if we did, it would be blatantly obvious how unsustainable and ridiculous these products were.  And the magical part about Large Language Models is that your most engaged customers are also your most-expensive, and the more-intensive the work, the more expensive the outputs become.  If you’re about to say “well they’ll just raise the prices,” perhaps you should check Twitter or Reddit, and notice that Anthropic’s customers are screaming like they’re being stung to death by bees because of new rate limits that only let them burn $10 of compute in five hours. Do you think these people would be comfortable with a $130-a-month, $1,300-a-month or $2,500-a-month subscription? One that performs the same way (if not worse) as their $20, $100 or $200-a-month subscription did? Or do you think they’ll do Aaron Sorkin speeches about Anthropic’s greed and immediately jump to ChatGPT in the hopes that the exact same thing doesn’t happen a few months later?  Much as homeowners were assured that they’d simply be able to refinance their homes before the adjustable rates hit, AI fans repeatedly switch subscriptions to whichever provider is currently offering the best deal, in some cases paying for multiple subscriptions under the explicit knowledge that rate limits existed and would become increasingly-punishing. Based on the reactions of their users, I don’t really see how the AI labs — or AI startups, for that matter — fix this problem.  On one hand, AI subscribers are acting like babies, crying that their product won’t let them use $2500 of tokens for $200. This was an obvious con, a blatant subsidy, and a party that wouldn’t last forever.  On the other, AI labs and AI startups have never, ever acted with any degree of honesty or clarity with regards to their costs, instead choosing to add “exciting” new features that often burn more tokens without charging the end user more, which sounds nice until you remember that things cost money and money is not unlimited. The very foundation of every AI startup is economically broken. The majority of them sell some sort of “deep research” report feature that costs several dollars to generate at a time, and many sell some form of expensive coding or “computer use” product, tool-based web search features, and many other products that exist to keep a user engaged while burning tokens, all without explaining to the user “yeah, we’re spending way more than we make off of you, this is an introductory rate.” This intentional, blatant and industry-wide deception set the terms for the Subprime AI Crisis. By selling AI services at $20 or $50 or even $200-a-month, AI startups and labs created the terms for their own destruction, with users trained for years to expect relatively unlimited access sold at a flat rate for a service powered by Large Language Models that burn tokens at arbitrary rates based on their inference of the user’s prompt, making costs near-impossible to moderate.   And when these companies make changes to slightly bring costs under control, their users act with revulsion, because rate limits aren’t price increases, but direct changes to the functionality of the product. Imagine if a subscription to a car service was $200-a-month, and let you go 50 miles, or 25 miles, or 100 miles, or 4 miles, or 12 miles depending on the day, and never at any point told you how many miles you had left beyond a percentage-based rate limit. To make matters worse, sometimes the car would arbitrarily take a different route, driving you five miles in the opposite direction, or decide to park on the side of the curb, charging you for every mile.  This is the reality of using an AI product in the year of our lord 2026. A Claude Code or OpenAI Codex user cannot with any clarity say that in three months their current workload or workflow will be possible based on their current subscription. Somebody buying an annual subscription to any AI product is immediately sacrificing themselves to the whims of startup CEOs that intentionally decided to deceive users for years as a means of juicing growth.  And when these limits decay, does it eventually make the ways in which some of these users work with Claude Code impossible? At what point do these rate limit shifts start changing how reliable the experience is and how much one can get done in a day? What use is a tool that gets more unreliable to access and expensive over time? Even if this week’s rate limits are an overcorrection, one has to imagine they resemble the future of Anthropic’s products, and are indicative of a larger pattern of decay in the value of its subscriptions.   I’m going to be as blunt as possible: every bit of AI demand — and barely $65 billion of it existed in 2025 — that exists only exists due to subsidies, and if these companies were to charge a sustainable rate, said demand would evaporate. There is no righting this ship. There is no pricing that makes sense that customers will pay at scale, nor is there a magical technological breakthrough waiting in the wings that will reduce costs. Vera Rubin will not save AI, nor will some sort of “too big to fail” scenario, because “too big to fail” was based on the fact that banks would have stopped providing dollars to people and insurance companies would have  stopped issuing insurance. Despite NVIDIA’s load-bearing valuation and the constant discussion of companies like OpenAI and Anthropic, their actual economic footprint is quite small in comparison to the trillions of dollars of CDOs and trillion plus dollars of mortgages involved in the great financial crisis. The death of the AI industry would be cataclysmic to venture capitalists, bring about the end of the hypergrowth era for the Magnificent Seven, and may very well kill Oracle, but — seriously — that is nothing in comparison to the scale of the Great Financial Crisis. This isn’t me minimizing the chaos to follow, but trying to express how thoroughly fucked everything was in 2008.  On Friday I’m going to get into this more in the premium. This wasn’t an intentional ad, I just realized as I wrote that sentence that that was what I have to do.  Anyway, I’ll close with a grim thought. What’s funny about the comparison to the subprime mortgage crisis is that there are, in all honesty, multiple different versions of the Stripper With Five Houses from The Big Short: All of these entities are acting based on a misplaced belief that the world will cater to them, and that nothing will ever change. While there might be different levels of cynicism — people that know there’re subsidies but assume they’ll be fine once they arrive, or people like Sam Altman that are already rich and don’t give a shit — I think everybody in the AI industry has deluded themselves into believing they have the mandate of Heaven.    Back in August 2024 , I named several pale horses of the AIpocalypse, and after absolutely fucking nailing the call two years early on OpenAI’s “big, stupid magic trick” of launching Sora to the public , I think it’s time to update them: Anyway, thanks for reading this piece. Data centers raise debt from either banks, private credit, private equity or “business development companies,” non-banking entities that borrow money from banks to lend to risky companies. In an analysis of 26 prominent data center deals, I found ( back in December 2025 ) several names — Blue Owl, MUFG (Mitsubishi), Goldman Sachs, JP Morgan Chase, Morgan Stanley, SMBC (Sumitomo Mitsui) and Deutsche Bank — that come up regularly.  AI Labs (and AI startups) raise funding from venture capitalists (EG: Dragoneer (Anthropic, OpenAI, Perplexity) and Founders Fund (Anthropic, OpenAI)), hyperscalers (Google, Amazon, NVIDIA, Microsoft, all of whom have now invested in both OpenAI and Anthropic), sovereign wealth funds (GIC, Singapore’s sovereign wealth fund, invested in Anthropic), and even banks providing lines of credit , as they did for both Anthropic and OpenAI .  Many of the big names in data center development (who I believe have all, in some way, backed CoreWeave) funded those lines of credit, including Morgan Stanley, SMBC, JPMorgan and MUFG. Those common names are points of failure, in particular SMBC and MUFG, two large Japanese banks that have aggressively loaned to just about every part of the AI economy. This pairs badly with the fact that the Japanese government is considering interest rate hikes thanks to the continuing chaos in the Middle East , which will make debt more expensive. Venture capitalists are funded by limited partners (EG: pension funds, investment banks and wealthy individuals), and the venture capital industry is facing an historic liquidity crisis (IE: they can’t raise money and their investments aren’t selling), which means that it cannot sustain the AI industry forever. NVIDIA (and other hardware sellers to a much lesser extent) sells GPUs and the associated hardware to data center developers and hyperscalers . At around $42 million a megawatt between GPUs, data center and power construction, these data centers are almost entirely paid using debt. This is the only link in the chain that is really profitable. Data center developers rent their GPUs to AI labs and hyperscalers. Developers, who raised $178.5 billion in debt in the US alone last year , must borrow heavily to fund buildouts, and due to many of these projects being run by either brand or relatively new developers, debt costs are higher.  As a result, based on my premium data center model , many data center projects are unprofitable even with a paying customer , and that’s assuming they even get built. To make matters worse, as I discussed last week, only 5GW of data center capacity out of over 200GW announced is actually under construction globally , which means many of these loans are currently on interest-only payments. All evidence points to GPU compute either being a low or negative-margin business. CoreWeave — the largest, best-funded and NVIDIA-backed AI compute provider — had an operating margin of -6% and net loss margin of -29% in 2025 .  CoreWeave’s largest customers are Microsoft, OpenAI and NVIDIA, which means that it should, in theory, be getting the best rate around. Hyperscalers like Google, Meta, Amazon, and Microsoft, who both rent GPUs from data center providers and rent GPUs to AI labs (as well as offering API access to some AI labs’ models — Google and Amazon sell Anthropic’s, Microsoft sells OpenAI’s models, and both it’s own models and other models like xAI’s Grok ). Hyperscalers steadfastly refuse to talk about their AI revenues, and do not break out costs. I would also put Oracle in this bucket. AI labs rent GPUs from either hyperscalers or data center companies to either train models or run inference (creating the outputs of models), sell access to models via their API, and offer subscription services to both consumer and business customers. Important detail: in almost every case, an AI lab must make an up front commitment, likely with a prepayment, to secure future capacity. This means that AI labs are often having to pony up massive amounts of up-front capital on top of their incredibly high ongoing costs. Anthropic has made $5 billion in revenue and spent $10 billion on compute to date , and had to raise another $30 billion in February 2026 after raising about $16.5 billion in 2025 alone. Through September 2025, OpenAI made $4.3 billion in revenue and spent $8.67 billion on inference alone . Neither of these companies have a path to profitability. AI startups buy access to models via AI labs’ API, building services that have “AI features” powered by said models, paid on a per-million token basis (for input tokens (user-fed data) and output tokens (model outputs)). Every single AI startup is unprofitable , and every AI startup functions by offering a service powered by AI models provided by AI labs. In every case that I’ve found, these providers always offer far more in token burn than the cost of their subscriptions. Consumers and businesses pay for monthly subscriptions or, in some cases, API access to models. Customers paying for AI services in most cases pay for a monthly service, such as Anthropic’s Claude Pro or Max or Perplexity Pro/Max, running from $20 a month to $200 a month.  These subscriptions for the most part mask the amount of tokens that you are actually burning as a customer, but in every single case that I’ve found, that amount is always in excess of the subscription cost. Cursor has, at this point, raised $3.36 billion, and turned it into, at best, about a billion dollars of revenue, and that’s assuming it linearly grew between periods versus (more likely) having up and down months. As AI labs grow, their costs increase dramatically, both in their immediate compute costs and the demands from GPU providers for up-front cash to secure future compute allocation.  In parallel, as AI startups grow, they burn more money per customer, which increases their dependence on venture capital. As this happens, AI labs are facing both a cash and compute crush, which means they have to start either controlling the amount of compute customers use or make more money from serving it. AI labs are thus forced to raise prices on AI startups, either through tolls (priority processing) or raw cost increases. Another important detail: one of the ways that AI labs raise prices isn’t even through “making things more expensive,” but selling access to models that burn more tokens. Think of this as the variable rate mortgage of the Subprime AI Crisis.  As AI labs raise prices on their AI startup clients, these startups are forced to reduce the quality of their services and/or increase their costs after years of getting their customers used to a significantly-cheaper or better service, which makes their products less attractive, leading to customer churn.  Worse still, these customers are used to using subscriptions from Anthropic and OpenAI with remarkable rate limits that are impossible for even a well-capitalized AI startup to compete with, which means that these changes only slow the rate of burn rather than making these companies profitable. As a result, these AI startups are more dependent on venture capital. While OpenAI and Anthropic are pretty happy on the top of the food chain, they are also dependent on the existence of AI startups for revenue for their models, which means that while these price changes increase the amount of revenue they get in the short term, they invariably push AI startups toward cheaper open source models and death. AI labs have, this entire time, been massively subsidizing their own products. Per Forbes , AI coding platform Cursor has faced numerous problems competing with Anthropic, who it claims at one point let users burn $5000 a month in tokens on a $200-a-month subscription, which reflects my own reporting from last year . Cursor also claims in the same article that its enterprise customers are profitable, but I call bullshit considering the multiple enterprise customers who have reached out to tell me they can burn $2 or $3 for every $1 of subscription.  The problem is that a subsidy is always a losing proposition, which means that at some point Anthropic and OpenAI will have to massively reduce the amount of tokens that people use on their accounts. As I’ll get to later, this infuriates users and sends them running for the doors. At some point, the cost of doing business with Anthropic and OpenAI will kill AI startups, as there is no point at which any of them become sustainable, which will in turn kill the revenue from selling access to their models. At some point, users will be forced to burn tokens at a rate that actually matches their subscription costs, which will reduce the value of the product, which will in turn reduce the amount of subscribers they will have. And at some point, Anthropic and OpenAI will be left with a bunch of compute reservations they’ve made that they don’t need and can’t afford due to miss-timed growth projections. As Dario Amodei said back in February, there’s no hedge on Earth that could stop Anthropic from going bankrupt if it buys too much compute . As the two largest customers of AI compute — there really isn’t even a distant third outside of xAI and hyperscalers, the latter of which are predominantly standing up OpenAI and Anthropic (or in Meta’s case a bunch of unprofitable LLM bullshit) — who’s going to pay for all of those data centers? Fucking Aquaman ? Users are inherently trained to expect a service that they pay for on a monthly basis, and their experience of said service is entirely separated from “token burn,” making it impractical to impossible to get them to use models directly, or to apply rate limits. The longer a user has used the service, the more their habits orient around an “unlimited” or “partially limited” service, which means your only options are to raise prices or apply rate limits, with the only justification for either of them being “new models” (which are more expensive) or “we’re unable to afford to run our company,” which the user doesn’t give a shit about. They’re profitable on inference - no they are not! There is no proof of this statement anywhere! What’s your source here? Sam Altman saying it in August 2025 ? Dario Amodei saying he had gross margins of 50%? That was a “stylized fact” that he specifically said wasn’t about Anthropic , not that you care! What else have you got for me here? SemiAnalysis’ InferenceX ? Gun to your head, explain to me how that’s the case. Oh you’ve heard the companies do “batch processing”? Why is all that “batch processing” not making them profitable?  I swear to god if you say any shit about how these companies would be “profitable without training” I’m going to scream. No! AI training costs are not going away. They are an inherent part of running these companies, and are not capex . They are operating expenses . AI is being funded by the largest companies in the world with the most healthy balance sheets- I will obliterate you with the 100-Type Guanyin Bodhisattva ! Microsoft is the only remaining hyperscaler that is funding the AI buildout without debt, and none of them will talk about AI revenues. This point is trotted out by imbeciles to try and say “this is nothing like the dot com bubble,” which I fundamentally agree with — it’s worse! It’s weirder! It’s a bigger waste! And they collectively need $2 trillion in brand new AI revenues by 2030 for any of it to make sense! The cost of AI services is going down because the token prices are going down - you are a silly person! You do not actually understand anything! The cost of tokens is not the same as the cost of serving tokens! OpenAI cut the cost of its o3 reasoning model by 80% a few weeks after the release of Claude Opus 4 . Do you think that happened because of magical price reductions on the ops side? If so, I wish to study your brain. It’s the gym model they want people to subscribe and not use it it’s the gym model it’s the gym model- TZZZZT, whoops, looks like you got tazed. Anthropic will announce that it’s “fixed the bug” (IE: eased rate limits it intentionally set) and apologize to the community, prolonging the inevitable. Rate limits will continue to decay over time, just at a slower pace.  Anthropic keeps the limits where they are, and we hit a new normal that makes everybody really mad. The AI companies that only have customers because they spend $3 to $10 for every dollar of revenue. The venture capitalists that are ultra-rich on paper, heavily leveraging their firms in companies like Harvey (worth “$11 billion”) and Cursor (worth “$29.3 billion”) that burn hundreds of millions or billions of dollars and are now both too large to sell to another company and too shitty a company to take public. The AI labs that have built massive businesses on selling heavily-subsidized subscriptions to customers who don’t want to pay for them and API calls to AI startups that can only pay them if infinite resources exist. The AI data center companies that, thanks to readily-available debt, have started 200GW of projects (and only started building 5GW of them) for AI demand that doesn’t exist, entirely based on the theoretical sense that maybe it will in the future. Oracle, who is building hundreds of billions of dollars of data centers for OpenAI (which needs infinite resources to be able to pay its compute costs), is taking on equally-large amounts of debt, all because it assumes that nothing bad will ever happen. The customers of AI startups that are building lifestyles, identities and workflows around them believing that we’re “just at the beginning” on top of unsustainable AI subscriptions. Any further price increases or service degradations from Anthropic and OpenAI are a sign that they’re running low on cash. Any reduction in capex from big tech is a sign that the AI bubble is bursting, as NVIDIA’s continued growth only comes from Microsoft, Google, Amazon, Meta, Oracle and other large companies buying tens of billions of dollars of servers from Taiwanese ODMs like Foxconn and Quanta. Any further price increases or service degradations from AI startups , such as Cursor, Perplexity, Harvey, Lovable or Replit. These are all token-intensive venture-hogs that burn $4 or $5 for every $1 of revenue. Any discussion of layoffs at AI companies . The collapse of a data center deal that has yet to commence construction. The collapse of a data center already in construction, but before it’s finished. The collapse of an already-constructed data center. CoreWeave or any major data center player having trouble or failing to raise debt. We’ve already seen the beginnings of this with CoreWeave’s issues raising for its Lancaster PA data center . The Further Collapse of Stargate Abilene: If anything happens to the construction of OpenAI’s flagship data center (being built by Oracle) in Abilene Texas, you know shit is getting bad. Any problems or delays with OpenAI or Anthropic going public: both of these companies are the financial equivalent of Chernobyl, so I can only imagine it’ll take some talented accountants to get them in any shape where investors without lead poisoning actually want to get involved. Any problems with Blue Owl as an ongoing concern: Blue Owl is the loosest lender in the AI bubble, and if it falls behind on their loans or has issues with its limited partners, that’s a bad sign too. Any problems with SoftBank : SoftBank was somehow able to raise $40 billion in debt (payable in a year ) to fund its chunk of OpenAI’s pseudo-$110 billion round, running over its promised 25% ratio of loans to the value of its assets. This puts SoftBank in a very precarious position. ARM’s stock tanking: A great deal of SoftBank’s wealth comes from its investment in ARM, including a $15 billion margin loan based on its stock. If ARM drops below $80, things are going to get hairy for Masayoshi Son. Any issues with NVIDIA’s customers’ ability to pay: If NVIDIA’s customers don’t reliably pay it, things will look bad come earnings season. NVIDIA misses on earnings: This is an obvious one, but I think the markets will crap their pants if NVIDIA misses on earnings estimates.

0 views
HeyDingus 2 weeks ago

I’m returning my Studio Display XDR and buying another one

Sooo… I did a thing. I couldn’t help but be slightly dissatisfied by the clarity of my Studio Display XDR ’ s nano-texture display. It just made everything look a little less than Retina-quality. And for this price, I don’t want to have lingering regrets each time I use it. So, I ordered a second non-nano-texture version, banking on Apple’s generous return policy . It came in today. I set it up about 30 minutes ago. I put the two displays side by side and… it’s no question. The nano-texture is going back. Showing the same content on each display, at the same brightness level, I can absolutely see the fuzziness introduced by the “ matte” display. It’s not that nano-texture is all bad. I love how it looks when the display is dark — there are zero reflections. 1 But the point is to enjoy it while the display is on . Without nano-texture, everything is as crisp as I had hoped. I tend to lean toward the display when I’m concentrating, and even close up, the display is razor sharp. I technically have until April 9th to send back the nano-texture XDR , but, honestly, I think I’m going to package it up tonight. Well… maybe tomorrow. I might as well enjoy having 10k pixels of display at my disposal while I can. If I hold onto the original display until the last day that I can send it back, I will have had it for 24 days. That’s a full 10 extra days beyond the stated 14-day return period. It’s possible that I could have squeezed in even a few more days by initiating the return today, the 14th day after it was delivered, instead of the 11th. With that in mind, one could get nearly a month of use for testing and comparison of Apple’s products, with the ability to return it (free shipping both ways) for a full refund. That’s serious commitment to customer satisfaction, and one area where Apple’s standards haven’t slipped. To boot, by paying with Apple Card’s Monthly Installments (which allow you to pay for an item over 12 months with 0% interest), I’ve only been charged $287.92 for the nano-texture display, and $263.92 for the regular one. I think that was just the taxes for each one. To be sure, it’s a privileged position I’m in to be able to do these shenanigans, but there’s a lot to be said for how easy Apple has made it to purchase even it’s most expensive products with very little risk. If I were in an environment with light sources behind me, my decision might be very different. I think there’s definitely a place for this non-reflective display — it’s just not in my home office. ↩︎ HeyDingus is a blog by Jarrod Blundy about technology, the great outdoors, and other musings. If you like what you see — the blog posts , shortcuts , wallpapers , scripts , or anything — please consider leaving a tip , checking out my store , or just sharing my work. Your support is much appreciated! I’m always happy to hear from you on social , or by good ol' email . If I were in an environment with light sources behind me, my decision might be very different. I think there’s definitely a place for this non-reflective display — it’s just not in my home office. ↩︎

0 views
iDiallo 2 weeks ago

How we get radicalized in America

Be healthy, be young, fall ill. You have a great job of course, you have insurance. It would be ok if the worst thing about health insurance in America was it is hard to navigate. No! The actual problem is that your insurance is incentivized not to cover you at your most vulnerable moment. You pay them every month. That's money that goes from your paycheck, into their pockets. Now if they cover you, that's money that leaves their pocket, and go into your treatment. There are two ways they can make money. 1. You continue paying every month, and never fall ill. 2. You fall ill, and they deny you care. Only the second option is an active option. Health Insurance is a scam that we have normalized in the United States. It helps no one, it makes healthcare unaffordable, and you have to fight tooth and nail to get any sort of care. When Luigi was in the headlines, and news anchors were asking how such a young man can get radicalized, I shook my head. In America, it is our tradition to get 2 jobs. It is our tradition to live paycheck to paycheck. And it is our tradition to get radicalized the moment we get sick. When you get sick, the healthcare industry tries to charge much as they can get away with and the insurance industry tries to deny as much as it can.

1 views

The AI Industry Is Lying To You

Hi! If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . I just put out a massive Hater’s Guide To The SaaSpocalypse , as well as the Hater’s Guide to Adobe . It helps support free newsletters like these! The entire AI bubble is built on a vague sense of inevitability — that if everybody just believes hard enough that none of this can ever, ever go wrong that at some point all of the very obvious problems will just go away. Sadly, one cannot beat physics. Last week, economist Paul Kedrosky put out an excellent piece centered around a chart that showed new data center capacity additions (as in additions to the pipeline, not brought online ) halved in the fourth quarter of 2025 (per data from Wood Mackenzie ):   Wood Mackenzie’s report framed it in harsh terms: As I said above, this refers only to capacity that’s been announced rather than stuff that’s actually been brought online , and Kedrosky missed arguably the craziest chart — that of the 241GW of disclosed data center capacity, only 33% of it is actually under active development: The report also adds that the majority of committed power (58%) is for “wires-only utilities,” which means the utility provider is only responsible for getting power to the facility, not generating the power itself, which is a big problem when you’re building entire campuses made up of power-hungry AI servers.  WoodMac also adds that PJM, one of the largest utility providers in America, “...remains in trouble, with utility large load commitments three times as large as the accredited capacity in PJM’s risked generation queue,” which is a complex way of saying “it doesn’t have enough power.”  This means that fifty eight god damn percent of data centers need to work out their own power somehow. WoodMac also adds there is around $948 billion in capex being spent in totality on US-based data centers, but capex growth decelerated for the first time since 2023 . Kedrosky adds: Let’s simplify: The term you’re looking for there is data center absorption, which is (to quote Data Center Dynamics) “...the net growth in occupied, revenue-producing IT load,” which grew in America’s primary markets from 1.8GW in new capacity in 2024 to 2.5GW of new capacity in 2025 according to CBRE .   The problem is, this number doesn’t actually express newly-turned-on data centers. Somebody expanding a project to take on another 50MW still counts as “new absorption.”  Things get more confusing when you add in other reports. Avison Young’s reports about data center absorption found 700MW of new capacity in Q1 2025 , 1.173GW in Q2 , a little over 1.5GW in Q3 and 2.033GW in Q4 (I cannot find its Q3 report anywhere), for a total of 5.44GW, entirely in “colocation,” meaning buildings built to be leased to others. Yet there’s another problem with that methodology: these are facilities that have been “delivered” or have a “committed tenant.” “Delivered” could mean “the facility has been turned over to the client, but it’s literally a powered shell (a warehouse) waiting for installation,” or it could mean “the client is up and running.” A “committed tenant” could mean anything from “we’ve signed a contract and we’re raising funds” (such as is the case with Nebius raising money off of a Meta contract to build data centers at some point in the future ). We can get a little closer by using the definitions from DataCenterHawk (from whichAvison Young gets its data), which defines absorption as follows :  That’s great! Except Avison Young has chosen to define absorption in an entirely different way — that a data center (in whatever state of construction it’s in) has been leased, or “delivered,” which means “a fully ready-to-go data center” or “an empty warehouse with power in it.”  CBRE, on the other hand, defines absorption as “net growth in occupied, revenue-producing IT load,” and is inclusive of hyperscaler data centers. Its report also includes smaller markets like Charlotte, Seattle and Minneapolis, adding a further 216MW in absorption of actual new, existing, revenue-generating capacity. So that’s about 2.716GW of actual, new data centers brought online. It doesn’t include areas like Southern Virginia or Columbus, Ohio — two massive hotspots from Avison Young’s report — and I cannot find a single bit of actual evidence of significant revenue-generating, turned-on, real data center capacity being stood up at scale. DataCenterMap shows 134 data centers in Columbus , but as of August 2025, the Columbus area had around 506MW in total according to the Columbus Dispatch, though Cushman and Wakefield claimed in February 2026 that it had 1.8GW . Things get even more confusing when you read that Cushman and Wakefield estimates that around 4GW of new colocation supply was “delivered” in 2025, a term it does not define in its actual report, and for whatever reason lacks absorption numbers. Its H1 2025 report , however, includes absorption numbers that add up to around 1.95GW of capacity…without defining absorption, leaving us in exactly the same problem we have with Avison Young.  Nevertheless, based on these data points, I’m comfortable estimating that North American data center absorption — as the IT load of data centers actually turned on and in operation — was at around 3GW for 2025 , which would work out to about 3.9GW of total power. And that number is a fucking disaster. Earlier in the year, TD Cowen’s Jerome Darling told me that GPUs and their associated hardware cost about $30 million a megawatt. 3GW of IT load (as in the GPUs and their associated gear’s power draw) works out to around $90 billion of NVIDIA GPUs and the associated hardware, which would be covered under NVIDIA’s “data center” revenue segment: America makes up about 69.2% of NVIDIA’s revenue, or around $149.6 billion in FY2026 (which runs, annoyingly, from February 2025 to January 2026). NVIDIA’s overall data center segment revenue was $195.7 billion, which puts America’s data center purchases at around $135 billion, leaving around $44 billion of GPUs and associated technology uninstalled. With the acceleration of NVIDIA’s GPU sales, it now takes about 6 months to install and operationalize a single quarter’s worth of sales. Because these are Blackwell (and I imagine some of the new next generation Vera Rubin) GPUs, they are more than likely going to new builds thanks to their greater power and cooling requirements, and while some could in theory be going to old builds retrofitted to fit them, NVIDIA’s increasingly-centralized (as in focused on a few very large customers) revenue heavily suggests the presence of large resellers like Dell or Supermicro (which I’ll get to in a bit) or the Taiwanese ODMs like Foxconn and Quanta who manufacture massive amounts of servers for hyperscaler buildouts.  I should also add that it’s commonplace for hyperscalers to buy the GPUs for their colocation partners to install, which is why Nebius and Nscale and other partners never raise more than a few billion dollars to cover construction costs.  It’s becoming very obvious that data center construction is dramatically slower than NVIDIA’s GPU sales, which continue to accelerate dramatically every single quarter. Even if you think AI is the biggest most hugest and most special boy: what’s the fucking point of buying these things two to four years in advance? Jensen Huang is announcing a new GPU every year!  By the time they actually get all the Blackwells in Vera Rubin will be two years old! And by the time we install those Vera Rubins, some other new GPU will be beating it!  Before we go any further, I want to be clear how difficult it is to answer the question “how long does a data center take to build?”. You can’t really say “[time] per megawatt” because things become ever-more complicated with every 100MW or so. As I’ll get into, it’s taken Stargate Abilene two years to hit 200MW of power . Not IT load. Power .  Anyway, the question of “how much data center capacity came online?” is pretty annoying too.  Sightline ’s research — which estimated that “almost 6GW of [global data center power] capacity came online last year” — found that while 16GW of capacity was slated to come online in 2026 across 140 projects, only 5GW is currently under construction, and somehow doesn’t say that “maybe everybody is lying about timelines.” Sightline believes that half of 2026’s supposed data center pipeline may never materialize, with 11GW of capacity in the “announced” stage with “...no visible construction progress despite typical build timelines of 12-18 months.” “Under construction” also can mean anything from “ a single steel beam ” to “nearly finished.” These numbers also are based on 5GW of capacity , meaning about 3.84GW of IT load, or about $111.5 billion in GPUs and associated gear, or roughly 57.5% of NVIDIA’s FY2026 revenue that’s actually getting built. Sightline (and basically everyone else) argues that there’s a power bottleneck holding back data center development, and Camus explains that the biggest problem is a lack of transmission capacity (the amount of power that can be moved) and power generation (creating the power itself):  Camus adds that America also isn’t really prepared to add this much power at once: Nevertheless, I also think there’s another more-obvious reason: it takes way longer to build a data center than anybody is letting on, as evidenced by the fact that we only added 3GW or so of actual capacity in America in 2025. NVIDIA is selling GPUs years into the future, and its ability to grow, or even just maintain its current revenues, depends wholly on its ability to convince people that this is somehow rational. Let me give you an example. OpenAI and Oracle’s Stargate Abilene data center project was first announced in July 2024 as a 200MW data center . In October 2024, the joint venture between Crusoe, Blue Owl and Primary Digital Infrastructure raised $3.4 billion , with the 200MW of capacity due to be delivered “in 2025.” A mid-2025 presentation from land developer Lancium said it would have “1.2GW online by YE2025.” In a May 2025 announcement , Crusoe, Blue Owl, and Primary Digital Infrastructure announced the creation of a $15 billion joint vehicle, and said that Abilene would now be 8 buildings, with the first two buildings being energized by the “first half of 2025,” and that the rest would be “energized by mid-2026.” Each building would have 50,000 GPUs, and the total IT load is meant to be 880MW or so, with a total power draw of 1.2GW.  I’m not interested in discussing OpenAI not taking the supposedly-planned extensions to Abilene because it never existed and was never going to happen .  In December 2025, Oracle stated that it had “delivered” 96,000 GPUs , and in February, Oracle was still only referring to two buildings , likely because that’s all that’s been finished. My sources in Abilene tell me that Building Three is nearly done, but…this thing is meant to be turned on in mid-2026. Developer Mortensen claims the entire project will be completed by October 2026 , which it obviously, blatantly won’t. I hate to speak in conspiratorial terms, but this feels like a blatant coverup with the active participation of the press. CNBC reported in September 2025 that “ the first data center in $500 billion Stargate project is open in Texas ,” referring to a data center with an eighth of its IT load operational as “online” and “up and running,” with Crusoe adding two weeks later that it was “live,” “up and running” and “continuing to progress rapidly,” all so that readers and viewers would think “wow, Stargate Abilene is up and running” despite it being months if not years behind schedule. At its current rate of construction, Stargate Abilene will be fully built sometime in late 2027. Oracle’s Port Washington Data Center, as of March 6 2026, consisted of a single steel beam . Stargate Shackelford Texas broke ground on December 15 2025 , and as of December 2025, construction barely appears to have begun in Stargate New Mexico . Meta’s 1GW data center campus in Indiana only started construction in February 2026 .  And, despite Microsoft trying to mislead everybody that its Wisconsin data center had ‘arrived” and “been built,” looking even an inch deeper suggests very little has actually come online” — and, considering the first data center was $3.3 billion ( remember: $14 million a megawatt just for construction), I imagine Microsoft has successfully brought online about 235MW of power for Fairwater. What Microsoft wants you to think is it brought online gigawatts of power (always referred to in the future tense), because Microsoft, like everybody else, is building data centers at a glacial pace, because construction takes forever, even if you have the power, which nobody does! The concept of a hundred-megawatt data center is barely a few years old, and I cannot actually find a built, in-service gigawatt data center of any kind, just vague promises about theoretical Stargate campuses built for OpenAI, a company that cannot afford to pay its bills.  Everybody keeps yammering on about “what if data centers don’t have power” when they should be thinking about whether data centers are actually getting built. Microsoft proudly boasted in September 2025 about its intent to build “the UK’s largest supercomputer” in Loughton, England with Nscale, and as of March 2026, it’s literally a scaffolding yard full of pylons and scrap metal . Stargate Abilene has been stuck at two buildings for upwards of six months.  Here’s what’s actually happening: data center deals are being funded by eager private credit gargoyles that don’t know shit about fuck. These deals are announced, usually by overly-eager reporters that don’t bother to check whether the previous data centers ever got built, as massive “multi-gigawatt deals,” and then nobody follows up to check whether anything actually happened.  All that anybody needs to fund one of these projects is an eager-enough financier and a connection to NVIDIA. All Nebius had to do to raise $3.75 billion in debt was to sign a deal with Meta for data center capacity that doesn’t exist and will likely take three to four years to build (it’s never happening). Nebius has yet to finish its Vineland, New Jersey data center for Microsoft , which was meant to be “ at 100MW ” by the end of 2025, but appears to have only had 50MW (the first phase) available as of February 2026 .  I’m just gonna come out and say it: I think a lot of these data center deals are trash, will never get built, and thus will never get paid. The tech industry has taken advantage of an understandable lack of knowledge about construction or power timelines in the media to pump out endless stories about “data center capacity in progress” as a means of obfuscating an ever-growing scandal: that hundreds of billions of NVIDIA GPUs got sold to go in projects that may never be built. These things aren’t getting built, or if they’re getting built, it’s taking way, way longer than expected, which means that interest on that debt is piling up. The longer it takes, the less rational it becomes to buy further NVIDIA GPUs — after all, if data centers are taking anywhere from 18 months to three years to build, why would you be buying more of them? Where are you going to put them, Jensen? This also seriously brings into question the appetite that private credit and other financiers have for funding these projects, because much of the economic potential comes from the idea that these projects get built and have stable tenants. Furthermore, if the supply of AI compute is a bottleneck, this suggests that when (or if) that bottleneck is ever cleared, there will suddenly be a massive supply glut, lowering the overall value of the data centers in progress…which are, by the way, all filled with Blackwell GPUs, which will be two or three-years-old by the time the data centers are finally turned on. That’s before you get to the fact that the ruinous debt behind AI data centers makes them all remarkably unprofitable , or that their customers are AI startups that lose hundreds of millions or billions of dollars a year , or that NVIDIA is the largest company on the stock market, and said valuation is a result of a data center construction boom that appears to be decelerating and even if it wasn’t operating at a glacial pace compared to NVIDIA’s sales . Not to sound unprofessional or nothing, but what the fuck is going on? We have 241GW of “planned” capacity in America, of which only 79.5GW of which is “under active development,” but when you dig deeper, only 5GW of capacity is actually under construction?   The entire AI bubble is a god damn mirage. Every single “multi-gigawatt” data center you hear about is a pipedream, little more than a few contracts and some guys with their hands on their hips saying “brother we’re gonna be so fuckin’ rich!” as they siphon money from private credit — and, by extension, you, because where does private credit get its capital from? That’s right. A lot comes from pension funds and insurance companies. Here’s the reality: data centers take forever. Every hyperscaler and neocloud talking about “contracted compute” or “planned capacity” may as well be telling you about their planned dinners with The Grinch and Godot. The insanity of the AI buildout will be seen as one of the largest wastes of capital of all time ( to paraphrase JustDario ), and I anticipate that the majority of the data center deals you’re reading about simply never get built. The fact that there’s so much data about data center construction and so little data about completed construction suggests that those preparing the reports are in on the con. I give credit to CBRE, Sightline and Wood Mackenzie for having the courage to even lightly push back on the narrative, even if they do so by obfuscating terms like “capacity” or “power” in ways that reporters and other analysts are sure to misinterpret. Hundreds of billions of dollars have been sunk into buying GPUs, in some cases years in advance, to put into data centers that are being built at a rate that means that NVIDIA’s 2025 and 2026 revenues will take until 2028 to 2029 to actually operationalize, and that’s making the big assumption that any of it actually gets built. I think it’s also fair to ask where the money is actually going. 2025’s $178.5 billion in US-based data center deals doesn’t appear to be resulting in any immediate (or even future) benefit to anybody involved. I also wonder whether the demand actually exists to make any of this worthwhile, or what people are actually paying for this compute.  If we assume 3GW of IT load capacity was brought online in America, that should (theoretically) mean tens of billions of dollars of revenue thanks to the “insatiable demand for AI” — except nobody appears to be showing massive amounts of revenue from these data centers.  Applied Digital only had $144 million in revenue in FY2025 (and lost $231 million making it). CoreWeave, which claimed to have “ 850MW of active power (or around 653MW of IT load)” at the end of 2025 (up from 420MW in Q1 FY2025 , or 323MW of IT load), made $5.13 billion of revenue (and lost $1.2 billion before tax ) in FY2025 .  Nebius? $228 million, for a loss of $122.9 million on 170MW of active power (or around 130MW of IT load). Iren lost $155.4 million on $184.7 million last quarter , and that’s with a release of deferred tax liabilities of $182.5 million. Equinix made about $9.2 billion in revenue in its last fiscal year , and while it made a profit , it’s unclear how much of that came from its large and already-existent data center portfolio , though it’s likely a lot considering Equinix is boasting about its “multi-megawatt” data center plans with no discussion of its actual capacity . And, of course, Google, Amazon, and Microsoft refuse to break out their AI revenues. Based on my reporting from last year , OpenAI spent about $8.67 billion on Azure through September 2025, and Anthropic around $2.66 billion in the same period on Amazon Web Services . As the two largest consumers of AI compute, this heavily suggests that the actual demand for AI services is pretty weak, and mostly taken up by a few companies (or hyperscalers running their own services.)  At some point reality will set in and spending on NVIDIA GPUs will have to decline. It’s truly insane how much has been invested so many years in the future, and it’s remarkable that nobody else seems this concerned. Simple questions like “where are the GPUs going?” and “how many actual GPUs have been installed?” are left unanswered as article after article gets written about massive, multi-billion dollar compute deals for data centers that won’t be built before, at this rate, 2030.  And I’d argue it’s convenient to blame this solely on power issues, when the reality is clearly based on construction timelines that never made any sense to begin with. If it was just a power issue, more data centers would be near or at the finish line, waiting for power to be turned on. Instead, well-known projects like Stargate Abilene are built at a glacial pace as eager reporters claim that a quarter of the buildings being functional nearly a year after they were meant to be turned on is some sort of achievement. Then there’s the very, very obvious scandal that NVIDIA, the largest company on the stock market, is making hundreds of billions of dollars of revenue on chips that aren’t being installed. It’s fucking strange, and I simply do not understand how it keeps beating and raising expectations every quarter given the fact that the majority of its customers are likely going to be able to use their current purchases in the next decade. Assuming that Vera Rubin actually ships in 2026, it’s reasonable to believe that people will be installing these things well into 2028, if not further, and that’s assuming everything doesn’t collapse by then. Why would you bother? What’s the point, especially if you’re sitting on a pile of Blackwell GPUs?  Why are we doing any of this?  Last week also featured a truly bonkers story about Supermicro, a reseller of GPUs used by CoreWeave and Crusoe, where co-founder Wally Liaw and several other co-conspirators were arrested for selling hundreds of millions of dollars of NVIDIA GPUs to China , with the intent to sell billions more.  Liaw, one of Supermicro’s co-founders, previously resigned in a 2018 accounting scandal where Supermicro couldn’t file its annual reports, only to be (per Hindenburg Research’s excellent report ) rehired in 2021 as a consultant , and restored to the board in 2023, per a filed 8K .  Mere days before his arrest, Liaw was parading around NVIDIA’s GTC conference , pouring unnamed liquids in ice luges and standing two people away from NVIDIA CEO Jensen Huang. Liaw was also seen congratulating the CEO of Lambda on its new CFO appointment on LinkedIn , as well as shaking hands (along with Supermicro CEO Charles Liang, who has not been arrested or indicted) with Crusoe (the company building OpenAI’s Abilene data center) CEO Chase Lochmiller .  Supermicro isn’t named in the indictment for reasons I imagine are perfectly normal and not related to keeping the AI party going . Nevertheless, Liaw and his co-conspirators are accused of shipping hundreds of millions of dollars’ worth of NVIDIA GPUs to China through a web of counterparties and brokers, with over $510 million of them shipped between April and mid-May 2025. While the indictment isn’t specific as to the breakdown, it confirms that some Blackwell GPUs made it to China, and I’d wager quite a few. The mainstream media has already stopped thinking about this story, despite Supermicro being a huge reseller of NVIDIA gear, contributing billions of dollars of revenue, with at least $500 million of that apparently going to China. The fact that Supermicro wasn’t specifically named in the case is enough to erase the entire tale from their minds, along with any wonder about how NVIDIA, and specifically Jensen Huang, didn’t know. This also isn’t even close to the only time this has happened. Late last year, Bloomberg reported on Singapore-based Megaspeed — a (to quote Bloomberg) “once-obscure spinoff of a Chinese gaming enterprise [that] evolved into the single largest Southeast Asian buyer of NVIDIA chips” — and highlighted odd signs that suggest it might be operating as a front for China.  As a neocloud, Megaspeed rents out AI compute capacity like CoreWeave, and while NVIDIA (and Megaspeed) both deny any of their GPUs are going to China, Megaspeed, to quote Bloomberg, has “something of a Chinese corporate twin”: Bloomberg reported that Megaspeed imported goods “worth more than a thousand times its cash balance in 2023,” with two-thirds of its imports being NVIDIA products. The investigation got weirder when Bloomberg tried to track down specific circuit boards that NVIDIA had told the US government were in specific sites: Things get weirder throughout the article, with a Chinese company called “Shanghai Shuoyao” having a near-identical website and investor deck (as mentioned) to Megaspeed, with several of the “computing clusters under construction” actually being in China.  Things get a lot weirder as Bloomberg digs in, including a woman called “Huang” that may or may not be both the CEO of Megaspeed and an associated company called “Shanghai Hexi,” which is also owned by the Yangtze River Delta project… who was also photographed sitting next to Jensen Huang at an event in Taipei in 2024. While all of this is extremely weird and suspicious, I must be clear there is no declarative answer as to what’s going on, other than that NVIDIA GPUs are absolutely making it to China, somehow. I also think that it would be really tough for Jensen Huang to not know about it, or for billions of dollars of GPUs to be somewhere without NVIDIA’s knowledge.  Anyway, Supermicro CEO Charles Liang has yet to comment on Wally Liaw or his alleged co-conspirators, other than a statement from the company that says that their acts were “ a contravention of the Company’s policies and compliance controls .”  Jensen Huang does not appear to have been asked if he knew anything about this — not Megaspeed, not Supermicro, or really any challenging question of any kind for the last few years of his life.  Huang did, however, say back in May 2025 that there was “no evidence of any AI chip diversion,’ and that the countries in question “monitor themselves very carefully.”  For legal reasons I am going to speak very carefully: I cannot say that Jensen is wrong, or lying, but I think it’s incredible, remarkable even, that he had no idea that any of this was going on. Really? Hundreds of millions if not billions of dollars of GPUs are making it to China — as reported by The Information in December 2025 — and Jensen Huang had no idea? I find that highly unlikely, though I obviously can’t say for sure. In the event that NVIDIA had knowledge — which I am not saying it did, of course — this is a huge scandal that, for the most part, nobody has bothered to keep an eye on outside of a few brave souls at The Information and Bloomberg who give a shit about the truth. Has anybody bothered to ask Jensen about this? People talk to him on camera all the time.  I’ll also add that I am shocked that so many people are just shrugging and moving on from Supermicro, which is a major supplier of two of the major neoclouds (Crusoe and CoreWeave) and one of the minors (Lambda, which they also rents cloud capacity to). The idea that a company had no idea that several percentage points of its revenue were flowing directly to China via one of its co-founders is an utter joke. I hope we eventually find out the truth. Nevertheless, this kind of underhanded bullshit is a sign of desperation on the part of just about everybody involved. So, I want to explain something very clearly for you, because it’s important you understand how fucked up shit has become: hyperscalers are forcing everybody in their companies to use AI tools as much as possible, tying compensation and performance use to token burn, and actively encouraging non-technical people to vibe-code features that actually reach production.  In practice, this means that everybody is being expected to dick around with AI tools all day, with the expectation that you burn massive amounts of tokens and, in the case of designers working in some companies, actively code features without ever knowing a line of code.  “How do I know the last part? Because a trusted source told me — and I’ll leave it at that” One might be forgiven for thinking this means that AI has taken a leap in efficacy, but the actual outcomes are a labyrinth of half-functional internal dashboards that measure random user data or convert files, spending hours to save minutes of time at some theoretical point. While non-technical workers aren’t necessarily allowed to ship directly to production, their horrifying pseudo-software, coded without any real understanding of anything, is expected to be “fixed” by actual software engineers who are also expected to do their jobs. These tools also allow near-incompetent Business Idiot software engineers to do far more damage than they might have in the past. LLM use is relatively-unrestrained (and actively incentivized) in at least one hyperscaler, with just about anybody allowed to spin up their own OpenClaw “AI agent” (read: series of LLMs that allegedly can do stuff with your inbox or Slack for no clear benefit, other than their ability to delete all of your emails ). In Meta’s case , this ended up causing a severe security breach: According to The Information, Meta systems storing large amounts of company and user-related data were accessible to engineers who didn’t have permission to see them, and was marked a sec-1 incident, the second highest level of severity on an internal scale that Meta uses to rank security incidents.  The incident follows multiple problems caused at Amazon by its Kiro and Q LLMs. I quote Business Insider ’s Eugene Kim:  Despite the furious (and exhausting) marketing campaign around “the power of AI code,” I believe that these events are just the beginning of the true consequences of AI coding tools: the slow destruction of the tech industry’s software stack.  LLMs allow even the most incompetent dullard to do an impression of a software engineer, by which I mean you can tell it “make me software that does this” or “look at this code and fix it” and said LLM will spend the entire time saying “you got this” and “that’s a great solution.”  The problem is that while LLMs can write “all” code, that doesn’t mean the code is good, or that somebody can read the code and understand its intention (as these models do not think), or that having a lot of code is a good thing both in the present and in the future of any company built using generative code.  LLM-based code is often verbose, and rarely aligns with in-house coding guidelines and standards, guaranteeing that it’ll take far longer to chew through, which naturally means that those burdened with reviewing it will either skim-read it or feed it into another LLM to work out what the hell to do. Worse still, LLM use is also entirely directionless. Why is anybody at Meta using an OpenClaw? What is the actual thing that OpenClaw does, other than burn an absolute fuck-ton of tokens?  Think about this very, very simply for a second: you have given every engineer in the company the explicit remit to write all their code using LLMs, and incentivized them to do so by making sure their LLM use is tracked. You have now massively increased both the operating costs of the company (through token burn costs) and the volume of code being created.  To be explicit, allowing an LLM to write all of your code means that you are no longer developing code, nor are you learning how to develop code, nor are you going to become a better software engineer as a result. This means that, across almost every major tech company, software engineers are being incentivized to stop learning how to write software or solve software architecture issues .   If you are just a person looking at code, you are only as good as the code the model makes, and as Mo Bitar recently discussed, these models are built to galvanize you, glaze you, and tell you that you’re remarkable as you barely glance at globs of overwritten code that, even if it functions, eventually grows to a whole built with no intention or purpose other than what the model generated from your prompt.  Things only get worse when you add in the fact that hyperscalers like Meta and Amazon love to lay off thousands of people at a time, which makes it even harder to work out why something was built in the way it was built, which is even harder when an LLM that lacks any thoughts or intentions builds it. Entire chunks of multi-trillion dollar market cap companies are being written with these things, prompted by engineers (and non-engineers!) who may or may not be at the company in a month or a year to explain what prompts they used.  We’re already seeing the consequences! Amazon lost hundreds of thousands of orders! Meta had a major security breach! The foundations of these companies are being rotted away through millions of lines of slop-code that, at best, occasionally gets the nod from somebody who has “software engineer” on their resume, and these people keep being fired too, raising the likelihood that somebody who knows what’s going on or why something is built a certain way will be able to stop something bad from happening.  Remember: Google, Amazon, Microsoft, and Meta all hold vast troves of personal information, intimate conversations, serious legal documents, financial information, in some cases even social security numbers, and all four of them along with a worrying chunk of the tech industry are actively encouraging their software engineers to stop giving a fuck about software.   Oh, you’re so much faster with AI code? What does that actually mean? What have you built? Do you understand how it works? Did you look at the code before it shipped, or did you assume that it was fine because it didn’t break?  This is creating a kind of biblical plague within software engineering — an entire tech industry built on reams of unmanageable and unintentional code pushed by executives and managers that don’t do any real work. LLMs allow the incompetent to feign competence and the unproductive to produce work-adjacent materials borne of a loathing for labor and craftsmanship, and lean into the worst habits of the dullards that rule Silicon Valley. All the Valley knows is growth , and “more” is regularly conflated with “valuable.” The New York Times’ Kevin Roose — in a shocking attempt at journalism — recently wrote a piece celebrating the competition within Silicon Valley to burn more and more tokens using AI models : Roose explains that both Meta and OpenAI have internal leaderboards that show how many tokens you’ve used, with one software engineer in Stockholm spending “more than his salary in tokens,” though Roose adds that his company pays for them. Roose describes a truly sick culture, one where OpenAI gives awards to those who spend a lot of money on their tokens , adding that he spoke with several tech workers who were spending thousands of dollars a day on tokens “for what amount to bragging rights.” Roose also added one more insane detail: that one person found a loophole in Claude’s $20-a-month using a piece of software made by Figma that allowed them to burn $70,000 in tokens . Despite all of this burn, Roose struggled to find anybody who was able to explain what they were doing beyond “maintaining large, complex pieces of software using coding agents running in parallel,” but managed to actually find one particularly useful bit of information — that all of this might be performative: I do give Roose one point for wondering if “...any of these tokenmaxxers [were] producing anything good, or whether they [were] merely spinning their wheels churning out useless code in an attempt to look busy.” Good job Kevin.  That being said, I find this story horrifying, and veering dangerously close to the actions of drug addicts and cult followers. Throughout this story in one of the world’s largest newspapers, Roose fails to find a single “tokenmaxxer” making something that they can actually describe, which has largely been my experience of evaluating anyone who talks nonstop about the power of “agentic coding.”  These people are sick, and are participating in a vile, poisonous culture based on needless expenses and endless consumption.  Companies incentivizing the amount of tokens you burn are actively creating a culture that trades excess for productivity, and incentivizing destructive tendencies built around constantly having to find stuff to do rather than do things with intention.  They are guaranteeing that their software will be poorly-written and maintained, all in the pursuit of “doing more AI” for no reason other than that everybody else appears to be doing so. Anybody who actually works knows that the most productive-seeming people are often also the most-useless, as they’re doing things to seem productive rather than producing anything of note. A great example of this is a recent Business Insider interview with a person who got laid off from Amazon after learning “AI” and “vibe coding,” and how surprised they were that these supposed skills didn’t make them safer from layoffs: To be clear, this person is a victim . They were pressured by Amazon to take up useless skills and build useless things in an expensive and inefficient way, and ended up losing their job despite taking up tools they didn’t like under duress.  This person was, at one point, actively part of building an internal Amazon site using AI, and had to “learn to vibe code with a lot of trial and error” and the help of a colleague. Was this a good use of her time? Was this a good use of her colleague’s time? No! In fact, across all of these goddamn AI coding hype-beast Twitter accounts and endless proclamations about the incredible power of AI agents, I can find very few accounts of something happening other than someone saying “yeah I’m more productive I guess.”  I am certain that at some point in the near future a major big tech service is going to break in a way that isn’t immediately fixable as a result of thousands of people building software with AI coding tools, a problem compounded by the dual brain drain forces of layoffs and a culture that actively empowers people to look busy rather than actually produce useful things. What else would you expect? You’re giving people a number that they can increase to seem better at their job, what do you think they’re going to do, try and be efficient? Or use these things as much as humanly possible, even if there really isn’t a reason to? I haven’t even gotten to how expensive all of this must be, in part because it’s hard to fully comprehend.  But what I do know is that big tech is setting itself up for crisis after crisis, especially when Anthropic and OpenAI stop subsidizing their models to the tune of allowing people to spend $2500 or more on a $200-a-month subscription .  What happens to the people who are dependent on these models? What happens to the people who forgot how to do their jobs because they decided to let AI write all of their code? Will they even be able to do their jobs anymore?   Large Language Models are creating Silicon Valley Habsburgs — workers that are intellectually trapped at whatever point they started leaning on these models that were subsidized to the point that their bosses encouraged them to use them as much as humanly possible. While they might be able to claw their way back into the workforce, a software engineer that’s only really used LLMs for anything longer than a few months will have to relearn the basic habits of their job, and find that their skills were limited to whatever the last training run for whatever model they last used was.  I’m sure there are software engineers using these models ethically, who read all the code, who have complete industry over it and use it as a means of handling very specific units of work that they have complete industry over. I’m also sure that there are some that are just asking it to do stuff, glancing at the code and shipping it. It’s impossible to measure how many of each camp there are, but hearing Spotify’s CEO say that its top developers are basically not writing code anymore makes me deeply worried, because this shit isn’t replacing software engineering at all — it’s mindlessly removing friction and putting the burden of “good” or “right” on a user that it’s intentionally gassing up. Ultimately, this entire era is a test of a person’s ability to understand and appreciate friction.  Friction can be a very good thing. When I don’t understand something, I make an effort to do so, and the moment it clicks is magical. In the last three years I’ve had to teach myself a great deal about finance, accountancy, and the greater technology industry, and there have been so many moments where I’ve walked away from the page frustrated, stewed in self-doubt that I’d never understand something. I also have the luxury of time, and sadly, many software engineers face increasingly-deranged deadlines set by bosses that don’t understand a single fucking thing, let alone what LLMs are capable of or what responsible software engineering is. The push from above to use these models because they can “write code faster than a human” is a disastrous conflation of “fast” and “good,” all because of flimsy myths peddled by venture capitalists and the media about “LLMs being able to write all code.” Generative code is a digital ecological disaster, one that will take years to repair thanks to company remits to write as much code as fast as possible.  Every single person responsible must be held accountable, especially for the calamities to come as lazily-managed software companies see the consequences of building their software on sand.  In the end, everything about AI is built on lies.  Hundreds of gigawatts of data centers in development equate to 5GW of actual data centers in construction.  Hundreds of billions of dollars of GPU sales are mostly sitting waiting for somewhere to go. Anthropic’s constant flow of “annualized” revenues ended up equating to literally $5 billion in revenue in four years , on $25 billion or more in salaries and compute. Despite all of those data centers supposedly being built, nobody appears to be making a profit on renting out AI compute. AI’s supposed ability to “write all code” really means that every major software company is filling their codebases with slop while massively increasing their operating expenses. Software engineers aren’t being replaced — they’re being laid off because the software that’s meant to replace them is too expensive, while in practice not replacing anybody at all. Looking even an inch beneath the surface of this industry makes it blatantly obvious that we’re witnessing one of the greatest corporate failures in history. The smug, condescending army of AI boosters exists to make you look away from the harsh truth — AI makes very little revenue, lacks tangible productivity benefits, and seems to, at scale, actively harm the productivity and efficacy of the workers that are being forced to use it. Every executive forcing their workers to use AI is a ghoul and a dullard, one that doesn’t understand what actual work looks like, likely because they’re a lazy, self-involved prick.  Every person I talk to at a big tech firm is depressed, nagged endlessly to “get on board with AI,” to ship more, to do more, all without any real definition of what “more” means or what it contributes to the greater whole, all while constantly worrying about being laid off thanks to the truly noxious cultures that are growing around these services. AI is actively poisonous to the future of the tech industry. It’s expensive, unproductive, actively damaging to the learning and efficacy of its users, depriving them of the opportunities to learn and grow, stunting them to the point that they know less and do less because all they do is prompt. Those that celebrate it are ignorant or craven, captured or crooked, or desperate to be the person to herald the next era, even if that era sucks, even if that era is inherently illogical, even if that era is fucking impossible when you think about it for more than two seconds. And in the end, AI is a test of your introspection. Can you tell when you truly understand something? Can you tell why you believe in something, other than that somebody told you you should, or made you feel bad for believing otherwise? Do you actually want to know stuff, or just have the ability to call up information when necessary?  How much joy do you get out of becoming a better person?If you can’t answer that question with certainty, maybe you should just use an LLM, as you don’t really give a shit about anything. And in the end, you’re exactly the mark built for an AI industry that can’t sell itself without spinning lies about what it can (or theoretically could) do.  Only 33% of announced US data centers are actually being built, with the rest in vague levels of “planning.” That’s about 79.53GW of power, or 61GW of IT load. “Active development” also refers to anything that is (and I quote) “...under development or construction,” meaning “we’ve got the land and we’re still working out what to do with it. This is pretty obvious when you do the maths. 61GW of IT load would be hundreds of thousands of NVIDIA GB200 NVL72 racks — over a trillion dollars of GPUs at $3 million per 72-GPU rack — and based on the fact there were only $178.5 billion in data center debt deals last year , I don’t think many of these are actually being built right now. Even if they were, there’s not enough power for them to turn on. NVIDIA claims it will sell $1 trillion of GPUs between 2025 and 2027 , and as I calculated previously , it sells about 1.6GW (in IT load terms, as in how much power just the GPUs draw) of GPUs every quarter, which would require at least 1.95GW of power just to run, when you include all the associated gear and the challenges of physically getting power. None of this data talks about data centers actually coming online.

0 views
DHH 3 weeks ago

Denmark desperately needs more inequality

The Danish election is tomorrow. One of the central themes in the incumbent campaign has been a proposed wealth tax. The fig leaf for this proposal was "smaller classrooms in the early grades", but that quickly fell off, and the debate centered on "inequality". And it's true that inequality is a problem in Denmark: There's not nearly enough! I know that sounds sacrilegious. Even most of the business-friendly press and parties in Denmark dance around this topic. Which makes political sense because the word "inequality" leads most people to think of poverty and destitution. But that's not the reality in the little kingdom that could. Denmark has an enormous state apparatus (half of GDP and a third of all workers!) that offers equal access to everything from health care to education and a million programs in between. It could surely be slimmed and trimmed, but on the whole, it works remarkably well. The average Dane is incredibly well cared for by any international standard (high-trust society, hurray!). By those same standards, it's the 8th most equal country in the world on income, as measured by the Gini coefficient (0.28). But this is where the numbers start spellbinding the debate. Because the Danish Gini coefficient perversely "degrades" if new businesses succeed, as any time successful founders and high-paid employees earning incomes above the median "worsen" inequality.  This is obviously nonsense. When the pie gets bigger, it gets better for all, as long as nobody is robbed of their existing slice.  Denmark should clearly want new successful businesses! It should love to see founders reap big rewards when the risks pay off. It should celebrate early employees making fortunes on stock grants. But all too often, it just doesn't. Just to put it on a pin: Danes hate flashy cars with a passion that stretches back much further than the current green excuses. But buying a $300,000 Ferrari in Denmark is one of the most patriotic things you can possibly do! You'll end up paying almost three times the price for the privilege, and sending 2/3s of that to the treasury in taxes. Truly a contribution to the common cause worthy of admiration, not scorn!  But because the debate around inequality is anchored in a fixed-pie paradigm, scorn is all you're likely to get. Anyone who does well in Denmark is immediately suspected of having succeeded at the expense of others. Probably through some form of nefarious exploitation, even if we can't prove what?! There is a core national politics of grievance and envy. But, however human that may be, the future progress and prosperity of the country depends on rejecting this zero-sum delusional dogma. The Danish economy is currently doing well compared to the rest of the EU, but it's dangerously dependent on a handful of vintage corporations pulling the bulk of the load. This simply has to change if the Danes wish to retain their high standards of living going forward. No corporation lasts forever. Novo Nordisk was Europe's most valuable company at the start of last year, now it's worth half that, and is out of the top ten. And who knows what the closing of the Hormuz Strait will do to Maersk. These two companies alone represent roughly a quarter of all Denmark's exports! Meanwhile, new business formation just hit an all-time low. And only a tiny portion of the big employers in Denmark were created in the last thirty years. And thus, almost all the wealth that funds the highly-prized welfare state is coming from really old companies. Many of them over a hundred years old. This is wonderful in many ways. The Danes should be rightfully proud to host Maersk (1904), Novo (1923), Vestas (1945), Lego (1932), and other international heavy-weights. But it can't rely on this aging corporate vintage to forever bear fruit for tomorrow. Tomorrow needs to be tended to by planting new seeds. New companies. New growth. New capital. And that's just not going to happen if the Danish state declares itself at war with capital formation or accumulation. It should be so lucky to have more rich people, with more capital, and the talent to deploy it toward a better, shared future (or spend it on heavily-taxed Ferraris!). The ballot boxes open tomorrow morning. It's predicted to be a close one. Fingers crossed for a prosperous choice.

0 views

The Beginning Of History

Hi! If you like this piece and want to support my work, please subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5000 to 185,000 words, including vast, extremely detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large .  I just put out a massive Hater’s Guide To Private Equity and one about both Oracle and Microsoft in the last month. I am regularly several steps ahead in my coverage, and you get an absolute ton of value, several books’ worth of content a year in fact!. In the bottom right hand corner of your screen you’ll see a red circle — click that and select either monthly or annual.  Next year I expect to expand to other areas too. It’ll be great. You’re gonna love it.  Before we go any further: no, this is not going to turn into a geopolitics blog. That being said, it’s important to understand the effect of the war in Iran on everything I’ve been discussing. So, let’s start simple. Open Google Maps. Scroll to the Middle East. Look at the bit of water separating the Gulf Arab countries from Iran. That’s the Persian Gulf.  Scroll down a bit. Do you see the narrow channel between the United Arab Emirates and Iran? That’s the Strait of Hormuz. At its narrowest point, it measures 24 miles across. Around 20% of the world’s oil and a similar percentage of the world’s liquified natural gas (LNG) flows through it each year.  Yes, that natural gas, the natural gas being used to power data centers like OpenAI and Oracle’s “Stargate” Abilene (which I’ll get to in a bit) and Musk’s Colossus data center . But really, size is misleading. Oil and gas tankers are massive, and they’re full to the brim with incredibly toxic material. Spills are, obviously, bad . Also, because of their size, these tankers need to stick where to where the water is a specific depth, lest they find themselves stuck.  As a result, there are two lanes that tankers use when navigating through the Strait of Hormuz — one going on, one going out. This a sensible idea with the goal to reduce the risk of collisions, but it also means that the potential chokepoint is even smaller.   Anyway, at the end of last month, Iran’s Revolutionary Guard Corps unilaterally closed off the strait, warning merchant shipping that any attempt to travel through the strait was “not allowed .” This closure, for what it’s worth, is not legally binding. Iran can’t unilaterally close a stretch of international waters. And yes, while some of those shipping lanes cross through Iran’s territorial waters ( and Oman’s, for that matter ), they’re still governed by the UN Convention on the Law of the Sea (UNCLOSS) , which gives ships the right to cross through narrow geographical chokeholds where part of the waters belong to another state, and that says that nations “shall not hamper transit passage.” That requirement, I add, cannot be suspended.  Still, merchant captains don’t want to risk getting themselves and their crews blown up, or arrested and thrown in Evin Prison . Insurers don’t want to pay for any ship that gets blown up, or indeed, for the ensuing environmental catastrophe. And the UAE doesn’t want its pristine beaches covered in crude oil.  And so, the tankers are staying put . And they’ll stay there until one of four things happens: Of the first three, none feels particularly likely, at least in the short-to-medium term. Maybe I’m wrong. Maybe everything reverses and everyone suddenly works it out — Trump realizes that he’s touching the stove and pulls out after claiming a “successful operation.” The world is chaotic and predicting it is difficult. Nevertheless, before that happens, closing the Strait of Hormuz means that Iran can inflict pain on American consumers at the pump, and we’ve already seen a 30% overnight spike in oil prices , with the price of a barrel jumping over $100 for the first time since 2022 (though as of writing this sentence it’s around $95). With midterms on the horizon, Iran hopes that it can translate this consumer pain to political pain for Donald Trump at the ballot box.  This is all especially nasty when you consider that the price of oil is directly tied to inflation. It influences shipping costs, a lot of medicines, construction materials, and consumer objects have petrochemical inputs. In very simple terms, if oil is used to make your stuff (or get it to you), that stuff goes up in price. While this obviously hurts countries with which Iran has previously had cordial relations, (particularly Qatar which is a major exporter of LNG), I genuinely don’t think it cares any more.  I mean, Iran has launched drones and missiles at targets located within Qatar’s territory , resulting in (at the latest count) 16 civilian injuries. Qatar shot down a couple of Iranian jets last week . I’m not sure what pressure any of the Gulf countries could exert on Iran to make it back down.  I don’t see the security situation improving, either. Iran’s Shahed drones are cheap and fairly easy to manufacture, and developed under some of the most punishing sanctions, when the country was cut off from the global supply chain. It then licensed the design to Russia, another heavily-sanctioned country, which has employed them to devastating effect in Ukraine.  Iran can produce these in bulk, and then — for the fraction of a cost of an American tomahawk missile — send them out as a swarm to hit passing ships. Even without the ability to produce new ones, Iran is believed to have possessed a pre-war stockpile of tens of thousands of Shahed drones .  Shaheds aren’t complicated, or expensive, or flashy, or even remotely sophisticated, and that’s what makes them such a threat. It took Ukraine a long time to effectively figure out how to counter them, and it’s done so by using a whole bunch of different tactics — from l and-based defenses like the German-made Gepard anti-aircraft gun , to interceptor drones , to repurposed 1960’s agricultural planes , to (quite literally) people shooting them down with assault rifles from the passenger seat of a propeller-powered planes .  Ukraine has the experience in combating these drones, and even still some manage to slip through its defences, often hitting civilian infrastructure . Airstrikes can probably reduce the threat to shipping (though not without exacting an inevitable and horrible civilian cost), but they can’t eliminate it.  Hell, even the Houthis — despite only controlling a small portion of Yemen, and despite efforts by a coalition of nations to degrade its offensive capabilities — still pose a risk to maritime traffic heading towards the Suez Canal.  Given the cargo these ships carry, any risk is probably too much risk for the insurers, for the carriers, and for the neighbouring countries. While I could imagine the US, at some point, saying “great news! It’s fine to go through the Strait of Hormuz now,” and though it has started offering US government-backed reinsurance for vessels , I don’t know if any shippers will actually believe it or take advantage of it.  And so, we get to the last point on my list. Regime change.  Do I believe that the Iranian government is deeply unpopular with its own people? Yes. Do I believe that said government can be overthrown by airstrikes alone? No. Do I believe that Iran’s government will do anything within its power to remain in control, even if that means slaughtering tens of thousands of their own people? Yes.  Even if there was an uprising, who would lead it? Iran’s virtually cut off from the Internet , and movement within the country is restricted, making it hard for any opposition figures to organize. The two most high-profile outside opposition figures — Reza Pahlavi, the son of the former Shah, and Maryam Rajavi, leader of the MEK and NCRI — both have their own baggage, and they’re living in the US and France respectively.  As I said previously, this isn’t me wading into geopolitics, but more of a statement that there’s no way of knowing when things will eventually return to normal. This conflict might wrap up in a couple of weeks, or it might be months, or, even longer than that. All this amounts to a huge amount of global oil production being bottled up, which is made worse by the fact that there’s also the slight problem that Iran produces a lot of oil itself, sending most of it (over 80%) to China . With Iran unable to export crude, and its production facilities now under attack, China’s going to have to look elsewhere. Which will result in even higher oil prices.  Which, in turn, will make everything else more expensive.  That is what brings us back to the AI bubble.  Now, given that most of the high-profile data center projects you’ve heard about are based in the US, which is (as mentioned) largely self-sufficient when it comes to hydrocarbons, you’d assume that it would be business as usual.  And you would be wrong.  You see, this is a global market. Prices can (and will!) go up in the US, even if the US doesn’t import oil or natural gas from abroad, because that’s just how this shit works. Sure, there are variations in cost where geography or politics play a role, but everyone will be on the same price trajectory. While we won’t see the same kind of shortages that we witnessed during the last oil shock (the one which ended up taking down the Carter presidency ), it will still hurt . While the US managed to decouple itself from oil imports, it hasn’t (and probably can’t) decouple itself from global pricing dynamics.  The US has faced a few major oil shocks — the first in 1973 , after OPEC issued an embargo against the US following the Yom Kippur War, which ended the following year after Saudi Arabia broke ranks, and the second in 1979, following the Iranian Revolution — and both hurt…a lot. This won’t be much different.  First, inflation.  As the cost of living spikes, people will start demanding higher wages, which will, in turn, be passed down through higher prices.  At least, that’s what would normally happen. Paul Krugman, the Nobel-winning economist, wrote in his latest substack that US workers in the 1970s were often unionized, and they benefited from contractual cost-of-living increases in their work contracts.  Sadly, we live in 2026. Union membership hasn’t recovered from the dismal Reagan years, and with layoffs and offshoring, combined with an already tough jobs market, workers have little leverage to demand raises. We’re in an economy oriented around do-nothing bosses that loathe their workers , one where workers will get squeezed even further by the consequences of any economic panic, even if it’s one caused by multiple events completely out of their control. So, it’s unlikely that we’ll see a wage-based amplification of any inflation that comes from the current situation.  That said, depending on how bad things get, we will see inflation spike, and Increases in inflation are usually met with changes in monetary policy, with central banks raising the cost of borrowing in an attempt to “cool” the economy (IE: reduce consumer spending so that companies are forced to bring down prices).    And we’d just started to bring down interest rates, with the Fed announcing in December that it projected rates of 3.4% by the end of 2026 . Iran changes that in the most obvious way possible — if prices soar, interest rates may follow, and if rates go up, even by a point or two of a percentage, financing the tens and hundreds of billions of dollars in borrowing that the AI bubble demands will become significantly more expensive.  For some context, the International Monetary Fund’s Kristalina Georgieva recently said “...a 10% increase in energy prices that persists for a year would push up global inflation by 40 basis points and slow global economic growth by 0.1-0.2%,” per The Guardian, who also added… And remember : the AI bubble, along with the massive private equity and credit funds backing it, is fueled almost entirely by debt. All this chaos and potential for jumps in inflation will also affect the affordability calculations that lenders will make before loaning the likes of Oracle and Meta the money they need at a time when lenders are already turning their nose up at Blue Owl-backed data center debt deals . The alternative is, of course, not raising interest rates — which, if the Fed loses its independence, is a possibility — which would be equally catastrophic, as we saw in the case of Turkey, whose president, Recep Tayyip Erdogan, has a somewhat… ahem… “unorthodox approach to monetary policy .  Erdogan believes that high interest rates cause inflation — a theory which he tested to the detriment of his own people .  In simpler terms, Turkey has faced some of the worst hyperinflation in the developed world , and has a currency that lost nearly 90% of its value in five years.  It’s not just the data centers, either. As interest rates go up, VC funds tend to shrink, because the investors that back said funds can get better returns elsewhere , and with much less risk.  As I discussed in the Hater’s Guide to Private Equity , 14% of large banks’ total loan commitments go to private equity, private credit and other non-banking institutions , at a time when ( to quote Forbes ) PE firms are taking an average of 23 months fundraising (up from 16 months in 2021), after private credit’s corporate borrowers’ default rates (as in the loans written off as unpaid by the borrow) hit 9.2% in 2025 . Put really simply, private equity, private credit, venture capital and basically everything to do with technology currently depends on the near-perpetual availability of debt. The growth of private credit is so recent that we truly don’t know what happens if the debt spigot gets turned off, but I do not think it will be pretty . Things get a little worse when you remember that famed business dipshits SoftBank are currently trying to raise a $40 billion loan to fund its three $10 billion Klarna-esque payments as part of its $30 billion investment in OpenAI’s not-actually-$110-billion-yet funding round . How SoftBank — a company that raised a $15 billion bridge loan due to be paid off in around four months and has about $41.5 billion in existing debt that’s maturing that needs to be refinanced in the next nine months or so, per JustDario — intends to take on another $40 billion is beyond me. And that’s a sentence I would’ve written before the war in Iran began. There’s also evidence that links lower IPO numbers to rising inflation rates , which means that achieving the exit that investors want will become so much harder — and so, they might as well not bother. Need proof? SoftBank-owned mobile payments company PayPay delayed its IPO last week, and I quote Reuters , because “...markets were rattled by [the attack] on Iran, according to two people familiar with the matter.” Inflation also negatively affects company valuations — which, again, will influence whether investors open their purse strings.  This is all a long-winded way of saying that the AI industry is about to enter a world of hurt. Every AI startup is unprofitable, which means they need to raise money from venture capitalists, who raise money from investors that aren’t paying them, pension funds and insurers, and private equity and credit firms that raise money from banks, both of which will struggle should central bank rates spike.  The infrastructural layer — AI data centers — also requires endless debt ( due to the massive upfront costs for NVIDIA chips and construction ), and that debt was already becoming difficult to raise.  Then there's the practical opex and capex costs. Higher interest rates mean that any contractors building the facilities will insist on higher fees, because their costs — labor costs, the price of filling up a van or a truck with gas, or paying for building materials — has gone up. And they’ll probably pad the increase a bit to take into account for any future rises in inflation.  Those gas turbines you’re running to power your facility? Yeah, feeding those is going to get much more expensive. Natural gas is up as much as 50%, and a lot of US capacity is going to serve markets in Asia and Europe to take advantage of the spike in prices , which will mean an increase in prices for US consumers.  In fact, you don’t even need interest rates to spike for things to get nasty. As the price of oil continues to skyrocket, flying a Boeing 747 filled with GB200 racks from Taiwan to Texas or mobilizing the thousands of people that work ( to quote Bloomberg ) day and night to build Stargate Abilene will become extra-normally more expensive. And even in the very, very unlikely event that things somehow quickly return to whatever level of “normal” you’d call the world before the conflict started, even brief shocks to the financial plumbing are enough to destabilize an already-fractured hype cycle. Last week, Bloomberg reported something I’d already confirmed three weeks ago — that OpenAI was no longer part of the planned expansion (past the initial two (of eight) buildings) of Stargate Abilene, a project that’s already massively delayed from its supposed “full energization” by mid-2026 .  Oracle disputes the report (and if it’s wrong, I imagine investors will rightly sue) claiming that “Crusoe [the developer] and Oracle are “operating in lockstep,” which doesn’t make sense considering the delays or, well, reality. My sources in Abilene also tell me that the expansion fell apart due to Oracle’s dissatisfaction with the revenue it was making on buildings one and two, and that a bidding war was taking place between Meta and Google for the future capacity.  Bloomberg’s Ed Ludlow also reports that NVIDIA put down a $150 million deposit as Crusoe attempts to lock down Meta as a tenant — a very strange thing to do considering Meta is flush with cash, suggesting a desperation in the hearts of everybody involved. It’s also very, very strange to have a supplier get involved in a discussion between a vendor and a customer , almost as if there’s some sort of circular financing going on. As I reported back in October, Stargate currently only has around 200MW of power , and The Information reports that power won’t be available for a year or more, something I also said in October .  As self-serving as it sounds, I really do recommend you read my premium piece about the AI Bubble’s Impossible Promises , because I laid out there how stupid and impossible gigawatt data centers were before the war in Iran. We’ve already got a shortage in the electrical grade steel and transformers required to expand America’s (and the world’s) power grid, we’ve already got a shortage of skilled labor required to build that power (and data centers in general) , and we’re moving massive amounts of heavy shit around a large patch of land using thousands of people, which will cost a lot of gas. I don’t know why, but the media and the markets seem incapable of imagining a world where none of this stuff happens, clinging to previous epochs where “things worked out” and where “things were okay” without a second thought. In The Black Swan , Nassim Taleb makes the point that “…the process of having [journalists] report in lockstep [causes] the dimensionality of the opinion set to shrink considerably,” saying that they tend to “[converge] on opinions and [use] the same items as causes.”  In simpler terms, everybody reporting the same thing in the same way naturally makes everybody converge on the same kinds of ideas — that AI is going to be a success because previous eras have “worked out,” even if they can’t really express what “worked out” means.  The logic is almost childlike — in the past, lots of money was invested in stuff that didn’t work out, but because some things worked out after spending lots of money , spending lots of money will work out here.  The natural result is that reporters (and bloggers) seek endless positive confirmation, and build narratives to match. They report that Anthropic hit $19 billion in annualized revenue and OpenAI hit $25 billion in annualized revenue — which has been confirmed to refer to a 4-week-long period of revenue multiplied by 12 — as proof that the AI bubble is real, ignoring the fact that both companies lose billions of dollars and that my own reporting says that OpenAI made billions less and spent billions more in 2025. They assume that a company would not tell everybody something untrue or impossible, because accepting that companies do this undermines the structure of how reporting takes place, and means that reporters have to accept that they, in some cases, are used by companies to peddle information with the intent of deception. And thanks to an affidavit from Anthropic Chief Financial Officer Krishna Rao filed as part of Anthropic’s suit against the Department of Defense’s supply chain risk designation , it’s clear that the deception was intentional, as the affidavit confirmed that Anthropic’s lifetime revenue “to date” (referring to March 9th 2026) is $5 billion , and it has spent $10 billion on inference and training.  To be abundantly clear , this means that Anthropic’s previous statement that it made $14 billion in annualized revenue ( stated by Anthropic on February 12 2026, and referring, I’ve confirmed, to a month-long period multiplied by 12 ) — referring to a period of 30 days where it made $1.16 billion — accounts for more than 23% of its lifetime revenue.  This comes down to which Anthropic you believe, because these two statements do not match up. I am not stating that it is lying , but I do believe annualized revenue is a deliberate attempt to obfuscate things and give the vibe that the business is healthier than it is. I also do not think it’s likely that Anthropic made 23% of its lifetime revenue in the space of a month. What this almost certainly means is that the sources that told media outlets that Anthropic made $4.5 billion in 2025 were misleading them . The exact quote from the affidavit is that “...[Anthropic] has generated substantial revenue since entering the commercial market—exceeding $5 billion to date,” and while boosters will say “uhm, it says “exceeding,” if it were anything higher than $5.5 billion Anthropic would’ve absolutely said so.  We can also do some very simple maths that suggests that Anthropic’s “annualized” figures are…questionable. On February 12 2026, annualized revenue hit $14 billion. Five days before the lawsuit was filed, it was $19 billion, “ with $6 billion added in February ” (per Dario Amodei at a Morgan Stanley conference), suggesting that annualized revenue was $13 billion, or $1.083 billion.  Even if we assume a flat billion, that means that Anthropic made $2.16 billion between January and the end of February 2026. And that’s not including the revenue made in March so far.  But I’m a curious little critter and went ahead and added up all of the times that Anthropic had talked about its annualized revenue from 2025 onward, and the results — which you can find with links here! — and based on my calculations, just using published annualized revenues gets us to $4.837 billion.  We are, however, missing several periods of time, which I’ve used “safe” (as in lower, so that I am trying to give Anthropic the benefit of the doubt) numbers to calculate based on the periods themselves. With these estimates, we get a grand total of $6.66 billion (ominous!), which is a great deal higher than $5 billion. When you remove the estimates and annualized revenues for 2026, you get $3.642 billion, which heavily suggests that Anthropic did not, in fact, make $4.5 billion in 2025. There isn’t a chance in Hell this company made $4.5 billion in 2025 based on its own CFO’s affidavit. I also think it’s reasonable to doubt the veracity of these annualized revenues, or, in my kindest estimation, that Anthropic is using any kind of standard “annualized” formula.  Here are the ways in which people will try and claim I’m wrong: I think it’s reasonable to doubt whether Anthropic made anywhere near $4.5 billion in 2025, whether Anthropic has annualized revenues even approaching those reported, and whether anything it says can be trusted going forward. It appears one of the most prominent startups in the valley has misled everybody about how much it makes, or if it has not, that somebody else is perpetuating a misinformation campaign. Add together the annualized revenues. Look at the links. Do the maths. I got the links for annualized revenues from Epoch AI , though I have seen all of these before in my own research.  People are going to try and justify why this isn’t a problem in all manner of ways. They’ll say that actually Anthropic made less money in 2025 but that’s fine because they all could see what annualized revenues really meant. So far, nobody has a cogent response, likely because there isn’t one. I haven’t even addressed the $10 billion in training and inference costs, because good lord, those costs are stinky , and based on my own reporting — which did not come from Anthropic, which is why I trust it! — Anthropic spent $2.66 billion on Amazon Web Services from January through September 2025, or around 26% of its lifetime compute spend. That’s remarkable, and suggests this company’s compute spend is absolutely out of control. This leads me to one more quote from Anthropic’s CFO: Without attempting to influence their decision making, if I were a counterparty to a company like this, my biggest concern would now be that this filing appears to suggest that Anthropic’s revenues are materially smaller than I believed. Though it might seem dangerous to be like me, pointing at stuff and saying “that doesn’t make sense!” Or questioning a narrative held by the entire stock market and most of modern journalism, but I’d argue the danger is that narrow, narrative-led, establishment-driven thinking makes it impossible for reporters to report.  While you might be able to say “a source told me that something went wrong,” the natural drive to report on what everybody else is saying means that this information is often reported with careful weasel words like “still going as planned” or “still growing incredibly fast.” It’s a kind of post-factual decorum — a need to keep the peace that frames bad signs as bumps in the road and good signs as cast-iron affirmations of future success. This is a catastrophic failure of journalism that deprives retail investors and the general public of useful information. It also — though it feels as if reporters are “getting scoops” or “breaking news” — naturally magnetizes journalists toward information that confirms the narrative, or “leaks” that are actually the company intentionally getting something in front of a reporter so that they (the reporter) can appear as if this was “investigative news” versus “marketing in a different hat.” It also means that modern journalism is ill-equipped, and no, this is not a “new” phenomena. It is the same thing that led to the dot com bubble, the NFT bubble, the crypto bubble, the Clubhouse bubble, the AR and VR bubble, and many more bubbles to come.  To avoid being “wrong,” reporters are pursuing stories that prove somebody else right, which almost invariably ends with the reporter being wrong. “Pursuing stories to prove somebody else right” means that a great many reporters (and newsletter writers) that claim to be objective and fact-focused end up writing the narrative that companies use to raise money using evidence manufactured by the company in question.  In some cases, this is an act of cowardice. Following the narrative because it’s easy and because everybody’s doing it adds a layer of reputation laundering. If everybody failed, everybody was conned and thus nobody has to be held accountable, and because there really has never been any accountability for the media being wrong about any previous bubbles, the assumption is that it’ll never happen.  However you may feel about my work or what I’m saying, I need you to understand something: journalism, both historically and currently, is unprepared for the consequences of being wrong.  The current media consensus around the AI bubble is that even if it pops it will be fine , with some even saying that “even if OpenAI folds, everything will work out, because of the dot com bubble.” This is a natural attempt to rationalize and normalize the chaotic and destructive — an attempt to map how this bubble would burst onto previous bubbles because new things are difficult and scary to imagine.  There has never been a time when the entire market crystallised around a few specific companies — not even the dot com bubble! — and then built an entire infrastructural layer mostly in service of two of them, with a price tag now leering close to the $1tn mark .   Let’s get specific. The scoffing and jeering I get from people when I say that AI demand doesn’t exist or that AI companies don’t have revenues or that OpenAI or Anthropic are unsustainable is never met with a good faith response , just quotes about how “Amazon Web Services lost lots of money” or “Uber lost lots of money” or that “these are the fastest growing companies of all time” or something about “all code being written by AI,” a subject I discussed at length two weeks ago .  The Large Language Model era is uniquely built to exploit human beings’ belief that we can infer the future based on the past, both in how it processes data and in how people report on its abilities. It exploits media outlets that do not have people that are given the time (or held to a standard where they have) to actually learn the subjects in question, and sells itself based on the statement that “this is the worst it’ll ever be” and “previous eras of investment worked out.”  LLMs also naturally cater to those who are willing to accept substandard explanations and puddle-deep domain expertise. The slightest sign that Claude Code can build an app — whether it’s capable of actually doing so or not — is enough for people that are on television every day to say that it will build all software, because it confirms the biases that the cycle of innovation and incumbent disruption still exists, even if it hasn’t for quite some time. A glossy report about job displacement — even one that literally says that Anthropic found “no systematic increase in job displacement in unemployment” from AI — gets reported as proof that jobs are being displaced by AI because it says “AI is far from reaching its theoretical capability: actual coverage remains a fraction of what’s feasible.”  This is an aggressive exploitation in how willing people with the responsibility to tell the truth are willing to accept half-assed expectations, and how willing people are to operate based on principles garnered from the lightest intellectual lifts in the world. The assumption is always the same: that what has happened before will happen again, even if the actuality of history doesn’t really reflect that at all. Society — the media, politicians, chief executives, shit, everyone on some level — is incapable of thinking of new stuff that would happen, especially if that new stuff would be economically destructive, such as a massive scar across all private credit, private equity and venture capital, one so severe that it may potentially destroy the way that businesses (and startups, for that matter) raise capital for the foreseeable future. People are more willing to come up with societally-destructive theories — such as all software engineering and all journalism and all content being created by LLMs, even if it doesn’t actually make sense — because it fits their biases. Perhaps they’re beaten down by decades of muting the power of labor or the destruction of our environment. Perhaps they’re beaten down by the rise of the right and the destruction of the rights of minorities and people of colour.  Or more noxiously, perhaps they’re excited to be the one that called it first, so that the new overlords that they perceive will own this (fictional) future, so much so that they’ll ignore the underlying ridiculousness of the economics, refuse to do any further reading that might invalidate their beliefs, or simply say whatever they’re told because it gets clicks and makes their advertisers, bosses or friends happy. People are willing to fall in line behind mythology because conceiving an entirely-different future is an intellectually challenging and emotionally draining act. It requires learning about a multitude of systems and interconnecting disciplines and being willing to admit, again and again, that you do not understand something and must learn more. There are plenty of people that are willing to do this, and plenty more that are not, and those are the people with TV shows and writing in the newspaper. I believe we’re in a new era. It’s entirely different. Stop trying to say “but in the past,” because the past isn’t that useful, and it’s only useful if you’re capable of evaluating it critically, skeptically, and making sure that it’s actually the same rather than it feeling like it is.  I keep calling this era “The Beginning of History,” not because it directly reflects Francis Fukuyama’s theory (which relates to democracies), but because I believe that those who succeed in this world are not those who are desperate to neatly fit it into the historical failures or successes of the past, but are willing to stare at it with the cold, hard fury of the present.  There are many signs that the past no longer makes sense. The collapse of SaaS (which I’ll cover in this week’s premium), the collapse of the business models of both venture capital and private equity, the collapse of democracies under the weight of fascism because the opposition parties never seem to give enough of a fuck about the experiences of regular people.  That’s because using the past to dictate what will happen in the future is masturbatory. It allows you to feel smart and say “I know the most about anything, which means I know what’s going on.” It is, much like an LLM, assuming that simply reading enough is what makes somebody smart, that shoving a bunch of text in your head — whether or not you understand it is immaterial — is what makes somebody know something or good at something.  It’s an intellectually bankrupt position that I believe will lead those unable to adapt to the reality of the future to destruction. It leads to lazy thinking that grasps at confirmations rather than any fundamental understanding, depriving the general public of good information in the favor of that which confirms the biases and wants and needs of the malignant and ignorant.  It takes courage to be willing to be wrong with deliberacy, but only if you admit that you were wrong. This hasn’t happened in previous bubbles, and it has to again for us to stop bubbles forming. I have made a great deal of effort to learn more as time goes on. I do not see boosters doing the same to prove their points. I will be pointing to this sentence in the future, one way or another.  So much more effort is put into humouring the ideas of the bubbles, of proving the marketing spiel of the bubbles, framed as a noxious “both-sides” that deprives the reader, listener or viewer of their connection with reality. It might be tempting to say this happens with cynicism too, except the majority of attention paid to bubbles is positive , and saying otherwise is a fucking lie. Need to justify unprofitable, unsustainable AI companies? Uber lost money before. Need to explain why AI data centers being built for demand isn’t a problem? Well, the internet exists, and people eventually used that fiber.  You can ignore actual proof while pretending to provide your own, all just by pointing vaguely to things in the past. It takes actual courage to form an opinion, something boosters fundamentally lack.  I’m not saying it’s impossible to make predictions, but that the majority of people make them with flimsy information, such as “this thing happened before” or “everyone’s saying this will happen.” I’m not saying you can’t try and understand what will happen next, but doing so requires you to use information that is not, on its face, generated by wishcasting or events that took place decades ago.  In the end, the greatest lesson we can learn from is that, historically speaking, people tend to fuck around and then find out.  The assumption boosters make is that one can fuck around forever. History tends to disagree. Iran rescinds its ban on travel through the strait. The security situation improves (either because Iran’s ability to attack shipping becomes sufficiently degraded, or because the Gulf countries, or perhaps their Western allies, feel sufficiently confident that they can safely escort ships through the strait).  The current Iranian government is overthrown and the conflict ends.  Both sides reach an agreement and we return to the status quo.  April 1 to 30, 2025, which I estimate as $166 million based on reports of Anthropic’s annualized revenue being $2 billion at the end of March 2025. August 1 to August 20, 2025, which I estimate as $271 million based on July 2025’s revenues ($4 billion). November 1 to November 29, 2025, which I estimate as $556 million, based on October’s $7 billion in annualized revenues.  January 1 to January 11, 2026, which I estimate as $219.1 million, assuming $9 billion in annualized revenue (based on reported December revenues). “Ed, it’s commercial revenue!” — this is all revenue. Anthropic doesn’t have “non-commercial revenue,” unless you are going to use a very, very broad version of what “non-commercial” means, at which point you have to tell me why you trust Anthropic. “This doesn’t include all the revenue up until March 2026! Maybe this suit was written weeks ago!” — even if it doesn’t, based on Anthropic’s own numbers, things don’t line up. Also, this was written specifically as part of the lawsuit with the DoD. It’s recent.  “It says “exceeding”! — it also says “over $10 billion in inference and training costs.” Can I just say whatever number I want here? Because if this is your argument that’s what you’re doing. “That $5 billion number is accurate!” — the only way this makes sense is if some or all of these annualized revenues are incorrect.

0 views

Premium: The Hater's Guide to Private Equity

We have a global intelligence crisis, in that a lot of people are being really fucking stupid. As I discussed in this week’s free piece , alleged financial analyst Citrini Research put out a truly awful screed called the “2028 Global Intelligence Crisis” — a slop-filled scare-fiction written and framed with the authority of deeply-founded analysis, so much so that it caused a global selloff in stocks .  At 7,000 words, you’d expect the piece to have some sort of argument or base in reality, but what it actually says is that “AI will get so cheap that it will replace everything, and then most white collar people won’t have jobs, and then they won’t be able to pay their mortgages, also AI will cause private equity to collapse because AI will write all software.”  This piece is written specifically to spook *and* ingratiate anyone involved in the financial markets with the idea that their investments are bad but investing in AI companies is good, and also that if they don't get behind whatever this piece is about (which is unclear!), they'll be subject to a horrifying future where the government creates a subsidy generated by a tax on AI inference (seriously). And, most damningly, its most important points about HOW this all happens are single sentences that read "and then AI becomes more powerful and cheaper too and runs on a device."  Part of the argument is that AI agents will use cryptocurrency to replace MasterCard and Visa. It’s dogshit. I’m shocked that anybody took it seriously. The fact this moved markets should suggest that we have a fundamentally flawed financial system — and here’s an annotated version with my own comments. This is the second time our markets have been thrown into the shitter based on AI booster hype. A mere week and a half ago, a software sell-off began because of the completely fanciful and imaginary idea that AI would now write all software . I really want to be explicit here: AI does not threaten the majority of SaaS businesses, and they are jumping at ghost stories.  If I am correct, those dumping software stocks believe that AI will replace these businesses because people will be able to code their own software solutions. This is an intellectually bankrupt position, one that shows an alarming (and common) misunderstanding of very basic concepts. It is not just a matter of “enough prompts until it does this” — good (or even functional!) software engineering is technical, infrastructural, and philosophical, and the thing you are “automating” is not just the code that makes a thing run.  Let's start with the simplest, and least-technical way of putting it: even in the best-case scenario, you do not just type "Build Be A Salesforce Competitor" and it erupts, fully-formed, from your Terminal window. It is not capable of building it, but even if it were, it would need to actually be on a cloud hosting platform, and have all manner of actual customer data entered into it. Building software is not writing code and then hitting enter and a website appears, requiring all manner of infrastructural things (such as "how does a customer access it in a consistent and reliable way," "how do I make sure that this can handle a lot of people at once," and "is it quick to access," with the more-complex database systems requiring entirely separate subscriptions just to keep them connecting ).  Software is a tremendous pain in the ass. You write code, then you have to make sure the code actually runs, and that code needs to run in some cases on specific hardware, and that hardware needs to be set up right, and some things are written in different languages, and those languages sometimes use more memory or less memory and if you give them the wrong amounts or forget to close the door in your code on something everything breaks, sometimes costing you money or introducing security vulnerabilities.  In any case, even for experienced, well-versed software engineers, maintaining software that involves any kind of customer data requires significant investments in compliance, including things like SOC-2 audits if the customer itself ever has to interact with the system, as well as massive investments in security.  And yet, the myth that LLMs are an existential threat to existing software companies has taken root in the market, sending the share prices of the legacy incumbents tumbling. A great example would be SAP, down 10% in the last month.  SAP makes ERP (Enterprise Resource Planning, which I wrote about in the Hater's Guide To Oracle ) software, and has been affected by the sell-off. SAP is also a massive, complex, resource-intensive database-driven system that involves things like accounting, provisioning and HR, and is so heinously complex that you often have to pay SAP just to make it function (if you're lucky it might even do so). If you were to build this kind of system yourself, even with "the magic of Claude Code" (which I will get to shortly), it would be an incredible technological, infrastructural and legal undertaking.  Most software is like this. I’d say all software that people rely on is like this. I am begging with you, pleading with you to think about how much you trust the software that’s on every single thing you use, and what you do when a piece of software stops working, and how you feel about the company that does that. If your money or personal information touches it, they’ve had to go through all sorts of shit that doesn’t involve the code to bring you the software.  Any company of a reasonable size would likely be committing hundreds of thousands if not millions of dollars of legal and accounting fees to make sure it worked, engineers would have to be hired to maintain it, and you, as the sole customer of this massive ERP system, would have to build every single new feature and integration you want. Then you'd have to keep it running, this massive thing that involves, in many cases, tons of personally identifiable information. You'd also need to make sure, without fail, that this system that involves money was aware of any and all currencies and how they fluctuate, because that is now your problem. Mess up that part and your system of record could massively over or underestimate your revenue or inventory, which could destroy your business. If that happens, you won't have anyone to sue. When bugs happen, you'll have someone who's job it is to fix it that you can fire, but replacing them will mean finding a new person to fix the mess that another guy made.  And then we get to the fact that building stuff with Claude Code is not that straightforward. Every example you've read about somebody being amazed by it has built a toy app or website that's very similar to many open source projects or website templates that Anthropic trained its training data on. Every single piece of SaaS anyone pays for is paying for both access to the product and a transfer of the inherent risk or chaos of running software that involves people or money. Claude Code does not actually build unique software. You can say "create me a CRM," but whatever CRM it pops out will not magically jump onto Amazon Web Services, nor will it magically be efficient, or functional, or compliant, or secure, nor will it be differentiated at all from, I assume, the open source or publicly-available SaaS it was trained on. You really still need engineers, if not more of them than you had before. It might tell you it's completely compliant and that it will run like a hot knife through butter — but LLMs don’t know anything, and you cannot be sure Claude is telling the truth as a result. Is your argument that you’d still have a team of engineers (so they know what the outputs mean), but they’d be working on replacing your SaaS subscription? You’re basically becoming a startup with none of the benefits.  To quote Nik Suresh, an incredibly well-credentialed and respected software engineer (author of I Will Fucking Piledrive You If You Mention AI Again ), “...for some engineers, [Claude Code] is a great way to solve certain, tedious problems more quickly, and the responsible ones understand you have to read most of the output, which takes an appreciable fraction of the time it would take to write the code in many cases. Claude doesn't write terrible code all the time, it's actually good for many cases because many cases are boring. You just have to read all of it if you aren't a fucking moron because it periodically makes company-ending decisions.” Just so you know, “company-ending decisions” could start with your vibe-coded Stripe clone leaking user credit card numbers or social security numbers because you asked it to “just handle all the compliance stuff.” Even if you have very talented engineers, are those engineers talented in the specifics of, say, healthcare data or finance? They’re going to need to be to make sure Claude doesn’t do anything stupid !  So, despite all of this being very obvious , it’s clear that the markets and an alarming number of people in the media simply do not know what they are talking about. The “AI replaces software” story is literally “Anthropic has released a product and now the resulting industry is selling off,” such as when it launched a cybersecurity tool that could check for vulnerabilities (a product that has existed in some form for nearly a decade) causing a sell-off in cybersecurity stocks like Crowdstrike — you know, the one that had a faulty bit of code cause a global cybersecurity incident that lost the Fortune 500 billions , and led to Delta Air Lines suspending over 1,200 flights over six long days of disruption .  There is no rational basis for anything about this sell-off other than that our financial media and markets do not appear to understand the very basic things about the stuff they invest in. Software may seem complex, but (especially in these cases) it’s really quite simple: investors are conflating “an AI model can spit out code” with “an AI model can create the entire experience of what we know as “software,” or is close enough that we have to start freaking out.” This is thanks to the intentionally-deceptive marketing pedalled by Anthropic and validated by the media. In a piece from September 2025, Bloomberg reported that Claude Sonnet 4.5 could “code on its own for up to 30 hours straight,”  a statement directly from Anthropic repeated by other outlets that added that it did so “on complex, multi-step tasks,” none of which were explained. The Verge, however, added that apparently Anthropic “ coded a chat app akin to Slack or Teams ,” and no, you can’t see it, or know anything about how much it costs or its functionality. Does it run? Is it useful? Does it work in any way? What does it look like? We have absolutely no proof this happened other than them saying it, but because the media repeated it it’s now a fact.  Perhaps it’s not a particularly novel statement, but it’s becoming kind of obvious that maybe the people with the money don’t actually know what they’re doing, which will eventually become a problem when they all invest in the wrong thing for the wrong reasons.   SaaS (Software as a Service, which almost always refers to business software) stocks became a hot commodity because they were perpetual growth machines with giant sales teams that existed only to make numbers go up, leading to a flurry of investment based on the assumption that all numbers will always increase forever, and every market is as giant as we want. Not profitable? No problem! You just had to show growth. It was easy to raise money because everybody saw a big, obvious path to liquidity, either from selling to a big firm or taking the company public… …in theory.  Per Victor Basta , between 2014 and 2017, the number of VC rounds in technology companies halved with a much smaller drop in funding, adding that a big part was the collapse of companies describing themselves as SaaS, which dropped by 40% in the same period. In a 2016 chat with VC David Yuan, Gainsight CEO Nick Mehta added that “the bar got higher and weights shifted in the public markets,” citing that profitability was now becoming more important to investors.  Per Mehta, one savior had arrived — Private Equity, with Thoma Bravo buying Blue Coat Systems in 2011 for $1.3 billion (which had been backed by a Canadian teacher’s pension fund!), Vista Equity buying Tibco for $4.3 billion in 2014, and Permira Advisers (along with the Canadian Pension Plan Investment Board) buying Informatica for $5.3 billion ( with participation from both Salesforce and Microsoft ) in 2015, 16 years after its first IPO. In each case, these firms were purchased using debt that immediately gets dumped onto the company’s balance sheet, known as a leveraged buyout.  In simple terms, you buy a company with money that the company you just bought has to pay off. The company in question also has to grow like gangbusters to keep up with both that debt and the private equity firm’s expectations. And instead of being an investor with a board seat who can yell at the CEO, it’s quite literally your company, and you can do whatever you want with (or to) it. Yuan added that the size of these deals made the acquisitions problematic, as did their debt-filled: Symantec would acquire Blue Coat for $4.65 billion in 2016 , for just under a 4x return. Things were a little worse for Tibco. Vista Equity Partners tried to sell it in 2021 amid a surge of other M&A transactions , with the solution — never change, private equity! — being to buy Citrix for $16.5 billion (a 30%% premium on its stock price) and merge it with Tibco, magically fixing the problem of “what do we do with Tibco?” by hiding it inside another transaction. Informatica eventually had a $10 billion IPO in 2021, which was flat in its first day of trading , never really did more than stay at its IPO price, then sold to Salesforce for $8 billion in 2025 , at an equity value of $8 billion , which seems fine but not great until you realize that, with inflation, the $5.3 billion that Permira invested in 2015 was about $7.15 billion in 2025’s money. In every case, the assumption was very simple: these businesses would grow and own their entire industries, the PE firm would be the reason they did this (by taking them private and filling them full of debt while making egregious growth demands), and the meteoric growth of SaaS would continue in perpetuity.  Yet the real year that broke things was 2021. As everybody returned to the real world, consumer and business spending skyrocketed, leading ( per Bloomberg ) to a massive surge in revenues that convinced private equity to shove even more cash and debt up the ass of SaaS: Bloomberg is a little nicer than I am, so they’re not just writing “deals were waved through because everybody assumed that software grows forever and nobody actually knew a thing about the technology or why it would grow so fast.” Unsurprisingly, this didn’t turn out to be true. Per The Information , PE firms invested in or bought 1,167 U.S. software companies for $202 billion, and usually hold investments for three to five years. Thankfully, they also included a chart to show how badly this went:  2021 was the year of overvaluation, and ( per Jason Lemkin of SaaStr ) 60% of unicorns (startups with $1bn+) valuations hadn’t raised funds in years. The massive accumulated overinvestment, combined with no obvious pathway to an exit, led to people calling these companies “ Zombie Unicorns ”: The problem, to quote The Information, is that “PE firms don’t want to lock in returns that are lower than what they promised their backers, say some executives at these firms,” and “many enterprise software firms’ revenue growth has slowed.” Per CNBC in November 2025 , private equity firms were facing the same zombie problem: Per Jason Lemkin , private equity is sitting on its largest collection of companies held for longer than four years since 2012, with McKinsey estimating that more than 16,000 companies (more than 52% of the total buyout-backed inventory) had been held by private equity for more than four years, the highest on record. In very simple terms, there are hundreds of billions of tech companies sitting in the wings of private equity firms that they’re desperate to sell, with the only customers being big tech firms, other private equity firms, and public offerings in one of the slowest IPO markets in history . Investing used to be easy. There were so many ideas for so many companies, companies that could be worth billions of dollars once they’d been fattened up with venture capital and/or private equity. There were tons of acquirers, it was easy to take them public, and all you really had to do was exist and provide capital. Companies didn’t have to be good , they just had to look good enough to sell. This created a venture capital and private equity industry based on symbolic value, and chased out anyone who thought too hard about whether these companies could actually survive on their own merits. Per PitchBook, since 2022, 70% of VC-backed exits were valued at less than the capital put in , with more than a third of them being startups buying other startups in 2024. Private equity firms are now holding assets for an average of 7 years , McKinsey also added one horrible detail for the overall private equity market, emphasis mine:  You see, private equity is fucking stupid, doesn’t understand technology, doesn’t understand business, and by setting up its holdings with debt based on the assumption of unrealistic growth, they’ve created a crisis for both software companies and the greater tech industry.  On February 6, more than $17.7 billion of US tech company loans dropped to “distressed” trading levels (as in trading as if traders don’t believe they’ll get paid, per Bloomberg ), growing the overall group of distressed tech loans to $46.9 billion, “dominated by firms in SaaS.” These firms included huge investments like Thoma Bravo’s Dayforce ( which it purchased two days before this story ran for $12.3 billion ) and Calabrio ( which it acquired for “over” $1 billion in April 2021 and merged with Verint in November 2025 ).  This isn’t just about the shit they’ve bought , but the destruction of the concept of “value” in the tech industry writ large. “Value” was not based on revenues, or your product, or anything other than your ability to grow and, ideally, trap as many customers as possible , with the vague sense that there would always be infinitely more money every year to spend on software.  Revenue growth came from massive sales teams compensated with heavy commissions and yearly price increases, except things have begun to sour, with renewals now taking twice as long to complete , and overall SaaS revenue growth slowing for years . To put it simply, much of the investment in software was based on the idea that software companies will always grow forever, and SaaS companies — which have “sticky” recurring revenues — would be the standard-bearer. When I got into the tech industry in 2008, I immediately became confused about the amount of unprofitable or unsustainable companies that were worth crazy amounts of money, and for the most part I’d get laughed at by reporters for being too cynical.  For the best part of 20 years, software startups have been seen as eternal growth-engines. All you had to do was find a product-market fit, get a few hundred customers locked in, up-sell them on new features and grow in perpetuity as you conquered a market. The idea was that you could just keep pumping them with cash, hire as many pre-sales (technical person who makes the sale), sales and customer experience (read: helpful person who also loves to tell you more stuff) people as you need to both retain customers and sell them as much stuff as you need.  Innovation was, as you’d expect, judged entirely by revenue growth and net revenue retention : In practice, this sounds reasonable: what percentage of your revenue are you making year-over-year? The problem is that this is a very easy to game stat, especially if you’re using it to raise money, because you can move customer billing periods around to make sure that things all continue to look good. Even then, per research by Jacco van der Kooji and Dave Boyce , net revenue retention is dropping quarter over quarter. The other problem is that the entire process of selling software has separated from the end-user, which means that products (and sales processes) are oriented around selling that software to the person responsible for buying it rather than those doomed to use it.  Per Nik Suresh’s Brainwash An Executive Today , in a conversation with the Chief Technology Officer of a company with over 10,000 people, who had asked if “data observability,” a thing that they did not (and would not need to, in their position) understand, was a problem, and whether Nik had heard of Monte Carlo. It turned out that the executive in question had no idea what Monte Carlo or data observability was , but because they’d heard about it on LinkedIn, it was now all they could think about.  This is the environment that private equity bought into — a seemingly-eternal growth engine with pliant customers desperate to spend money on a product that didn’t have to be good , just functional-enough. These people do not know what they are talking about or why they are buying these companies other than being able to mumble out shit like “ARR” and “NRR+” and “TAM” and “CAC” and “ARPA” in the right order to convince themselves that something is a good idea without ever thinking about what would happen if it wasn’t. This allowed them to stick to the “big picture,” meaning “numbers that I can look at rather than any practical experience in software development.” While I guess the concept of private equity isn’t morally repugnant, its current form — which includes venture capital — has led the modern state of technology into the fucking toilet, combining an initial flux of viable businesses, frothy markets and zero interest rates making it deceptively easy to raise money to acquire and deploy capital, leading to brainless investing, the death of logical due diligence, and potentially ruinous consequences for everybody involved. Private equity spent decades buying a little bit of just about everything, enriching the already-rich by engaging with the most vile elements of the Rot Economy’s growth-at-all-costs mindset . Its success is predicated on near-perpetual levels of liquidity and growth in both its holdings and the holdings of those who exist only to buy their stock, and on a tech and business media that doesn’t think too hard about the reality of the problems their companies claim to solve. The reckoning that’s coming is one built specifically to target the ignorant hubris that made them rich.  Private equity has yet to be punished by its limited partners and banks for investing in zombie assets, allowing it to pile into the unprofitable data centers underpinning the AI bubble, meaning that companies like Apollo, Blue Owl and Blackstone — all of whom participated in the ugly $10.2 billion acquisition of Zendesk in 2022 ( after it rejected another PE offer of $17 billion in 2021 ) that included $5 billion in debt — have all become heavily-leveraged in giant, ugly debt deals covering assets that are obsolete to useless in a few years . Alongside the fumbling ignorance of private equity sits the $3 trillion private credit industry , an equally-putrid, growth-drunk, and poorly-informed industry run with the same lax attention to detail and Big Brain Number Models that can justify just about any investment they want. Their half-assed due diligence led to billions of dollars of loans being given to outright frauds like First Brands , Tricolor and PosiGen , and, to paraphrase JP Morgan’s Jamie Dimon, there are absolutely more fraudulent cockroaches waiting to emerge . You may wonder why this matters, as all of this is private credit. Well, they get their money from banks. Big banks. In fact, according to the Federal Reserve of Boston , about 14% ($300 billion) of large banks’ total loan commitments to non-banking financial institutions in 2023 went to private equity and private credit, with Moody’s pegging the number around $285 billion, with an additional $340 billion in unused-yet-committed cash waiting in the wings . Oh, and they get their money from you . Pension funds are among some of the biggest backers of private credit companies , with the New York City Employees Retirement System and CalPERS increasing their investments.  Today, I’m going to teach you all about private equity, private credit, and why years of reframing “value” to mean “growth” may genuinely threaten the global banking system, as well as how effectively every company raises money. An entirely-different system exists for the wealthy to raise and deploy capital, one with flimsy due diligence, a genuine lack of basic industrial knowledge, and hundreds of billions of dollars of crap it can’t sell.  These people have been able to raise near-unlimited capital to do basically anything they want because there was always somebody stupid enough to buy whatever they were selling, and they have absolutely no plan for what happens when their system stops working.  They’ll loan to anyone or invest in anything that confirms their biases, and those biases are equal parts moronic and malevolent. Now they’re investing teachers’ pensions and insurance premiums in unprofitable and unsustainable data centers, all because they have no idea what a good investment actually looks like.  Welcome to the Hater’s Guide To Private Equity, or “The Stupidest Assholes In The Room.”

0 views

On NVIDIA and Analyslop

Hey all! I’m going to start hammering out free pieces again after a brief hiatus, mostly because I found myself trying to boil the ocean with each one, fearing that if I regularly emailed you you’d unsubscribe. I eventually realized how silly that was, so I’m back, and will be back more regularly. I’ll treat it like a column, which will be both easier to write and a lot more fun. As ever, if you like this piece and want to support my work, please subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5000 to 18,000 words, including vast, extremely detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . I am regularly several steps ahead in my coverage, and you get an absolute ton of value. In the bottom right hand corner of your screen you’ll see a red circle — click that and select either monthly or annual.  Next year I expect to expand to other areas too. It’ll be great. You’re gonna love it.  Before we go any further, I want to remind everybody I’m not a stock analyst nor do I give investment advice.  I do, however, want to say a few things about NVIDIA and its annual earnings report, which it published on Wednesday, February 25: NVIDIA’s entire future is built on the idea that hyperscalers will buy GPUs at increasingly-higher prices and at increasingly-higher rates every single year. It is completely reliant on maybe four or five companies being willing to shove tens of billions of dollars a quarter directly into Jensen Huang’s wallet. If anything changes here — such as difficulty acquiring debt or investor pressure cutting capex — NVIDIA is in real trouble, as it’s made over $95 billion in commitments to build out for the AI bubble .  Yet the real gem was this part: Hell yeah dude! After misleading everybody that it intended to invest $100 billion in OpenAI last year ( as I warned everybody about months ago , the deal never existed and is now effectively dead ), NVIDIA was allegedly “close” to investing $30 billion . One would think that NVIDIA would, after Huang awkwardly tried to claim that the $100 billion was “ never a commitment ,” say with its full chest how badly it wanted to support OpenAI and how intentionally it would do so. Especially when you have this note in your 10-K: What a peculiar world we live in. Apparently NVIDIA is “so close” to a “partnership agreement” too , though it’s important to remember that Altman, Brockman, and Huang went on CNBC to talk about the last deal and that never came together. All of this adds a little more anxiety to OpenAI's alleged $100 billion funding round which, as The Information reports , Amazon's alleged $50 billion investment will actually be $15 billion, with the next $35 billion contingent on AGI or an IPO: And that $30 billion from NVIDIA is shaping up to be a Klarna-esque three-installment payment plan: A few thoughts: Anyway, on to the main event. New term: analyslop, when somebody writes a long, specious piece of writing with few facts or actual statements with the intention of it being read as thorough analysis.  This week, alleged financial analyst Citrini Research (not to be confused with Andrew Left’s Citron Research)  put out a truly awful piece called the “2028 Global Intelligence Crisis,” slop-filled scare-fiction written and framed with the authority of deeply-founded analysis, so much so that it caused a global selloff in stocks .  This piece — if you haven’t read it, please do so using my annotated version — spends 7000 or more words telling the dire tale of what would happen if AI made an indeterminately-large amount of white collar workers redundant.  It isn’t clear what exactly AI does, who makes the AI, or how the AI works, just that it replaces people, and then bad stuff happens. Citrini insists that this “isn’t bear porn or AI-doomer fan-fiction,” but that’s exactly what it is — mediocre analyslop framed in the trappings of analysis, sold on a Substack with “research” in the title, specifically written to spook and ingratiate anyone involved in the financial markets.  Its goal is to convince you that AI (non-specifically) is scary, that your current stocks are bad, and that AI stocks (unclear which ones those are, by the way) are the future. Also, find out more for $999 a year. Let me give you an example: The goal of a paragraph like this is for you to say “wow, that’s what GPUs are doing now!” It isn’t, of course. The majority of CEOs report little or no return on investment from AI , with a study of 6000 CEOs across the US, UK, Germany and Australia finding that “ more than 80%  [detected] no discernable impact from AI on either employment or productivity .” Nevertheless, you read “GPU” and “North Dakota” and you think “wow! That’s a place I know, and I know that GPUs power AI!”  I know a GPU cluster in North Dakota — CoreWeave’s one with Applied Digital that has debt so severe that it loses both companies money even if they have the capacity rented out 24/7 . But let’s not let facts get in the way of a poorly-written story. I don’t need to go line-by-line — mostly because I’ll end up writing a legally-actionable threat — but I need you to know that most of this piece’s arguments come down to magical thinking and the utterly empty prose. For example, how does AI take over the entire economy?  That’s right, they just get better. No need to discuss anything happening today. Even AI 2027 had the balls to start making stuff about “OpenBrain” or whatever. This piece literally just says stuff, including one particularly-egregious lie:  This is a complete and utter lie. A bald-faced lie. This is not something that Claude Code can do. The fact that we have major media outlets quoting this piece suggests that those responsible for explaining how things work don’t actually bother to do any of the work to find out, and it’s both a disgrace and embarrassment for the tech and business media that these lies continue to be peddled.  I’m now going to quote part of my upcoming premium (the Hater’s Guide To Private Equity, out Friday), because I think it’s time we talked about what Claude Code actually does. I’ve worked in or around SaaS since 2012, and I know the industry well. I may not be able to code, but I take the time to speak with software engineers so that I understand what things actually do and how “impressive” they are. Similarly, I make the effort to understand the underlying business models in a way that I’m not sure everybody else is trying to, and if I’m wrong, please show me an analysis of the financial condition of OpenAI or Anthropic from a booster. You won’t find one, because they’re not interested in interacting with reality. So, despite all of this being very obvious , it’s clear that the markets and an alarming number of people in the media simply do not know what they are talking about or are intentionally avoiding thinking about it. The “AI replaces software” story is literally “Anthropic has released a product and now the resulting industry is selling off,” such as when it launched a cybersecurity tool that could check for vulnerabilities (a product that has existed in some form for nearly a decade) causing a sell-off in cybersecurity stocks like Crowdstrike — you know, the one that had a faulty bit of code cause a global cybersecurity incident that lost the Fortune 500 billions , and resulted in Delta Airlines having to cancel over 1,200 flights over a period of several days .  There is no rational basis for anything about this sell-off other than that our financial media and markets do not appear to understand the very basic things about the stuff they invest in. Software may seem complex, but (especially in these cases) it’s really quite simple: investors are conflating “an AI model can spit out code” with “an AI model can create the entire experience of what we know as ‘software,’ or is close enough that we have to start freaking out.” This is thanks to the intentionally-deceptive marketing pedalled by Anthropic and validated by the media. In a piece from September 2025, Bloomberg reported that Claude Sonnet 4.5 could “code on its own for up to 30 hours straight,”  a statement directly from Anthropic repeated by other outlets that added that it did so “on complex, multi-step tasks,” none of which were explained. The Verge, however, added that apparently Anthropic “ coded a chat app akin to Slack or Teams ,” and no, you can’t see it, or know anything about how much it costs or its functionality. Does it run? Is it useful? Does it work in any way? What does it look like? We have absolutely no proof this happened other than Anthropic saying it, but because the media repeated it it’s now a fact.  As I discussed last week, Anthropic’s primary business model is deception , muddying the waters of what’s possible today and what might be possible tomorrow through a mixture of flimsy marketing statements and chief executive Dario Amodei’s doomerist lies about all white collar labor disappearing .  Anthropic tells lies of obfuscation and omission.  Anthropic exploits bad journalism, ignorance and a lack of critical thinking. As I said earlier, the “wow, Claude Code!” articles are mostly from captured boosters and people that do not actually build software being amazed that it can burp up its training data and make an impression of software engineering.  And even if we believe the idea that Spotify’s best engineers are not writing any code , I have to ask: to what end? Is Spotify shipping more software? Is the software better? Are there more features? Are there less bugs? What are the engineers doing with the time they’re saving? A study from last year from METR said that despite thinking they were 24% faster, LLM coding tools made engineers 19% slower.  I also think we need to really think deeply about how, for the second time in a month, the markets and the media have had a miniature shitfit based on blogs that tell lies using fan fiction. As I covered in my annotations of Matt Shumer’s “Something Big Is Happening,” the people that are meant to tell the general public what’s happening in the world appear to be falling for ghost stories that confirm their biases or investment strategies, even if said stories are full of half-truths and outright lies. I am despairing a little. When I see Matt Shumer on CNN or hear from the head of a PE firm about Citrini Research, I begin to wonder whether everybody got where they were not through any actual work but by making the right noises.  This is the grifter economy, and the people that should be stopping them are asleep at the wheel. NVIDIA beat estimates and raised expectations, as it has quarter after quarter. People were initially excited, then started reading the 10-K and seeing weird little things that stood out. $68.1 billion in revenue is a lot of money! That’s what you should expect from a company that is the single vendor in the only thing anybody talks about.  Hyperscaler revenue accounted for slightly more than 50% of NVIDIA’s data center revenue . As I wrote about last year , NVIDIA’s diversified revenue — that’s the revenue that comes from companies that aren’t in the magnificent 7 — continues to collapse. While data center revenue was $62.3 billion, 50% ($31.15 billion) was taken up by hyperscalers…and because we don’t get a 10-Q for the fourth quarter, we don’t get a breakdown of how many individual customers made up that quarter’s revenue. Boo! It is both peculiar and worrying that 36% (around $77.7 billion) of its $215.938 billion in FY2026 revenue came from two customers. If I had to guess, they’re likely Foxconn or Quanta computing, two large Taiwanese ODMs (Original Design Manufacturers) that build the servers for most hyperscalers.  If you want to know more, I wrote a long premium piece that goes into it (among the ways in which AI is worse than the dot com bubble). In simple terms, when a hyperscaler buys GPUs, they go straight to one of these ODMs to put them into servers. This isn’t out of the ordinary, but I keep an eye on the ODM revenues (which publish every month) to see if anything shifts, as I think it’ll be one of the first signs that things are collapsing. NVIDIA’s inventories continue to grow, sitting at over $21 billion (up from around $19 billion last quarter). Could be normal! Could mean stuff isn’t shipping. NVIDIA has now agreed to $27 billion in multi-year-long cloud service agreements — literally renting its GPUs back from the people it sells them to — with $7 billion of that expected in its FY2027 (Q1 FY2027 will report in May 2026).  For some context, CoreWeave (which reports FY2025 earnings today, February 26) gave guidance last November that it expected its entire annual revenue to be between $5 billion and $5.15 billion. CoreWeave is arguably the largest AI compute vendor outside of the hyperscalers. If there was significant demand, none of this would be necessary. NVIDIA “invested” $17.5bn in AI model makers and other early-stage AI startups, and made a further $3.5bn in land, power, and shell guarantees to “support the build-out of complex datacenter infrastructures.” In total, it spent $21bn propping up the ecosystem that, in turn, feeds billions of dollars into its coffers.  NVIDIA’s l ong-term supply and capacity obligations soared from $30.8bn to $95.2bn , largely because NVIDIA’s latest chips are extremely complex and require TSMC to make significant investments in hardware and facilities , and it’s unwilling to do that without receiving guarantees that it’ll make its money back.  NVIDIA expects these obligations to grow .  NVIDIA’s accounts receivable (as in goods that have been shipped but are yet to be paid for) now sits at $38.4 billion, of which 56% ($21.5 billion) is from three customers. This is turning into a very involved and convoluted process! It turns out that it's pretty difficult to actually raise $100 billion. This is a big problem, because OpenAI needs $655 billion in the next five years to pay all its bills , and loses billions of dollars a year. If OpenAI is struggling to raise $100 billion today, I don't see how it's possible it survives. If you're to believe reports, OpenAI made $13.1 billion in revenue in 2025 on $8 billion of losses , but remember, my own reporting from last year said that OpenAI only made around $4.329 billion through September 2025 with $8.67 billion of inference costs alone. It is kind of weird that nobody seems to acknowledge my reporting on this subject. I do not see how OpenAI survives. it coded for 30 hours [from which you are meant to intimate the code was useful or good and that these hours were productive].  it made a Microsoft Teams competitor [that you are meant to assume was full-featured and functional like Teams or Slack, or…functional? And they didn’t even have to prove it by showing you it]  It was able to write uninterruptedly [which you assume was because it was doing good work that didn’t need interruption].

0 views
bitonic's blog. 1 months ago

A vibe-coded alternative to YieldGimp

If you’re a UK tax resident, short-term low-coupon gilts are the most tax efficient way to get savings-account-like returns, since most of their yield is tax free. This makes them very popular amongst retail investors, which now hold a large portion of the tradable low-coupon gilts. YieldGimp.com used to be a great free resource to evaluate the gilts currently available. However, it was recently turned into an app rather than a simple webpage. I’m not even sure if the app is free or paid, but I do not want to install the “YieldGimp platform” to quickly check gilt metrics when I buy them. So I asked my LLM of choice to produce an alternative, and after a few minutes and a few rounds of prompting I had something that served my needs. It is available for use at mazzo.li/gilts/ , and the source is on GitHub . It differs from YieldGimp in that it does not show metrics based on the current market price, but rather requires the user to input a price. I find this more useful anyway, since gilts are somewhat illiquid on my broker, so I need to come up with a limit price myself, which means that I want to know what the yield is at my price rather than the market price. It also lets you select a specific tax rate to produce a “gross equivalent” yield. It is not a very sophisticated tool and it doesn’t pretend to model gilts and their tax implications precisely (the repository’s README has more details on its shortcomings), but for most use cases it should be informative enough to sanity-check your trades without a Bloomberg terminal.

0 views

Premium: The AI Data Center Financial Crisis

Since the beginning of 2023, big tech has spent over $814 billion in capital expenditures, with a large portion of that going towards meeting the demands of AI companies like OpenAI and Anthropic.  Big tech has spent big on GPUs, power infrastructure, and data center construction,  using a variety of financing methods to do so, including (but not limited to) leasing. And the way they’re going about structuring these finance deals is growing increasingly bizarre.  I’m not merely talking about Meta’s curious arrangement for its facility in Louisiana , though that certainly raised some eyebrows. Last year, Morgan Stanley published a report that claimed hyperscalers were increasingly relying on finance leases to obtain the “powered shell” of a data center, rather than the more common method of operating leases.  The key difference here is that finance leases, unlike operating leases, are effectively long-term loans where the borrower is expected to retain ownership of the asset (whether that be a GPU or a building) at the end of the contract. Traditionally, these types of arrangements have been used to finance the bits of a data center that have a comparatively limited useful life — like computer hardware, which grows obsolete with time. The spending to date is, as I’ve written about again and again , an astronomical amount of spending considering the lack of meaningful revenue from generative AI.  Even after a year straight of manufacturing consent for Claude Code as the be-all-end-all of software development resulted in putrid results for Anthropic — $4.5 billion of revenue and $5.2 billion of losses before interest, taxes, depreciation and amortization according to The Information — with ( per WIRED ) Claude Code only accounting for around $1.1 billion in annualized revenue in December, or around $92 million in monthly revenue. This was in a year where Anthropic raised a total of $16.5 billion (with $13 billion of that coming in September 2025), and it’s already working on raising another $25 billion . This might be because it promised to buy $21 billion of Google TPUs from Broadcom , or because Anthropic expects AI model training costs to cost over $100 billion in the next 3 years . And it just raised another $30 billion — albeit with the caveat that some of said $30 billion came from previously-announced funding agreements with Nvidia and Microsoft, though how much remains a mystery. According to Anthropic’s new funding announcement, Claude Code’s run rate has grown to “over $2.5 billion” as of February 12 2026 — or around $208 million. Based on literally every bit of reporting about Anthropic, costs have likely spiked along with revenue, which hit $14 billion annualized ($1.16 billion in a month) as of that date.  I have my doubts, but let’s put them aside for now. Anthropic is also in the midst of one of the most aggressive and dishonest public relations campaigns in history. While its Chief Commercial Officer Paul Smith told CNBC that it was “focused on growing revenue” rather than “spending money,” it’s currently making massive promises — tens of billions on Google Cloud , “ $50 billion in American AI infrastructure ,” and $30 billion on Azure . And despite Smith saying that Anthropic was less interested in “flashy headlines,” Chief Executive Dario Amodei has said, in the last three weeks , that “ almost unimaginable power is potentially imminent ,” that AI could replace all software engineers in the next 6-12 months , that AI may (it’s always fucking may ) cause “ unusually painful disruption to jobs ,” and wrote a 19,000 word essay — I guess AI is coming for my job after all! — where he repeated his noxious line that “we will likely get a century of scientific and economic progress compressed in a decade.” Yet arguably the most dishonest part is this word “training.” When you read “training,” you’re meant to think “oh, it’s training for something, this is an R&D cost,” when “training LLMs” is as consistent a cost as inference (the creation of the output) or any other kind of maintenance.  While most people know about pretraining — the shoving of large amounts of data into a model (this is a simplification I realize) — in reality a lot of the current spate of models use post-training , which covers everything from small tweaks to model behavior to full-blown reinforcement learning where experts reward or punish particular responses to prompts. To be clear, all of this is well-known and documented, but the nomenclature of “training” suggests that it might stop one day, versus the truth: training costs are increasing dramatically, and “training” covers anything from training new models to bug fixes on existing ones. And, more fundamentally, it’s an ongoing cost — something that’s an essential and unavoidable cost of doing business.  Training is, for an AI lab like OpenAI and Anthropic, as common (and necessary) a cost as those associated with creating outputs (inference), yet it’s kept entirely out of gross margins : This is inherently deceptive. While one would argue that R&D is not considered in gross margins, training isn’t gross margins — yet gross margins generally include the raw materials necessary to build something, and training is absolutely part of the raw costs of running an AI model. Direct labor and parts are considered part of the calculation of gross margin, and spending on training — both the data and the process of training itself — are absolutely meaningful, and to leave them out is an act of deception.  Anthropic’s 2025 gross margins were 40% — or 38% if you include free users of Claude — on inference costs of $2.7 (or $2.79) billion, with training costs of around $4.1 billion . What happens if you add training costs into the equation?  Let’s work it out! Training is not an up front cost , and considering it one only serves to help Anthropic cover for its wretched business model. Anthropic (like OpenAI) can never stop training, ever, and to pretend otherwise is misleading. This is not the cost just to “train new models” but to maintain current ones, build new products around them, and many other things that are direct, impossible-to-avoid components of COGS. They’re manufacturing costs, plain and simple. Anthropic projects to spend $100 billion on training in the next three years, which suggests it will spend — proportional to its current costs — around $32 billion on inference in the same period, on top of $21 billion of TPU purchases, on top of $30 billion on Azure (I assume in that period?), on top of “tens of billions” on Google Cloud. When you actually add these numbers together (assuming “tens of billions” is $15 billion), that’s $200 billion.  Anthropic ( per The Information’s reporting ) tells investors it will make $18 billion in revenue in 2026 and $55 billion in 2027 — year-over-year increases of 400% and 305% respectively, and is already raising $25 billion after having just closed a $30bn deal. How does Anthropic pay its bills? Why does outlet after outlet print these fantastical numbers without doing the maths of “how does Anthropic actually get all this money?” Because even with their ridiculous revenue projections, this company is still burning cash, and when you start to actually do the maths around anything in the AI industry, things become genuinely worrying.  You see, every single generative AI company is unprofitable, and appears to be getting less profitable over time. Both The Information and Wall Street Journal reported the same bizarre statement in November — that Anthropic would “turn a profit more quickly than OpenAI,” with The Information saying Anthropic would be cash flow positive in 2027 and the Journal putting the date at 2028, only for The Information to report in January that 2028 was the more-realistic figure.  If you’re wondering how, the answer is “Anthropic will magically become cash flow positive in 2028”: This is also the exact same logic as OpenAI, which will, per The Information in September , also, somehow, magically turn cashflow positive in 2030: Oracle, which has a 5-year-long, $300 billion compute deal with OpenAI that it lacks the capacity to serve and that OpenAI lacks the cash to pay for, also appears to have the same magical plan to become cash flow positive in 2029 : Somehow, Oracle’s case is the most legit, in that theoretically at that time it would be done, I assume, paying the $38 billion it’s raising for Stargate Shackelford and Wisconsin, but said assumption also hinges on the idea that OpenAI finds $300 billion somehow . it also relies upon Oracle raising more debt than it currently has — which, even before the AI hype cycle swept over the company, was a lot.  As I discussed a few weeks ago in the Hater’s Guide To Oracle , a megawatt of data center IT load generally costs  (per Jerome Darling of TD Cowen) around $12-14m  in construction (likely more due to skilled labor shortages, supply constraints and rising equipment prices) and $30m a megawatt in GPUs and associated hardware. In plain terms, Oracle (and its associated partners) need around $189 billion to build the 4.5GW of Stargate capacity to make the revenue from the OpenAI deal, meaning that it needs around another $100 billion once it raises $50 billion in combined debt, bonds, and printing new shares by the end of 2026. I will admit I feel a little crazy writing this all out, because it’s somehow a fringe belief to do the very basic maths and say “hey, Oracle doesn’t have the capacity and OpenAI doesn’t have the money.” In fact, nobody seems to want to really talk about the cost of AI, because it’s much easier to say “I’m not a numbers person” or “they’ll work it out.” This is why in today’s newsletter I am going to lay out the stark reality of the AI bubble, and debut a model I’ve created to measure the actual, real costs of an AI data center. While my methodology is complex, my conclusions are simple: running AI data centers is, even when you remove the debt required to stand up these data centers, a mediocre business that is vulnerable to basically any change in circumstances.  Based on hours of discussions with data center professionals, analysts and economists, I have calculated that in most cases, the average AI data center has gross margins of somewhere between 30% and 40% — margins that decay rapidly for every day, week, or month that you take putting a data center into operation. This is why Oracle has negative 100% margins on NVIDIA’s GB200 chips — because the burdensome up-front cost of building AI data centers (as GPUs, servers, and other associated) leaves you billions of dollars in the hole before you even start serving compute, after which you’re left to contend with taxes, depreciation, financing, and the cost of actually powering the hardware.  Yet things sour further when you face the actual financial realities of these deals — and the debt associated with them.  Based on my current model of the 1GW Stargate Abilene data center, Oracle likely plans to make around $11 billion in revenue a year from the 1.2GW (or around 880MW of critical IT). While that sounds good, when you add things like depreciation, electricity, colocation costs of $1 billion a year from Crusoe, opex, and the myriad of other costs, its margins sit at a stinkerific 27.2% — and that’s assuming OpenAI actually pays, on time, in a reliable way. Things only get worse when you factor in the cost of debt. While Oracle has funded Abilene using a mixture of bonds and existing cashflow, it very clearly has yet to receive the majority of the $25 billion+ in GPUs and associated hardware (with only 96,000 GPUs “ delivered ”), meaning that it likely bought them out of its $18 billion bond sale from last September .  If we assume that maths, this means that Oracle is paying a little less than $963 million a year ( per the terms of the bond sale ) whether or not a single GPU is even turned on, leaving us with a net margin of 22.19%... and this is assuming OpenAI pays every single bill, every single time, and there are absolutely no delays. These delays are also very, very expensive. Based on my model, if we assume that 100MW of critical IT load is operational (roughly two buildings and 100,000 GB200s) but has yet to start generating revenue, Oracle is burning, without depreciation ( EDITOR’S NOTE: sorry! This previously said depreciation was a cash expense and was included in this number (even though it wasn’t! ) , but it's correct in the model! ), around $4.69 million a day in cash . I have also confirmed with sources in Abilene that there is no chance that Stargate Abilene is fully operational in 2026. In simpler terms: I will admit I’m quite disappointed that the media at large has mostly ignored this story. Limp, cautious “are we in an AI bubble?” conversations are insufficient to deal with the potential for collapse we’re facing.  Today, I’m going to dig into the reality of the costs of AI, and explain in gruesome detail exactly how easily these data centers can rapidly approach insolvency in the event that their tenants fail to pay.  The chain of pain is real: Today I’m going to explain how easily it breaks. If Anthropic’s gross margin was 38% in 2025, that means its COGS (cost of goods sold) was $2.79 billion. If we add training, this brings COGS to $6.89 billion, leaving us with -$2.39 billion after $4.5 billion in revenue. This results in a negative 53% gross margin. AI startups are all unprofitable, and do not appear to have a path to sustainability.  AI data centers are being built in anticipation of demand that doesn’t exist, and will only exist if AI startups — which are all unprofitable — can afford to pay them. Oracle, which has committed to building 4.5GW of data centers, is burning cash every day that OpenAI takes to set up its GPUs, and when it starts making money, it does so from a starting position of billions and billions of dollars in debt. Margins are low throughout the entire stack of AI data center operators — from landlords like Applied Digital to compute providers like CoreWeave — thanks to the billions in debt necessary to fund both construction and IT hardware to make them run, putting both parties in a hole that can only be filled with revenues that come from either hyperscalers or AI startups.  In a very real sense, the AI compute industry is dependent on AI “working out,” because if it doesn’t, every single one of these data centers will become a burning hole in the ground.

1 views
Stratechery 2 months ago

Amazon Earnings, CapEx Concerns, Commodity AI

Amazon's massive CapEx increase makes me much more nervous than Google's, but it is understandable.

0 views
Stratechery 2 months ago

Apple Earnings, Supply Chain Speculation, China and Industrial Design

Apple's earnings could have been higher but the company couldn't get enough chips; then, once again a new design meant higher sales in China.

0 views
Stratechery 2 months ago

Meta Earnings, Turning Dials, Zuckerberg’s Motivation

Meta is up, despite massive CapEx plans. The company is turning every dial to drive revenue, because Mark Zuckerberg thinks winning in AI is existential.

0 views
Stratechery 2 months ago

An Interview with Kalshi CEO Tarek Monsour About Prediction Markets

An interview with Kalshi co-founder and CEO Tarek Monsour about the value of prediction markets.

0 views

Lessons from Building AI Agents for Financial Services

I’ve spent the last two years building AI agents for financial services. Along the way, I’ve accumulated a fair number of battle scars and learnings that I want to share. Here’s what I’ll cover: - The Sandbox Is Not Optional - Why isolated execution environments are essential for multi-step agent workflows - Context Is the Product - How we normalize heterogeneous financial data into clean, searchable context - The Parsing Problem - The hidden complexity of extracting structured data from adversarial SEC filings - Skills Are Everything - Why markdown-based skills are becoming the product, not the model - The Model Will Eat Your Scaffolding - Designing for obsolescence as models improve - The S3-First Architecture - Why S3 beats databases for file storage and user data - The File System Tools - How ReadFile, WriteFile, and Bash enable complex financial workflows - Temporal Changed Everything - Reliable long-running tasks with proper cancellation handling - Real-Time Streaming - Building responsive UX with delta updates and interactive agent workflows - Evaluation Is Not Optional - Domain-specific evals that catch errors before they cost money - Production Monitoring - The observability stack that keeps financial agents reliable Why financial services is extremely hard. This domain doesn’t forgive mistakes. Numbers matter. A wrong revenue figure, a misinterpreted guidance statement, an incorrect DCF assumption. Professional investors make million-dollar decisions based on our output. One mistake on a $100M position and you’ve destroyed trust forever. The users are also demanding. Professional investors are some of the smartest, most time-pressed people you’ll ever work with. They spot bullshit instantly. They need precision, speed, and depth. You can’t hand-wave your way through a valuation model or gloss over nuances in an earnings call. This forces me to develop an almost paranoid attention to detail. Every number gets double-checked. Every assumption gets validated. Every model gets stress-tested. You start questioning everything the LLM outputs because you know your users will. A single wrong calculation in a DCF model and you lose credibility forever. I sometimes feel that the fear of being wrong becomes our best feature. Over the years building with LLM, we’ve made bold infrastructure bets early and I think we have been right. For instance, when Claude Code launched with its filesystem-first agentic approach, we immediately adopted it. It was not an obvious bet and it was a massive revamp of our architecture. I was extremely lucky to have Thariq from Anthropic Claude Code jumping on a Zoom and opening my eyes to the possibilities. At the time the whole industry, including Fintool, was all building elaborate RAG pipelines with vector databases and embeddings. After reflecting on the future of information retrieval with agents I wrote “ the RAG obituary ” and Fintool moved fully to agentic search. We even decided to retire our precious embedding pipeline. Sad but whatever is best for the future! People thought we were crazy. The article got a lot of praise and a lot of negative comments. Now I feel most startups are adopting these best practices. I believe we’re early on several other architectural choices too. I’m sharing them here because the best way to test ideas is to put them out there. Let’s start with the biggest one. When we first started building Fintool in 2023, I thought sandboxing might be overkill. “We’re just running Python scripts” I told myself. “What could go wrong?” Haha. Everything. Everything could go wrong. The first time an LLM decided to `rm -rf /` on our server (it was trying to “clean up temporary files”), I became a true believer. Here’s the thing: agents need to run multi-step operations. A professional investor asks for a DCF valuation and that’s not a single API call. The agent needs to research the company, gather financial data, build a model in Excel, run sensitivity analysis, generate complex charts, iterate on assumptions. That’s dozens of steps, each potentially modifying files, installing packages, running scripts. You can’t do this without code execution. And executing arbitrary code on your servers is insane. Every chat application needs a sandbox. Today each user gets their own isolated environment. The agent can do whatever it wants in there. Delete everything? Fine. Install weird packages? Go ahead. It’s your sandbox, knock yourself out. The architecture looks like this: Three mount points. Private is read/write for your stuff. Shared is read-only for your organization. Public is read-only for everyone. The magic is in the credentials. We use AWS ABAC (Attribute-Based Access Control) to generate short-lived credentials scoped to specific S3 prefixes. User A literally cannot access User B’s data. The IAM policy uses ` ${aws:PrincipalTag/S3Prefix} ` to restrict access. The credentials physically won’t allow it. This is also very good for Enterprise deployment. We also do sandbox pre-warming. When a user starts typing, we spin up their sandbox in the background. By the time they hit enter, the sandbox is ready. 600 second timeout, extended by 10 minutes on each tool usage. The sandbox stays warm across conversation turns. So sandboxes are amazing but the under-discussed magic of sandboxes is the support for the filesystem. Which brings us to the next lesson learned about context. Your agent is only as good as the context it can access. The real work isn’t prompt engineering it’s turning messy financial data from dozens of sources into clean, structured context the model can actually use. This requires a massive domain expertise from the engineering team. The heterogeneity problem. Financial data comes in every format imaginable: - SEC filings : HTML with nested tables, exhibits, signatures - Earnings transcripts : Speaker-segmented text with Q&A sections - Press releases : Semi-structured HTML from PRNewswire - Research reports : PDFs with charts and footnotes - Market data : Snowflake/databases with structured numerical data - News : Articles with varying quality and structure - Alternative data : Satellite imagery, web traffic, credit card panels - Broker research : Proprietary PDFs with price targets and models - Fund filings : 13F holdings, proxy statements, activist letters Each source has different schemas, different update frequencies, different quality levels. Agent needs one thing: clean context it can reason over. The normalization layer. Everything becomes one of three formats: - Markdown for narrative content (filings, transcripts, articles) - CSV/tables for structured data (financials, metrics, comparisons) - JSON metadata for searchability (tickers, dates, document types, fiscal periods) Chunking strategy matters. Not all documents chunk the same way: - 10-K filings : Section by regulatory structure (Item 1, 1A, 7, 8...) - Earnings transcripts : Chunk by speaker turn (CEO remarks, CFO remarks, Q&A by analyst) - Press releases : Usually small enough to be one chunk - News articles : Paragraph-level chunks - 13F filings : By holder and position changes quarter-over-quarter The chunking strategy determines what context the agent retrieves. Bad chunks = bad answers. Tables are special. Financial data is full of tables and csv. Revenue breakdowns, segment performance, guidance ranges. LLMs are surprisingly good at reasoning over markdown tables: But they’re terrible at reasoning over HTML `<table>` tags or raw CSV dumps. The normalization layer converts everything to clean markdown tables. Metadata enables retrieval. The user asks the agent: “ What did Apple say about services revenue in their last earnings call? ” To answer this, Fintool needs: - Ticker resolution (AAPL → correct company) - Document type filtering (earnings transcript, not 10-K) - Temporal filtering (most recent, not 2019) - Section targeting (CFO remarks or revenue discussion, not legal disclaimers) This is why `meta.json` exists for every document. Without structured metadata, you’re doing keyword search over a haystack. It speeds up the search, big time! Anyone can call an LLM API. Not everyone has normalized decades of financial data into searchable, chunked markdown with proper metadata. The data layer is what makes agents actually work. The Parsing Problem Normalizing financial data is 80% of the work. Here’s what nobody tells you. SEC filings are adversarial. They’re not designed for machine reading. They’re designed for legal compliance: - Tables span multiple pages with repeated headers - Footnotes reference exhibits that reference other footnotes - Numbers appear in text, tables, and exhibits—sometimes inconsistently - XBRL tags exist but are often wrong or incomplete - Formatting varies wildly between filers (every law firm has their own template) We tried off-the-shelf PDF/HTML parsers. They failed on: - Multi-column layouts in proxy statements - Nested tables in MD&A sections (tables within tables within tables) - Watermarks and headers bleeding into content - Scanned exhibits (still common in older filings and attachments) - Unicode issues (curly quotes, em-dashes, non-breaking spaces) The Fintool parsing pipeline: Raw Filing (HTML/PDF) Document structure detection (headers, sections, exhibits) Table extraction with cell relationship preservation Entity extraction (companies, people, dates, dollar amounts) Cross-reference resolution (Ex. 10.1 → actual exhibit content) Fiscal period normalization (FY2024 → Oct 2023 to Sep 2024 for Apple) Quality scoring (confidence per extracted field) Table extraction deserves its own work. Financial tables are dense with meaning. A revenue breakdown table might have: - Merged header cells spanning multiple columns - Footnote markers (1), (2), (a), (b) that reference explanations below - Parentheses for negative numbers: $(1,234) means -1234 - Mixed units in the same table (millions for revenue, percentages for margins) - Prior period restatements in italics or with asterisks We score every extracted table on: - Cell boundary accuracy (did we split/merge correctly?) - Header detection (is row 1 actually headers, or is there a title row above?) - Numeric parsing (is “$1,234” parsed as 1234 or left as text?) - Unit inference (millions? billions? per share? percentage?) Tables below 90% confidence get flagged for review. Low-confidence extractions don’t enter the agent’s context—garbage in, garbage out. Fiscal period normalization is critical. “Q1 2024” is ambiguous: - Calendar Q1 (January-March 2024) - Apple’s fiscal Q1 (October-December 2023) - Microsoft’s fiscal Q1 (July-September 2023) - “Reported in Q1” (filed in Q1, but covers the prior period) We maintain a fiscal calendar database for 10,000+ companies. Every date reference gets normalized to absolute date ranges. When the agent retrieves “Apple Q1 2024 revenue,” it knows to look for data from October-December 2023. This is invisible to users but essential for correctness. Without it, you’re comparing Apple’s October revenue to Microsoft’s January revenue and calling it “same quarter.” Here’s the thing nobody tells you about building AI agents: the model is not the product. The skills are now the product. I learned this the hard way. We used to try making the base model “smarter” through prompt engineering. Tweak the system prompt, add examples, write elaborate instructions. It helped a little. But skills were the missing part. In October 2025, Anthropic formalized this with Agent Skills a specification for extending Claude with modular capability packages. A skill is a folder containing a `SKILL.md` file with YAML frontmatter (name and description), plus any supporting scripts, references, or data files the agent might need. We’d been building something similar for months before the announcement. The validation felt good but more importantly, having an industry standard means our skills can eventually be portable. Without skills, models are surprisingly bad at domain tasks. Ask a frontier model to do a DCF valuation. It knows what DCF is. It can explain the theory. But actually executing one? It will miss critical steps, use wrong discount rates for the industry, forget to add back stock-based compensation, skip sensitivity analysis. The output looks plausible but is subtly wrong in ways that matter. The breakthrough came when we started thinking about skills as first-class citizens. Like part of the product itself. A skill is a markdown file that tells the agent how to do something specific. Here’s a simplified version of our DCF skill: That’s it. A markdown file. No code changes. No production deployment. Just a file that tells the agent what to do. Skills are better than code. This matters enormously: 1. Non-engineers can create skills. Our analysts write skills. Our customers write skills. A portfolio manager who’s done 500 DCF valuations can encode their methodology in a skill without writing a single line of Python. 2. No deployment needed. Change a skill file and it takes effect immediately. No CI/CD, no code review, no waiting for release cycles. Domain experts can iterate on their own. 3. Readable and auditable. When something goes wrong, you can read the skill and understand exactly what the agent was supposed to do. Try doing that with a 2,000-line Python module. We have a copy-on-write shadowing system: Priority: private > shared > public So if you don’t like how we do DCF valuations, write your own. Drop it in `/private/skills/dcf/SKILL.md`. Your version wins. Why we don’t mount all skills to the filesystem. This is important. The naive approach would be to mount every skill file directly into the sandbox. The agent can just `cat` any skill it needs. Simple, right? Wrong. Here’s why we use SQL discovery instead: 1. Lazy loading. We have dozens of skills with extensive documentation like the DCF skill alone has 10+ industry guideline files. Loading all of them into context for every conversation would burn tokens and confuse the model. Instead, we discover skill metadata (name, description) upfront, and only load the full documentation when the agent actually uses that skill. 2. Access control at query time. The SQL query implements our three-tier access model: public skills available to everyone, organization skills for that org’s users, private skills for individual users. The database enforces this. You can’t accidentally expose a customer’s proprietary skill to another customer. 3. Shadowing logic. When a user customizes a skill, their version needs to override the default. SQL makes this trivial—query all three levels, apply priority rules, return the winner. Doing this with filesystem mounts would be a nightmare of symlinks and directory ordering. 4. Metadata-driven filtering. The `fs_files.metadata` column stores parsed YAML frontmatter. We can filter by skill type, check if a skill is main-agent-only, or query any other structured attribute—all without reading the files themselves. The pattern: S3 is the source of truth, a Lambda function syncs changes to PostgreSQL for fast queries, and the agent gets exactly what it needs when it needs it. Skills are essential. I cannot emphasize this enough. If you’re building an AI agent and you don’t have a skills system, you’re going to have a bad time. My biggest argument for skills is that top models (Claude or GPT) are post-trained on using Skills. The model wants to fetch skills. Models just want to learn and what they want to learn is our skills... Until they ate it. Here’s the uncomfortable truth: everything I just told you about skills? It’s temporary in my opinion. Models are getting better. Fast. Every few months, there’s a new model that makes half your code obsolete. The elaborate scaffolding you built to handle edge cases? The model just... handles them now. When we started, we needed detailed skills with step-by-step instructions for some simple tasks. “First do X, then do Y, then check Z.” Now? We can often just say for simple task “do an earnings preview” and the model figures it out (kinda of!) This creates a weird tension. You need skills today because current models aren’t smart enough. But you should design your skills knowing that future models will need less hand-holding. That’s why I’m bullish on markdown file versus code for model instructions. It’s easier to update and delete. We send detailed feedback to AI labs. Whenever we build complex scaffolding to work around model limitations, we document exactly what the model struggles with and share it with the lab research team. This helps inform the next generation of models. The goal is to make our own scaffolding obsolete. My prediction: in two years, most of our basic skills will be one-liners. “Generate a 20 tabs DCF.” That’s it. The model will know what that means. But here’s the flip side: as basic tasks get commoditized, we’ll push into more complex territory. Multi-step valuations with segment-by-segment analysis. Automated backtesting of investment strategies. Real-time portfolio monitoring with complex triggers. The frontier keeps moving. So we write skills. We delete them when they become unnecessary. And we build new ones for the harder problems that emerge. And all that are files... in our filesystem. Here’s something that surprised me: S3 for files is a better database than a database. We store user data (watchlists, portfolio, preferences, memories, skills) in S3 as YAML files. S3 is the source of truth. A Lambda function syncs changes to PostgreSQL for fast queries. Writes → S3 (source of truth) Lambda trigger PostgreSQL (fs_files table) Reads ← Fast queries - Durability : S3 has 11 9’s. A database doesn’t. - Versioning : S3 versioning gives you audit trails for free - Simplicity : YAML files are human-readable. You can debug with `cat`. - Cost : S3 is cheap. Database storage is not. The pattern: - Writes go to S3 directly - List queries hit the database (fast) - Single-item reads go to S3 (freshest data) The sync architecture. We run two Lambda functions to keep S3 and PostgreSQL in sync: S3 (file upload/delete) fs-sync Lambda → Upsert/delete in fs_files table (real-time) EventBridge (every 3 hours) fs-reconcile Lambda → Full S3 vs DB scan, fix discrepancies Both use upsert with timestamp guards—newer data always wins. The reconcile job catches any events that slipped through (S3 eventual consistency, Lambda cold starts, network blips). User memories live here too. Every user has a `/private/memories/UserMemories.md` file in S3. It’s just markdown—users can edit it directly in the UI. On every conversation, we load it and inject it as context: This is surprisingly powerful. Users write things like “I focus on small-cap value stocks” or “Always compare to industry median, not mean” or “My portfolio is concentrated in tech, so flag concentration risk.” The agent sees this on every conversation and adapts accordingly. No migrations. No schema changes. Just a markdown file that the user controls. Watchlists work the same way. YAML files in S3, synced to PostgreSQL for fast queries. When a user asks about “my watchlist,” we load the relevant tickers and inject them as context. The agent knows what companies matter to this user. The filesystem becomes the user’s personal knowledge base. Skills tell the agent how to do things. Memories tell it what the user cares about. Both are just files. Agents in financial services need to read and write files. A lot of files. PDFs, spreadsheets, images, code. Here’s how we handle it. ReadFile handles the complexity: WriteFile creates artifacts that link back to the UI: Bash gives persistent shell access with 180 second timeout and 100K character output limit. Path normalization on everything (LLMs love trying path traversal attacks, it’s hilarious). Bash is more important than you think. There’s a growing conviction in the AI community that filesystems and bash are the optimal abstraction for AI agents. Braintrust recently ran an eval comparing SQL agents, bash agents, and hybrid approaches for querying semi-structured data. The results were interesting: pure SQL hit 100% accuracy but missed edge cases. Pure bash was slower and more expensive but caught verification opportunities. The winner? A hybrid approach where the agent uses bash to explore and verify, SQL for structured queries. This matches our experience. Financial data is messy. You need bash to grep through filing documents, find patterns, explore directory structures. But you also need structured tools for the heavy lifting. The agent needs both—and the judgment to know when to use each. We’ve leaned hard into giving agents full shell access in the sandbox. It’s not just for running Python scripts. It’s for exploration, verification, and the kind of ad-hoc data manipulation that complex tasks require. But complex tasks mean long-running agents. And long-running agents break everything. Subscribe now Before Temporal, our long-running tasks were a disaster. User asks for a comprehensive company analysis. That takes 5 minutes. What if the server restarts? What if the user closes the tab and comes back? What if... anything? We had a homegrown job queue. It was bad. Retries were inconsistent. State management was a nightmare. Then we switched to Temporal and I wanted to cry tears of joy! That’s it. Temporal handles worker crashes, retries, everything. If a Heroku dyno restarts mid-conversation (happens all the time lol), Temporal automatically retries on another worker. The user never knows. The cancellation handling is the tricky part. User clicks “stop,” what happens? The activity is already running on a different server. We use heartbeats sent every few seconds. We run two worker types: - Chat workers : User-facing, 25 concurrent activities - Background workers : Async tasks, 10 concurrent activities They scale independently. Chat traffic spikes? Scale chat workers. Next is speed. In finance, people are impatient. They’re not going to wait 30 seconds staring at a loading spinner. They need to see something happening. So we built real-time streaming. The agent works, you see the progress. Agent → SSE Events → Redis Stream → API → Frontend The key insight: delta updates, not full state. Instead of sending “here’s the complete response so far” (expensive), we send “append these 50 characters” (cheap). Streaming rich content with Streamdown. Text streaming is table stakes. The harder problem is streaming rich content: markdown with tables, charts, citations, math equations. We use Streamdown to render markdown as it arrives, with custom plugins for our domain-specific components. Charts render progressively. Citations link to source documents. Math equations display properly with KaTeX. The user sees a complete, interactive response building in real-time. AskUserQuestion: Interactive agent workflows. Sometimes the agent needs user input mid-workflow. “Which valuation method do you prefer?” “Should I use consensus estimates or management guidance?” “Do you want me to include the pipeline assets in the valuation?” We built an `AskUserQuestion` tool that lets the agent pause, present options, and wai When the agent calls this tool, the agentic loop intercepts it, saves state, and presents a UI to the user. The user picks an option (or types a custom answer), and the conversation resumes with their choice. This transforms agents from autonomous black boxes into collaborative tools. The agent does the heavy lifting, but the user stays in control of key decisions. Essential for high-stakes financial work where users need to validate assumptions. “Ship fast, fix later” works for most startups. It does not work for financial services. A wrong earnings number can cost someone money. A misinterpreted guidance statement can lead to bad investment decisions. You can’t just “fix it later” when your users are making million-dollar decisions based on your output. We use Braintrust for experiment tracking. Every model change, every prompt change, every skill change gets evaluated against a test set. Generic NLP metrics (BLEU, ROUGE) don’t work for finance. A response can be semantically similar but have completely wrong numbers. Building eval datasets is harder than building the agent. We maintain ~2,000 test cases across categories: Ticker disambiguation. This is deceptively hard: - “Apple” → AAPL, not APLE (Appel Petroleum) - “Meta” → META, not MSTR (which some people call “meta”) - “Delta” → DAL (airline) or is the user talking about delta hedging (options term)? The really nasty cases are ticker changes. Facebook became META in 2021. Google restructured under GOOG/GOOGL. Twitter became X (but kept the legal entity). When a user asks “What happened to Facebook stock in 2023?”, you need to know that FB → META, and that historical data before Oct 2021 lives under the old ticker. We maintain a ticker history table and test cases for every major rename in the last decade. Fiscal period hell. This is where most financial agents silently fail: - Apple’s Q1 is October-December (fiscal year ends in September) - Microsoft’s Q2 is October-December (fiscal year ends in June) - Most companies Q1 is January-March (calendar year) “Last quarter” on January 15th means: - Q4 2024 for calendar-year companies - Q1 2025 for Apple (they just reported) - Q2 2025 for Microsoft (they’re mid-quarter) We maintain fiscal calendars for 10,000+ companies. Every period reference gets normalized to absolute date ranges. We have 200+ test cases just for period extraction. Numeric precision. Revenue of $4.2B vs $4,200M vs $4.2 billion vs “four point two billion.” All equivalent. But “4.2” alone is wrong—missing units. Is it millions? Billions? Per share? We test unit inference, magnitude normalization, and currency handling. A response that says “revenue was 4.2” without units fails the eval, even if 4.2B is correct. Adversarial grounding. We inject fake numbers into context and verify the model cites the real source, not the planted one. Example: We include a fake analyst report stating “Apple revenue was $50B” alongside the real 10-K showing $94B. If the agent cites $50B, it fails. If it cites $94B with proper source attribution, it passes. We have 50 test cases specifically for hallucination resistance. Eval-driven development. Every skill has a companion eval. The DCF skill has 40 test cases covering WACC edge cases, terminal value sanity checks, and stock-based compensation add-backs (models forget this constantly). PR blocked if eval score drops >5%. No exceptions. Our production setup looks like this: We auto-file GitHub issues for production errors. Error happens, issue gets created with full context: conversation ID, user info, traceback, links to Braintrust traces and Temporal workflows. Paying customers get `priority:high` label. Model routing by complexity: simple queries use Haiku (cheap), complex analysis uses Sonnet (expensive). Enterprise users always get the best model. The biggest lesson isn’t about sandboxes or skills or streaming. It’s this: The model is not your product. The experience around the model is your product. Anyone can call Claude or GPT. The API is the same for everyone. What makes your product different is everything else: the data you have access to, the skills you’ve built, the UX you’ve designed, the reliability you’ve engineered and frankly how well you know the industry which is a function of how much time you spend with your customers. Models will keep getting better. That’s great! It means less scaffolding, less prompt engineering, less complexity. But it also means the model becomes more of a commodity. Your moat is not the model. Your moat is everything you build around it. For us, that’s financial data, domain-specific skills, real-time streaming, and the trust we’ve built with professional investors. What’s yours? Thanks for reading! Subscribe for free to receive new posts and support my work. I’ve spent the last two years building AI agents for financial services. Along the way, I’ve accumulated a fair number of battle scars and learnings that I want to share. Here’s what I’ll cover: - The Sandbox Is Not Optional - Why isolated execution environments are essential for multi-step agent workflows - Context Is the Product - How we normalize heterogeneous financial data into clean, searchable context - The Parsing Problem - The hidden complexity of extracting structured data from adversarial SEC filings - Skills Are Everything - Why markdown-based skills are becoming the product, not the model - The Model Will Eat Your Scaffolding - Designing for obsolescence as models improve - The S3-First Architecture - Why S3 beats databases for file storage and user data - The File System Tools - How ReadFile, WriteFile, and Bash enable complex financial workflows - Temporal Changed Everything - Reliable long-running tasks with proper cancellation handling - Real-Time Streaming - Building responsive UX with delta updates and interactive agent workflows - Evaluation Is Not Optional - Domain-specific evals that catch errors before they cost money - Production Monitoring - The observability stack that keeps financial agents reliable Why financial services is extremely hard. This domain doesn’t forgive mistakes. Numbers matter. A wrong revenue figure, a misinterpreted guidance statement, an incorrect DCF assumption. Professional investors make million-dollar decisions based on our output. One mistake on a $100M position and you’ve destroyed trust forever. The users are also demanding. Professional investors are some of the smartest, most time-pressed people you’ll ever work with. They spot bullshit instantly. They need precision, speed, and depth. You can’t hand-wave your way through a valuation model or gloss over nuances in an earnings call. This forces me to develop an almost paranoid attention to detail. Every number gets double-checked. Every assumption gets validated. Every model gets stress-tested. You start questioning everything the LLM outputs because you know your users will. A single wrong calculation in a DCF model and you lose credibility forever. I sometimes feel that the fear of being wrong becomes our best feature. Over the years building with LLM, we’ve made bold infrastructure bets early and I think we have been right. For instance, when Claude Code launched with its filesystem-first agentic approach, we immediately adopted it. It was not an obvious bet and it was a massive revamp of our architecture. I was extremely lucky to have Thariq from Anthropic Claude Code jumping on a Zoom and opening my eyes to the possibilities. At the time the whole industry, including Fintool, was all building elaborate RAG pipelines with vector databases and embeddings. After reflecting on the future of information retrieval with agents I wrote “ the RAG obituary ” and Fintool moved fully to agentic search. We even decided to retire our precious embedding pipeline. Sad but whatever is best for the future! People thought we were crazy. The article got a lot of praise and a lot of negative comments. Now I feel most startups are adopting these best practices. I believe we’re early on several other architectural choices too. I’m sharing them here because the best way to test ideas is to put them out there. Let’s start with the biggest one. The Sandbox Is Not Optional When we first started building Fintool in 2023, I thought sandboxing might be overkill. “We’re just running Python scripts” I told myself. “What could go wrong?” Haha. Everything. Everything could go wrong. The first time an LLM decided to `rm -rf /` on our server (it was trying to “clean up temporary files”), I became a true believer. Here’s the thing: agents need to run multi-step operations. A professional investor asks for a DCF valuation and that’s not a single API call. The agent needs to research the company, gather financial data, build a model in Excel, run sensitivity analysis, generate complex charts, iterate on assumptions. That’s dozens of steps, each potentially modifying files, installing packages, running scripts. You can’t do this without code execution. And executing arbitrary code on your servers is insane. Every chat application needs a sandbox. Today each user gets their own isolated environment. The agent can do whatever it wants in there. Delete everything? Fine. Install weird packages? Go ahead. It’s your sandbox, knock yourself out. The architecture looks like this: Three mount points. Private is read/write for your stuff. Shared is read-only for your organization. Public is read-only for everyone. The magic is in the credentials. We use AWS ABAC (Attribute-Based Access Control) to generate short-lived credentials scoped to specific S3 prefixes. User A literally cannot access User B’s data. The IAM policy uses ` ${aws:PrincipalTag/S3Prefix} ` to restrict access. The credentials physically won’t allow it. This is also very good for Enterprise deployment. We also do sandbox pre-warming. When a user starts typing, we spin up their sandbox in the background. By the time they hit enter, the sandbox is ready. 600 second timeout, extended by 10 minutes on each tool usage. The sandbox stays warm across conversation turns. So sandboxes are amazing but the under-discussed magic of sandboxes is the support for the filesystem. Which brings us to the next lesson learned about context. Context Is the Product Your agent is only as good as the context it can access. The real work isn’t prompt engineering it’s turning messy financial data from dozens of sources into clean, structured context the model can actually use. This requires a massive domain expertise from the engineering team. The heterogeneity problem. Financial data comes in every format imaginable: - SEC filings : HTML with nested tables, exhibits, signatures - Earnings transcripts : Speaker-segmented text with Q&A sections - Press releases : Semi-structured HTML from PRNewswire - Research reports : PDFs with charts and footnotes - Market data : Snowflake/databases with structured numerical data - News : Articles with varying quality and structure - Alternative data : Satellite imagery, web traffic, credit card panels - Broker research : Proprietary PDFs with price targets and models - Fund filings : 13F holdings, proxy statements, activist letters Each source has different schemas, different update frequencies, different quality levels. Agent needs one thing: clean context it can reason over. The normalization layer. Everything becomes one of three formats: - Markdown for narrative content (filings, transcripts, articles) - CSV/tables for structured data (financials, metrics, comparisons) - JSON metadata for searchability (tickers, dates, document types, fiscal periods) Chunking strategy matters. Not all documents chunk the same way: - 10-K filings : Section by regulatory structure (Item 1, 1A, 7, 8...) - Earnings transcripts : Chunk by speaker turn (CEO remarks, CFO remarks, Q&A by analyst) - Press releases : Usually small enough to be one chunk - News articles : Paragraph-level chunks - 13F filings : By holder and position changes quarter-over-quarter The chunking strategy determines what context the agent retrieves. Bad chunks = bad answers. Tables are special. Financial data is full of tables and csv. Revenue breakdowns, segment performance, guidance ranges. LLMs are surprisingly good at reasoning over markdown tables: But they’re terrible at reasoning over HTML `<table>` tags or raw CSV dumps. The normalization layer converts everything to clean markdown tables. Metadata enables retrieval. The user asks the agent: “ What did Apple say about services revenue in their last earnings call? ” To answer this, Fintool needs: - Ticker resolution (AAPL → correct company) - Document type filtering (earnings transcript, not 10-K) - Temporal filtering (most recent, not 2019) - Section targeting (CFO remarks or revenue discussion, not legal disclaimers) This is why `meta.json` exists for every document. Without structured metadata, you’re doing keyword search over a haystack. It speeds up the search, big time! Anyone can call an LLM API. Not everyone has normalized decades of financial data into searchable, chunked markdown with proper metadata. The data layer is what makes agents actually work. The Parsing Problem Normalizing financial data is 80% of the work. Here’s what nobody tells you. SEC filings are adversarial. They’re not designed for machine reading. They’re designed for legal compliance: - Tables span multiple pages with repeated headers - Footnotes reference exhibits that reference other footnotes - Numbers appear in text, tables, and exhibits—sometimes inconsistently - XBRL tags exist but are often wrong or incomplete - Formatting varies wildly between filers (every law firm has their own template) We tried off-the-shelf PDF/HTML parsers. They failed on: - Multi-column layouts in proxy statements - Nested tables in MD&A sections (tables within tables within tables) - Watermarks and headers bleeding into content - Scanned exhibits (still common in older filings and attachments) - Unicode issues (curly quotes, em-dashes, non-breaking spaces) The Fintool parsing pipeline: Raw Filing (HTML/PDF) ↓ Document structure detection (headers, sections, exhibits) ↓ Table extraction with cell relationship preservation ↓ Entity extraction (companies, people, dates, dollar amounts) ↓ Cross-reference resolution (Ex. 10.1 → actual exhibit content) ↓ Fiscal period normalization (FY2024 → Oct 2023 to Sep 2024 for Apple) ↓ Quality scoring (confidence per extracted field) Table extraction deserves its own work. Financial tables are dense with meaning. A revenue breakdown table might have: - Merged header cells spanning multiple columns - Footnote markers (1), (2), (a), (b) that reference explanations below - Parentheses for negative numbers: $(1,234) means -1234 - Mixed units in the same table (millions for revenue, percentages for margins) - Prior period restatements in italics or with asterisks We score every extracted table on: - Cell boundary accuracy (did we split/merge correctly?) - Header detection (is row 1 actually headers, or is there a title row above?) - Numeric parsing (is “$1,234” parsed as 1234 or left as text?) - Unit inference (millions? billions? per share? percentage?) Tables below 90% confidence get flagged for review. Low-confidence extractions don’t enter the agent’s context—garbage in, garbage out. Fiscal period normalization is critical. “Q1 2024” is ambiguous: - Calendar Q1 (January-March 2024) - Apple’s fiscal Q1 (October-December 2023) - Microsoft’s fiscal Q1 (July-September 2023) - “Reported in Q1” (filed in Q1, but covers the prior period) We maintain a fiscal calendar database for 10,000+ companies. Every date reference gets normalized to absolute date ranges. When the agent retrieves “Apple Q1 2024 revenue,” it knows to look for data from October-December 2023. This is invisible to users but essential for correctness. Without it, you’re comparing Apple’s October revenue to Microsoft’s January revenue and calling it “same quarter.” Skills Are Everything Here’s the thing nobody tells you about building AI agents: the model is not the product. The skills are now the product. I learned this the hard way. We used to try making the base model “smarter” through prompt engineering. Tweak the system prompt, add examples, write elaborate instructions. It helped a little. But skills were the missing part. In October 2025, Anthropic formalized this with Agent Skills a specification for extending Claude with modular capability packages. A skill is a folder containing a `SKILL.md` file with YAML frontmatter (name and description), plus any supporting scripts, references, or data files the agent might need. We’d been building something similar for months before the announcement. The validation felt good but more importantly, having an industry standard means our skills can eventually be portable. Without skills, models are surprisingly bad at domain tasks. Ask a frontier model to do a DCF valuation. It knows what DCF is. It can explain the theory. But actually executing one? It will miss critical steps, use wrong discount rates for the industry, forget to add back stock-based compensation, skip sensitivity analysis. The output looks plausible but is subtly wrong in ways that matter. The breakthrough came when we started thinking about skills as first-class citizens. Like part of the product itself. A skill is a markdown file that tells the agent how to do something specific. Here’s a simplified version of our DCF skill: That’s it. A markdown file. No code changes. No production deployment. Just a file that tells the agent what to do. Skills are better than code. This matters enormously: 1. Non-engineers can create skills. Our analysts write skills. Our customers write skills. A portfolio manager who’s done 500 DCF valuations can encode their methodology in a skill without writing a single line of Python. 2. No deployment needed. Change a skill file and it takes effect immediately. No CI/CD, no code review, no waiting for release cycles. Domain experts can iterate on their own. 3. Readable and auditable. When something goes wrong, you can read the skill and understand exactly what the agent was supposed to do. Try doing that with a 2,000-line Python module. We have a copy-on-write shadowing system: Priority: private > shared > public So if you don’t like how we do DCF valuations, write your own. Drop it in `/private/skills/dcf/SKILL.md`. Your version wins. Why we don’t mount all skills to the filesystem. This is important. The naive approach would be to mount every skill file directly into the sandbox. The agent can just `cat` any skill it needs. Simple, right? Wrong. Here’s why we use SQL discovery instead: 1. Lazy loading. We have dozens of skills with extensive documentation like the DCF skill alone has 10+ industry guideline files. Loading all of them into context for every conversation would burn tokens and confuse the model. Instead, we discover skill metadata (name, description) upfront, and only load the full documentation when the agent actually uses that skill. 2. Access control at query time. The SQL query implements our three-tier access model: public skills available to everyone, organization skills for that org’s users, private skills for individual users. The database enforces this. You can’t accidentally expose a customer’s proprietary skill to another customer. 3. Shadowing logic. When a user customizes a skill, their version needs to override the default. SQL makes this trivial—query all three levels, apply priority rules, return the winner. Doing this with filesystem mounts would be a nightmare of symlinks and directory ordering. 4. Metadata-driven filtering. The `fs_files.metadata` column stores parsed YAML frontmatter. We can filter by skill type, check if a skill is main-agent-only, or query any other structured attribute—all without reading the files themselves. The pattern: S3 is the source of truth, a Lambda function syncs changes to PostgreSQL for fast queries, and the agent gets exactly what it needs when it needs it. Skills are essential. I cannot emphasize this enough. If you’re building an AI agent and you don’t have a skills system, you’re going to have a bad time. My biggest argument for skills is that top models (Claude or GPT) are post-trained on using Skills. The model wants to fetch skills. Models just want to learn and what they want to learn is our skills... Until they ate it. The Model Will Eat Your Scaffolding Here’s the uncomfortable truth: everything I just told you about skills? It’s temporary in my opinion. Models are getting better. Fast. Every few months, there’s a new model that makes half your code obsolete. The elaborate scaffolding you built to handle edge cases? The model just... handles them now. When we started, we needed detailed skills with step-by-step instructions for some simple tasks. “First do X, then do Y, then check Z.” Now? We can often just say for simple task “do an earnings preview” and the model figures it out (kinda of!) This creates a weird tension. You need skills today because current models aren’t smart enough. But you should design your skills knowing that future models will need less hand-holding. That’s why I’m bullish on markdown file versus code for model instructions. It’s easier to update and delete. We send detailed feedback to AI labs. Whenever we build complex scaffolding to work around model limitations, we document exactly what the model struggles with and share it with the lab research team. This helps inform the next generation of models. The goal is to make our own scaffolding obsolete. My prediction: in two years, most of our basic skills will be one-liners. “Generate a 20 tabs DCF.” That’s it. The model will know what that means. But here’s the flip side: as basic tasks get commoditized, we’ll push into more complex territory. Multi-step valuations with segment-by-segment analysis. Automated backtesting of investment strategies. Real-time portfolio monitoring with complex triggers. The frontier keeps moving. So we write skills. We delete them when they become unnecessary. And we build new ones for the harder problems that emerge. And all that are files... in our filesystem. The S3-First Architecture Here’s something that surprised me: S3 for files is a better database than a database. We store user data (watchlists, portfolio, preferences, memories, skills) in S3 as YAML files. S3 is the source of truth. A Lambda function syncs changes to PostgreSQL for fast queries. Writes → S3 (source of truth) ↓ Lambda trigger ↓ PostgreSQL (fs_files table) ↓ Reads ← Fast queries Why? - Durability : S3 has 11 9’s. A database doesn’t. - Versioning : S3 versioning gives you audit trails for free - Simplicity : YAML files are human-readable. You can debug with `cat`. - Cost : S3 is cheap. Database storage is not. The pattern: - Writes go to S3 directly - List queries hit the database (fast) - Single-item reads go to S3 (freshest data) The sync architecture. We run two Lambda functions to keep S3 and PostgreSQL in sync: S3 (file upload/delete) ↓ SNS Topic ↓ fs-sync Lambda → Upsert/delete in fs_files table (real-time) EventBridge (every 3 hours) ↓ fs-reconcile Lambda → Full S3 vs DB scan, fix discrepancies Both use upsert with timestamp guards—newer data always wins. The reconcile job catches any events that slipped through (S3 eventual consistency, Lambda cold starts, network blips). User memories live here too. Every user has a `/private/memories/UserMemories.md` file in S3. It’s just markdown—users can edit it directly in the UI. On every conversation, we load it and inject it as context: This is surprisingly powerful. Users write things like “I focus on small-cap value stocks” or “Always compare to industry median, not mean” or “My portfolio is concentrated in tech, so flag concentration risk.” The agent sees this on every conversation and adapts accordingly. No migrations. No schema changes. Just a markdown file that the user controls. Watchlists work the same way. YAML files in S3, synced to PostgreSQL for fast queries. When a user asks about “my watchlist,” we load the relevant tickers and inject them as context. The agent knows what companies matter to this user. The filesystem becomes the user’s personal knowledge base. Skills tell the agent how to do things. Memories tell it what the user cares about. Both are just files. The File System Tools Agents in financial services need to read and write files. A lot of files. PDFs, spreadsheets, images, code. Here’s how we handle it. ReadFile handles the complexity: WriteFile creates artifacts that link back to the UI: Bash gives persistent shell access with 180 second timeout and 100K character output limit. Path normalization on everything (LLMs love trying path traversal attacks, it’s hilarious). Bash is more important than you think. There’s a growing conviction in the AI community that filesystems and bash are the optimal abstraction for AI agents. Braintrust recently ran an eval comparing SQL agents, bash agents, and hybrid approaches for querying semi-structured data. The results were interesting: pure SQL hit 100% accuracy but missed edge cases. Pure bash was slower and more expensive but caught verification opportunities. The winner? A hybrid approach where the agent uses bash to explore and verify, SQL for structured queries. This matches our experience. Financial data is messy. You need bash to grep through filing documents, find patterns, explore directory structures. But you also need structured tools for the heavy lifting. The agent needs both—and the judgment to know when to use each. We’ve leaned hard into giving agents full shell access in the sandbox. It’s not just for running Python scripts. It’s for exploration, verification, and the kind of ad-hoc data manipulation that complex tasks require. But complex tasks mean long-running agents. And long-running agents break everything. Subscribe now Temporal Changed Everything Before Temporal, our long-running tasks were a disaster. User asks for a comprehensive company analysis. That takes 5 minutes. What if the server restarts? What if the user closes the tab and comes back? What if... anything? We had a homegrown job queue. It was bad. Retries were inconsistent. State management was a nightmare. Then we switched to Temporal and I wanted to cry tears of joy! That’s it. Temporal handles worker crashes, retries, everything. If a Heroku dyno restarts mid-conversation (happens all the time lol), Temporal automatically retries on another worker. The user never knows. The cancellation handling is the tricky part. User clicks “stop,” what happens? The activity is already running on a different server. We use heartbeats sent every few seconds. We run two worker types: - Chat workers : User-facing, 25 concurrent activities - Background workers : Async tasks, 10 concurrent activities They scale independently. Chat traffic spikes? Scale chat workers. Next is speed. Real-Time Streaming In finance, people are impatient. They’re not going to wait 30 seconds staring at a loading spinner. They need to see something happening. So we built real-time streaming. The agent works, you see the progress. Agent → SSE Events → Redis Stream → API → Frontend The key insight: delta updates, not full state. Instead of sending “here’s the complete response so far” (expensive), we send “append these 50 characters” (cheap). Streaming rich content with Streamdown. Text streaming is table stakes. The harder problem is streaming rich content: markdown with tables, charts, citations, math equations. We use Streamdown to render markdown as it arrives, with custom plugins for our domain-specific components. Charts render progressively. Citations link to source documents. Math equations display properly with KaTeX. The user sees a complete, interactive response building in real-time. AskUserQuestion: Interactive agent workflows. Sometimes the agent needs user input mid-workflow. “Which valuation method do you prefer?” “Should I use consensus estimates or management guidance?” “Do you want me to include the pipeline assets in the valuation?”

0 views