Amazon Buys Globalstar, Delta to Add Leo, The Apple Angle
Apple's Globalstar acquisition is being framed as Apple versus SpaceX, but I think the real story is about Apple.
Apple's Globalstar acquisition is being framed as Apple versus SpaceX, but I think the real story is about Apple.
Breaking down OpenAI's internal memo about taking on Anthropic in the enterprise.
Listen to this post : In January 2025, Doug O’Laughlin at Fabricated Knowledge declared that o1 and reasoning models marked the end of Aggregation Theory: I believe that there is no practical limit to the improvements of models other than economics, and I think that will be the real constraint in the future. It is reasonable that if we spent infinite dollars on a model, it would be improved. The problem is whether infinite dollars would make sense for a business. That is going to be the key question for 2025. How do the economics of AI make this work? One of the core assumptions about the internet has just been broken. Marginal costs now exist again, meaning that most hyperscalers will become increasingly capital-intensive. The era of Aggregation Theory is behind us, and AI is again making technology expensive. This relation of increased cost from increased consumption is anti-internet era thinking. And this will be the big problem that will be reckoned with this year. Hyperscaler’s business models are mainly underpinned by the marginal cost being zero. So, as long as you set up the infrastructure and fill an internet-scale product with users, you can make money. This era will soon be over, and the future will be much weirder and more compute-intensive. Looking back on the 2010s, we will probably consider them a naive time in the long arc of technology. One of our fundamental assumptions about this period is unraveling. This will be the single most significant change in the technology landscape going forward. Aggregation Theory was, if I may say so myself, the single best way to understand the 2010s, particularly consumer tech. It explained the dynamics undergirding Google and Facebook’s dominance, as well as the App Store and Amazon’s e-commerce business; it was also a useful ( albeit incomplete ) framework to understand an entire host of consumer services like Uber, Airbnb, and Netflix. It’s worth pointing out, however, that some of the critical insights undergirding Aggregation Theory are much older, and are embedded in the fundamental nature of tech itself. They are, as O’Laughlin notes, rooted in the concept of zero marginal costs. Marginal costs are how much it costs to make one more unit of a good. Consider a widget-making factory: Land and machines are clearly fixed costs; you have to have both to get started, and you are paying for both whether or not you make one more widget. Raw material, on the other hand, is clearly a marginal cost: if you make one more widget, you need one more widget’s worth of raw material. When it comes to physical goods, electricity and humans are also marginal costs: you need more or fewer of them depending on whether you make more or fewer widgets. Where marginal costs matter is that they provide a price floor. Companies will operate unprofitably because profit and loss is an accounting concept that incorporates depreciation, i.e. your fixed costs. For example, imagine that a company spent $1,000 on a factory to make widgets that have a marginal cost of $10: as long as the price of widgets is >$10 the company will make them even if they don’t earn enough money to cover their depreciation costs (i.e. they operate at a loss) because at least they are still making a marginal profit on each widget (what the company may not do is invest in any more fixed costs, and, eventually, will probably go bankrupt from interest on the debt that likely financed those fixed costs). I explain all of this precisely because it’s almost completely immaterial to tech. First, there generally are no raw material costs, because the outputs are digital. Second, because there are no raw material costs, and because the fixed costs are so large, electricity and humans are generally treated as fixed costs, not marginal costs: of course you will run your servers all of the time and at full capacity, because every scrap of additional revenue you can generate is worth it. AI very much fits in this paradigm: the output is digital, and while AI chips use a lot of electricity, the cost is a fraction of the cost of the chips themselves, which is to say that no one with AI chips is making marginal cost calculations in terms of utilizing them. They’re going to be used! Rather, the decision that matters is what they will be used for. Consider Microsoft: last quarter the company missed the Street’s Azure growth expectations not because there wasn’t demand, but because the company decided to use its capacity for its own products. CFO Amy Hood said on the company’s earnings call : I think it’s probably better to think about the Azure guidance that we give as an allocated capacity guide about what we can deliver in Azure revenue. Because as we spend the capital and put GPUs specifically, it applies to CPUs, the GPUs more specifically, we’re really making long-term decisions. And the first thing we’re doing is solving for the increased usage in sales and the accelerating pace of M365 Copilot as well as GitHub Copilot, our first-party apps. Then we make sure we’re investing in the long-term nature of R&D and product innovation. And much of the acceleration that I think you’ve seen from us and products over the past a bit is coming because we are allocating GPUs and capacity to many of the talented AI people we’ve been hiring over the past years. Then, when you end up, is that, you end up with the remainder going towards serving the Azure capacity that continues to grow in terms of demand. And a way to think about it, because I think, I get asked this question sometimes, is if I had taken the GPUs that just came online in Q1 and Q2 in terms of GPUs and allocated them all to Azure, the KPI would have been over 40. And I think the most important thing to realize is that this is about investing in all the layers of the stack that benefit customers. And I think that’s hopefully helpful in terms of thinking about capital growth, it shows in every piece, it shows in revenue growth across the business and shows as OpEx growth as we invest in our people. The cost that Microsoft is contending with here is not marginal cost, but rather opportunity cost: compute spent in one area cannot be used in another area; in the case of these earnings, Microsoft was admitting that they could have made their Azure number if they wanted to, but chose to prioritize their own workloads because, as CEO Satya Nadella noted later in the call, those have higher gross margin profiles and higher lifetime value. It’s opportunity costs, not marginal costs, that are the challenge facing hyperscalers. How much compute should go to customers, and which ones? How much should be reserved for internal workloads? Microsoft needs to balance Azure — both for its enterprise customers and OpenAI — and its software business; Amazon needs to balance its e-commerce business, AWS, and its strategic investments in both Anthropic and OpenAI. Google has to balance GCP, its own strategic investment in Anthropic, and its consumer businesses. Last week Anthropic released announced Mythos, its most advanced model. And, in somewhat typical Anthropic fashion, it did so by focusing on its dangers; from the introductory post for Project Glasswing , the company’s initiative for leveraging Mythos to address security: We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity. Claude Mythos Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout—for economies, public safety, and national security—could be severe. Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes. In an Update last week I analogized Anthropic’s “disaster-porn-as-marketing-tool” approach to The Boy Who Cried Wolf ; what’s important about that analogy is not just that the boy raised false alarms, but also that, in the end, the wolf did come. To that end, I wrote two weeks ago about the myriad of security issues that underpin all software, and my optimism that AI would solve these issues in the long run, even if it made things much worse in the short run. In other words, it’s actually not important whether or not Mythos represents a major security threat: if this model doesn’t, a future model will; to that end, I do support leveraging Mythos to proactively find and fix bugs before bad actors can find and exploit them. At the same time, it’s also worth noting that there are other reasons for Anthropic to not make Mythos widely available, limiting access to a finite number of companies with a high capacity and willingness to pay. The first are those opportunity costs: Anthropic is already short on compute serving its current models; X was overrun with complaints and debates this weekend about Anthropic allegedly dumbing down Claude over the last month or so . Making Mythos more widely available — particularly to subscription plans that don’t pay per usage — would make the situation much worse. In other words, Anthropic isn’t facing a marginal cost problem, but an opportunity cost problem: where to allocate its compute. Of course this could become a margin problem: I suspect that Anthropic is going to overcome its conservatism in terms of compute by acquiring more compute from hyperscalers and neoclouds, and paying dearly for the privilege. The key to handling those costs will be to charge more for Claude going forward; that, by extension, means maintaining pricing power, which leads to a second benefit of not releasing Mythos broadly. Anthropic certainly faces competition from OpenAI; for both frontier labs, however, the real competition in the long run are open source models. Right now those primarily come from China, and a key ingredient in fast-following frontier models is distillation; from Anthropic’s blog : We have identified industrial-scale campaigns by three AI laboratories—DeepSeek, Moonshot, and MiniMax—to illicitly extract Claude’s capabilities to improve their own models. These labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, in violation of our terms of service and regional access restrictions. These labs used a technique called “distillation,” which involves training a less capable model on the outputs of a stronger one. Distillation is a widely used and legitimate training method. For example, frontier AI labs routinely distill their own models to create smaller, cheaper versions for their customers. But distillation can also be used for illicit purposes: competitors can use it to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost, that it would take to develop them independently. I absolutely believe this is a real problem, and wrote as much when DeepSeek R1 was released last year . I also think it’s in the interest of everyone other than the frontier labs to pretend that it isn’t; open source models are not subject to the frontier labs’ markup or compute constraints, which is exactly why it benefits most companies to have them available, whether or not they are distilled. Of course that doesn’t mean they are free to run: you still need to provide the compute. Notice, however, how that makes stopping distillation even more of a priority for the frontier labs: first, they want to protect their margins. Second, however, their biggest cost is opportunity cost: the customers they can’t serve because they don’t have enough compute. To the extent they can make compute less useful for their potential customers — by stopping open source models from distilling their models — is the extent to which they can acquire that compute for themselves at more favorable rates. Mythos wasn’t the only new model announced last week: Meta released the first fruit of their new frontier lab as well. From the company’s blog post : Today, we’re excited to introduce Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts. To support further scaling, we are making strategic investments across the entire stack — from research and model training to infrastructure, including the Hyperion data center… Muse Spark offers competitive performance in multimodal perception, reasoning, health, and agentic tasks. We continue to invest in areas with current performance gaps, such as long-horizon agentic systems and coding workflows. Muse Spark isn’t state of the art, but it’s in the game, and overall a positive first impression from Meta Superintelligence Labs. What is most notable to me, however, is the extent to which the last nine months of AI have made clear that CEO Mark Zuckerberg made the right call to embark on that “ground-up overhaul of [Meta’s] AI efforts”. The trigger for O’Laughlin’s post that I opened this Article with was reasoning, where models using more tokens led to better answers; since then agents have exponentially increased token demand , as they can use LLMs continuously without a human in the loop. This is a huge driver in sky-rocketing demand for Claude, as well as OpenAI’s Codex. Moreover, this use case is so potentially profitable that not only is Anthropic’s revenue sky-rocketing, but OpenAI is pivoting its focus to enterprise. Indeed, you can make the argument that one of OpenAI’s biggest challenges is the fact it has such a popular consumer product in ChatGPT. I, with my Aggregation Theory lens, have long maintained that that userbase was a big advantage for OpenAI, but that assumed that the company could effectively monetize it, which is why I have argued so vociferously for an advertising model . OpenAI has big projections for exactly that, but until that materializes, that big consumer base is a big opportunity cost in terms of OpenAI’s focus and compute. The company has, to its credit and in the face of widespread skepticism, made significant investments in more compute, but the temptation to allocate more and more compute to agentic use cases that enterprises will pay for, even at the expense of the consumer business, will be very large. This puts Meta in a unique position relative to everyone else in the industry: unlike any of the hyperscalers or the frontier labs, Meta does not have an enterprise or cloud business to worry about. That means that serving the consumer market comes with no opportunity costs. Of course those opportunity costs would be much smaller anyways, given that Meta already has an at-scale advertising business to monetize usage. In other words, Meta may actually face less competition in winning the consumer space than it might have seemed a few months ago, simply because that is their primary focus — and because they have their own model, which means they don’t need to worry about not having access to the frontier labs (much of this analysis applies to Google, of course). This, by the same token, is why Meta should open source Muse, just like they did Llama. The entities that will be the most hurt by widespread availability of a frontier model are other frontier labs, who will see their pricing power reduced and face increased competition for compute. This will make it even harder for them to bear the opportunity cost of pursuing the consumer market, leaving it for Meta. So is “the era of Aggregation Theory…behind us”? On one hand, the insight that the way to create and maintain value will come from owning the customer is almost certainly going to continue to be the case. On the consumer side owning customers leads to advertising which provides the revenue to provide services to customers. On the enterprise side — which, I would note, has never been an arena where Aggregation Theory was meant to be applied — I think it’s likely that both Anthropic and OpenAI continue to move up the stack and deliver features that compete with software providers directly (an approach that is also in line with not making leading edge models publicly available). On the other hand, O’Laughlin’s observation that we are and will continue to be compute constrained is an important one: companies will not be able to assume they can serve everyone, because serving one set of customers imposes the opportunity cost of not serving another. This won’t, at least in theory, last forever: at some point AI will be “good enough” for enough use cases that there will be enough compute capacity to take advantage of the fact that there really aren’t meaningful marginal costs entailed in serving AI; that theoretical future, however, feels further away than ever. OpenAI is betting that this compute constraint — and the deals they have made to overcome it — will matter more than Anthropic’s current momentum with end users. From Bloomberg : OpenAI told investors this week that its early push to dramatically increase computing resources gives it a key advantage over Anthropic PBC at a moment when its longtime rival is gaining ground and mulling a potential public offering. The ChatGPT maker said it has outpaced Anthropic by “rapidly and consistently” adding computing capacity to support wider adoption of its software, according to a note the company sent to some of its investors after Anthropic announced a more powerful AI model called Mythos. The ambitious infrastructure build-out, criticized by some as too costly, has enabled OpenAI to better keep pace with rising demand for AI products, the memo states. I’m less certain that this will be dispositive. When it comes to AI, distribution and transaction costs are still free — the two preconditions for Aggregators — which means that the winners should be those with the most compelling products. Those products will win the most users, providing the money necessary to source the compute to serve them; consider Anthropic’s deal to secure a meaningful portion of TPU supply , which, given the capacity constraints at TSMC, is ultimately an example of taking supply from Google. I suspect that Anthropic can take more, including already built hyperscaler and neocloud capacity. Yes, that compute will be more expensive, but if demand is high enough the necessary cash flow will be there. In other words, my bet is that owning demand will ultimately trump owning supply, suggesting that the underlying principles of Aggregation Theory lives on. To put it another way, I think that OpenAI will need to win with better products, not just more compute; then again, if more compute is the key to better products, then does supply matter most? Regardless, they’ll certainly be focused on delivering both to the enterprise customers who are driving Anthropic’s astonishing growth. The real cost may be the consumer market they currently dominate, given that Meta has nothing to lose and everything to gain. You need land for the factory You need machines for the factory You need electricity to operate the machines You need humans to operate the machines You need the raw material for the widgets
Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Sharp Tech video is on why OpenAI’s enterprise pivot makes sense. Anthropic Anthropic Anthropic . In the current AI era, it feels like a new company is crowned the winner every few months, and right now Anthropic is wearing the crown. However, a point I make on Sharp Tech is that Anthropic’s exponential growth includes the part of the curve everyone misses: the company has been on this once-barely-visible trajectory for nearly two years now. Now the company has what is undoubtedly the most powerful model in the world, so powerful, in fact, that Anthropic says it can’t release it publicly. There’s reason for cynicism, given Anthropic’s history, but the part of the “Boy Cries Wolf” myth everyone forgets is that the wolf did come in the end. — Ben Thompson The New York Times and Another Paradigm Shift. If you’re interested in media, this week’s Stratechery Interview with New York Times CEO Meredith Kopit Levien is a fantastic listen. The Times has nailed the internet era better than media company in the world, and they’ve succeeded by making deliberate choices — a paywall before it was cool, a clear point of view, integrated business and editorial strategies — to differentiate themselves from a sea of commoditized content in an era of aggregators and content abundance. That playbook worked wonders for the Times in the previous generation of the internet, and I enjoyed hearing Levien’s thoughts on updating it for an era dominated by AI and video. — Andrew Sharp The New Yorker Explains Sam Altman. This week’s Sharp Text hit a few different beats, including thoughts on the Strait of Hormuz and a fun bit of E-ZPass history, but I opened with a take on the sprawling Sam Altman profile from the New Yorker . The 16,000 word profile is certainly an exhaustive recital of questions that have been asked about Altman for more than a decade, but better topics went unexplored. It’s frustrating — and representative of too much tech coverage — that so much effort went into what’s effectively a well-written Wikipedia entry, anchored by a predetermined conclusion, and ignoring more dramatic questions than whether Sam Altman is a good person. — AS OpenAI Buys TBPN, Tech and the Token Tsunami — OpenAI’s purchase of TBPN makes no sense, which may be par for the course for OpenAI. Then, AI is breaking stuff, starting with tech services. Anthropic’s New TPU Deal, Anthropic’s Computing Crunch, The Anthropic-Google Alliance — Anthropic needs compute, and Google has the most: it’s a natural partnership, particularly for Google. Anthropic’s New Model, The Mythos Wolf, Glasswing and Alignment — Anthropic says its new model is too dangerous to release; there are reasons to be skeptical, but to the extent Anthropic is right, that raises even deeper concerns. An Interview with New York Times CEO Meredith Kopit Levien About Betting on Humans With Expertise — An interview with New York Times Company CEO Meredith Kopit Levien about human expertise as a moat against Aggregators and AI. Hormuz, Rushmore and a Sam Altman Story That Missed the Story — On the New Yorker’s profile of Sam Altman, the future in the Middle East, and the power of E-ZPass history . OpenAI Buys TBPN Mythos, Altman, New York Times VLIW: The “Impossible” Computer Gas Turbine Blades and their Heat-Defying Single-Crystal Superalloys A Ceasefire and Reports of PRC Pressure; Another Politburo Investigation; Mythos, DeepSeek, and a Token Crunch An Exclusive Hornets-Suns Report and Mail on LeBron, Wemby, the Pistons, ABS in the NBA, Bulls Fandom for Kids Malone to Carolina and Karnisovas Out in Chicago, Cooper and Kon Battling to the Finish, A Jokic-Wemby Classic in Denver Mythos and Project Glasswing, The Year of Anthropic Continues Apace, Q&A on the NYT, Altman, De-globalization
Listen to this post: Good morning, This week’s Stratechery Interview is with New York Times Company CEO Meredith Kopit Levien . Levien became CEO in 2020, after previously serving as Chief Operating Officer, Chief Revenue Officer, and Head of Advertising. I previously interviewed Kopit Levien in August 2022 . The New York Times editorial team always elicits strong reactions, both in the political realm and also in tech, but that’s not what this interview is about; what is indisputable is that the New York Times as a business is both incredibly interesting and incredibly successful. Over the last decade the newspaper has gone from strength to strength, building a thriving subscription business, expanding its bundle from news to Games to Sports to Cooking and more, and now — to take things full circle — has a rapidly growing advertising business. We discuss all of that in this interview, starting with the Games and Sports categories, why the bundle is about expanding the New York Times brand, and the company’s recent push into vertical video. Then we discuss what it means to be a destination site, while also using Aggregators to acquire customers. We spend time on AI, including the New York Times lawsuit against OpenAI, why Kopit Levien sees humans as the moat against AI content, and how the company is using AI on both the business and editorial sides. Finally we discuss the potential for building communities, why advertising is working, and how surviving in an Aggregator and AI world is about fighting entropy. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Meredith Kopit Levien, welcome back to Stratechery. MKL: Hi Ben, thanks for having me, so happy to be here. It’s hard to believe, but it has been four-and-a-half years since you last came on — I was thinking two or three years ago — nope, it’s almost half a decade. I was actually shocked that I’ve been doing interviews for that long, but apparently I’ve been doing them for like six, six-and-a-half years. MKL: You have, and I’ve listened to a lot of them! I appreciate it. Well, we already did the whole background conversation then, we both worked for the student newspaper, lots of commonality there. So let’s fast forward to the time of that interview. It was August 2022, and speaking of mind-blowing lengths of time, you had bought Wordle earlier that year, it’s hard to believe it’s been that long and then you had just purchased The Athletic . How do you feel about those acquisitions five years on? MKL: That’s such a fun place to start. We acquired both of them, if I remember correctly, within a week of each other, and I would say we feel great about both of them and both of them have exceeded our expectations in so many ways. Is Wordle the greatest media acquisition of all time? MKL: You know what I tell people? That New York Times Games is the most up-and-to-the-right thing I’ve experienced in my career in terms of just people’s attention to it and the way it kind of touched culture and still touches culture every day, and the ability for Wordle to be like a megaphone for these other incredible games that we already had that most people didn’t know about. And then what’s so amazing to me is we now have, I think 11 games — half of them are free, half of them are paid games, tens of millions of people play our games every day. And we have made the vast majority, we’ve made those games. And before Wordle and after Wordle, Wordle in and of itself is extraordinary, but before and after, we’ve made other extraordinary games, it’s so awesome. Is it a bit of like annoying that’s like everyone thinks about Wordle, “Oh, you bought Wordle”, and you’re like, “Look, we made most of these, give us more credit here!”? MKL: Listen, credit to Josh Wardle , it’s an awesome game, and it just touched culture like nothing else. But it has served us so brilliantly — it has just shined this huge light on all these other games and it’s given us a chance to prove our chops as a game studio and we just keep making hits. I am so proud of our games team, Jonathan Knight and the whole team around him, they have done such good work and they are still hard, hard at it, that team works so hard. I’m a Connections player , so Wyna Liu is my hero , but they’re all amazing and they put out really good work. Games, it’s going swimmingly, I hope we get to talk even more about it. As long as we’re here, like how has your – because we were talking a bit about, Wordle sort of came out of the blue — it was this game that popped up, you snapped it up, super smart — and we were talking in our interview about it being an in-point to the New York Times broadly. MKL: Yeah. Has that evolved as you expected or has it evolved in different ways? In the context of not just Games being a property but also it tying into the whole thing. MKL: What a great question. To answer that, let me step back for a minute and say our strategy is for the whole of the New York Times and all the different parts of the portfolio to be an essential subscription for curious people everywhere who want to understand the world and make the most of their lives. We’ve got three pillars to that, 1) be, and become even more every day, the world’s best news destination 2) have these leading lifestyle products, including Games, but also Sports, Recipes, shopping advice, that really help people do their passion more deeply or better or enjoy it even more and then put those two things together, news and the lifestyle products, in an interconnected experience so that the New York Times is incredibly relevant to you every single day, whatever is going on in the world or your world. Right. This is a point you made before, is you wanted the New York Times to not just be — sometimes the news is slow, or sometimes stuff’s happening you don’t care about, and you wanted to have other stuff for people along the way. MKL: Listen, I want to be really clear. We are first and foremost a high quality independent news journalism company, that is our mission, it is the most value-creating thing we do for society and economically, and that is by miles. And to your original question, it’s just amazing to have all these other points of introduction to people and point all these other ways to bring people into the Times ecosystem and to get them to form a habit with us. Once we do that, once we can engage them in something, our bet is that we can engage them in more and more, and there’s lots of examples of that. You mentioned you had three things, you had the news, you had the lifestyle, what was the third one? MKL: Yeah, so news, news is such a small word for such a big idea. You mentioned that sports is a lifestyle so is sports not news? Is that lifestyle? It’s kind of interesting where that fits. MKL: We do sports news, we do sports journalism, we do news journalism. But let me stay on the news thing for a minute because we’re often even trying ourselves in how we articulate it to not let it be this small idea. We do high quality, original, independent journalism, which means we are unearthing new and important information through reporting and also providing often deeply reported commentary and analysis on the really big topics that are going on in the world and also on things that just matter at the level of relevance of people’s daily lives. You could read us today for what is happening with this fragile ceasefire in Iran and you could also read us today for health advice or for what movie to go see or what restaurant people are eating in in New York City right now. News is this very broad thing at The New York Times, and we’ve got these four lifestyle products. I would say to you what we’re doing with The Athletic is absolutely journalism, often it is like news journalism, but make no mistake, and we are doing it with the rigor and the independence that The Times does. It’s journalism, but we are doing it for fans, we are doing that journalism. Right. It never occurred to me until you sort of mentioned it — it’s not wrong to say that sports is a lifestyle category. MKL: Totally. That intersection is actually kind of interesting to think about. MKL: Let me tell you something — I have an almost 15-year-old, he is an athlete, and he is a giant sports fan and when I think, “What are his lifestyle pursuits?”, when I fill out the parent statement in the school applications, first he’s a sports fan, and The Athletic is serving that fandom. Do you think there’s a bit where some of this sports journalism has been caught up in, “We are journalists”, bit and has missed the fact that people watch sports in many cases as a pastime to relax. I look forward to turning on the baseball game at night, I don’t want the perils of the world, this is supposed to be an escape. It’s also most helpful to put it in this lifestyle category because that’s actually meeting people where they are. MKL: I think that’s a great point. What I will say is The Athletic often does very hard-hitting sports journalism, it is certainly covering the important topics and the tough topics across the major leagues and teams in the United States and European football and a bunch of other things, so it is doing that, hard stop. But if you look at the multiplicity of things they’re doing and you look in a day’s time, it’s probably well over 100 stories that get published every day, an enormous amount of that is beat reporting on what happened to your team in the league that you most likely watch and it is literally meant to make you closer to the team, the fan, the game. I think all high quality information is — consumers of information want uncompromised information and so The Athletic is just like uncompromised the way The Times is uncompromised, it’s going to pursue the truth wherever it may lead, even when that’s to uncomfortable places. But the whole purpose of the broad set of things we do at The Athletic is to make you a better fan, and we know that. Whereas the purpose, and again, that does not mean we don’t do hard-hitting journalism, we absolutely do, but we are independent of anyone’s interest in that journalism but the sports fan. And for the Times, we’re not writing or producing our work for any particular audience, we’re doing it in service to the public’s interest. Is that a value of keeping The Athletic brand separate from the New York Times? MKL: We are absolutely committed to building the brand The Athletic, it was a deliberate choice, I’m very invested in that choice and we’ve still got a lot of running room to build it. I say the biggest opportunity with The Athletic is just to make more sports fans. We’re making real progress with it and let me tell you, you asked me at the beginning, “How’s it going?”, we bought a company that was losing a ton of money because they were investing into a huge sports newsroom, it’s like a giant newsroom with a little business. We said it would take some time, but then it would be accretive to the Times — it is absolutely that. We got there in many ways earlier and better than we expected and today we’ve got well over 500 journalists at The Athletic. So it’s an even bigger journalistic proposition and it’s really contributing as a business to The Times and we’re thrilled about that, and I want to say we’re only four years and a few months in, we’re just getting started on all the ways we can support fandom of the major sports. I think we were nailing the journalism thing, you’re always going to get better and better at that, they were good at it before we acquired them, we’ve helped them be even better at it, do it more robustly, do it in a more edited way and add like a layer of national, and in some cases global, sports coverage. But there’s just a lot of stuff that there’s a lot of white space in the market to serve fans deeply reported, uncompromised information and we’re going to do that. You have such a good product organization and you have the whole Games initiative, how much do you think about the prospects for games in the context of sports? Whether this be fantasy sports or sort of a whole host of like daily pick-ems — it’s interesting because there’s obviously a huge gambling angle to this but how many of those sort of offerings are possible without necessarily being gambling or whatever it might be? MKL: Yeah, great question. We think there’s real opportunity for Puzzles/Games, and Sports, we think we’re good at both of those things. We already have our first collab, I think it’s about a year old, we launched a Sports Connections puzzle , it is super fun. We did some great marketing for it with famous athletes, which was hilarious, and it’s played a lot, so people love it, and I would say that is early. We’re building out the team, we just hired a new Chief Product Officer at The Athletic , he comes following years of building communities at Facebook. We took one of the guys from the Times newsroom who’d been a leader of the Upshot, who’s incredible at building interactive work, and he’s now leading interactive work at The Athletic, so we think there’s real opportunity for that. And I’ll tell you just this week, it might even be today, I’m losing track of my dates, we are launching something called The Beast . I don’t know if you’re an NFL fan, but it is the most comprehensive guide I think that exists on the planet to the NFL draft class and it includes literally information on thousands of players who are draft hopefuls and then very deep profiles of 400 of them. Before we owned The Athletic, and actually until a year ago, we’d publish it like as a book, a physical book, it’s this like monster book because there’s so much information in it and teams use it, there’s nothing else like it. Now you’ll see as it launches this week, it’s got all these incredible interactive features now on the individual player profiles and if you’re someone, if you love an NFL team and you really care, you’re going to pay attention to The Beast. So I think we’re just getting started on features that may be games and also other things that support a fan who’s super passionate about their team. I keep interrupting you, but you mentioned three things, so we’ve got to get that third thing. What was the third thing in addition to news and lifestyle? MKL: World’s best news destination, leading lifestyle products, and put those two things together in an interconnected product experience for a bundle that makes The Times relevant for whatever is going on in your world, or the bigger world, every single day. That’s the idea. Got it. We talked a lot about bundling last time and obviously that’s really the core of your strategy, how though has that evolved in the last five years? Is this really a most people are coming in the door through these lifestyle brands and you’re bringing them to the news, whereas it used to be the other way before? I’m throwing that out there as a hypothesis, how does that actually work? MKL: I actually think the essence of it is about having this portfolio of world-class news coverage, news broadly defined, and then not just products, but these products that either are or are becoming the leaders in their category. These categories are giant spaces where tens of millions, in some cases hundreds of millions, of people spend a lot of time. It’s the fact that we have rare and valuable news coverage and lifestyle products in these huge spaces that’s really working. So to me, the word “bundle” can mean — the low common denominator version of it is, “It’s a marketing concept or merchandising concept” — in our experience, we’ve got this singular idea of being essential in meeting a lot of different kinds of information and experience needs in a person’s life. Rather than it be this idea of, “We’ve got one big important thing” — I’m going to come back to news in a minute because news is central to all of it — but you’ve got this one major hero thing and then you append a bunch of other stuff so the consumer thinks there’s some other value there, we have invested and built these products out in such a way where each thing should be deeply valuable to the person who cares about buying the right products and is going to deeply research them, and therefore they use Wirecutter. You talked about expanding the brand, is this what you mean? Where you hear “New York Times”, it’s not, of course news is always the most important, I know you’re going to say that, so I’ll say that for you. MKL: I’m going to say that again and again, because it’s true. It’s also the most economic-value creating thing we do. Right. But you want people to think that, “New York Times, that’s the best games”, or, “That’s the best cooking”. MKL: New York Times makes the best puzzles, it has the best recipes, and by the way, just advice for home cooks who want to cook, it’s where I go if I’m a sports fan, and it’s absolutely going to give me the best uncompromised shopping advice — that’s sort of the spirit of it. It’s not just a news indicator it’s like a “stamp of quality” indicator. MKL: It’s a stamp of rigor and quality, and I’m going to keep using this word, “uncompromised”. Really high quality information that’s done in an uncompromised way and therefore has value at real scale. And the “uncompromised” comes from the business model? MKL: Uncompromised comes from the idea that at our core what we do is independent journalism. You could even say every bit of it, even the games are like journalistic in that they are sort of planned in a very deliberate way and thought out. Right. They’re not randomly generated, someone is actually editing every puzzle. MKL: That’s right. Humans with expertise are making these things and in some cases harnessing technology to do that even better. It’s really working, and I want to say to you, I wouldn’t have had these words four-and-a-half years ago, but at the core what we’re trying to do in a very complex information ecosystem, really shaped and controlled by a small number of dominant tech platforms, we are trying to make news coverage and products that are so good that people seek them out and ask for them by name. A destination site . MKL: Seek them out, ask for them by name, make room in their lives. The destination site has been — there’s a few companies that I always feel very pleased about, I feel like they’re like my children in a way. MKL: Are we one of your kids? You are one of my kids! MKL: I appreciate that, we could use all the parents, we could use it. That’s why I loved that, I’ve mentioned it multiple times, but the strategy document that you guys, it’s been like a decade now — I’m like, “This is beautiful”, and I think it really was on this point of destination sites, this idea that the way around a world of Aggregators that just commoditizes everything is people have to seek you out directly. Google will say a competition is only a click away and no one seems to take that seriously, people can actually click on you and go there. MKL: My answer, we all read your Aggregation Theory and all the updates you’ve done to Aggregation Theory. The way I think about it is for more than a decade, we have had these like four D’s that we’re obsessed with. Ready? So what do I mean by that? We know we exist in an ecosystem shaped by these dominant tech platforms and so and we have to have a wide free layer for our work, we have to, otherwise you can’t bring in the next subscribers. So we are very deliberate where we can be about how we go about doing that and the idea is we need to be able to get you to sample our stuff and fall in love with it and we’ve got to give you enough time and space to make a habit of it so that ultimately you subscribe. Yeah, that’s really interesting. I was going to ask this towards the end, but that’s a good lead into it. You’ve had a big focus on video recently, and it’s super interesting – actually, I have a few questions about this. One is it’s pretty weird to go to the video tab on the desktop and all the videos are vertical. Was that very controversial? MKL: There’s video all over the site now so you’re gonna see it in a lot of places. When we say destination, we know a lot of people during the workday are reading us or watching us or listening to us on the desktop web, but we are so kind of first to that phone. Our bet is the ability to watch a video on a phone, you are going to want it in vertical and we now have a home for it in this tab. I encourage everybody, download our app, and you get the best version of what we’re doing. Download your app and make sure you register your user account and get the experience. It’s really interesting because I’ve noticed with Stratechery actually, a huge portion of my audience now is just audio, I think more than half my subscribers listen instead of read. You mentioned you mostly listen, which is fine. But as far as the reading goes, actually, I still have a huge amount of people reading on the desktop as compared to mobile. MKL: By the way, I listen when I run because all my other media time is reading. MKL: And now I’m forcing myself to watch. Right, you’ve got to dogfood it . MKL: I’m like listening to YouTube when I run. Just talking shop, is there a bit where, as you look back on the evolution of media, there’s a thing where actually it turned out that the browser ended up being a text medium, and then the phone was actually the multimedia platform? MKL: That’s such a great question, that’s so well put and I need to take that in for a minute and think about it. What I’ll say that I think that’s related to that in a web world, we needed a website that people would type in and then like pin and always be able to go back to, that worked and the Times has been very good at that. In an iOS and Android world, we need an app, and we’re very, very good at that. I would actually say to you, we’re still pretty early in really getting more and more people to use our app. Today, the majority of people who use our app are subscribers, the engagement is enormous, but it’s like mostly the people who subscribe. We have not made the app a really important place for prospects and we’re starting to do that, the Watch tab is part of that. I think it remains to be seen in a world where the Times is as preferred a brand and a source for watching as it is for reading and listening. Which, by the way, I want to say to you, those things are not going to go away, we’ve been at this for 175 years. MKL: The old media doesn’t go away, the people who do it still do it. They vary it a bit, but many of them still do it. To your point, this is a big part of your approach is you have this huge reporting base, which the medium, that’s all ones and zeros, they can write an article, and they can be on a podcast, and they can show up in video. MKL: And they can put a camera, they can literally hold a camera in front of them from somewhere on the edges of Iran and describe what they’re seeing. So I think it remains to be seen, I think the market is still kind of forming and structuring. We regard video as doing three really important things for us. One is it helps us engage the people we already have, and anything that helps us engage the people we already have is very good for business. Churn mitigation is always a win if you’re a subscription business. MKL: It’s good for business, and I would argue it’s good for journalistic impact and everything. Good for society, but very good for business. We also think there is an enormous number of people in all generations of life, but especially young people, who spend time watching, and they’re either watching news or they’re watching things that are in a zone adjacent. We are the only generation that really just maximized text, it’s been all downhill ever since. We got all the text in the world, we read it all, and then now everyone’s just watching video. MKL: I could do a whole other episode on that and fight to get my very intelligent kid to just like sit back and read and how important I think that is to brain development. But we think video will help us engage whole new audiences, that is a big bet we’re making, we’re already starting to see some of that, we are very excited about it. And then the third thing that video does for us, and I think that’s really important, I think we all know that trust in all institutions is at an all-time low, trust in media is at an all-time low, I hate the word “media” because it lumps in journalism and a bunch of other things, but trust in all of it is low. And the more we can show you the work, the more we believe you will come to understand what an independent journalistic process to pursue the truth wherever it may leave looks like. Interesting. So it’s like brand-enhancing for what you’re going for overall. MKL: Totally, and trust building. I’ll just tell you, we are much more aggressive today than we’ve been. One of the formats that we’ve scaled the most and there’s still so much room to go is just a reporter on camera describing the story. Which by the way then your production is vertical anyway so it ties right in. MKL: But there are times you go into a studio and explain something, so it doesn’t have to only be vertical, it goes a really long way. And we have made a very deliberate choice where we’ve said, we don’t particularly have a business model on TikTok or Instagram or YouTube Shorts, but we’ve got to be in those places. I wanted to ask you about that because when you think about podcasts, for example, there’s a huge push in general to be on YouTube and I think it’s pretty obvious because podcasts are incredible for audience retention. I’ve talked about for my business, all these people listening to Stratechery don’t go anywhere. Whereas people would have emails build up before that, and they’re like, “I have too many emails, I should just unsubscribe”, the problem is I get much less sharing because it’s much easier to forward an email and the podcast, you just go to the next podcast and then it’s sort of done. So you have podcasts in general going to YouTube because they feel like the algorithm is the way to acquire new users. The reason to bring this up is I go to the New York Times YouTube page right now, your last main video is from seven days ago. Your last Short is more recent, but it’s about Trump escalates threats to destroy Iran. Well, there’s been some news development since those threats. MKL: You think? Consult top of app. But the point is clearly it’s not a priority for you. How does that tie into the balance of destination site versus customer acquisition and all those sorts of things? MKL: It’s a great question. Let me start by saying our general thesis, and I’ve been here a long time now, so I’ve got enough reps to say it bears out. If we make great work that should scale because it’s unlike anything else out there, and it’s important, it will. I want to say that, that is our bet. And so I will say to you, we’re still at. That’s my bet too. MKL: I listened to enough of your work to know you think that too. It’s a really important principle that we’ve just like hit again and again and again as a business. First, we have to make like the best stuff there is, and it’s got to be done in an independent way and it’s got to be done with rigor into a high standard of quality. So the chapter we’re in now with video is very much scaling production, which is like, “What are we making?”, “What is it?”, “What is the New York Times if you can watch it?”. We are early in that and we’re going to admit that all over the place. We are, as I started to say, putting a lot of that work. The best place to experience it is come to our app, go to the website, even if you have to, you know, even if on the site, some of it is shot for vertical, best place to experience it is our destinations. But we need to be in the places where huge numbers of people are. So the work is also on TikTok and Instagram, it’s on YouTube both in short form and on YouTube, we’re starting to put our longer form stuff there. And the truth is, it’s a place where we can see, you are right, a lot of it is dictated by algorithms, but also you get a sense of what is a hit. I’m going to name a few things that are just like unequivocally hits at the New York Times as video. The Ezra Klein show was only a podcast, it’s now a video show too — that guy is so brilliant, he has such an incredible following, we are so excited about that show. Right around the time we were putting him on video, we launched, to the extent that Ezra is examining the biggest ideas on the left, Ross Douthat is examining the biggest ideas that are animating the right. Ross has been a longtime columnist at the Times, we launched a show, I think we launched the pod and video at the same time it was one of the first ones where we said, we’re going out. You say they’re going huge, are they going huge on your properties, or are they going huge on the RSS feeds and the other platforms? MKL: Out in the ecosystem. And when I say huge, we were early in all of this, they’re building audiences and growing. The Daily is huge, The Morning , we have the largest general interest news newsletter I think on the Internet in terms of readership, five or six million people open it every day. And do you see very tangible, measurable, people are finding this other platforms and coming back to the Times and subscribing? Or is this more ethereal, this is enhancing the brand, in the long run this will pay off? MKL: It’s a great question. The broad answer I’m going to give you, and I ran the subscription business for a long time, I was on top of the product organization, I was accountable for it, the thing I’m sure is that we have to make stuff that is so good that it’s worth paying for even in the presence of free and less expensive alternatives, and we also have to have many tens of millions of people who do not yet pay, who are regularly engaging with our work. We do believe we have to be sort of out there in the ecosystem — of course, you and I both know, you know, we see a receding link-based economy. Did you see that discussion between Nate Silver and Nikita Bier the other day? MKL: Oh, I haven’t seen it yet. They were talking about, because Nate Silver did some sort of article about who’s getting prominence on X and things along those lines, and one of Nikita’s pushback about The New York Times not having prominence, not just on X but on all social platforms, is you do what I do , which is we’re old and lazy and just post an article with a link and Twitter doesn’t feature links anymore. Fine, it is what it is, I have my built-in audience, it’s okay. And it’s like, well, if you actually want to grow, you have to do the whole thread thing like, “This is what’s in this article”, and at the end there’s a link. And Nikita pointed out that the New York Times does the bare minimum, it’s basically like an RSS feed for links, of course they’re not getting featured. Is that something where, I’m telling you now, you didn’t read it, you’re like, “Oh yeah, we should fix that”, or is that a, “Well, you know what? We’re not a social media company, we are a destination site, and that’s just the way it’s going to be”. MKL: It’s a fair question, I think you should regard us as first and most importantly trying to make the best stuff that can and should scale because it’s amazing. And remind me, I’m going to mention two other video shows to you that are so different. And then we are also looking to always master the evolving audience ecosystem. And I think if you followed us, it’s interesting on YouTube, we’re doing more now show by show to build audience so just like you mentioned, the New York Times channel, but like Ezra’s feed is surely updated, Ross Douthat’s feed is updated. I’ll mention these two other shows. We launched our cooking team, launched a show maybe six months ago called The Pizza Interview , we have this amazing test kitchen on the west side of Manhattan and like every major celebrity with something important to say can come on that show now, they make a pizza and they talk about their work. So the cast of Stranger Things came with the finale, Ariana Grande came. That’s a great concept. MKL: It’s amazing. And that show is building so much momentum, so different than what you would expect. It is fun, it’s really working. We’ve had a show, I don’t know if you’re a music fan, Ben, but we’ve got a music critic and a music reporter, Jon Caramanica and Joe Coscarelli, they have had a podcast on The Times for like a decade called Popcast , where they talk about music. It was sort of made at the edges of the enterprise, these guys are so talented, and we’ve just brought them to video and kind of prime time and man is that scaling. They actually did a live show at an all-company meeting with Lizzo, it was unbelievable. They’re getting everybody, it’s so, so great. What you see is we are just in the early days of saying, “How and where should we build the big audience for this?”. The Daily, which is nine years in still in the top podcasts, there is I think it’s the largest general interest news podcast, most people do not listen on The New York Times, they listen on Apple or Spotify. MKL: And you know that because of what you do for a living. So we’re open-minded about that and also pushing really hard on the companies that shape the ecosystem to make it so that great stuff can scale. Yeah, I’ve had plenty of discussions with YouTube. MKL: I’m sure we’re going to talk about that too. Well, we’ve actually gone quite long, I do need to ask you about – there’s this technology called AI you may have heard of, I do have a few questions for you on that. Just to get it out of the bag, you’re in ongoing litigation with OpenAI. Obviously, I’m sure that constrains what you can talk about to a certain extent. But sort of big picture, what’s the point of this? What do you want to accomplish? MKL: We’re in ongoing litigation, two-and-a-half years now with OpenAI and Microsoft, we’ve also sued Perplexity . Why? They stole our stuff, they used it without permission, without fair value exchange, copyright infringement and they build products that compete with us, so that’s why. Let me just say, why did the Times do this? You know, we have spent over 175 years, an enormous amount of resources on high-quality independent journalism, and I want to say this, we’re fighting here, obviously, for the Times, but for the industry writ large for high quality journalism and content creation writ large and for the public to have high quality information and content. We have made an enormous investment, we’ve been doing it for a very long time, and we have a huge number of works. Is your biggest concern the training or the output? MKL: We believe that there should be sustainable fair value exchange for our work used in any way, number one, so fair value exchange sustainably. Number two, we believe we should have control and the law says we should have control over how our work is used, and I would say those are kind of for everyone. And for the Times very specifically, by the way, we’re not just suing, we have a deal with Amazon , we choose to deal, these things are of a piece enforcement of our rights in court and dealing is all to put a stake in the ground to say high quality journalism deserves to be paid for and it should be. And, by the way, the LLMs are only going to be as good as the information that courses through them. The third bit is can we do a deal that’s consistent with our long-term strategy, which involves ultimately having direct relationships with our consumer. Do you worry about — you’ve had this huge growth in terms of these lifestyle verticals, things like recommendations, things like cooking. Some of those AI is really, really good and useful at, do you feel a threat there? Have you seen an impact there? MKL: We’re enforcing our rights in court for very specific reasons. I want to do a number of AI categories so let’s set aside the court case. Let’s just say in terms of NYT Cooking, super compelling. Also, I go to ChatGPT, I ask for a recipe and it will give me one. MKL: Totally fair question. I want to say to you first, we’re also using AI like assertively in our product. Right, my next question is how you’re actually using it. MKL: Let’s come back to that. The most important part of our strategy, and maybe to the extent there’s a theme from this conversation, is that The New York Times creates human-led high quality news journalism and all this other stuff, including recipes that are better because of the humanity, the expertise, the professional process that goes into them. And I want to say, because you asked about cooking specifically, every one of those recipes, we have 25,000 recipes and counting in a database, every one of them, human-tasted, human-tested, they’re better. People say to me all the time, “Your recipes are just better”, yes! Because professional chefs and cooks are using them and it doesn’t get published until we’ve done that. We think that’s going to have enduring value, we think in an information ecosystem where it’s harder and harder to find quality stuff, brands are going to matter more and human-made content is going to matter more. The week you filed the lawsuit, when I wrote about it, I entitled it The New York Times’ AI Opportunity . MKL: I remember what you wrote about it. In this world of everyone getting individualized content and actually that makes you more valuable, not less. MKL: Listen, society needs a shared fact base. People need high quality, uncompromised information and they need to be able to find it with ease and they need to be able to know what is true and worth their time and we think the Times and each of our portfolio brands, each of our lifestyle brands is like a signal to that. So we are obviously investing enormously into all that. Has that been validated in the numbers? MKL: Look at our business results. It’s been a strong period for our business results, I can’t tell you what will happen in the future, but I can tell you we are very, very focused on two things. One, making our products even more kind of rare and valuable at real scale to people, and we are also incredibly focused, part of how I got into this chair, we are incredibly focused on harnessing technology to make the journalism richer where it can help us do that, make our journalists able to get to more things or get to the things more deeply. We are incredibly focused on using technology, and this includes AI, to make the work more accessible. I told you earlier, I’m a runner, you can listen to almost every article now. You can’t listen to the live journalism, but everything else you can listen to in an automated voice and I think we’re on the third generation of that voice, it’s so much better. It’s still like, we’ll mispronounce one or two things, but it’s great. See, I read my own articles and I still mispronounce things, so maybe that’s actually the human component. The moment it starts pronouncing things perfectly, I’ll know it’s a robot. MKL: We we’ve been aggressive with that. Let me give you an example in the journalism that the Epstein Files , I think it was like three-and-a-half million pages, they came out like late in the day on a Friday and we’ve got a whole AI Initiatives team in the newsroom and they like built a tool to be able to comb those documents and the magic of what we were able to do from them was the fact that we could create this tool that said like, there’s all these different story angles to get to, how do you get at it with ease? And then the beat reporters and the editors who have the expertise and the kind of rigor to say, “What should the public know from this?”, it’s the combination of those things that made it awesome. I’m going to give you one more example that I just kind of said immediately, “Oh, there’s a real interesting opportunity here”. Remember the Sydney Sweeney jeans/genes thing? MKL: So the early of read on that was that the left was up in arms about this Sydney Sweeney ad and we had journalists who basically did a story using AI to comb social media to sort of say, “How did this happen?”, and what they found was it was actually construction on the right, started as a construction. Like the idea that there was kind of fury about it started as a construction on the right and then became like a bigger thing. So I think any new technology, it is our job, it is my job, to see that people are not afraid of it, and are using it in responsible and appropriate ways. We’ve just rolled out Claude Code to our product engineering team, so they can prototype faster and do all kinds of things. So The Times is not anti-AI or any other tech, we have laid a stake in the ground to say this next chapter of the ecosystem has got to be shaped in a way that allows high quality journalism organizations and other high quality creative content organizations to do their work in a way where they can earn the living they should from that work but we are certainly not anti-tech. Just to go back to this AI bit and The New York Times AI Opportunity idea. You just touched on the, This is a trusted brand, it’s validated by humans”, it’s leaning into the humanity of it. I’ve expanded that bit a little bit as well as I’ve been thinking about this thesis , and I have this concept that I’ve been thinking about called totem content , where if everyone is reading AI content, everyone’s reading different stuff. The idea of having one piece that, “Did you read the Stratechery article today?”, or whatever it might be, is actually going to be more valuable, not less. I’ve been thinking about this in the context of community, it feels like no content company has ever solved community. You have a thriving comment section, but you’re not making friends in the comment section, it’s sort of a performative bit. MKL: We’re not introducing friends to one another, not necessarily yet. If I know someone who is interested in the same sports team or is interested in Wordle or Connections or whatever it might be or is interested in a particular facet of the world and I knew who they were, there’s something there and there’s a continual trigger for us to talk about it. Where’s your thinking about this? You do this all the time, there’s lots of group chats with New York Times articles shared it, is that something, though, that you want to or you see an opportunity to lean more into? MKL: My very short answer is yes, with like a double underline. Yes, yes, yes. At the core of the mission’s role is to help society make sense of itself in a way that serves the common interest, the public interest, “common” is the main word in community. So yes, and I agree with you, I don’t think it’s been solved in any way yet by us or anybody else in the sort of publishing or journalism industry, but we’re beginning to focus on it much more earnestly. I want to say two other things. Within the news report, we do a ton of culture and lifestyle journalism, and going back a couple of years, we launched the 100 Best Books , and we launched it with a bunch of input from experts beyond the Times, but of course, all coalescing around our books experts and we launched it with a bunch of features, because it was like an inherently shareable idea, “I read these books, Ben, you should read these books, what’s on your book list?”, and then we did it for movies . We’re just at the beginning of it, I think it’s a huge opportunity, I am super interested in it. And the last thing I want to say, and it kind of brings us back to where you started with me. I will never forget, I was with my son and his friend, on the ferry to the Vineyard, and his friend was like, “Oh my gosh, I play Wordle every day and then after that, I go and I play…”, and he named four rip offs because he liked the game so much. Point being, we need to make more games, we have, we did, we’re still making more. But none of those games, you know, have like the competitors, people may play them, but like you don’t hear about them the way you hear about Wordle, they haven’t broken through. Why is that? There is one puzzle a day from a company whose brand ethos is it makes you smarter that you do with the people you love and by the way, it’s true for Wordle and Connections and Strands. Everyone’s playing the exact same puzzle. MKL: And it is a shared experience. Just to go back, you asked me about sports, fandom is a shared experience, and we’re thinking very hard about how we support that game moment in a way that I think The Athletic has a very big opportunity here. And I think in news, what we want, journalism can’t solve society’s big problems, and there are many big problems, but society’s problems cannot be solved without high quality independent journalism. So the idea of, “Can we get more people engaged with one another?”, on really big, important, weighty topics that need independent journalism, I think that’s a big idea and a big opportunity for The Times, for journalism, for the country, for the world. Has the New York Times fully crossed the Valley of Despair in terms of advertising? Part of all this was you had to like build a subscription business but now that you’re known as a subscription business, advertising is suddenly a growth opportunity instead of a decline to manage? MKL: I came to run the ad business, the woman who runs the ad business now, Joy Robins , she’s an extraordinary leader. The ad business I joke all the time is going so much better under her than it ever went many years ago. I think that we have really found a formula that works. What is that formula? MKL: We are a, and I bet, long after I’m here, we are a subscription-first business, meaning we make things that are meant to be extraordinary to consumers at great scale. So many of our ads are shown to subscribers because so much of our engagement is from subscribers and we’re obsessed, especially in a changing ecosystem, with getting the next group, the prospects, really, really, really engaged with our work and our obsession with engagement and with quality products in giant spaces that marketers want to be near, news broadly defined, but on the authority of news. Marketers want to be next to other healthy, thriving brands, and I think The Times is that today, but they also want to be in sports and they want to be next to our games, which are cultural sensations, and by the way, do you think marketers like shopping? Quality shopping and cooking, there’s so many marketers want to do stuff with that. I do think we’ve arrived, I’ve been more optimistic and excited about our ad business over the last year than I’ve been at any other point and I think given the scale that we have achieved — Ben, you and I both grew up on the web, just think about the number of page views the New York Times has, like, all that engagement. And we’ve spent half a decade, longer than that, building very sophisticated first-party data. So we’re never going to have the scale of a platform or the targetability of a platform, but we are certainly well above what I would suspect any other kind of publisher can do. That’s the question — is there anything actually generalizable from the New York Times? Like you’ve done it, you’ve won it, can anyone actually replicate this? MKL: First of all, we have not won anything, I want to say that very clearly. We have so much more to do, to grow, to make sure. Relative to basically every other newspaper, I’m going to declare you a winner. MKL: Let me tell you the few things that I think are absolutely extensible. I often say we’ve spent so much of our time wanting to make a market and then support a market for digital subscriptions to journalism, and journalism being something of value that is worth paying for. We believe that a thriving, healthy ecosystem with lots of competitors who we’re fighting every day with is actually better, it’s certainly better for society, we think it’s just better generally. And I want to say there are you, Puck, there are so many other things that have been invented since I came to The New York Times. So in some ways, there are aspects of the information ecosystem and journalism that that are thriving, certainly not local journalism, certainly not deeply reported journalism and that’s very unfortunate. The things that I think are extensible, one, when I get asked, “Why has the Times succeeded?”, if I can only give one short answer, it is we kept investing in journalism, that’s it. Good times, bad times, we kept investing in the journalism. There was something there that actually was worth paying for, one. And two, we stuck to our values. So the Times can’t be bought, the journalism is never compromised, we can’t be cowed, we can be hated in lots of places, and people know they’re still going to get our best understanding, they’re going to get the results of a pursuit of truth wherever it will lead, even when that’s to uncomfortable places. If I had to boil it down to like two short things, I’m ripping off a line from our publisher, AG Sulzberger , that I think does it so beautifully, he says, “It’s value and values”, we kept investing to make sure the product was still really valuable and then we just never let go of our values, I think that those are ideas that are extensible to everyone. The other thing I’ll say to you, and this is maybe my contribution, we clocked early on, 9 or 10 years ago, we are competing for engagement with the most powerful companies, information companies the world has ever known, who are so much richer than us, so dominant, and we’ve got to get really good at engagement. We’ve got to get really good at making people want to come back, and we’ve also believed in the power of brands as signals to get people to ask for us. I say all the time, they’ve got to ask for us by name. The New York Times, Wordle, Connections, Strands, The Athletic, Cooking, Wirecutter, people have to ask for us by name, and we’ve invested into all those things, I think those are all extensible ideas. Well that’s why I say you’re one of my idea children, destination site, I write about Aggregators and my personal strategy is to do everything the exact opposite as them because why would I want to even compete in that game? So that certainly resonates. MKL: And you have so many readers and listeners at The New York Times, we’ve been reading you as long as you have felt like a parent of us. Well, I appreciate it. You are, for the record, older than, The New York Times I should say. 175 years this year, very exciting, congratulations. MKL: (laughing) Very exciting. Can I say one thing? If we can do anything with like a 175th — Is it a birthday? Is it an anniversary? — if we can do anything in this moment, the most important thing we want to accomplish is just raising people’s consciousness for the idea of what high quality independent journalism is and does. It is human beings with a professional process and real expertise going out into the world and unearthing new information, following a very honed professional process to do so, so that the public can know what’s happening. We are spending a lot of our energy this year at 175 years old, just trying to remind people what that is and there’s so many other things you can do in media now. You know, I listen to a bunch of stuff, there’s so many things that are like adjacent to news. Oh, I appreciate it. I’m not a reporter, so I need someone to actually go out and unearth facts. MKL: But it is not that, most of it is not that and I think as local journalism has been in such dire straits for so long, and there’s so few local newspapers and fewer journalists and as people get more and more of their media diet fed to them by an algorithm that’s meant to match the things they already think and as leaders work to discredit independent journalism with all those forces going on in the world, I think the public has a — I think it’s just harder to know or remember or be conscious of the importance of the thing our journalists are doing every single day. There’s one thing, I know we’ve gone slightly long, but when you say that, what I find inspiring and why I like to talk to you and write about the New York Times is, I’m sure it’s a relief to you, I’m just completely independent of any partisanship or political angle. MKL: Totally, you’re not compromised. I find it so interesting from a business perspective and what you’re articulating there is what is inspiring is it’s a fight against entropy, where the easiest path for people and for publications is to just give in to the algorithm, as it were. And it’s kind of nice to go to YouTube and not see any of your videos there, because it’s sort of like an assertion that that’s not the path we’re going to go, and I certainly can relate to that and find that inspiring and that’s why I enjoyed talking to you. MKL: I enjoyed talking to you, this was a lot of fun, thank you. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day! We have to be a daily habit We have to have direct relationships with people We have to be a destination and let me say to you, by destination, I mean, we do most of the economic value creation and we also give the best experience if you actually come to us in the whole of the experience. Then I say the fourth D is we only do drive-bys if they’re deliberate.
Anthropic says its new model is too dangerous to release; there are reasons to be skeptical, but to the extent Anthropic is right, that raises even deeper concerns.
Anthropic needs compute, and Google has the most: it's a natural partnership, particularly for Google.
OpenAI's purchase of TBPN makes no sense, which may be par for the course for OpenAI. Then, AI is breaking stuff, starting with tech services.
Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Agents Over Bubbles . Formula 1 Spins Off Track. Last fall Eddy Cue and Tim Cook agreed to pay a reported $750 million over the next five years to become the U.S. broadcaster of Formula 1. That deal kicked in this year, and for this week’s Sharp Text I wrote about the disastrous first month of the new era . In short: through no fault of Apple’s, Formula 1 is suddenly an acrimonious mess. Fans are mocking every race, the newly redesigned engines are a problem, and the greatest driver in the sport is threatening to retire at 28 years old. Where will it all go? I have no idea, but the Miami Grand Prix is one month away, and it’s now time for everyone to search for solutions. — Andrew S harp Apple’s First — and Next — 50 Years . When it comes to discussing Apple’s last 50 years, and their prospects for the next fifty years, there are two obvious choices: John Gruber and Horace Dediu. I have the pleasure of talking to John twice a week on Dithering — we discussed Apple’s anniversary on both Tuesday and Friday — but I was particularly excited to spend 90 minutes with Dediu on this week’s Stratechery Interview . This wasn’t just a podcast about Apple, but about how tech has changed over the last fifty years, and why AI makes even the most reliable narrators of history increasingly uncertain about the future. — Ben Thompson Security and AI. Glancing at headlines in the aftermath of the Axios hack, I was briefly under the impression that a buzzy D.C. news organization had just suffered a breach of its email list. Unfortunately the real story is a bit more ominous. So if you, like me, had never heard of Axios or a “supply chain attack” before Monday, start with Ben’s Daily Update on Wednesday , which made mechanics of the hack more legible. We went deeper on Sharp Tech to explain why the Axios hack matters , and what sort of tension this week’s news portends — including why AI will make security issue worse in the short-term, but may be the solution in the long run. — A S Apple’s 50 Years of Integration — Apple has survived 50 years by being the only company integrating hardware and software; if the company loses because of AI it will be because the point of integration changes. Axios Supply Chain Attack, Claude Code Code Leaked, AI and Security — AI is going to be bad for security in the short-term, but much better than humans in the long-term. An Interview with Asymco’s Horace Dediu About Apple at 50 — An interview with Asymco’s Horace Dediu about his career in tech, Apple’s first 50 years, and the prospects for the next 50, particularly in the face of AI A Snap of Oversteer — Formula 1 began a brand new era with a very bad month. Can the sport get back on track? Apple at 50 Will AI Disrupt Apple? The Supercritical CO2 Turbine: Waterless Wonder The U.S., China and Iran; A PRC-Pakistan Peace Plan; KMT Chair Set to Visit China; Huawei, Manus and ZXMOTO Don’t Blame Cayden Boozer, Stretch Run Notes on the Sixers, Celtics, and SGA, Geriatric Millennials Approach Extinction Five Questions on Apple at 50 Years Old, The Axios Hack and AI Security, Q&A on Starlink, AI IPOs, AirPods
An interview with Asymco's Horace Dediu about his career in tech, Apple's first 50 years, and the prospects for the next 50, particularly in the face of AI
AI is going to be bad for security in the short-term, but much better than humans in the long-term.
Listen to this post : There is a weird phenomenon as a sports fan where the athletes on the field or court are older than you…and then they’re your age…and then they’re all younger than you; for me the last athlete I could look up to, at least in terms of age, was Tom Brady. Tech companies are similar, in a way. I like to write about tech history, and the importance of origin stories for understanding company cultures, and I’m fortunate enough to have witnessed most of those origins. However, there are still some companies that pre-date me — the Tom Brady’s of the industry, if you will — and one of those is Apple, which turns 50 tomorrow. My first computer was a hand-me-down IBM-compatible 286 — I don’t even remember the brand — but I mostly cut my teeth building my own computers with overclocked Celeron chips in college, using parts procured by leveraging unsustainable dot-com era customer acquisition strategies (a unique email address meant a PayPal account with a free $25 and a single-use credit card with another free $25 used for a Value America account with a $50 off coupon). Needless to say I not only witnessed many of these companies’ births, but also their deaths! There were Apple II’s at my elementary school, where I would type out programs in BASIC, but my first serious interaction with the company’s products was at the college newspaper doing layout in QuarkXPress; after I graduated I was smitten by the iMac G4 and its adjustable arm, and the GarageBand addition to the iLife suite; I ended up buying an iBook, and here I am, a quarter of a century later, typing this Article on a MacBook Pro. In my history is much of Apple’s history. I missed the very early years, when the Apple I was a mere circuit board created by Steve Wozniak; Steve Jobs bought the parts for the initial batch on net-30 terms and paid them off by receiving cash-on-delivery from a computer shop in Mountain View; it was the Apple II, released in 1977, that made the company, and that was my first encounter with Apple. The Mac came out in 1984, and found its niche in desktop publishing; that’s how I came back to Apple in college. Apple, however, was struggling in the face of more capable modular Windows PCs, which I was happily building in the meantime. It was OS X that changed Apple’s fortunes with nerds , and Jony Ive’s stunning designs that changed the value proposition for everyone else; iLife, meanwhile, made the Mac useful from day one. It was the combination of all three that made me a customer, and as the Internet destroyed lock-in, it was the fit and finish of the operating system and Apple’s independent developer ecosystem that made my two years at Microsoft with Windows a drag; then, in 2020, Apple’s differentiation came full circle : Macs were the fastest personal computers — particularly laptops — in the world. There were, of course, other parts of the Apple story, including the iPod and, most importantly, the iPhone. Those were the products that made Apple the most valuable company in the world for years (today Apple is surpassed only by Nvidia). These products, however, might have been in a form that addressed a far larger market, but were still very much Apple, a company that, all these years later, faces no competition when it comes to integrating hardware and software. What do I mean by “no competition”? Well, consider Apple’s nominal competitors through the years: IBM: This is, perhaps, the most iconic photo from early Apple: The Apple I launched in a world where computing was primarily for the enterprise, and primarily happened on IBM’s mainframes. Increased accessibility of processors and memory, however, made hobbyist computers possible, which is exactly what the Apple I was. It was the Apple II, however, that made IBM pay attention; I explained in 2013’s The Truth About Windows Versus the Mac : In the late 1970s and very early 1980s, a new breed of personal computers were appearing on the scene, including the Commodore, MITS Altair, Apple II, and more. Some employees were bringing them into the workplace, which major corporations found unacceptable, so IT departments asked IBM for something similar. After all, “No one ever got fired…” IBM spun up a separate team in Florida to put together something they could sell IT departments. Pressed for time, the Florida team put together a minicomputer using mostly off-the-shelf components; IBM’s RISC processors and the OS they had under development were technically superior, but Intel had a CISC processor for sale immediately, and a new company called Microsoft said their OS — DOS — could be ready in six months. For the sake of expediency, IBM decided to go with Intel and Microsoft. IBM was, in the end, just a hardware maker; they couldn’t be bothered to make the software. Microsoft: Software fell to Microsoft. Continuing from that 2013 Article: The rest, as they say, is history. The demand from corporations for IBM PCs was overwhelming, and DOS — and applications written for it — became entrenched. By the time the Mac appeared in 1984, the die had long since been cast. Ultimately, it would take Microsoft a decade to approach the Mac’s ease-of-use, but Windows’ DOS underpinnings and associated application library meant the Microsoft position was secure regardless. For decades after the fact, conventional wisdom was that Microsoft’s modular approach — the one that let me build my own computers — was unquestionably superior to Apple’s integration of hardware and software. In fact, it was Apple’s integration that kept the company afloat: all of those Macs used for desktop publishing were expensive, and gave Apple enough revenue to (barely) stay in business; the company’s brief foray into licensing Macintosh OS was a major contributor to the company nearly going bankrupt. Or, to put it another way, Apple only briefly competed with Microsoft, and it nearly killed them. Consumer Electronics Companies: It’s difficult to choose a company to represent the iPod era, because Apple didn’t really face any meaningful competition. There was Sony and the Discman, and Diamond and Creative with some of the first MP3 players, but the reality is that no one had the combination of hardware and software that made the iPod special; in this case, the software was iTunes, and putting iTunes on Windows is what propelled Apple far beyond the Macintosh, and laid the groundwork for what came next. RIM, Palm, and Nokia: It was early smartphone makers who were, in the framing I am taking in this Article, the only true competition Apple has ever had. All three of these companies integrated hardware and software, which makes sense given that the smartphone category was so nascent — that’s when integration is particularly important. The iPhone, however, was different in one important regard: RIM, Palm (which also sold phones with Microsoft’s Windows Mobile), and Nokia first and foremost made phones ; the iPhone was a full-blown computer, built on a foundation of OS X. That, combined with the iPhone’s innovative multi-touch input method, resulted in a vastly more capable and compelling device that wiped out all three companies. Android: Android is, in many respects, the Windows to Apple’s iOS — which was why many commentators predicted that Apple was doomed . One critical difference, however, is in the Article I excerpted above: whereas DOS came before the Mac, the iPhone came before Android. That meant that Apple had a critical mass of users and developers first, in contrast to the 1980s. Another difference is that the iPhone sold to end users, not IT departments, who actually cared about the look and feel of the device they were spending their money on. A third difference is that Apple had (and continues to have) the performance advantage, thanks to their investment in their own silicon, a stark difference from the dead end the company found itself in with the Mac. Android is, of course, a big success, with more unit market share worldwide (although the iPhone has majority share in the U.S.). There is a place for modularity, and companies like Samsung have done well to build high-end Android-powered devices, with a host of Chinese companies in particular filling in the lower-end. And, it should be noted, that Google makes its own Pixel phones as well; that is true competition, albeit one that barely registers given Google’s commitment to the entire Android ecosystem (so few, if any Pixel-exclusive features, at least not for long), and Apple’s grip on the high-end of the market. Perhaps Apple’s most interesting new product is one that takes the company full circle. The MacBook Neo is the cheapest Mac laptop ever, and has the company poised for major gains in the low-end of the market. Notably, in defiance of the assumption that modular offerings take share by being cheaper and “good enough”, Apple, by making everything from operating system to device to chip, is selling a computer that is both higher quality and has higher performance with lower component costs than the alternatives in its class; and, now that there is no more software lock-in — the Neo runs a browser and an AI chat client just like Windows machines do — Apple is poised to make major gains in its oldest market. More generally, Apple’s market share in all of its markets, including the phone, continues to increase over time, not decrease. This is happening despite the fact that Apple is not investing at a meaningful level — at least compared to its Big Tech peers — in AI server capacity, and has yet to ship the new AI-empowered Siri it promised nearly two years ago . The reason it doesn’t matter is that no matter how powerful AI becomes, you still need to access it with a device, and Apple, thanks to its integration of hardware and software, makes the best devices. Now, according to Bloomberg , Apple is planning to leverage its position with end users to give access to multiple AI providers: Apple Inc. plans to open Siri to outside artificial intelligence assistants, a major move aimed at bolstering the iPhone as an AI platform. The company is preparing to make the change as part of a Siri overhaul in its upcoming iOS 27 operating system update, according to people with knowledge of the matter. The assistant can already tap into ChatGPT through a partnership with OpenAI, but Apple will now allow competing services to do the same… The company is developing new tools to allow AI chatbot apps installed via the App Store to integrate with the Siri assistant, said the people, who asked not to be identified because the plans haven’t been announced. The chatbots will also work with an upcoming Siri app and other features in the Apple Intelligence platform. That means, for instance, if users have Alphabet Inc.’s Google Gemini or Anthropic PBC’s Claude installed, they’d be able to send queries to those services from within the Siri voice assistant, just like they have been able to with ChatGPT since Apple Intelligence launched in 2024. The approach also should allow Apple to generate more money from third-party AI subscriptions through the App Store. This isn’t quite Safari search, wherein Apple earns a revenue share from Google for searches made through the iPhone’s built-in browser, but given that AI assistants are largely monetized through subscriptions, it’s not far off: Apple will happily sell subscriptions through the App Store and take 30% of the price for the first year, and 15% after that. Owning the device means Apple gets to aggregate AI (and the company is already making $1 billion a year from chatbot subscriptions ). This is exactly what I expected after Apple announced that initial partnership with OpenAI; from a 2024 Update Apple, probably more than any other company, deeply understands its position in the value chains in which it operates, and brings that position to bear to get other companies to serve its interests on its terms; we see it with developers, we see it with carriers, we see it with music labels, and now I think we see it with AI. Apple — assuming it delivers on what it showed with Apple Intelligence — is promising to deliver features only it can deliver, and in the process lock in its ability to compel partners to invest heavily in features it has no interest in developing but wants to make available to Apple’s users on Apple’s terms. The company that owns the point of integration in the value chain never wants to have an exclusive supplier; it wants to commoditize its complements, which means creating a modular interface for multiple companies to compete on the integrator’s terms, which is exactly what these AI extensions for App Store apps sound like. Of course there still is the matter of getting Apple Intelligence to work; this upcoming feature is separate from Apple’s deal with Gemini for foundation models for Siri. I explained the distinction in this Update , and concluded: The big problem with this vision is that it assumed that Apple Intelligence would be competent, and it simply wasn’t; just as the iPhone search deal wouldn’t be worth much if the iPhone sucked, Siri chatbot integration isn’t worth much if Siri sucks. Now, however, Google is selling the underlying model to make Siri good, and their biggest hope is that they can pay Apple all of their money back — and more! — to have a money-making Gemini sit on top. Apple will let the users decide who is on top; I’m sure the company would also be amenable to be paid to be the default! Many people are taking a victory lap about Apple’s decision to not compete in AI models, claiming that the company is winning by not trying; I previously linked to Horace Dediu’s The most brilliant move in corporate history? , but it’s a good articulation of the argument: The hyperscalers are now spending 94% of their operating cash flows on AI infrastructure. Amazon is projected to go negative free cash flow this year with as much as $28 billion in the red. Alphabet’s free cash flow is expected to collapse 90% from $73 billion to $8 billion. These companies used to be the greatest cash machines ever built. Now they’re borrowing money to keep the data center lights on… And what are they getting for that $650 billion? AI services generate roughly $35 billion in total revenue or 5% of what’s being spent on infrastructure. There are dreams of more of course, but the business models of AI have yet to resonate, especially for consumers… Apple didn’t miss the AI revolution. It just bet that the winners won’t be the ones who build the infrastructure. They’ll be the ones who own the customer and no one else on Earth owns the best customers. Apple owns the best customers because it makes the best devices, thanks to its integration of hardware and software. And, as I recounted above, it is somehow, fifty years on, the only company of its kind. There is, however, an emerging threat that Apple is seeking to head off. Again from Bloomberg : Apple Inc. awarded rare bonuses to iPhone hardware designers this week, aiming to stem a wave of departures to AI startups like OpenAI that are building their own devices. The company granted out-of-cycle bonuses worth several hundred thousand dollars to many members of its iPhone Product Design team, according to people with knowledge of the matter. Apple’s leadership has grown increasingly concerned about the number of engineers being poached by potential rivals. OpenAI, which has tapped former Apple design chief Jony Ive to help design a new generation of AI-centric products, has emerged as a particular threat…OpenAI’s hardware division is run in part by Apple veteran Tang Tan. He used to oversee the iPhone product design team that’s receiving the bonuses. Tan’s group at OpenAI has hired several dozen Apple engineers, and not just ones who worked on the iPhone. The startup has lured employees who helped develop the iPad, Apple Watch and Vision Pro. OpenAI isn’t just hiring designers; the company is also building out operations capabilities to be able to actually make the upcoming Ive-designed device at scale (presumably in China). Still, many are wondering about the status of OpenAI’s hardware device given the news about Sora; from the Wall Street Journal : OpenAI is planning to pull the plug on its Sora video platform, a product it released to great fanfare last year that has since fallen from public view. The move is one of a number of steps OpenAI is taking to refocus on business and coding functions ahead of a potential initial public offering as soon as the fourth quarter of this year. CEO Sam Altman announced the changes to staff on Tuesday, writing that the company would wind down products that use its video models. In addition to the consumer app, OpenAI is also discontinuing a version of Sora for developers and won’t support video functionality inside ChatGPT, either. OpenAI is in the middle of a strategy shift to redirect the company’s computing resources and top talent toward so-called productivity tools that can be used by both enterprises and individual users. Last week, OpenAI announced that it was combining its ChatGPT desktop app, coding tool Codex and browser into one “superapp.” The company expects the consolidated product to align its employees around a single vision. In fact, cutting Sora but keeping the hardware initiative fits this strategy shift: Sora, along with the also indefinitely delayed adult-mode , were products that drive more attention, which lends itself to the more traditional consumer business model of advertising. Productivity, on the other hand, is a much better fit for enterprise, where Anthropic is making major gains. The problem, however, is that most consumers aren’t willing to pay for software; what they are willing to pay for are devices . This was the secret of the iPhone; from 2016’s Everything as a Service : Apple has arguably perfected the manufacturing model: most of the company’s corporate employees are employed in California in the design and marketing of iconic devices that are created in Chinese factories built and run to Apple’s exacting standards (including a substantial number of employees on site), and then transported all over the world to consumers eager for best-in-class smartphones, tablets, computers, and smartwatches. What makes this model so effective — and so profitable — is that Apple has differentiated its otherwise commoditizable hardware with software. Software is a completely new type of good in that it is both infinitely differentiable yet infinitely copyable; this means that any piece of software is both completely unique yet has unlimited supply, leading to a theoretical price of $0. However, by combining the differentiable qualities of software with hardware that requires real assets and commodities to manufacture, Apple is able to charge an incredible premium for its products. OpenAI is approaching this space from the opposite direction: it has a massive consumer user base for ChatGPT, and an impressively large number of subscribers; it is also adding advertising. However, to truly monetize consumers the most attractive business model is the Apple model: integrated hardware and software. The truth is that Apple’s lack of investment in AI was always going to be a short to medium-term win: the company doesn’t have to spend on infrastructure, and everyone still needs a device. The real threat is in the long-term: what happens if AI becomes so good that it obviates traditional user interfaces? Or, to put it another way, what if the point of integration that is most compelling is not a traditional operating system and hardware device, but rather AI and a dedicated device? If this threat materializes, it won’t be with OpenAI’s initial offering; the smartphone is the ultimate form factor, and does so many jobs that depend on its flexibility and capability and 3rd-party ecosystem that no new entrant could hope to compete (indeed, Google and Android is arguably a bigger threat for this reason). However, just how capable might AI be not just next year, but in five years, or ten years? If ever a better interaction paradigm were to succeed the smartphone surely it will be rooted in AI — and Apple, by giving up now, won’t be in the game. This absolutely is not a prediction. Indeed, if I had to bet, I would bet on Apple keeping its place: It’s also worth noting that OpenAI has, in its relatively short life, managed to frame itself as a competitor to basically everyone in tech, from Google to Meta to Microsoft, only to find itself forced to pivot in the face of Anthropic and its focused approach on coding and productivity in the enterprise. The audacity of taking on everyone is impressive; the effectiveness of fighting everyone for everything may be less so. Still, there is an angle here for OpenAI, and a point of vulnerability for Apple. The company made it fifty years with no one truly competing with its integrated business model; the fate of its next fifty years may rest on the question of just how compelling AI ends up being — and if OpenAI can out-Apple the original. First, there is the likelihood that the smartphone, thanks to its screen, connectivity, and battery life, is in fact the best device for AI, and that furthermore, AI will be just one capability alongside everything a smartphone already does. Second, to the extent that AI inference moves to the edge, Apple has a big advantage thanks to its industry-leading chips. Third, Apple always has the option of opening up its devices to allow for much deeper integration with 3rd-party AI providers other than OpenAI, in order to effectively fight off a potential threat.
Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Agents Over Bubbles . R.I.P. Sora, 2025-2026. AI Sam came, AI Sam saw, and AI Sam stole those GPUs. We’ll always have the memories . Unfortunately, it turns out that Sam would rather have the GPUs , so on Sharp Tech this week, Ben and I eulogized the app that took over the world for about two weeks last year . That included thoughts on copyright battles that may have sealed its fate, why Ben’s reluctant to be too critical, and more signs that OpenAI is serious about its enterprise pivot. Come for that conversation, and then stay for a rollicking spring mailbag that includes a great take on search advertising, F1 venting, the Vision Pro and my wife, kids and phones, and more. — Andrew Sharp The 2026 Bullseye List. The NBA Playoffs are only a few weeks away, which means Ben Golliver and I are already in preparation mode, including a delightful episode today running through a “Bullseye List” of superstars who will be under pressure this spring . We discuss everyone from Kevin Durant and Alperin Sengun to Jalen Brunson, Chet Holmgren, and Victor Wembanyama, a debatable inclusion, but undeniably the most magnetic star in the league right now. And yes, given my Luka takes in January , and Luka looking incredible throughout March, I did take accountability and add myself to bullseye list. — AS Arm’s Big Shift. If you wanted more evidence that AI is changing everything, look no further than Arm: the company was famous for its high margin IP-licensing business model, but this week announced that instead of (just) facilitating other company’s making chips, it would start making and selling chips itself. Naturally, their first offering is explicitly focused on AI data centers. I explained Arm’s motivations in Wednesday Update , and interviewed Arm CEO Rene Haas to get his point of view on Thursday . — Ben Thompson Arm Launches Own CPU, Arm’s Motivation, Constraints and Systems — Arm is selling its own chips, not just licensing IP. It’s a big change compared to Arm’s history, but not surprising given how computing is evolving. An Interview with Arm CEO Rene Haas About Selling Chips — An interview with Arm CEO Rene Haas about the company’s decision to not just license IP but make their own chips. Tilting at Windmills — As the Iran war continues, let’s take a look at the Democratic Party, institutional media, and offshore wind farms. John Ternus and Responsible Individuals Sora and Mac Pro Dead Singapore’s Sound Card Hero A Giant Mess with Super Micro; Completely Correct Xiong’an Progress; The PRC’s Balancing Act on Iran; Manus, Apple and Router News The Intrigue(?) in the East, Peterson and Acuff On Center Stage, Revisiting Draft Kevin Durant The BULLSEYE List in 2026: Playoff Questions for Ant, Chet, Tatum, Mitchell, Wemby, and Beyond A Spring Break Mailbag: RIP Sora, Ads and Surplus, F1 Going in Reverse, Elon Inc., Smartphone Parenting, and More
Listen to this post: Good morning, This week’s Stratechery Interview is with Arm CEO Rene Haas, who I previously spoke to in January 2024 , and who recently made a major announcement at Arm’s first-ever standalone keynote : the long-time IP-licensing company is undergoing a dramatic shift in its business model and selling its own chips for the first time. We dive deep into that decision in this interview, including the meta of the keynote, Arm’s history, and how the company has evolved, particularly under Haas’ leadership. Then we get into why CPUs matter for AI, and how Arm’s CPU compares to Nvidia’s, x86, and other custom Arm silicon. At the end we discuss the risks Arm faces, including a maxed-out supply chain, and how the company will need to change to support this new direction. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Rene Haas, welcome back to Stratechery. RH: Ben Thompson, thank you. Well, you used to be someone special, I think you were the only CEO I talked to who did nothing other than license IP, now you’re just another fabless chip guy like [Nvidia CEO] Jensen [Huang] or [Qualcomm CEO] Cristiano [Amon]. RH: (laugh) Yeah, you can put me in that category, I guess. Well the reason to talk this week is about the momentous announcements you made at the Arm Everywhere keynote — you will be selling your own chip. But before I get to the chip, i’m kind of interested in the meta of the keynote itself, is this Arm Everywhere concept new like as far as being a keynote? Why have your own event? RH: You know, we were talking a little bit about this going into the day. I don’t think we’ve ever as a company done anything like this. Yeah I didn’t think so either, I was trying to verify just to make sure my memory was correct, but yes it’s usually like at Computex or something like that. RH: Our product launches have usually been lower key, we try to use them usually around OEM products that are using our IP that use our partner’s chips, but we just felt like this was such a momentous day for the company/very different day for the company that we want to do something very, very unique. So it was very intentional, we were chatting about it prior, I don’t think we’ve done anything like before. Who was the customer for the keynote specifically? Because you’re making a chip — Meta is your first customer, they knew about this, they don’t need to be told — what was the motivation here? Who are you targeting? RH: When you prepare for these things, that’s one of the first questions you ask yourself, “Who is this for?”, “Is it for the ecosystem?”, “Is it for customers?”, “Is it for investors?”, “Is it for employees?”, and I think under the umbrella of Arm Everywhere, the answer to those questions was “Yes”, everybody. We felt we needed to, because a lot of questions come up on this, right, Ben, in terms of, “What are we doing?” “Why are we doing?”, “What’s this all about?”, the answer to that question was “Yes”, it was for everyone. One more question: Why the name “Arm Everywhere”? RH: We were trying to come up with something that was going to thematically remind people a bit about who Arm was and what we are and what we encompass, but not actually tease out that we were going to be announcing something. Right, you can’t say “Arm’s New Chip Event”. RH: (laughing) Yes, exactly, “Come to the new product launch that we’ve not yet announced”. So we just decided that that would be enough of a teaser to get people interested. Just to note you said, “What Arm was “, what was Arm? You used the past tense there. RH: Yeah, and I will say, we are still doing IP licensing, you can still buy CSSs [Compute Subsystem Platforms], so we are still offering all of the products we did before that day and plus chips, so I’m not yet just another chip CEO, I think I’m still very different than the other folks you talked to. Actually, back up, give me the whole Rene Haas version of the history of Arm. RH: Oh, my goodness gracious. The company was born out of a joint venture way back in the day between Acorn Computer and then ultimately Apple and VLSI to design a low-power CPU to power PDAs. The thing that was kind of important was, “I need something that is going to run in a plastic package” — you may remember back then just about everything was in ceramic — “I can’t melt the PDA, and oh, by the way, this thing’s got to run off a battery”. So they chose a RISC architecture, and that’s where the ARM ISA [ instruction set architecture ] was born and that’s what the first chip was intended to do, and the thing wasn’t very successful. So fast forward, however, the founders and then a very, very important guy in Arm’s history, Robin Saxby , put out a goal to make the ARM ISA the global standard for CPUs. And if you go back to early 1990s, there were a lot of CPUs out there and also there was not an IP business, there really wasn’t a very good fabless semiconductor model, and there was not a very good set of tools to develop SoCs [system on a chip] . So in some ways, and this is what I love about the company, it was a bit of a crazy idea because you didn’t really have all the things in place necessary to go off and do that. But back then, there were a lot of companies designing their own CPUs, if you will, and the idea there being that ultimately this would be something that customers could be able to access, acquire, and build, and then ultimately build a standard upon it. It was ultimately the killer design win for the company, and I know you’re a strategist and historian as well around this area, is the classic accidental example of TI was developing the baseband modem for an applications processor for the Nokia GSM phone and they needed a microcontroller, something to kind of manage the overall process, and they stumbled across what we were doing, and we licensed them the IP. That was kind of the first killer license that got the company off the ground and that’s what really got us into mobile. People may think, “You were the heart of the smartphone and you had this premonition to design around iOS” or, “You worked really closely in the early days of Android”, it was the accidental, we found ourselves into the Nokia phone, GSM phone, Symbian gets ported to ARM, and then there starts to be at least enough of a buzz around nascent software, but that’s how the company was born. I did enjoy for the keynote, you had a bunch of different Arm devices in the run-up running on the screen, and my heart did do a little pitter-patter when the Nokia phones popped on. Another day, to be sure. RH: Yeah, cool stuff right? But that’s kind of how the company got off the ground, and as it was a general purpose CPU which meant we didn’t really have it designed for, “It’s going to be good at X”, or, “It’s going to be good at Y, it’s going to be good at Z”, it turned out that because it was low power, it was pretty good to run in a mobile application. I think the historic design win where the company took off was obviously the iPhone, and the precursor to the iPhone was the iPod was using a chipset from PortalPlayer that used the ARM7 and the Mac OS was all x86, and then inside the company, it was Tony Fadell’s team arguing , “Let’s use this PortalPlayer architecture”, versus, “Do we go with Intel’s x86 and a derivative atom”, back in the day, and once a decision was made that “We’re going to port to ARM for iOS”, that’s where the tailwind took off. So is it definitely making up too much history to go back and say, “The reason Arm was a joint venture to start is because people knew you needed to have an ecosystem and not be owned by any one company”, or whatever it might be, that’s being too cute about things — the reality is it was just stumbling around, barely surviving, and just fell backwards into this? RH: Which, by the way, every good startup that’s really been successful, that’s kind of how the formula works. You stumble around in the dark, you find something you’re good at and then you engage with a customer and you find what ultimately is sticky and that’s really what happened with Arm. When you consider the changes that you’ve made at Arm, and I want to get your description of the changes that you’ve made, but how many of the challenges that you face were based on legitimate market fears about, “We’re going to alienate customers” or whatever it might be versus maybe more cultural values like, “We serve everyone”, versus almost like a fear like, “This is just the market we’ve got, let’s hold on to it”? RH: I think, Ben, we thought about it much more broadly, and when I took over and you and I met not long after that, there were a couple of things that were happening in the market in terms of a need to develop SoCs faster, a need to get to market more quickly and we knew that intuitively that no one knew how to combine 128 Arm cores together with a mesh network and have it perform better than we could because that’s what we had to do to go off and verify the cores. So we knew that doing compute subsystems really mattered, but I came from a bit of a different belief that if you own the ISA at the end of the day, you are the platform, you are the compute platform and it is incumbent upon you to think about how to have a closer connection between the hardware and the software, that is just table stakes. I don’t think it’s anything new, if you think about what Steve Jobs thought about with Apple and everything we’ve seen with Microsoft, with Wintel. I felt with Arm, particularly not long after I started, in 2023 and 2024, this was only getting accelerated with AI. Because with AI, the models and innovation moving way, way faster than the hardware could possibly keep up. I just felt for the company in the long term that this was a direction that we had to strongly consider, because if you are the ISA and you are the platform, the chip is not the product, the system is. That’s the thing that I was sort of driving at when I was writing about your launch. There’s an aspect where you’ve made these big changes, you’re originally just the ISA, then you’re doing your own cores, not selling them, but you’re basically designing the cores, then you’re moving to these systems on a chip designs and now you’re selling your own chips. But it feels like your portion of the overall, “What is a computer?”, has stayed fairly stable, actually, because, “What is a computer?”, is just becoming dramatically more expansive. RH: I think that’s exactly right. Again, if you are a curator of the architecture and you are an owner of the ISA, as good as the performance-per-watt is, as interesting as the microarchitecture is, as cool as it is in terms of how you do branch prediction, the software ecosystem determines your destiny. And the software ecosystem for anyone building a platform needs to have a much closer relationship between hardware and software, simply in terms of just how fast can you bring features to market, how fast can you accelerate the ecosystem, and how can you move with the direction of travel in terms of how things are evolving. You mentioned the big turning point or biggest design win was the iPhone way back in the day, and the way I’ve thought about Arm versus x86 — there’s been, you could make the case, ARM/RISC has been theoretically more efficient then CISC, and I’ve talked to Pat Gelsinger about how there was a big debate in Intel way back in the 80s about should we switch from CISC to RISC, and he was on the side of and won the argument that by the time we port everything to RISC we could have just built a faster CISC chip that is going to make up all the difference and that carried the day for a very long time. However, mobile required a total restart, you had to rebuild everything from scratch to deliver the power efficiency, and I guess the question is, you’ve had a similar dynamic for a long time about Arm in the data center theoretically is better, you care about power efficiency etc, is there something now — is this an iPhone-type moment where there’s actually an opportunity for a total reset to get all the software rewritten that needs to be done? Or have companies like Amazon and Qualcomm or whatever efforts they’ve done paved the ground that it’s not so stark of a change? RH: It’s a combination of both. One of the big advantages we got with Amazon doing Graviton in 2019, and then subsequently the designs we had with Google, with Axion, and Microsoft with Cobalt, is it just really accelerated everything going on with cloud-native, and anything that moves to cloud-native has kind of started with ARM. What do you mean by cloud native? RH: Cloud-native meaning these are applications that are starting from scratch to be ported to ARM. Built on a Linux distro, but not having to carry anything about running super old legacy software or running COBOL or something of that nature on-prem, so that was a huge benefit for us in terms of the go-forward. Certainly we got a huge interjection of growth when Nvidia went from the generation before Hopper, which I think was Volta or Pascal, I may be mixing up their versions, which was an x86 connect to Grace. So when they went to Grace Hopper, then Grace Blackwell, and now Vera, the AI stack for the head node now starts to look like ARM, that helps a lot in terms of how the data center is organized, so we certainly got a benefit with that. I think for us, the penny drop moment was when, and it’s probably 2018, 19 timeframe, is when Red Hat had production Linux distros for ARM and that really also accelerated things in terms of the open source community, the uploads and things that made things a lot, a lot easier from the software standpoint. Give me the timeline of this chip. When did you make the decision to build this chip? You can tell me now, when did this start? RH: You know, it started with a CSS, right? And we were talking to Meta about the CSS implementation. Right. And just for listeners, CSS is where you’re basically delivering the design for a whole system on a chip sort of thing. RH: Compute subsystem, yeah, so it’s the whole system on a chip. And by the way, it’s probably 95% of the IP that sits on a chip. What doesn’t include? It doesn’t include the I/O, the PCIe controllers, the memory controllers, but it’s most of the IP. And this is what undergirds — is Cobalt really the first real shipping CSS chip? Or does Graviton fall under this as well? RH: Cobalt’s probably the first incarnation of using that, so Meta was looking at using that and I think the discussions were taking place in the 2025 timeframe, mid-2025 timeframe. Here’s the key thing, Ben, not that long ago. Right. Well, that was my sense it was not that long ago, so I’m glad to hear that confirmed. RH: Not that long ago. Because CSS takes you a lot of the way there so that discussion in around the 2025 timeframe that we were going back and forth of, “Are you licensing CSS”, versus, “Could you build something for us?”, and we had been musing about, “Was this the right thing for us to do from a strategy standpoint?”, and how we thought about it, but ultimately it came down to Meta saying, “We really want you to do this for us, we think this is going to be the best way to accelerate time to market and give us a chip that’s performant and in the schedule that we need”, so somewhere in the 2025-ish timeframe, we agreed that, yes, we’ll do this for you. Why did Meta want you to do it instead of them finishing it off themselves? RH: I think they just did the ROI, in terms of, “I’ve got a lot of people working on things like MTIA , I’ve got a whole bunch of different projects internally, is it better that you do it versus we do it”? “How much can we actually differentiate a CPU”? RH: Yeah and by the way, that is ultimately what it comes down to at some point in time and the fact that the first one that came back works, it’s going to be able to go into production, and it’s ready to go. I’m not going to say they were shocked, but we kind of knew that was going to happen because we knew how to do this stuff and the products were highly performant and tested in the CSS, so it happened fast is the short answer. So if we talk about Arm crossing the Rubicon, was it actually not you selling this chip it was when you did CSS? RH: One could say that that was a big step. When we started talking about doing CSSs, let me step back, we made a decision to do CSSs— Explain CSSs and that decision because I think that’s actually quite interesting. RH: What is a CSS? It’s a compute subsystem, it takes all of the blocks of IP that we sold individually and puts them together in a fully configured, verified, performant deliverable that we can just hand to the customer and they can go off and complete the SoC. Some customers have told us it saves a year, some say a year-and-a-half and this is really around the test and verification in terms of the flow. One of the examples I gave, it’s a little cheeky, but it kind of worked during the road show, was when we were trying to explain to investors, “What’s IP, what’s a CSS?”, I said, go to the Lego store, and you’ve got a bin of Legos, yellow Legos, red Legos, blue Legos, trying to buy all those Legos and building the Statue of Liberty is a pain, or you can go over to the boxes where it’s the Statue of Liberty and just put those pieces together, and the Statue of Liberty is going to look beautiful. This is what the CSS was. I just want to jump in on that, because I was actually thinking about this, the Lego block concept is a common one that’s used when talking about semiconductors, but I remember being back in business school, and this was 2010, somewhere around then, and one of the case studies that we did was actually Lego, and the case study was the thought process of Lego deciding whether or not to pursue IP licensing as opposed to sticking with their traditional model, and all these trade-offs about, “We’re going to change our market”, “We’re going to lose what Lego is”, the creativity aspect, “It’s going to become these set pieces”. I just thought about that in this context where I came down very firmly on the side of, “Of course they should do this IP licensing”, but it was almost the counter was this sort of traditionalist argument which is kind of true — Legos today are kind of like toys for adults to a certain extent, and you build it once, reading directions and you think back to when I was a kid and you had all the Legos and it was just your creativity and your imagination and I’m like, “Maybe this analogy with Arm is actually more apt than it seems”. There’s a very romantic notion of IP licensing, you go out and make new things, “We got this for you”, versus, “No we’re just giving you the whole chip”, or in this case of CSS you, to your point, you could go get The Statue of Liberty, don’t even bother building it yourself. RH: And I think I came across this in the early days. In the 1990s, I was working with ASIC design at Compaq Computer, and they were doing all their ASICs for Northbridge , Southbridge , VGA controllers, and this is when the whole chipset industry took off. And I remember one of the senior guys at Compaq explaining why you’re doing this, he said, “I’m all about differentiation, but there needs to be a difference”. And to some extent, that’s a little bit of this, right? You can spend all the time building it, but if it’s all built and you spent all this time and it’s not functionally different nor performant different, but you spent time — well, if you’re playing around with Legos and you got all day, that’s fine — but if you’re running a business and you’re trying to get products out quickly, then time is everything, and that’s really what CSS did. It kind of established to folks that, “My gosh, I can save a lot of time on the work I was doing that was not highly differentiated”, and in fact, in some case, it was undifferentiated because we could get to a solution faster in such a way that it was much more performant than what folks might be trying to get to the last mile. So when we started talking about this to investors back in 2023 during the roadshow, their first question was, “Aren’t you going to be competing with your customers?”, and, “Isn’t this what your customers do?”, and, “Aren’t they going to be annoyed by it?”, and my answer was, “If it provides them benefit, they’ll buy it, if it does not present a benefit, they won’t buy it”, that’s it. And what we found is a lot of people are taking it, even in mobile, where people where we were told was, “No, no, these are the black belts and they’re going to grind out the last mile and you can’t really add a lot of value” — we’ve done a bunch in the mobile space, too. So with Meta, was the deal like, “Okay, we’ll do the whole thing for you, but then we get a sell to everyone?”, and they’re like, “That’s fine, we don’t care, it doesn’t matter”? RH: Yes, exactly. We said, “If we’re going to do this, how do you feel about us selling it to other customers?”, and they said, “We’re fine with that”. When did you realize that the CPU was going to be critical to AI? RH: Oh, I think we always thought it was. I had a cheeky little slide in the keynote about the demise of the CPU, and I had to spend a lot of time. I mean, I don’t know, I might have talked to someone recently who I swear was pretty adamant that a lot of CPUs should be replaced with GPUs, and now they’re selling CPUs, too. RH: I had to talk to investors and media to explain to them why a CPU was even needed. They were a little bit like, “Can’t the GPU run by itself?”, it’s like a kite that doesn’t need anything to hang on to. First off, on table stakes, obviously you need the data center but particularly as AI moves into smaller form factors, physical AI, edge, where you obviously have to have a CPU because you’re running display, you have I/O, you have human interface. It’s how do you add accelerated AI onto the CPU? So yeah, I think we kind of always knew it was going to be there, and there was going to be continued demand for it. Right, but there’s a difference between everyone on the edge is going to have a CPU so we can layer on some AI capabilities. It doesn’t have the power envelope or the cost structure to support a dedicated GPU, that’s fair, that’s all correct. It’s also correct that, to your point, a GPU needs a CPU to manage its scheduling and its I/O and all those sorts of things, but what I’m asking about specifically is actually, we’re going to have these agentic workflows, all of which what the agent does is CPU tasks and so it’s not just that we will continue to need CPUs, we might actually need an astronomical more amount of CPUs. Was that part of your thesis all along? RH: I think we have instinctively thought that to be the case. And what drives that? The sheer generation of tokens, tokens by the pound, tokens by the dump truck, if you will. The more tokens that the accelerators are generating, whether that’s done by agentic input, human input, whatever the input is, the more tokens that are generated, those tokens have to be distributed. And the distribution of those tokens, how they are managed, how they are orchestrated, how they are scheduled, that is a CPU task purely. So we kind of intuitively felt that over time, as these data centers go from hundreds of megawatts to gigawatts, you are going to need, at a minimum, CPUs that have more cores, period. There was this belief of 64 cores might be enough and maybe 128 cores would be the limit, Graviton 5 is 192 cores, the Arm AGI CPU is 136, we were already starting to see core counts go up, and we started thinking about, “What’s driving all these core counts going up, is it agentic AI?”. A proxy for it was just sheer tokens being generated in a larger fashion that needed to be distributed in a fast way and what was layered onto that was things like Codex, where latency matters, performance matters, delivering the token at speed matters. So I think all of that was bringing us to a place that we thought, “Yeah, you know what?”, we’re seeing this core count thing really starting to go up, we were seeing that about a year ago, Ben. So am I surprised that the CPU demand is exploding the way it is? Not really. Agentic AI, just the acceleration of how these agents have been launched, certainly is another tailwind kicker. Which happens to line up with your mid-2025 decision that, “Maybe we should sell CPUs”. RH: Yeah, it all kind of lines up. We were seeing that, you know what, we think that this is going to be a potentially really, really large market where not only core count matters, but number of cores matters, efficiency matters because we could imagine a world where each one of these cores is running an agent or a hypervisor and the number of cores can really, really matter in the system, which laid claim to what we were thinking about in terms of, “Okay, we can see a path here in terms of where things are going”. So CSSs with greater than 128 cores in the implementation? Absolutely. Do I think, could I see 256? Absolutely. Could I see 512? Possibly. I think then it comes down to the memory subsystem, how you keep them fed, etc., but yeah, so short answer, about a year ago we started seeing this. Do you think that core count is going to be most important or is it going to be performance-per-core? RH: I think core count is going to be quite important because I think, again, I have a belief that each one of these cores will want to potentially run their own agent, launch a hypervisor job, launch a job that can be run independently, launch it, get the work done, go to sleep. The performance of the core is going to matter, no doubt about it, but I think the efficiency of that core is probably going to matter just as much as the performance is. Well, the reason I ask is because you talked a lot in this presentation about the efficiency advantage, where the company born from a battery or whatever your phrase was, and that certainly, I think, rings true, particularly in isolation. But in a large data center, if the biggest cost is the GPUs, then isn’t it more important to keep the GPUs fed? Which basically to say, is a chip’s capability to feed GPUs actually more important on a systemic level than necessarily the chip’s efficiency on its own? RH: I’m going to plead the fifth and say yes to both. You’ve got to pick one! RH: Well, what’s important? I think the design choice that Nvidia made with Vera was very important, Vera is designed to feed Rubin, it has a very specific interface, NVLink Fusion or NVLink chip-to-chip, provides a blazing fast interface, and has the right number of cores in terms of to keep that GPU fed optimally. But at the same time, is it the right configuration in a general-purpose application where you want to run an air-cooled rack in the same data hall? If you think about a data hall where you might have a Vera Rubin liquid-cooled rack sitting right next to a liquid-cooled Vera rack, but somewhere else inside the data center, you’ve got room for multiple air-cooled racks. That space that you may have not used in the past for CPU, you want to because of the problem statement that I just gave. So I actually think it’s a “both” world, which is why when people ask me, “Oh my gosh, aren’t you competing with Nvidia Vera, and aren’t people going to get confused?” — not particularly, I think there’s ample space for both. So you feel like Nvidia might be selling standalone Vera racks but that’s not necessarily what Vera was designed for, that’s what you’re designed for, and you think that’s where you’re going to be different. RH: Yes, and I mean, if you look at what’s been announced so far from Nvidia, they announced a giant 256-CPU liquid-cooled rack and the first implementation that we’re doing with Meta is a much smaller air-cooled rack. So very, very different right off the get-go. But you will have a liquid-cooled option? RH: If customers want that, we can do that too. I think that differentiation makes sense. Well, speaking of differentiation, why ARM versus x86? Why is there an opportunity here? RH: Performance-per-watt, period. Graviton sort of started it, and they’ve been very public about their 40% to 50%, Cobalt stated the same with Microsoft, Axion, Google stated the same, Nvidia has stated the same. Just on table stakes, 2x performance-per-watt is pretty undeniable. And that, I think, it starts there as probably the primary value proposition. What is x86 still better at? You can’t say legacy software, other than legacy software. RH: Go back to our earlier part of our conversation, right? The ISA, what is the value of the ISA? It is the software that it runs, right? It is the software that it runs. So if you were to look at where does x86 have a stronghold, x86 is very good at legacy on-prem software. Ok, fine, we’ll give you legacy on-prem software and I think part of the thesis here to your point a lot of this agentic work, it’s on Linux, it’s using containers, it’s all relatively new, it all by and large works well in ARM already, but you did have a bit in the presentation where you interviewed a guy from Meta that was about porting software. How much work still needs to be done there? RH: There’s a delta between the porting work and the optimization work. Graviton, what Amazon will tell you, is that greater than 50% of their new deployments and accelerating is ARM-based. And, yes, am I the CEO of Arm and do I have a biased opinion? Of course. But I find it hard to, on a clean sheet design, if you were starting from scratch and the software porting was done and you had either cloud-native or the application space was established or as a head node, I don’t know why you’d start with x86. What about, why are you doing ARM? We did ARM versus x86, I’m sort of working my way down the chain here — actually, I did backwards, we stuck in Vera already — but why you versus custom silicon generally? You talked about Amazon. Why do you need to do the whole thing? RH: So let’s think about an Amazon, for example. Amazon does Graviton, would I like Amazon to buy the Arm AGI CPU? Yes. Am I going to be heartbroken if they never buy one? No, I’m perfectly fine if they stay building what they’re building. Are they ever going to buy one? No. RH: I hope they do! But if they don’t, it’s not going to be the end of the world. SAP — SAP runs a lot of software on Amazon, they run SAP HANA on Amazon, they also have a desire to do stuff on-prem and if they’re doing something on-prem in a smaller space and they’re looking to leverage that work, they’d love to have something that is ARM-based. Prior to us doing this product, there was no option at all, right? So that’s a very, very good example. Similar with a Cloudflare. Is Cloudflare going to do their own implementation? Likely not. Do they run on other people’s clouds? Sure, they do. Do they have an application that could be on-prem running on ARM? Absolutely. So we think that, and I don’t want to prefetch this, Ben, but we had a lot of questions from folks like, “Amazon won’t buy from you”, “Google won’t buy from you”, “Microsoft won’t buy from you”, because you’re competing with them. And we say, well, Google builds TPUs, yet they buy a lot of Nvidia GPUs, so it’s not so binary. That’s true. They’ll buy what their customers ask them to buy. RH: 100%. And if we solve a problem with an implementation that theirs does not, they’ll buy it, and if we don’t, they won’t. Just you know between you and me, is the only customer silicon that is truly potentially competitive Qualcomm and you’re just not too worried about making them mad? RH: This is off the record here? (laughing) I didn’t say off the record. RH: Qualcomm, it’s funny, I had a question at the investor conference about competing with Nvidia. And I said, you know, a month ago, no one would have asked about any Arm person competing with anybody. So it’s wonderful to have these kind of conversations, the market is underserved and there aren’t choices. There isn’t a product from Qualcomm, there isn’t a product from MediaTek, there isn’t a product from Infineon, there just isn’t. Is that sort of your case? If there were a bunch of options in the market, would you still be entering? RH: We entered this because Meta asked us to and because Meta asked us to we did. So if I was to answer your question, would we have entered if those other four guys were there or five hypotheticals? I don’t know that Meta would have asked us. If the Arm AGI CPU, it’s being built on TSMC’s 3-nm node, which is kind of impossible to get allocation for. How’d you get allocation? If you started this in 2025, how’d you pull that off? RH: We’re working through a back-end ASIC partner that helps secure the allocation for us. Oh, interesting. Are you concerned about that in the long run ? Like this business blows up and actually you just can’t make enough chips? RH: I’m probably less worried about that at the moment than I am about memory. I think that the business, the demand is very, very high actually for the chip, Ben and through our partner, we’re able to secure upside through TSMC, that has not been a problem. But memory is quite challenging and I think if there’s any limit to how big this business can get and I would say that what we provided to investors as a financial forecast is based upon the capacity we’ve secured on both memory and logic but if there was more memory could we sell more? Yes. This is sort of the sweet spot though of making predictions, everyone gets to say, “Wow, how are your predictions so accurate?”, it’s like, “Well it’s because I knew exactly how much what I would be able to make”. RH: Yeah, if there was more memory we’d be even more aggressive on the numbers. How did you make the memory decisions that you did in terms of memory bandwidth and all those sorts of pieces, particularly given the short timeline which you made this you. That wasn’t necessarily part of the CSS spec before, so how were you thinking about that? RH: The things we kind of looked at was, we sort of started with LP versus standard DRAM . Because Vera’s doing LP and you decided to do standard. RH: We’re doing standard DRAM, yeah. We thought we’d be a little bit better on the cost side that could help and at the same time, a little bit better on the capacity side. So it really kind of drove down to, we’re going to solve for capacity because we thought that that might matter in a more generalized application space to give the broader width of use, which then brought us to standard DDR versus LP. I think the reason we talked last time was in the context of you making a deal with Intel to get Arm working on 18A, and this was going to be a multi-generational partnership. What happened to that? Is that still around? RH: It’s still around. We did a lot of work on 18A because we felt that it was going to be really, really important if someone wanted to build on Intel 18A, that the Arm IP was available. So we did our part relative to if someone wants to go build an ARM-based SoC on Intel process, but that unfortunately hasn’t come to pass just yet. It’s interesting you mentioned that you’re actually not worried about TSMC capacity but you are worried about memory — I didn’t fully think through that being another headwind for Intel where they could really use TSMC having insufficient capacity to help them, but if memory is the first constraint then no one’s even getting there. RH: First off, obviously HBM [ high bandwidth memory ] being such a capacity hog, and then people moving from LP into HBM at the memory guys, then compounding on it, all of the explosion of the CPU demand drives up memory demand. So it all kind of adds on to itself, which makes the memory problem pretty acute. What exactly is in the bill of materials that you’re selling? You showed racks but you mentioned a partnership with Super Micro for example — if I buy a chip from Arm what exactly am I buying? You’ve mentioned memory obviously, so what else is in that? And what are you getting from partners? RH: Yeah, so we’ll send you a voucher code after the show, and you can place your orders. Just the SoCs. If you need to secure the memory, that’s on you, we’re not securing memory at this point in time. We did a lot of work with Super Micro, with Lenovo, with ASRock. So there’s a full 1U, 2U server blade reference architecture so the full BOM relative to all the passives and everything you need from an interconnect standpoint is all there. There’s a full BOM, which, as we mentioned in the session, the rack physically itself complies with OCP standards and then we’ve done all the work in terms of the reference design. So we can provide the full BOM of the reference platform, memory, but what we are selling only is the SoC. Very nerdy question here, but how are you going to report this from an accounting perspective? Just right off the top chips have a very different margin profile, is this going to all be broken out? How are you thinking about that? RH: We’ll probably do that. Today we break down licensing and royalty of the IP business, we’ll probably break out chips as a separate revenue stream. To go back to, you did call this event Arm Everywhere, will you ever sell a smartphone chip? RH: I don’t know, that’s a really hard question. I think we’re going to look at areas where we think we could add significant value to a market that’s underserved, that market’s pretty well served. It’s very well served and this agentic AI, potentially a new market, fresh software stack, makes sense to me. What risks are you worried about with this? You come across as very confident, “This is very obviously what we should”, how does this go wrong? RH: Most of my career has been spent actually in companies that have chips as their end business as opposed to IP. I’ve been at Arm 12 years, 13 years, I’ve been the CEO for about four-and-a-half. I did a couple of years, two, three years at a company called Tensilica that was doing, or actually the longer, five years, but most of my career was either NEC Semiconductor, Texas Instruments, Nvidia. Chip business is not easy, right? You introduce a whole different new set of characteristics. You have to introduce this term called “inventory” to your company. RH: RMAs, inventory, customer field failures, just a whole cadre of things that’s very new for our company, there certainly is execution risk that we’ve added that has not existed before. We had a 35-year machine being built that is incredibly good at delivering world-class IP to customers — doing chips is a whole different deal. I don’t want to minimize that, but at the same time, I don’t want to communicate that that’s something that we haven’t thought about deeply over the years and we’ve got a lot of people who have done that work inside the company. A lot of my senior executive team, ex-Broadcom, ex-Marvell, ex-Nvidia, we’ve got a lot of people inside the engineering organization who have come from that world, we’ve built up an operations team to go off and support that. So while there is risk, we’ve been taking a lot of steps inside the company to be adding the resources. We’ve been increasing our OpEx quite a bit in the quarters leading up to this, about 25% year-on-year, investors were asking a ton of questions about, “When are we going to see why you’re adding all those people?”, and Arm Everywhere explained that. We also told investors that that’s now going to taper off because we’ve got, we think what we need to go off and execute on all this. But I think that’s the biggest thing, Ben. And the upside is just absolute revenue dollars, I guess absolute profit dollars. RH: I think there’s a financial upside, certainly, in terms of financial dollars. But I think back to the platform, I think by being closer to the hardware and the software and the systems, we can develop even better products around IP, CSS, etc. because I think when you are the compute platform, it is incumbent upon you to have as close a relationship as you can between the software that’s developed on your platform. What’s the state of the business in China these days, by the way? RH: China still represents probably 15% of our revenue, we still have a joint venture in China, the majority of our businesses is royalties, royalties is much bigger than licensing in China. We still have a lot of design wins coming in the mobile space for people doing their own SoCs like a Xiaomi. The hyperscaler market is strong between Alibaba, ByteDance, Tencent, and then most of the robotics and EV guys are doing stuff based on ARM, whether it’s XPeng, BYD, Horizon Robotics. So our business is pretty healthy in China. You do have the Immortalis and Mali GPUs. Are those good at AI? RH: Yes they can be very good, we’ve added a lot of things to to our GPUs around what we call neural graphics so this is adding essentially a convolution and vector engine that can can help with AI. Right now the focus has been really more around AI in a graphics application, whether it’s around things like DLSS and things of other area, but we’ve got a lot of ingredients in those GPUs. So we should stay tuned, sounds very interesting. You did have one moment in the presentation that was a little weird, you were trying to say that this AI thing is definitely a real thing but you’re like, “Well it might be a financial bubble, but the AI is real”. Are you worried about all this money that is going into this that you’re making a play for a piece of, but is there some consternation in that regard? RH: No, what I was trying to indicate was when people talk about bubbles, typically it’s either valuation bubbles or investment bubbles. The valuation bubbles, those come and go over time. The investment bubble, I’m not as worried about in the sense of, “Is there going to be real ROI on the investment being made?”, I actually worry more about the, “Can you get all the stuff required to build out all of the scale?” — we just talked about memory, there’s TSMC capacity. I think the memory will be solved, they will ultimately not be able to help themselves, they will build more capacity, I’m worried about leading edge. TSMC will help themselves if they don’t have any challengers. RH: Turbines, right? You’ve got companies who are like GE Vernova or Mitsubishi, this is not their world of building factories well ahead to go serve an extra 5 to 10 gigawatts of power. So I think TSMC is super disciplined, and they’ve been world class at that throughout their history. Will the memory guys be able to help themselves? The numbers are now so large that even the Sandisk’s of the world and storage, everything has kind of gotten bananas, and that is a concern in terms of if just one of those key components of the supply chain blinks and decides not to invest to provide the capacity, then things kind of slow down. But the numbers, Ben, the numbers we’re talking about are numbers we’ve never seen before. $200 billion CapEx from an Amazon or $200 billion CapEx from a Google. And then you have companies like Anthropic talking about $6 billion revenue increases over a three-to-four month period, which are the size of some software companies. So we are in some very stratospheric levels in terms of spend that would I be surprised if there was a pause in something just as people calibrate? Yeah, I wouldn’t be surprised at all. But if I think about the 5 to 10-year trajectory, there’s no way you can say this is a bubble. If you said, “I think machines that can think as well as humans and make us more productive, that’s kind of a fad”, I don’t actually think that’s going to happen, it’s almost nonsensical. Just to sort of go full circle, you’ve been on the edge, and now this new product that gets the Arm Everywhere moniker but it’s about being in the data center — is the edge dead? Or if not dead is it are we in a fundamental shift where the most important compute is going to be in data centers or is there a bit where AI is real but it actually does leave the data center, go to the edge and that’s a bigger challenge? RH: I think until something is invented that is different than the transformer, and we talk about some very different model as to how AI is trained and inferred, then we’re looking at a lot of compute in the data center and some level of compute on the edge. I think if you just suspend animation for a second and we say, you know what, the transformer is it, and that’s what the world looks like for the next number of, the next 5 to 10 years, the edge is not going to be dead. The edge is going to have to run some level of native compute for whatever the thing has to do, and it’s going to run some AI acceleration, of course. But is everything going to happen in your pocket? No. I mean, that’s not going to happen. I’ve come down to that side too. I think in the fullness of time, at least for now, the thin client model, it looks like it’s going to be it. I guess that seems to be your case as well because you had a big event, it is for a data center GPU. Arm is Everywhere, but not everyone can buy it. RH: And power efficiency was a nice to have in the data center, but I would say it wasn’t existential. It is now, though. And I say that’s another big change because, again, one of the examples I gave, if you’re 4x-ing or 5x-ing or 6x-ing the CPUs in a given data center and you don’t want to give up one ounce of GPU accelerator power, then you’re going to squeeze everywhere you can and that, I think, is a thing that’s in our favor. Where’s Arm in 10 years? RH: I would like to think of as one of the most important semiconductor companies on the planet. We’re not there yet, but that’s how I would like the company to be thought about. Rene Haas, congratulations, great to talk. RH: Thank you, Ben. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!
Arm is selling its own chips, not just licensing IP. It's a big change compared to Arm's history, but not surprising given how computing is evolving.
Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Sharp Tech video is on Questions about Anthropic vs. the U.S. Government. Everything I Didn’t Write . This was one of those weeks where far more happened than I could write about — and that’s partly my fault for taking a stand on bubbles ! To that end, I highly suggest this week’s episode of Sharp Tech , where we cover: OpenAI’s pivot to enterprise, and why AI might look like the PC in the 1980s Why I think that agents are not only real, but also the reason we are not in a bubble OpenClaw as evidence that my thesis that OpenAI and Anthropic are sustainably differentiated through their integration of harness and model is wrong Nvidia’s inference pivot, and why Nvidia is particularly concerned about a world dominated by OpenAI and Anthropic (and why Microsoft might be in trouble) And, for good measure, why I don’t mind Wisconsin winters I think that each of these points could be another Update, but also, I’m taking a few days off for vacation, so I hope you’ll listen to this episode in particular. — Ben Thompson What Jensen Huang Has In Common with Steve Jobs. I really enjoyed this week’s Dithering covering Nvidia’s announcements at GTC Monday , including a near-perfect inversion of what Jensen Huang was telling the world about Nvidia’s approach to inference workloads just one year ago. In their trademark 15-minute format, Ben explains how and why Nvidia’s inference messaging is now different ( see also : this week’s Stratechery Interview ), while Gruber draws on decades of Apple experience to note the similarities between Huang and Steve Jobs. It’s a great listen that renders legible an easily missed strategic inflection point at the most valuable company in the world . — Andrew Sharp Trump’s Trip to Beijing, Delayed Indefinitely. As the war in Iran continues, this week’s Sharp China covered the news that President Trump will delay a trip to Beijing that had been scheduled to begin March 31st . Come to hear why both sides are likely relieved by the delay, and stay to hear about a softened Taiwan threat assessments from the U.S. intelligence community and a succession of PLA military scientists who are being purged for reasons that aren’t entirely clear. — AS Agents Over Bubbles — Agents are fundamentally changing the shape of demand for compute, both in terms of how they work and in terms of who will use them. They’re so compelling that I no longer believe we’re in a bubble. An Interview with Nvidia CEO Jensen Huang About Accelerated Computing — An interview with Nvidia CEO Jensen Huang about his GTC 2026 keynote, navigating China and DC, and remembering Nvidia’s true nature. Jensen Huang and Andy Grove, Groq LPUs and Vera CPUs, Hotel California — GTC 2026 marked an important inflection point for Nvidia, as the company is selling multiple architectures, instead of focusing on just one GPU. The motivation is serve all needs and keep all customers. What the NBA Could Be Getting from College Basketball — College basketball is fantastic, and the NBA should take advantage of its success by raising the age limit for the NBA Draft. LLM Paradigm Changes Jensen Huang’s Jobsian Keynote From Fiber to AI: A Laser Giant’s Rebirth Mexico City’s Sinking Lands The War in Iran and the Visit to Beijing; New DNI Assessments on Taiwan; Military Scientists Disappearing From Public View How to Miss a Free Throw, The Biggest Top 100 Disappointments, Expansion is Afoot (Again) How NOT to Miss a Free Throw, Generic Houston Rockets Slander, The Top 100 Pleasant Surprises OpenAI’s Enterprise Pivot, The Rise of Agents and Bubble Counterpoints, Nvidia Changes Its Inference Story
Stratechery is on a bit of a disjointed Spring Break, as my usual week off will be spread out: I will return to my usual posting schedule on Tuesday, March 31. All other Stratechery Plus content, including my podcasts, will stay on schedule. There will be no Update on Thursday, March 19 There will be no Update on Monday and Tuesday, March 23–24; there will be an Update and Interview on Wednesday and Thursday, March 25–26 There will be no Update on Monday, March 30
GTC 2026 marked an important inflection point for Nvidia, as the company is selling multiple architectures, instead of focusing on just one GPU. The motivation is serve all needs and keep all customers.
Listen to this post: Good morning, This week’s Stratechery Interview is running early this week, as I had the chance to speak in person with Nvidia CEO Jensen Huang at the conclusion of his GTC 2026 keynote , which took place yesterday in San Jose . I have spoken to Huang four times previously, in May 2025 , March 2023 , September 2022 , and March 2022 . In this interview we talk about a keynote that came across like a bit of a history lesson, and what that says about a company that still feels small even as it’s the most valuable in the world, as well as what has changed in AI over the last year. Then we discuss a number of announcements that might feel like a change in approach (although Huang disagrees), including Nvidia’s burgeoning CPU business and the Groq acquisition. Finally we discuss scarcity in the AI stack and how that affects Nvidia, the China question, and Huang’s frustration with doomers and their influence in Washington. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Jensen Huang, welcome back to Stratechery. JH: It’s great to be with you. You literally just walked off the stage, went a little long, I think, but you spent a lot of this keynote , which I quite enjoyed, explaining what Nvidia is, starting with the history of the programmable shader, the launch of CUDA 20 years ago. We don’t need to spend too much time recounting this, you did a good job, and Stratechery readers are certainly familiar — sorry, this is a bit of a lead up here — Stratechery readers are familiar , and I remember this exactly, someone asked me to explain how is it that Nvidia can announce so many things at a single GTC, this is like six, seven years ago, maybe even longer than that, and I explained the whole thing with CUDA and all the libraries is it’s just sort of doing the same thing again and again , but for specific industries. That’s the story you told today, and it’s kind of a back-to-the-future moment after the last few GTC keynotes have kind of just been pretty AI-centered, CES was pretty AI-centered . Why did you feel the need tell that story now? To recast CUDA and why is it important? JH: Well, because we’re going into a whole lot of new new industries and because AI is going to use tools, and when AI uses tools, those are tools that we created for humans. AI is going to use Excel, AI is going to use Photoshop, AI is going to use logic synthesis tools, Synopsis tools, and Cadence tools. Those tools have to be super-accelerated, they’re going to use databases they have to be super-accelerated because AI’s are fast. And so I think in this era, we need to get all of the world’s software now as fast as possible accelerated, and then put them in front of AI so that AI could agentically use them. So is that a bit where we’ve already done this for a bunch of sectors and now we’re going to do it for a bunch more? JH: Yeah, a whole bunch more. For example, data processing. Well, that was sort of a surprise. I didn’t expect you to be opening with an IBM partnership . JH: Yeah, right, that kind of puts it in perspective. I mean, they really started it all. You wrote last week that AI is a five-layer cake : power, chips, infrastructure, models, and applications. Is there a concern that in the last four or five years, that you are worried about being squeezed into the chips box, so it’s important to both remind people and also yourselves about you being this vertically integrated company — not just in terms of building systems, but into the entire software stack, you’re not just a chip company. JH: I guess my mind doesn’t start with, “What I’m not”, it starts with, “What do we need to be?”. And back then, we realized that accelerated computing was a full stack problem, you have to understand the application to accelerate it. We realized that we had to understand the application, we had to have the developer ecosystem, we needed to have excellent expertise in algorithm development, because the old algorithms that were developed for CPUs don’t work well for GPUs, so we had to rewrite, refactor algorithms so that they could be accelerated by our GPUs. If we do that, though, you get 50 times speed up, 100 times speed up, 10 times speed up, and so it’s totally worth it. I think since the very beginning, we realized, “Ok, what do we want to do, and what does it take to achieve that?”. Now, today we’re building AI factories, we’re building AI infrastructure all over the world. That’s a lot more than building chips, and building chips is obviously important, it’s the foundation of it. Right, that’s like one full stack of doing the networking and doing the storage, and now you’re into CPUs. JH: Now you’ve got to put it all together into these giant systems — a gigawatt factory is probably $50, $60 billion. Out of that $50, $60 billion, probably about, call it $15, $17 or so, is infrastructure: land, power, and shell. The rest of it is compute and networking and storage and things like that, and so that level of investment, unless you’re helping customers achieve the level of confidence that they’re going to succeed in building it, you just have no hope, nobody’s going to risk $50 billion. So I think that that’s the big idea, that we need to help customers not just build chips, but build systems and then after we build systems, not just build systems, but build AI factories. AI Factories has a lot of software inside, it’s not just our software, it’s a ton of software for cooling management and electricals and things like that, and redundancies and a lot of it is over-designed, it’s over-designed because nobody talked to each other. When you have a lot of people who don’t talk to each other, integrate systems, you have to, by definition, over-design your part of it. But if we’re working together as one team, we’ll make sure that we can push the limits and get more throughput out of the power that we have or save money for whatever throughput you want to have. Just to go back to that software bit, you mentioned Excel wasn’t designed to be used by AI. You have things like Claude has this new functionality to use Excel , so when you talk about that, you want to invest in these libraries, is that to enable models like that to do better? Or is that something for Microsoft or for enterprises — you want to use this, you don’t want to be beholden to this sort of other player in the world? JH: Well, SQL’s a good example. SQL’s used by people, and we bang on the SQL systems like anybody else, and it is the ground truth of businesses. Well, it’s not just gonna be people banging on our SQL database now, it’s gonna be a whole bunch of agents banging on it. Right, they’re gonna do it way faster. JH: They’re gonna need to do it way faster. And so the first thing we have to do is accelerate SQL, that’s kind of the simple logic of it. That makes sense. In terms of models, you noted that language models are only one category. “Some of the most transformative work is happening in protein AI, chemical AI, physical simulation, robots, and autonomous systems”, and this is from the piece you wrote last week. You’ve previously made this point while noting in other keynotes, “Everything is a token”, I think, is a phrase that you’ve used before. Do you see transformers as being the key to everything, or do we need new fundamental breakthroughs to enable these applications? JH: We need all kinds of new models. For example, transformers, its ability to do attention scales quadratically, and so how do you have quite long memory? How can you have a conversation that lasts a very long time and not have the KV cache essentially become, over time, garbage? Or have entire racks of solid-state drives that are holding KV cache . JH: And of course, let’s say that you were able to record all of our conversation, when you go back and reference some conversation, which part of the reference is most important? There needs to be some new architecture that thinks about attention properly and be able to process that very quickly. We came up with a hybrid architecture of a transformer with an SSM, and that is what enables Nemotron 3 to be super intelligent and super efficient at the same time, that’s an example. Another example is coming up with models that are geometry aware, meaning a lot of things in life, in nature, are symmetrical. And so when you’re generating these models, you don’t want it to generate what is just statistically plausible, it has to also be physically based, and so it has to come out symmetrical. And so cuEquivariance , for example, allows you to do things like that. So we have all these different technologies that are designed — or, for example, when we’re generating tokens in words, it comes out in chunks at a time, little bits, tokens at a time, when you’re generating motion, you need it to be continuous. And so there’s discrete information that you generate and understand, and there’s continuous information that you want to generate and understand. Transformers is not ideal for both. Right, that makes sense. One more quote from the piece, you write, “In the past year, AI crossed an important threshold. Models became good enough to be useful at scale. Reasoning improved. Hallucinations dropped. Grounding improved dramatically. For the first time, applications built on AI began generating real economic value”. What specifically was that change? Because I think about the timing, I feel like this upcoming year is definitely about agents, I just wrote about it today — but for last year, was that the reasoning? Was that the big breakthrough? JH: Generative, of course, was a big breakthrough, but it hallucinated a lot and so we had to ground it, and the way to ground it is reasoning, reflection, retrieval, search, so we helped it ground. Without reasoning, you couldn’t do any of that, and so reasoning allowed us to ground the generative AI. And once you ground it, then you could use that system to reason through problems and decompose it, and decompose it into things that you could actually do something about, and so the next generation was tool use. Turns out it probably tells you something that search was a service that nobody paid for, and the reason for that is getting information is very important and very useful but it’s not something you pay for. The bar to reach to get somebody to pay you for something has to be higher than just information. “Where’s a good restaurant?” — information is just, I don’t think is worthy enough to get paid for. Some people pay for it, I pay for it. We now know that we’ve now crossed that threshold. Not only is it able to converse with us and generate information for us, it can now, of course, do things for us. Coding is just a perfect example for that. If you think about it for a second, you realize this, coding is not really the same modality as language, you have to teach it empty spaces and indentations and symbols, it’s almost like a new modality and you can’t generate code just one token at a time, you have to reflect on the chunk of code. That chunk of code has to be factored properly and has to be optimal and has to obviously compile, it has to be grounded not on probable truth, it has to be grounded on execution. Right, does it run or not? JH: It has to run or not. And so I think the code, learning that modality was a big deal. Once you’re able to now do — we pay engineers several hundred thousand dollars a year to code, and so now they have a coding assistant. They could think about architecture. Instead of describe programs in code, which is very laborious, they can now describe software in specification, which is much more abstract and allows them to be much more productive. And so they describe specification, architecture, they’re able to use their time to solve and innovate, and so our software engineers 100% use coding agents now. Many of them haven’t generated a line of code in a while, but they’re super productive and super busy. Do you think there is a temptation to over-extrapolate from coding, though, precisely because it’s verifiable? You have this agent idea where they can go — it’s not just that they will generate code, then they can actually verify it, see if it works, if it doesn’t, they can go back and do it again, and this can happen all without humans because there’s a clear, “Does it work or not?”. JH: Well, because you can reflect, you could have, let’s say, design a house. Designing a house or designing a kitchen used to be the work of architects, designers, but now you could have carpenters do that. So now you elevated the capability of a carpenter, now you use an agent for that carpenter to go design a house, design a kitchen, come up with some interesting styles. The agent doesn’t have some tool to execute. However, you could give an example. You say, “these are the styles I’m looking for, I want it to be aesthetic like that”. Because the agent is able to reflect, is able to compare its quality of code, its quality of result against some reference, it could say, “You know what, it didn’t turn out as well as I hope, I’m going to go back at it again”, and so it iterates. It doesn’t have to be fully executable, in fact, the more probabilistic, the more aesthetic, the more subjective, if you will, AI actually does better. Right, well that’s why you almost have two extremes. You have generating images where there’s no right answer and then you have coding where there is a right answer and AI seems to do good on those sides and the question is how much will it collapse into the middle there. JH: We’re fairly certain it could do architecture now, we’re fairly certain it could design kitchens and living rooms. Well, to this point, one of the big things with agents coming online is, you’ve talked a lot about accelerated computing, I think you’ve trash talked as it were, maybe the CPUs to the day, they’re all gonna be removed, like everything’s gonna be accelerated. Suddenly CPUs are hot again. It turns out they’re pretty useful and important to the extent you are selling CPUs now, how’s it feel to be a CPU salesman ? JH: There’s no question that Moore’s law is over. Accelerated computing is not parallel computing. Go back in time — 30 years ago, there were probably 10, 20, 30 parallel computing companies, only one survived, Nvidia and the reason why is because we had the good wisdom of recognizing the goal wasn’t to get rid of the CPU, the goal was to accelerate the application. So what I just falsely accused you of was actually true for everybody else. JH: We were never against CPUs, we don’t want to violate Amdahl’s Law . Accelerated computing, in fact, inside our systems, we choose the best CPUs, we buy the most expensive CPUs, and the reason for that is because that CPU, if not the best and not the most performant, holds back millions of dollars of chips. When it comes to branch prediction , you worried about wasting CPU time, now you’re worried about wasting GPU time. JH: That’s right, you just never can have GPUs be squandered, GPU time be idle. And so we always use the best CPUs to the point where we went and built Grace so that we could have the highest performance single-threaded CPU and move data around a lot faster. And so accelerated computing was never against CPUs, my basis is still true that Amdahl’s Law is over, the idea that you would use general purpose computing and just keep adding transistors, that is so dead, and so I think fundamentally we’re not against CPUs. However, these agents are now able to do tool use, and the tools that they want to use are tools created for humans and they’re basically two types. There’s the stuff that we run in data centers and most of it is SQL, most of it is database related, and the other type is personal computers. We’re now going to have AIs that are able to learn unstructured tool use, the first type of tool use is structured. CLIs are tool use, APIs, they’re all structured tool use, the commands are very explicit, the arguments are explicit, the way you talk to that application is very specific. However, there’s a whole bunch of applications that were never designed to have CLIs and APIs and those tools need AIs to learn multi-modality, unstructured, and it has to go and be able to go surf a website and it has to be able to recognize buttons and pull down menus and just kind of work its way through it like we do. That tool use are going to want to use PCs and we have both sides, we have incredibly great data processing systems, and as you know, Nvidia’s PCs are the most performant in the world. So what makes an agent-focused CPU different from other CPUs? So you’re going to have a rack of just Vera CPUs. JH: Oh, really good, excellent. So the way that CPUs were designed in the last decade, they were all designed for hyperscale cloud and the way that hyperscale cloud monetizes CPUs is by the CPU core. So you want to design CPUs that have as many cores as possible that are rentable, the performance of it is kind of secondary. You’re dealing with web latency by and large. JH: That’s exactly right, exactly. And so the number of CPU instances is what you’re optimizing for. That’s why you see these CPUs with a couple of hundred, 300, 400 cores coming. Well, they’re not performant and for tool use, where you have this GPU waiting for the tool use— And you’re going over NVLink. JH: That’s right, you want the fastest single-threaded computer you can possibly get. So is it just the speed? Or does the CPU itself need to be increasingly parallel so it doesn’t have misses and things like that? Or so it’s like just all the way down the pipeline is very different? JH: Yeah, the most important thing is single-threaded performance and the I/O has to be really great. Because it’s now in the data center, the number of single-threaded instances running is going to be quite high and therefore, it’s going to bang on the I/O system, it’s going to bang on the memory controller really hard. Vera’s bandwidth-per-CPU core, bandwidth-per-CPU, is three times higher than any CPU that’s ever been designed, and so it’s designed so that it has lots and lots of I/O bandwidth and lots and lots of memory bandwidth, so that it never throttles the CPU. If the CPU gets throttled, then we’re holding back a whole bunch of GPUs. Is this Vera rack, is it still, you talked about it being very tightly linked to the GPU rack, but is it still disaggregated so that the GPUs can be serving multiple different Vera cores? Whereas you have a Vera core on a board with- Okay, got it, that makes sense. How does your Intel partnership and the NVLink thing fit into this, if at all? JH: Excellent. Some of the world is happy with Arm, some of the world still needs, particularly, you know, enterprise computing, a whole bunch of stacks that people don’t want to move and so x86 is really important to that. Has the resiliency of x86 code been surprising to you? JH: No. Nvidia’s PC is still x86, all of our workstations are x86. I did want to congratulate you, as you talked about in the keynote today, you are the token king . So in your article, you also talked about that energy is the first principle of AI infrastructure and the constraint on how much intelligence the system can produce. If that’s the case, if it’s the amount of tokens you can produce and you’re constrained by how much energy is in the data center, why do companies even try to compete with the token king? JH: It’s going to be hard because it’s not reasonable to build a chip and somehow achieve results that are fairly dramatic. Even in the case of Groq , Groq couldn’t deliver the results unless we paired it with Vera Rubin . Well tell me about this, my next question was about Groq. JH: So if you look at the entire envelope of inference, on the one hand, you want to deliver as much throughput as possible, on the other hand, you want to deliver as many smart tokens as possible — the smarter the token, the higher price you could charge. These two balance, this tension of maximizing throughput on the one hand, maximizing intelligence on the other hand, is really, really tough to work out. I do have to say, last year you had a slide talking about this Pareto Curve , and you talked about, I think it was when you introduced Dynamo, how your GPUs could cover the whole thing, and so you didn’t have to think about it, just buy an Nvidia GPU, and Dynamo will do both. But now you’re here saying, “Well, it doesn’t quite cover the whole thing”. JH: We cover the whole thing still better than any system that can do it. Where we could extend that Pareto is particularly on the extremely high token rates and extremely low latency, but it also reduces the throughput. However, because of coding agents, because they’re now AI agents that are producing really, really great economics, and because the agents are being attached to humans that are actually making extremely, I mean, they’re extremely valuable. Right, they’re even more expensive than GPUs. JH: And so I want to give my software engineers the highest token rate service, and so if Anthropic has a tier of Anthropic Claude Code that increases coding rate by a factor of 10, I would pay for it, I would absolutely pay for it. So you’re building this product for yourself? JH: I think most great products are kind of because you see a pain point and you feel the pain point and you know that that’s where the market’s going to go. We would love for our coding agents to run 10 times faster, but in order to do that, it’s just very, very difficult to do that in a high throughput system and so we decided to add the Groq low latency system to it and then we basically co-run, co-process. Right. And is this just separating decode and prefill ? JH: We’re going to do even the high processing, high FLOPS part of decode, attention part of decode. So you’re disaggregating even down to the decode level. JH: That’s right, and that requires really tight coupling and really, really close integration of software. So how are you able to do that? You say you’re shipping later this year, this deal was just announced a couple of months ago. JH: Well, we started working on disaggregated inferencing, Dynamo really put Nvidia’s ideas on the table. The day that I announced Dynamo, everybody should have internalized that, I was already thinking about, “How do we disaggregate inference across a heterogeneous infrastructure more finely?”, and Groq’s architecture is such an extreme version of ours, they had a very hard time. Dynamo was a year ago, and Groq just happened sort of over Christmas. Was there an event that sort of made you think this needed to happen? JH: Well remember, I announced Dynamo a year ago, we’ve been working on Dynamo for two years, so we’ve been thinking about disaggregated inference thing for two, three years, and we started working with Groq maybe before we announced the deal, maybe six months earlier. So we’ve been thinking about working with them about unifying Grace Blackwell and Groq fairly early on. So the interaction with them, I really like the team and we don’t want their cloud service. They had another business that they really believe in and they still believe in, they’re doing really well with it and that wasn’t a part of the business that we wanted, so we decided to acquire the team and license the technology. Then we’ll take the fundamental architecture and we’ll evolve it from here. So it was just a happy coincidence or not a happy coincidence, maybe not a happy coincidence. JH: Strategic serendipity. Because OpenAI, you know, has an instance now with Cerebras that they announced in January . JH: That was done completely independent of us and frankly, I didn’t even know about it, but it wouldn’t have changed anything. I think the Groq architecture is the one I would have chosen anyways, it’s much more sensible to us. Was this the first time where there was sort of an ASIC approach that sort of made you raise your eyebrows like, “Oh, that’s actually fundamentally different”? JH: No, Mellanox . That’s a good example. JH: Yeah, Mellanox. We took a bunch of our computing stack and we put it into the Mellanox stack. NVLink wouldn’t be possible, you know, at the scale we’re talking about without the in-network fabric computing that we did with Mellanox. Taking the software stack, disaggregating it, and putting it where it needs to be, is a specialty of Nvidia. We’re not obsessed about where computing is done, we just want to accelerate the application. Remember, Nvidia is an accelerated computing company, not a GPU company. Right. So you talk about power being the constraint. When your customers are thinking about what to buy, we could buy all sort of traditional GPUs, or we could buy these LPU racks. Is that just, they should be thinking about it in terms of you’re just confident they can drive way more revenue? JH: It really depends on the kind of products they have. Suppose you really don’t have enterprise use cases at the moment, I don’t really think that adding Groq makes much sense, and the reason for that is because most of your customers are free tier customers, and they’re moving towards paying. So it might be two-thirds free tier, one-third paid, in that case, adding Groq to it, you’re adding a lot of expense. You’re taking some power, it’s not worth it. Complexity. And you’re taking away servers, the opportunity cost. JH: What you could actually be serving the free tier, yeah. However, if you have Anthropic-like business and you have OpenAI-like business where Codex is capturing really great economics, but you just wish you could generate more tokens, this is where adding that accelerator can really boost your revenues. Are we actually constrained by power right now in 2026 or by fab capacity or what? Everyone’s saying we don’t have enough supply. What’s the actual limiting factor? JH: I think it’s probably close on everything. You couldn’t double anything, really. Because you’ll hit some other constraints. It does feel like, though, the U.S. has I think done a pretty good job of scrounging up power , maybe more than people expected a couple years ago, it feels like chips are really much more of a limiter right now . JH: Our supply chain is fairly well planned. You know, we were planning for a very, very big year, and we’re planning for a very big year next year. We saw all the soju drinking and fried chickens. JH: (laughing) Yeah, right. We’re planning, we plan for, in our supply chain, we have got, you know, a couple of hundred partners in our supply chain and we’ve got long-term partnerships with them. So I feel pretty good about that part of it. I don’t think we have twice as much power as we need, I don’t think we have twice as much chip supply as we need, I don’t think we have twice of anything as we need. But I think everything is, everything that I see in the horizon, we will be able to support from a supply chain perspective and the thing that I wish probably more than anything is that all the land, power, and shell would just get stood up faster. Is it fair to say, is there a bit where Nvidia is actually the biggest beneficiary of scarcity, though, to the extent it exists? Like, if there’s a power scarcity, you’re the most efficient chip, so you’re going to be utilizing that power better. Or if there’s fab capacity, like you just said, you’ve been out there securing the supply chain, you got it sort of sorted, are you the big winners in that regard? JH: Well, we’re the largest company in this space, and we did a good job planning. And we plan upstream of the supply chain, we plan downstream of the supply chain and so I think we’ve done a really good job preparing everyone for growth. Right, but is this a bit where, at its core, why not having access to the Chinese market maybe is a threat? Like if China ends up with plenty of power and plenty of chips, even though those chips are only 7nm, they have the capacity to build up an ecosystem to potentially rival CUDA in the long run, is that the concern that you have? JH: There’s no question we need to have American tech stack in China, and I’ve been very consistent about that since the very beginning recognizing that open source software will come. No country contributes more to open source software than China does and we also know that 50% of the world’s AI researchers come from China, and we also know that they’re really inventive. DeepSeek is not a nominal piece of technology, it’s really, really good. And Kimi is really good, and Qwen is really good and they make unique contributions to architecture, and they make unique contributions to the AI stack so I think we have to take these companies seriously. To the extent that American tech stack is what the world builds on top of, then when that technology diffuses out of China, which it will, because it’s open source, and when it comes out of China, it goes into American industries, it goes into Southeast Asia, it goes into Europe, the American tech stack will be prepared to receive them. I’ve been really consistent that this is probably the single most geopolitical strategic issue for the American tech industry. Yeah, when we talked last time , the Trump administration had banned the H20. Were you surprised you were able to get the Trump administration to see your point of view? And then were you even more surprised that now you’re stymied by the Chinese government ? JH: I’m not surprised by us being stymied by them and the reason for that is because, of course, China would like to have their tech stack develop. In the time that we’ve left that market, you know how fast the Chinese industry moves, and Huawei achieved a record year for their company’s history. This is a very long-running company, and they had a record year. They had, what, five, six IPOs of chip companies that are addressing the AI industry. I think we need to be more strategic in how we think about American leadership and American geopolitical and technology leadership. AI is not just a model, and that’s a deep misunderstanding — AI, as I said and as you mentioned in the beginning, AI is a five-layer cake and we have to win the infrastructure layer, we have to win the chips layer, we have to win the platform layer, we have to win the model layer and we have to win the application layer. Some of the things that we do are jeopardizing our ability as a country to lead in each one of those five layers. I think it’s a terrible mistake to think that the way to win is to bundle all of it top-to-bottom and tie every company together into one holistic stack so that we can only win or win at the limits of what any one of the layers can win. We’ve got to let all the layers go out and try to win the market. Have those other layers maybe benefited from their longer experience in Washington and you sort of showed up a little late to the scene? JH: Yeah, maybe. What have you learned? What’s been the biggest thing you’ve learned about Washington? JH: Well, the thing that I was surprised by is how deep the doomers were integrated into Washington D.C. and how the messages of doomers affected the psychology of the policy makers. Everyone was scared instead of optimistic. JH: That’s right, and I think it has two fundamental problems. In this Industrial Revolution, if we don’t allow the technology to diffuse across the United States and we don’t take advantage of it ourselves, what will happen to us is what happened to Europe in the last Industrial Revolution — we left them behind. And they, in a lot of ways, they invented all the technologies of the last Industrial Revolution and we just took advantage of it. I hope that we have the historic wisdom, that we have the technological understanding and not get trapped in science fiction, doomerism, these incredible stories that are being invented to scare the living daylights out of policy makers who don’t understand technology very well and they give them these science fiction embodiments that are just not helpful. One of the situations that is most concerning to me is when you poll the United States, the population, the popularity of AI is decreasing, that’s a real problem. It’s no different than the popularity of electricity, the popularity of electric motors, the popularity of gasoline engines, in the last Industrial Revolution became less popular. The popularity of the Internet, could you just imagine? Other countries took advantage of it much more quickly than we did and then technology diffused into its industries and society much more quickly and so we just have to be much, much more concerned that we don’t give this technology some kind of a mystical science fiction embodiment that’s just not helpful and scaring people. And so I don’t like it when doomers are out scaring people, I think there’s a difference between genuinely being concerned and warning people versus is creating rhetoric that scares people. I think a characteristic you see all the time is people put on their big thinking hats and try to tease out all these nuances and forget the fact that actual popular communication is done in broad strokes. You don’t get to say, “Oh, you’re a little scared of this, but not this XYZ”— you’re just communicating fear as opposed to communicating optimism. JH: Yeah, and somehow it makes them sound smarter. People love to sound smart. JH: Sometimes it’s maybe, and we now know, it helps them with their fundraising and sometimes it helps them secure regulatory capture. So there’s a lot of different reasons why they do it, and these are incredibly smart people but I would just warn them that most of these things will likely backlash and will likely come back and they’ll be probably disappointed that they did it someday. I’m gonna tie a few questions together because I know we’re a little short on time. In the self-driving car space , you’re working with multiple automakers, you have your Alpamayo model , while still supplying chips to Tesla. You had a big bit about OpenClaw today in your presentation — meanwhile, a huge thing driving the Vera chips, for example, we talk about agents, is what’s happening with say, Claude Code and happening with Codex from OpenAI. Am I right to tease out a consistent element here, and your investment in your open source models goes with that, where you’re happy to supply the leading provider, or the inventor in a space with chips, but then you’re going to fast follow what they do for everyone else that is threatened by them? So you simultaneously broaden your customer base, you’re not just dependent on the leaders, but then also the leaders are helping you sell to everyone else because they’re worried about being left behind. JH: No, nothing like that. We’re at the frontier on so many different domains. In a lot of ways, we are the leader in many of these domains, but we never turn them into products. We’re a technology stack and so we have to be at the frontier, we have to be the world leader of the technology stack, but we’re not a solutions manufacturer, we’re not a service provider. And so that’s number one. Will that always be the case? JH: Yeah, always be the case. There’s no reason to, and we’re delighted not to. And so we create all this technology, we make it available to everybody. Well, it’s funny though, if you go back to like your boards, for example, like the products you ship, more and more of that, there’s what, 30,000 specific SKUs in a rack today or something like that. More and more of those are defined by you, “This is what it’s going to be”, in part to make it easier to assemble, all those sorts of pieces. Is there a bit where that’s gonna happen on the software side too, as you talk about those vertical bits and your open source model? JH: We create a thing vertically and then we open it horizontally and so everybody could use whatever piece they would like. As long as they’re running on Nvidia chips? JH: Whatever piece they would like, they don’t have to use all Nvidia chips, they don’t have to use all Nvidia software. We have to build it vertically, we have to integrate it vertically and optimize it vertically. But afterwards, we give them source, we give them — they just figure out how they want to do it. Do you think Nvidia can actually produce and keep up in terms of having a frontier model that can win that space or be a necessary provider of that space given that folks like Meta seem to have fallen off or the alternative is, seems to be by and large Chinese models. JH: Winning that space is not important to us. Right, well important not in terms of winning, but important in terms of there needs to be an open source frontier model, so if not you, then who? JH: That’s right, that’s right, somebody has to create open source models and Nvidia has a real capability in doing so. Whenever we create these open source models, we also learn a lot about the computation. Was that a bit of a problem with Blackwell? I’ve heard mutters that the training runs were maybe a little more difficult than they were sort of previously. JH: The challenge with Blackwell was 100% NVLink 72, NVLink 72 was backbreaking work. And it was the only time that I thanked the audience for working with us. Yeah, I noticed when you said that today, it came across as very sincere. JH: Yeah, because we tortured everybody, but everybody loves it now. This is the second time we’ve had a chance to talk in person, and my takeaway when I met you previously in Taipei was the extent that Nvidia still feels like a small company. Are you worried about getting stretched too thin, or do you still think you have sort of that CUDA-esque flywheel where, “It looks like we’re doing a lot, we’re just kind of doing the same thing over and over again?”. JH: The reason why Nvidia can move so fast is because we always have a unifying theory for the company, and that’s my job, I need to come up with a unifying theory for what’s important and why things connect together and how they connect together and then create an organization, an organism that’s really, really good at delivering on that unifying theory. And so the unifying theory for Nvidia is actually fairly simple. On the one hand, we have the computing platform, the software platform that’s related to CUDA-X . On the other hand, we’re a computing systems company, we optimize things vertically, we apply extreme co-design across the stack and all the different components of a computer and now that computer is a platform of ours and we integrate that platform into all the clouds and to all the OEMs and then we have another platform that’s now the data center platform, or the AI factory platform. So once you have a unifying theory about what Nvidia builds and how it goes about doing it — and I used the keynote to kind of tell that story even partly to our own employees. That’s what it felt like. That whole first hour of the keynote felt like you talking to your employees, reminding them of what you do. JH: It’s important that we’re always constantly reminded of what’s important to us and AI is important to us, but of course CUDA-X and all of the solvers and all of the applications that we can accelerate is really important to us. Thank you very much. JH: Thank you. It’s great to see you, Ben. Keep up the good work. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!
Listen to this post : There is a weird paradox in terms of AI prognostication: on one hand, you don’t want to be the one to completely dismiss the most terrifying doomsday scenarios; who wants to be found out to be foolishly optimistic? At the same time, there is also pressure to give credence to the possibility that we are in a bubble, and all of this hype and spending is going to go belly up. While I have argued against the former , I have very much been on board with the latter, making the case that bubbles can be good . Sitting here in March 2026, however, on the morning of Nvidia’s GTC, I’ve come to a different conclusion: I don’t think we’re in a bubble (which, paradoxically, maybe is the truest evidence we are). Over the last couple of weeks, first in the context of Nvidia’s earnings , and then last week in the context of Oracle’s , I’ve talked about three LLM inflection points. ChatGPT: The first LLM inflection point was the November 2022 launch of ChatGPT, which hardly needs an explanation. Yes, transformer-based large language models were introduced in 2017, and the capabilities were both impressive and growing, but under-appreciated; Stratechery started an interview series with Daniel Gross and Nat Friedman in October 2022 under the premise that there was an incredible new technology that was sorely lacking for product applications and startup energy. Needless to say, that was entirely flipped on its head just weeks later. ChatGPT opened the eyes of the world to what LLMs were capable of, but the initial versions had two flaws that have stuck in many people’s minds, particularly those convinced that we are in a bubble. The first flaw is that LLMs frequently got things wrong, and worse, would hallucinate when it didn’t know the answer. This made LLMs feel like something of a parlor trick: amazing when it works, but not something that you can count on. The second was related to the first: even in that flawed state LLMs were tremendously useful, but you needed to have an idea of what to use them for, and you needed to proactively take care to manage mistakes and verify the output in case it was hallucinated. o1: The second LLM inflection point was the release of OpenAI’s o1 model in September 2024. By that point LLMs had improved tremendously, both thanks to new foundation models and also because of continued improvements in post-training; that meant that the stream of tokens that constituted an answer in ChatGPT or Claude were now much more likely to be right, and somewhat less likely to hallucinate. What made o1 different, however, was that it reasoned over its answer before delivering it to you. I explained in an Update at the time : The big challenge for traditional LLMs is that they are path-dependent; while they can consider the puzzle as a whole, as soon as they commit to a particular guess they are locked in, and doomed to failure. This is a fundamental weakness of what are known as “auto-regressive large language models”, which to date, is all of them. Reasoning models self-evaluate: they work through an answer and then consider if the answer is correct, or if they should consider other alternatives. To put it in terms of the weaknesses I identified above, they were internally proactive in terms of managing mistakes, reducing the burden on the user to continually actively guide the LLM, and the results were remarkable. From my perspective, if the brilliance of ChatGPT was in making LLMs much more readable and useful, the brilliance of o1 was in making LLMs much more reliable and essential. Opus 4.5: Anthropic released Opus 4.5 on November 24, 2025, to relatively little fanfare; then, at some point in December, Claude Code with Opus 4.5 suddenly seemed to be able to do things that were never possible previously. OpenAI released GPT-5.2-Codex around the same time, on December 18, and it was similarly capable. People had been talking about “agents” for a while; suddenly, however, both Claude and Codex were actually accomplishing tasks — some of which took hours — and doing them correctly. That bit about the Opus 4.5 model’s release date is interesting, however: the key thing about agentic workloads is that they are about more than the model, or using the model recursively, like o1. Rather, a critical component of making agentic workloads work is the “harness”, i.e. the software that actually controls the model. To put it another way, Claude Code and OpenAI’s Codex actually abstract the user away from the model: you give instructions to an agent, which actually directs the model; critically, the agent can also use other deterministic tools as well, which means that it can verify its results. To put it in the context of coding, in paradigm one an LLM would generate code; in paradigm two an LLM would think about the code it was generating and iterate towards a better answer; in this paradigm an agent directs a model to generate code, then checks to see if the code actually works, and if it doesn’t tries again, all without the user needing to be involved. In other words, many of the biggest flaws from the original ChatGPT have been substantially mitigated, at least for verifiable use cases like coding: LLMs are much more likely to be right the first time, they reason over their results to increase their chances, and now agents actively verify the results without humans needing to be in the loop. That leaves one flaw: actually figuring out what to use these for. The reason I’ve been writing about these three inflection points over the last couple of weeks has been to explain why it is that the industry is so compute constrained and why the massive investment in capex by the hyperscalers is justified. It’s how this third point will be manifested that I think is under-appreciated. After all, far more people use chatbots than use agents, and I would make the case that most people are not using chatbots as much as they should! It’s been a question of agency: to get the most from AI requires actually taking the initiative to use AI; I wrote in 2024’s MKBHD’s For Everything : Large language models are intelligent, but they do not have goals or values or drive. They are tools to be used by, well, anyone who is willing and able to take the initiative to use them. I don’t think either Brownlee or I particularly need AI, or, to put it another way, are overly threatened by it…The connection between us and AI, though, is precisely the fact that we haven’t needed it: the nature of media is such that we could already create text and video on our own, and take advantage of the Internet to — at least in the case of Brownlee — deliver finishing blows to $230 million startups. How many industries, though, are not media, in that they still need a team to implement the vision of one person? How many apps or services are there that haven’t been built, not because one person can’t imagine them or create them in their mind, but because they haven’t had the resources or team or coordination capabilities to actually ship them? This gets at the vector through which AI impacts the world above and beyond cost savings in customer support, or whatever other obvious low-hanging fruit there may be: as the ability of large language models to understand and execute complex commands — with deterministic computing as needed — increases, so too does the potential power of the sovereign individual telling AI what to do. The Internet removed the necessity — and inherent defensibility — of complex cost structures for media; AI has the potential to do the same for a far greater host of industries. It’s interesting to read that two years on, realize that I was writing about the latest paradigm shift well before it happened, and yet feel completely blown away by that paradigm shift all the same. That’s how big of a deal actually functional agents are: you can see them coming and yet still be amazed when they arrive — and, as one must say with everything related to AI, in a form that is the worst they will ever be. It’s the implications on agency, however, that are the most profound: yes, you need agency to use agents, and yes, the number of people who will have that agency are probably far fewer than those who might use a chatbot. Of course you can make the (almost certainly accurate) case that chatbots will become agent managers in their own right, but the more critical observation is that by abstracting humans away from direct model management any one single human can control multiple agents. What this means in terms of compute — and by extension, economic impact — is that it actually won’t require that many people with agency to drastically increase the amount of compute that is actively utilized to create products with meaningful economic impact. In other words, the rise of agents doesn’t just mean a dramatic increase in compute, but also a narrowing of the need for widescale adoption by humans for that demand to manifest. Yes, AI still needs agency; it just doesn’t need agency from that many people for its impact to be profound. Apple-focused media, in the wake of the recent MacBook Neo launch, latched onto comments from Asus CFO Nick Wu on the company’s recent earnings call describing the $599 computer as “a shock to the entire market”; equally interesting, however, was how Wu sought to downplay the Neo’s potential effects on that market: Actually, we heard about the MacBook Neo shipments coming online back in the second half of last year. So we made some internal preparations. But after the product officially released, we found the specs to have some limitations. For example, the memory is not upgradable, and it only has 8 gigabytes of memory. So this may limit certain applications. So I think when Apple positioned the product, it’s probably focused more on content consumption. This differs somewhat from mainstream notebook usage scenarios because in that case, the Neo feels more like a tablet because tablets are mostly for content consumption. This feels like a bit of a cop-out, given just how capable the Neo’s processor is, and how well Mac OS operates on 8GB of RAM, thanks in part to Apple’s deep integration of hardware and software; at the same time, Wu is tapping into something that is true, which is that most consumers mostly do just want to consume content (which, I would add, means he should be more worried about the Neo, not less). This is why your favorite productivity application always ends up pivoting to the enterprise: it is companies who are willing to pay for productivity, because they are the ones actually paying for the workers who they want to be more productive. It’s reasonable to expect this to apply to AI as well: the most compelling consumer applications of AI, at least in the near term, are Google and Meta’s advertising businesses, which sit alongside content. By the same token, it was always unrealistic for OpenAI to think that it could convert more than a small percentage of consumers into subscribers; that’s both why an ad model is essential, and also why that won’t be enough to pay the bills. It’s definitely the case that most people don’t want to pay for AI; it remains to be seen if they want to use it enough to make the ad model work. That is another way of saying that Anthropic got it right by focusing almost entirely on the enterprise market: companies have a demonstrated willingness to pay for software that makes their employees more productive, and AI certainly fits the bill in that regard. What makes enterprise executives truly salivate, however, is the prospect of AI not simply eliminating jobs, but doing so precisely because that makes the company as a whole more productive. It’s always been the case, even in large companies, that a relatively small number of people actually move the needle and drive the company forward in meaningful ways. That drive, however, has been filtered through a huge apparatus, filled with humans, who accelerate the effort in some vectors, and retard it in others. That apparatus makes broad impact possible, but it carries massive coordination costs. Agents, however, will tilt much more heavily towards pure acceleration, making those drivers of value much more impactful. I’m sympathetic to the argument that the best companies will want to use AI to do more, not simply save money; the reality of large organizations, however, is that the positive impact of AI will not be in eliminating jobs, but rather replacing hard-to-manage-and-motivate human cogs in the organizational machine with agents that not only do what they are told but do so tirelessly and continuously until the job is done. This only makes the argument that we are not in a bubble that much more compelling: In this context, is it any wonder that every single hyperscaler says that demand for compute exceeds supply, and that every single hyperscaler is, in the face of stock market skepticism, announcing capex plans that blow away expectations? This is also why the impending wave of layoffs that are going to be credited to AI shouldn’t be completely dismissed as a useful cover for correcting over-hiring decisions in the COVID era, or right-sizing compensation structures in the wake of multiple contractions. That is all true! At the same time, it’s worth considering that companies become bloated because that has long been the only way to scale, and it’s hard to know at what point the diminishing returns that come from the drag of coordination costs and a sprawling workforce outweigh the benefits of the marginal employee; you only find that point when you have blown past it, and it’s hard to go backwards. AI, however, not only gives the aforementioned excuse to undo that bloat, but also moves the “rightsize” point significantly towards a much smaller workforce. More and more companies are not simply going to wonder if they hired too much for a pre-AI world, but also if they hired too much for a post-AI world; the most forward-looking and future-proof approach will likely be to cut more rather than less, with the hope that those who remain have no choice but to rebuild scale with agents. After all, if they don’t, dramatically smaller competitors built with AI from the beginning will soon be nipping at their heels with both smaller cost structures and more capabilities that will structurally increase over time. There is a good chance this is going to get ugly; I’m not advocating for this outcome, rather analyzing why it is probably going to happen. The economic imperatives are going to be impossible to resist, and will fuel demand for even more compute over time, further supporting the case that this is no bubble. Another important bubble question is about the sky-high valuations of Anthropic and OpenAI: sure, maybe all of this stuff is real, but if models are a commodity, is there any profit to be made? Horace Dediu raises these questions at Asymco and wonders if Apple is executing The Most Brilliant Move in Corporate History : Here is where Apple’s bet becomes genius. AI models are commoditizing faster than anyone predicted. Software and hardware both have tendencies to commodify. Protections exist but they have to do with integration and distribution. DeepSeek built a model for $6 million that matches systems costing $100 million. Open source models now power 80% of startups seeking VC funding. The moat these companies are spending hundreds of billions to build is evaporating. Apple understood this before anyone else. It didn’t build its own AI model, it licensed Google’s Gemini for about $1 billion a year. Why spend $100 billion building a factory when outsourcing costs a billion? And if a better model appears next year, Apple just switches vendors…Apple didn’t miss the AI revolution. It just bet that the winners won’t be the ones who build the infrastructure. They’ll be the ones who own the customer and no one else on Earth owns the best customers. I think that nearly all of these assertions were defensible during the first LLM paradigm. It didn’t take long for multiple base models to be more than good enough for what most people use LLMs for, like, say, cooking or basic medical advice, or as a therapist or companion. Moreover, it was reasonable to expect that models of this quality would soon be able to run locally; I made the case that this was Apple’s opportunity myself back when their own models — which they absolutely did try to build, contra Dediu — failed to ship. The reasoning paradigm, however, blew a significant hole in the local inference case. Not only do reasoning models require fast compute, given the number of tokens generated, but they also need exponentially more memory to accommodate much larger context windows, which is the biggest limitation of local models. Apple makes incredible chips with a compelling unified memory architecture that makes basic inference more plausible for their devices than anyone else; there is also no scenario where capable reasoning models that are remotely competitive with cloud-based models are running locally in the foreseeable future. It is agents, however, that may strike the fatal blow to Dediu’s argument. Specifically, I noted above that what made Opus 4.5 compelling was not the model release itself, but changes to the Claude Code harness that made it suddenly dramatically more useful. What this means is that model performance isn’t the only thing that matters: the integration between model and harness is where true agent differentiation is found. This is a very big deal when it comes to figuring out the future structure of the AI industry and where profits will flow, because profits flow away from modular parts of the value chain — which are commoditized — and flow towards integrated parts of the value chain, which are differentiated. Apple is of course the ultimate example of this: its hardware is not commoditized because it is integrated with their software, which is why Apple can charge sustainably higher prices and capture nearly the entirety of the PC and smartphone sector profits. It follows, then, that if agents require integration between model and harness, that the companies building that integration — specifically Anthropic and OpenAI (Gemini is a strong model, but Google hasn’t yet shipped a compelling harness) — are actually poised to be significantly more profitable than it might have seemed as recently as late last year. And, by the same token, companies who were betting on model commoditization may struggle to deliver competitive products. The canary in the coal mine in this regard is Microsoft. Microsoft once fancied itself as an integrated AI provider, bragging on earnings call about how its deep integration with OpenAI would mean sustainably differentiated infrastructure ; a month later OpenAI nearly imploded and Microsoft pivoted, talking increasingly about models as commodities and a Core AI strategy that entailed building infrastructure around models that themselves would be interchangeable and abstracted away from Microsoft’s customers. Fast forward to last week , however, when Microsoft revealed how they will handle the potential business impact of AI reducing seats, which is a bit of a problem for their seat-based business model: the company is going to bundle AI into a new higher-tiered enterprise offering, E7, which is going to cost twice as much — $99 per seat per month — as the formerly top-of-the-line E5. That’s a big increase, which Microsoft needs to justify with AI that actually makes those seats more productive, and the product they launched with the new bundle was Copilot Cowork. If the “Cowork” name sounds familiar, it’s because this is basically the enterprise version of Claude Cowork , a GUI-ified version of Claude Code that the company released earlier this year. There are important differences with the Microsoft version, including the fact that the latter runs in the cloud and is grounded in your organizational data, with all of the permission and access policies that go with it. What is crucial, however, is that Copilot Cowork — unlike the Copilot chatbot — is not model agnostic: Cowork is an agent, which means it needs both a model and a harness, and those are two integrated pieces, not modular components. The implications of this are significant: Microsoft is admitting, at least for now, that delivering a truly compelling agentic product that enterprises are willing to pay for means abandoning their stated goal of being model agnostic; that, by extension, raises the possibility that models are not and will not be commodities, because agents require more than models. This certainly raises questions about Apple’s decision to merely license Gemini and build a harness themselves in the form of new Siri. Microsoft decided that they couldn’t deliver a compelling product by going that route; what has Apple done to inspire faith that they can do a better job? If anything, the company’s saving grace is the point that Dediu ended with: consumers may simply not care that much about agents, in which case Apple will be fine with good enough, even as Microsoft, with enterprise customers who do care, realizes it needs to share more margin than it might want to with Anthropic. What matters in terms of this Article, however, is that if agents are making Anthropic and OpenAI the point of integration in the value chain, then the bubble argument that these companies are overvalued, or that the massive investments other companies are making on their behalf in data centers is unwarranted, may not be correct. I must, in the end, address my opening parenthetical: I’ve long maintained that there is no need to be worried about a bubble as long as everyone is worried about a bubble; it’s the moment when caution is flung to the wind and assurances are made that this is definitely not a bubble that we might actually be in one. And, well, I think the rise of agents means we are not in a bubble. The capex is warranted, and Anthropic and OpenAI look more durable than ever. If my declaring there is no bubble means there is one, then so be it! The first paradigm required a lot of compute for training, but inference — actually answering a question — was relatively efficient: you simply sent the user whatever the model spit out. The second paradigm dramatically increased the amount of computing needed for inference, for two reasons: first, generating an answer required a lot more tokens, because all of the “reasoning” required tokens, in addition to the answer itself. Second, the fact that reasoning made the models so much more useful meant that they were used more, which drove increased token usage in its own right. It’s the third paradigm, however, that has truly tipped the scales in favor of capex expenditure not being speculative investment but rather badly needed investment in meeting demand that far exceeds supply. First, generating an answer will often entail multiple calls to a reasoning model. Second, the agent itself needs compute, and that compute — and the tools the agent uses — is better done by CPUs than GPUs . Third, agents are another step function increase in usefulness, which means they are going to be used even more than even reasoning models in a chatbot. First, all of the weaknesses of LLMs are being addressed by exponential increases in compute. Second, the number of people who need to wield AI effectively for demand to skyrocket is decreasing. Third, the economic returns from using agents aren’t just impactful on the bottom line, but the top line as well.