Latest Posts (20 found)
Xe Iaso -3 days ago

Giving your Go apps Tigris superpowers

Tigris is S3-compatible, which means you can point the AWS SDK at it and most things just work. The catch is that the Tigris-exclusive features—bucket forking, snapshots, object renaming, and the like—need verbose workarounds because the AWS SDK doesn't know they exist. So we wrote a Go SDK that does. It comes in two flavors: the package is a drop-in replacement for the standard S3 client with first-class methods for the Tigris-specific operations, and is a higher-level client for the common single-bucket case that infers its configuration from the environment so you stop passing the same parameters over and over. You can adopt the Tigris features incrementally without refactoring your existing S3 code, and the simpler API still works against other S3-compatible providers. I wrote up how it works and why we built it over on the Tigris blog.

0 views

The Archivist In Me Turned This Blog Into a Book

Four years ago, in the article What Happens To My Digital Identity When I Die? , I wrote the following prophetic words: […] Which gets me back to this website. My intentions are to someday publish its contents in the form of a book, which can also be stored at the KBR [Royal Library of Belgium]. This allows the people dear to me to still have access to the silly stuff I write here. Two years later, I claimed that Good Blogging Habits Yield a Book Each Year of more than words. That means compiling a hefty tome to compress all these years of productive blogging into a single physical volume might be a bit more challenging than I initially anticipated. Yet not impossible. So the last months, I’ve kept myself busy by doing just that: turning this blog into a book! The book flipped open on the blog post 'Three Little GameCube Mods' from 05 December 2021. This was a special experiment with a high probability of failure as I wasn’t sure how it would turn out. What should a Brain Baking book look like compared to browsing this website? How should it feel to flip through the pages? Will I be able to squeeze everything in there (nope)? Do I want to publish this publicly or just generate a , send it to the presses just for myself and call it a day? And if so, which service to use, as the past ones I’ve relied on all showed their shortcomings? Luckily, it turned out all right. I call it Brain Baking DX: Blog Archives 2016 - 2026 and it is available at Amazon under ISBN-13 number 979-8197112897 . My first attempt yielded more than 500 pages and I couldn’t find a publishing service that was eager to print something like that for less than . At Amazon, the book costs… . And it’s globally available, should you be crazy enough to want a copy. Be warned, though: the book is mostly unedited , I created this mainly for myself. I sent out a few copies to friends but have no intention of setting up a marketing campaign let alone making money off of it. I intentionally set the price low and receive for every sale. Please do not buy this just to support me. So why a book? As mentioned before, I want my writing to be a bit more permanent than the fleeting medium called the internet. What happens when my VPS is blown up, my backups burned away, and my motivation to restore all this along with it? In Belgium every author of “proper” books (this is debatable nowadays… Is Brain Baking DX a proper book?) is legally obliged to deposit two copies to the Royal Library in Brussels, where the books disappear into the winding depths of the archive deep below the capital. Plus, I like books. I like flipping through this one and rediscovering old writings: it feels very different than clicking through the online archive. Also, since I like to add photos in my blog posts to help shape the atmosphere, preserving these mostly personal photos in the book makes me feel warm and fuzzy inside when I flip through the book and look at them. Some of these photos are snapshots of my life as a kid, my old and new desktop setup, destinations I once biked to, etc. It’s a nice memento to have these included. Why Brain Baking DX ? What’s up with that? Most readers of this blog know that I grew up with a Game Boy and became a big retro gaming nut(case) because of it. The black DX cartridge editions transformed their original Game Boy release into the wonderful world of colours: black carts work with the original Game Boy and the then new Game Boy Color. DX was simply the “DeluXe” treatment to your beloved Link’s Awakening or Tetris . Funny, as Brain Baking DX might be called anything but deluxe: financial constraints prevent me from publishing this book in full colour mode and space constraints prevent me from simply dumping everything that I’ve ever written in here. Perhaps both are for the better. Still, in a way, a printed edition of ten years worth of Brain Baking blog posts can certainly be called deluxe. My method for compiling the book wasn’t as simple as throwing every Markdown article source file at Pandoc to compile a single . I didn’t want to preserve everything I post here: it had to be a deliberate, curated selection. Things I didn’t want in there include: After proceeding to make a first selection, I categorised these into major themes that became the parts of the book: parenting, journaling & writing, work, the web, technology, retro, video and board games, life & philosophy, food & cooking, and living in Belgium. Then I employed my usual Markdown/Pandoc/TeX magic and inspected the results. The front cover of Brain Baking DX. 600+ pages. Ouch. Now what? Maybe it is time to think about the layout: how do I want to present all this text? Clearly, a typical book layout won’t do. I turned down the font size, opted for a two-column layout, selected a more wide book format ( ) and squeezed everything I could out of those margins. As a last resort, I also allowed chapters to start on any page (as opposed to the right page only which introduces a lot of blank pages). I admit I might have overdone it a bit as the top margin is very thin, but all these changes did reduce the page size to a more manageable 470. After ordering a few copies for myself to inspect the result, I was afraid that the text would not be very readable, or the margins where the book would be glued would be too narrow. Fortunately, the end result is surprisingly pleasant to read. The cream paper Amazon provides is a nice match although the paper feels a bit too thin for my taste. Yet a hefty tome like this for is ridiculously cheap so I can’t complain. The fact that the book is printed in black & white does not work against the many photos and screenshots included. Additionally, because it’s Amazon, it allows the book to be distributed and printed virtually anywhere. My copies were printed in Brétigny-sur-Orge in France. Better than China! When you are selecting blog posts to be included into the book, you’ll notice recurring themes you wrote about. For example, I have wasted too many words on physical video game collecting. Instead of just pasting these chapters next to each other, I wanted them to “flow” better in the book so I did rewrite portions to better match the medium. Also, in many occasions, a new chapter (thus blog post) starts with a reference to the previous one. On the site, this is just a link, but on paper, you don’t want to print “in this article”. Speaking of links, a blog or website is an interconnected medium: how to approach this on paper? I ended up putting all LaTeX links in the margin footer on the same page but did a diagonal sweep to remove the excessive ones. On the site, a long link is just hidden behind a click, but on paper, an link to a long URL is not only ugly but will never be typed over or “used” in that way. Also, internal Brain Baking links usually start with —in the end, I decided to keep it that way as prepending everywhere would mean even more text wasted. I did make a note of this in the newly written introduction. Besides the “in this post” link adaptations (don’t do this—it’s also bad for accessibility in your online blog!), I noticed I also had to do something about the images. Because of the two-column layout, the wide figures such as graphs will be squeezed into a barely readable square. You can fix this by manually adding a to the ones you want to be displayed as a full-page spread ( ). But I made another mistake: in many posts, I write something and then add an image to emphasise the statement, ending in a semicolon to point to the image. Yet in a book, you never know precisely where that image will be included! In my future writings, I’ll take these things into account to more easily compile Brain Baking DX II in ten years. This was a lovely month project that rewarded me with a physical artefact of an ever-evolving digital medium, solidifying words, sentences, and paragraphs in a way that perhaps might even envy The Internet Archive. As a hopeless sentimental person, flipping through the book, looking at the figures and reading the text makes me happy. And also embarrassed as there are plenty of contextual and grammatical mistakes in solidified as well. I’m looking forward to revisiting the project in ten years! If you want to attempt something like this for yourself and don’t know how to approach this technically, drop me a line and I’ll be more than glad to help you out. Related topics: / archiving / By Wouter Groeneveld on 5 June 2026.  Reply via email . Research and topics regarding creativity & bread baking: I have other published books that delve into this. Monthly link sharing posts and other posts that are mainly lists or links. Technical posts on programming, coding, Hugo tips, etc. Design mistake posts. Overly negative posts. Anything that has the word “AI” in it (except my more elaborate commentary). Too short posts to be worthwhile printing. Too photo/screenshot intensive posts to be worthwhile printing.

0 views
Xe Iaso Today

IPv6 zones in URLs are a mistake

IPv6 is weird. One of the more strange parts of the standard is that every interface's link local addresses are in . If you have a machine with two network interfaces, both of them will be in , so if you have a packet destined to , how do you disambiguate it? The answer is you use IPv6 scopes/zones . The exact format of what goes into a zone is OS dependent, but on Linux it's the interface name and on Windows it's the interface ID. This lets the kernel's routing table know how to handle an address range conflict. On my tower, this would be represented like this: Where is the name of my tower's ethernet device. When you create a host:port bindhost, you normally separate the hostname and port with a colon. IPv6 uses colons to separate hex groups. In order to disambiguate what's the host and what's the port, you typically format the IPv6 address in square brackets, so on port 80 would look like this: And with the right scope it looks like this: Now let's get URL encoding into the mix. From high orbit, you can imagine a URL's format as being something like this: An IPv6 zone would then be part of the hostname, just like with that port 80 example from earlier. So you'd think the URL would be something like this: But if you try to parse this as a URL in Go, you get an error: This happens because URLs can't represent all Unicode values, so any values that don't fit into the grammar of a URL become percent-encoded . This is why sometimes you'll see a in URLs in the wild; that's encoding the ascii space key, which is invalid in URLs. In order to work around this, you need to percent-encode the percent sign in the IPv6 zone: In theory, there is guidance for how to properly handle IPv6 zones in user interfaces in RFC 9844 , but there's no such guidance for URLs . Go also does not seem to follow this RFC in net/url . EDIT: It seems that this behaviour is compliant with RFC 6874 and that this is in fact how it is meant to be done. Our industry confounds me. So in the meantime in order for Anubis to point to IPv6 zoned addresses, you need to encode the with percent encoding. This is horrible, but it seems that this is an edge case that applies to other frameworks, programming languages, and libraries: Maybe some day in the future there will be a better option here. In the meantime my policy of not forking the Go standard library means that this somewhat terrible UX for an edge case is acceptable. I hate it, but what can you do? TL;DR: computers were a mistake. https://trac.nginx.org/nginx/ticket/623 https://github.com/psf/requests/issues/6808 https://datatracker.ietf.org/doc/html/draft-schinazi-httpbis-link-local-uri-bcp-03 -- Browsers don't currently support IPv6 zones because it breaks the concept of an "origin" which is used for many subtle things, this RFC draft attempts to define an zone origin in IPv6 so that browsers have a leg to stand on

0 views
Unsung Today

“Then suddenly we were boring, bloated, and not particularly interesting.”

In 2021 and 2022, product manager Steven Sinofsky wrote a… …first-person account of what I saw at the PC revolution from the perspective of joining Microsoft as a newly hired software design engineer fresh from graduate school working on developer tools, through my time as a program manager and ultimately leading Office, and then moving to Windows, and everything in between. Sinofsky called the series Hardcore Software: Inside the Rise and Fall of the PC Revolution . It covers 1989–2012 and somewhere inside over 100 chapters, there is a fascinating six parter about the “ ribbon ” redesign of Office 2007. The first part covers the challenge of the team in 2007, taking stock of Office after almost 25 years of its evolution. (Number of toolbars in 1983: one. Number of toolbars in 2003: 31.) The second part shows great screenshots of all the Office versions from 1.0 until then, and the remaining four cover the Ribbon redesign process. Regardless of how you feel about Microsoft Office today, and whether you consider the Ribbon interface a success, it’s a perfect weekend read as it covers universal challenges of software complexity and change management. It’s such a potent series I’m sure we’ll come back to it. It covers a lot, including – in the first part – wrestling with a definition of bloat or complexity, which in the context of Office was less about the number of functions available, and more about mastery: […] In practice, bloat comes from the fact […] that Office does so many things that customers just assume the product can do whatever they need it to do. Despite that fact, customers have no idea how to make the product do what they need. This feeling of helplessness that leads to frustration. […] Bloat is owning a product that you cannot master. This below is a great observation about the perils of an idea of a “simple mode,” which Sinofsky argues is always a leaky abstraction : We tried reducing bloat by hiding features […], but that only added to the mystery of the product. Mac, Windows, and Office all went through periods of “simple means fewer” and tried mechanisms such as short menus, simple mode, or adaptive toolbars. But that frustrated or confused people. No one really wanted to use a simple mode and there was always one command missing that was needed, so simple mode became a complicated way to do that one thing that made someone’s work unique. It was great to see this argument for a broad definition of a bug, as it slides exactly into my post from a while back : Ages ago in ancient Microsoft history there was a debate on the original apps team about what it means for something to be a bug. Is it a crash? Is it data loss? Is it a typo in an error message and so on? Out of that was created a notion of bug severity, a measure for how serious a bug might be from losing all data all the way to simple cosmetic issues. However, when it came to talking about bugs with product support or ultimately customers the definition of a bug was very simple “a bug is any time the software does not do what a customer expects”. This definition created a discipline of documenting everything reported about the product and always making sure every issue was looked at, even if a code change did not result. The key lesson was how helpful an expansive definition was. There are also observations and research about how users “debug” the product to make it achieve something they know is possible, but they don’t know how: We called the futzing document debugging, and it created a frustration that the product was powerful yet overwhelming. People believed a specific result was achievable but getting from point A to B seemed impossible or unlearnable. And some about the challenges of figuring out what features people use: […] Most people didn’t know or care what buttons they clicked on or menus they chose so long as it was working for them—and that meant when asked, “Did you use X?” most people couldn’t recall. To a skeptical press or IT manager (and they all were) that meant unused features. I should stop quoting and let you read in peace. But, check this out. Lisa wasn’t the only one having linguistic fun: Early keyboard shortcuts were simple, like using Ins(ert) key to copy text from the scrap (clipboard). #case study #complexity #definitions #flow #software evolution What is Software Bloat, Really? A Tour of “Ye Olde Museum Of Office Past” Competing Designs, Better Design Progress From Vision to Beta First Feedback and a Surprise Defying Conventional Wisdom to Finish Office

0 views
Chris Coyier Yesterday

The New Van

I got something I’ve wanted for years and years! A camper van! I’m a camper van guy now! It’s a Mercedes-Benz Sprinter, and even more technically a Winnebago Revel . I scoured Craigslist, Facebook Marketplace, and RV-specific sites for a long time drooling over these things. I’ve rented a half dozen of them on Ourdoorsy over the years. I’ve borrowed friends. So I feel like I knew what I wanted and I knew the specific price range I could go, so it took a little while to find it. Ultimately one that came up on Craigslist led back to one sitting on the lot at a local spot called Just Used Cars . I liked the size of it. Just a normal length, not “extended”. Plenty of height to stand up in. I like that it’s an actual Sprinter base because of it’s nice poise/stance compared to other van bases. Also it’s 4WD and has good ground clearance which. That wasn’t a requirement for me, but this will make me trust it driving in the winter and up to Mt. Bachelor and such, which will be really nice. Although funnily enough, I’ve already gotten it stuck once out in Pacific City at the beach — even in 4WD Low and using the traction boards on the roof. Sand is rough. I like the tan color as well. Maybe I’ll get some cool decal or have an artist paint the side or something someday. I wasn’t specifically looking for a Winnebago Revel, but that’s just how those roll, and honestly, the Winnebago name sounds nice to me. Long history making campers, obviously. The interest rate on this thing was horrible. It’s 8 or 9% or something. It doesn’t really matter, as my plan is to pay it off in the next few months. My thinking is that it will do great things for my credit score this way. We’ll see. I co-signed for a “normal” new car just recently, and that rate was 3%, which seems fine/good. I started writing a bunch more little stories about the van. I’ve been using it and thinking about it and working on it a ton, so there is a bunch to say. But I think I’ll break those out into smaller blog posts as I go! One quick one: after I bought it, the dealership called and told me the previous owner wanted to talk to me. I approved them giving him my phone number, we chatted, and he came over to see the van. I was able to return to him some things that belonged to him tucked away into crannies all over the van. He was a nice guy who just really really wanted the new owner to understand it . All the little details about how it worked and where you can put things and quirks and whatnot. We spent a few hours going over things. I really appreciated that, and it shows how attached some people can get to these homes-on-wheels.

0 views

Broker-Visible vs Client-Local Parallelism

This post is a little side-quest from my “Kafka Share Groups and Parallelizing Consumption” series. My “Kafka Share Groups and Parallelizing Consumption” series ( part 1 , part 2 ) has been laser focused on how different configurations and behaviors affect parallel consumption in share groups (Queues for Kafka). So far I’ve shown that you most definitely can hold share groups wrong . You could quite easily and inadvertently create a work queue and with the right combination of things going against you, see a small number of consumers dominate, leaving most consumers starved of messages. All the while lag builds and builds. You need to know the settings and what they do. Don’t just rely on the defaults. But it’s worth asking the question: is parallelizing consumption what share groups are for? The answer is no. If your only concern is parallel consumption, then there are other options. Chuck Larrieu Casias wrote a good post on LinkedIn pointing out that people shouldn’t be thinking of share groups as THE solution to parallelizing work (without exploding the partition count). Share groups exist to expose queue-like semantics over a log. Unlike a normal consumer group, a share group lets you accept one record and reject another for retry. A consumer group tracks one committed offset per partition. A share group has to track many individual records independently: which records are available, which have been delivered (to whom), which have been acknowledged, and which should become available again. But just because share groups don’t exist primarily to parallelize work doesn’t mean it’s not a tool that can be used for that purpose. If your messages are independent or you are otherwise ok with loose ordering then share groups could be a simple choice for breaking away from partition count as the unit of parallelism. The central theme I took from Chuck’s post is that parallelism has to be accounted for somewhere . The unit of parallelism can be broker-visible and broker-managed, or client-local and client-managed. Broker-visible/managed can only take you so far. When you need to process 1,000 messages in parallel to cope with the producer rate, what represents those 1,000 parallel units of work? Is it partitions, consumers, virtual threads/async tasks? If the unit of parallelism is the consumer itself then we must scale out serial consumers to scale the parallel processing (with a matching partition count with consumers groups). Every parallel unit of work (consumer) becomes visible to the broker as protocol interactions and state plus one or more TCP connections. If parallelism comes in part from the client itself, the unit of parallelism could be a virtual thread, an async task or even an OS thread. This is invisible to the broker. You need fewer consumers, fewer TCP connections, and less broker-visible protocol interaction/state.  This split of where the unit of parallelism is accounted for, broker-side vs client-side, exists across all messaging systems. It’s not specific to Kafka. A simple calculation for aggregate parallelism is easy: 60000 msg/s * 1s = 60000 60000 msg/s * 5s = 300000 100 msg/s * 20s = 2000 10000 msg/s * 0.5s = 5000 50 msg/s * 5s = 250 Once you know how many messages must be processed in parallel, you can figure out your tactics. The formula tells you how much parallelism you need, then it’s up to you to figure out where that parallelism should live. Let’s use our 60,000 messages per second workload from the share group series. If it takes 1 second to process each message, then we need to support 60,000 messages being processed at any given moment. If each unit of parallelism is a serial consumer, then that means 60,000 consumers! That’s a lot of connections, a lot of protocol state, and a really big consumer group. What if it takes 10 seconds on average to process a message, you’d need 600,000 consumers, and well over 1 million TCP connections! If most of the work is I/O, and the CPU spends a lot of time waiting around then can’t we make a single client do more work? What if one client can handle processing 1000 messages in parallel? Then we’d only need 60 consumers for the “60K msg/s + 1 second processing time” example.  Fig 1. Left: Parallel work across N serial consumers. Right: Parallel work across N parallel-capable consumers. If the ultimate unit of parallelism is visible to the broker as something it must manage, it can get really expensive in resources for highly parallel workloads (no matter which messaging system you use). Managing virtual threads, or even OS threads, is much cheaper than managing one or more TCP connections + metadata per unit of parallelism. This is true of all messaging systems I have ever used. The cost is greater complexity on the client, but if you don’t want to roll your own logic, there are libraries to help here (see Chuck’s post for some). Unfortunately, the ParallelConsumer library is no longer being maintained (though a fork might be in the future). This library not only added internal client-side parallel processing but queue semantics as well (on top of consumer groups). Now that we have share groups, perhaps we need a new library that adds client-side parallelism to share groups. I’m going back to writing Part 3 of my parallelism in share groups series. We’ll be comparing broker-managed vs client-managed parallelism with share groups and consumer groups. 60000 msg/s * 1s = 60000 60000 msg/s * 5s = 300000 100 msg/s * 20s = 2000 10000 msg/s * 0.5s = 5000 50 msg/s * 5s = 250

0 views
Kev Quirk Yesterday

It's Just Broken: Oh WordPress

by Pup On Tech In a recent post, the Pup ON Tech perfectly captures the absolute nightmare that is building a self-hosted WordPress site. What starts as a simple VPS setup quickly devolves into a bloated mess of heavy themes, dozens of conflicting plugins, and rigid page builders. By the time you’ve fought with broken caching layers and terrible performance, you realise that fixing the bloat defeats the entire purpose of using WordPress. Read post ➡ WordPress really is a nightmare, and this post by Pup On Tech really capsulated that! Should have just used a flat-file system or an SSG from the start. 🙃 Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Stratechery Yesterday

An Interview with Microsoft CEO Satya Nadella About Finding Core Competencies

Listen to this post: Good morning, This week’s Stratechery Interview is with Microsoft CEO Satya Nadella . I have previously interviewed Nadella in May 2024 , October 2022 , April 2020 , and May 2019 . As I noted yesterday , I spoke to Nadella shortly after the conclusion of his keynote at Build , Microsoft’s annual developer conference . One notable thing about the keynote was the fact that Nadella was — outside of product demos — the sole presenter; one gets the sense he has shifted into a much more hands-on role at Microsoft over the last year. The reasons why are clear: my first question to Nadella was if he was happy about where Microsoft was currently positioned as a company. We talk about the reasons for that question, the status of the company’s partnership with OpenAI, and whether Microsoft has invested sufficiently in AI infrastructure. Then we talk about the future of software, Microsoft’s business model in the age of AI, and if they can operate independently from the leading edge models. At the end we talk about Project Solara and whether Microsoft will ever pay residents to build data centers. One note, with regards to a misunderstanding towards the end of the interview: there is no documentation I could find about being able to use Copilot Cowork with non-Anthropic models; Microsoft’s own documentation fits my understanding. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Satya Nadella, welcome back to Stratechery. SN: It’s great to be with you, Ben. So first off, I don’t know if you realize this, but at least according to my daughter, the defining word for the real grinders in Gen Z — first off, LinkedIn is like the social network. SN: That’s great! Number two, the word they all use is “build”, “I’m building, I’m building”, so who knew when I was at the first Build, I think, in 2010? Or was it 2011? Who knew you were such a trendsetter? SN: (laughing) There you go, I’m thrilled that your daughter is building and is on LinkedIn. Yeah, well, I’m not sure if she’s on there, she’s more making fun of people, so we’ll see how it works. We last talked the summer of 2024 after Build, this was up in Seattle. To say a lot has changed since then is an understatement. I had a bunch of questions I wanted to ask you about the business as a whole, things going on, I’m going to start with those, then I have questions about the presentation at the end. But relative to that, I want to ask you one simple question: Are you happy with Microsoft’s current competitive position? SN: You know, always this is the trickiest thing, you can sit here and say, “I’m happy” — that means you’re not ambitious enough and when you say, “If you’re not competitive, what the heck are you doing?”. And plus you have like 57 different product lines. SN: I’d say the thing in these platform shifts in particular is to, one, get the conceptual model of, “Where is the opportunity for us as a company?” — most people measure competitive position as if it’s a complete zero-sum game, and it’s never been the case. Which is, it is not the case with the cloud, it is not the case in client-server, and so to me, “What is Microsoft uniquely capable of doing in this new world” — that’s the key thing that we have to answer before we even get to the competitive position. In that context, “What is it that we really have a shot at?”, which is we can be a trusted purveyor of a platform, which is what we’ve always done, that allows people to create more value on top of a platform, which is again the DNA we have. Even in a world where these frontier models seem to have no limit— A very large appetite. SN: They have large appetite. That is what I feel even this Build , this conference, we are at that state where we can now really turn this from any one frontier model to saying, “Hey, there is actually a way for a frontier ecosystem to emerge where there are many stakeholders who all actually are operating with their own frontier intelligence”, that is a place where I think we have a unique shot, a unique competitive angle, and most importantly, brand permission. This is the other thing I’ve learned, Ben, which is every company thinks they can do everything, and then they realize that the world doesn’t need them to, the world wants them to do the one thing. Is that a lesson that you had to learn? SN: Yeah, absolutely. I’ve always said this, at Microsoft we are at our best when we do what the world expects us to do, we are at our worst when we do things out of envy, which is just because somebody else had some cool hit, somewhere, doesn’t mean we should go do that. But enough about the Zune, right? SN: (laughing) Yeah, Zune was a great device, but the world didn’t need Zune from us, and so that was the end of it. This identification of your unique capabilities, is that one of the changes over the last two years where that has emerged? SN: Yeah, in fact, it has emerged and also the world’s kind of gotten to it. Has it been forced on you to an extent? SN: Yeah, even my own conceptual understanding, I started by thinking of, “What are models?”, models are kind of like some stateless APIs, then I adjusted and said, “Oh, maybe there’ll be like databases” — they’re really more than that. I don’t remember talking about this with you, but last time I talked to [Microsoft CTO] Kevin [Scott], we analogized it to processors at some point, and you actually did make a comparison in terms of the partnership to your partnership with Intel. SN: Exactly. So the question now is, it’s a better conceptual model to think of what we’re doing is you have to really build a learning machine, and any company has to build a learning machine, so what I want to build is essentially a multi-tenant learning system that allows everybody to have their own hill-climbing machine . So that conceptual idea, now I’ve turned what is essentially frontier is not about any frontier model — I want to build whatever you did with M365 or with Azure into a platform which allows everybody to basically build their own hill-climbing machine right because the future of a firm at a foundational level they’ll have human capital they’ll have token capital and for the token capital they need their own hill-climbing machine. All right, so I’ll jump to the end, you released seven new models, you emphasize the work you’ve done to build these models from scratch, not with distilling, not with using other models as teachers — so did you just articulate what the ambitions are with these models? SN: Yeah, there are two sets of things. One is we wanted to build from ground up with clean lineage, the models that we will have that we can license and allow enterprises to continuously hill-climb, so that’s why we want that model. By the way you talked about distillation — the point is to not use distillation during any of our own hill-climbing but at the very end, in fact some of the things that we are doing is, after all, we have all the OpenAI IP, in fact some of the performance gains we get is by doing RKLD, which is reverse knowledge distillation , and RL on top of it. So we have effectively two frontiers, we have our own, we have the OpenAI, and we’re going to use these things to eval match. And the clock is ticking to get to the right state you need to be while you still have that access . SN: Yeah, and there’s five years of it. But the bottom line is at any given point in time, I want to make sure that I’m using the best, most efficient model for whether it’s in coding, whether it’s in security, making sure also in our case, we’ll have a harness that’s independent of these models, we have the GitHub Copilot harness that’s used everywhere across Microsoft. Our goal is to make sure we have a model lineage, which we control end-to-end, we then use OpenAI IP, even with all of the capability it has — ultimately, the tests are going to be the evals for us and our customers. In the long run, the way it was framed today, and I thought it was very compelling, and it speaks to what you just said, was this idea of enterprises being able to take these models and in their own RL environments incorporate their data at a much deeper level than sort of a slap-on RAG implementation or basic post-training. Is that the end goal, though? SN: Yeah, the end goal for me is the following, which is I go back and say, let’s say that they’re a generalist model — if you go back even, Windows could have a release, then another release, and Adobe and Autodesk could keep building and keep going up, what’s the moral equivalent of that? That is the thing. And then in the first time, we said fine-tuning, it kind of didn’t work because we didn’t have the tools, we didn’t have the data collection regime, none of that. But now we have it. So let’s say the generalist models keep getting better, MAI models, let’s say, or OpenAI models, then you have this RLE. Right, but this deep customization of the models you’re talking about is only possible with MAI models. SN: That’s correct, but the thing that we want to start getting everyone on is this multi-tenant hill-climbing system — so if you think about it, we literally turned your use of M365, which already is a multi-tenant system, into a hill-climbing system for you. Okay, I’m gonna have to stop you, I’m going to give you an ELI5 opportunity, explain hill-climbing to the audience. SN: Hill-climbing is basically when you think about, “What does AI do?” — AI is all about taking an objective and continuously learning how to go predict and create that output that is the representation of that objective, and do so continuously. So that’s why a metaphor of hill-climbing is the best way to describe learning. And you want everybody to do this individually on their own hill. SN: Individually on their own. As opposed to like, hitching along. SN: What is your moat as a company? Your moat as a company is your tacit knowledge. In a world where AI exists, and network effects of AI exist, you need your own hill-climbing machine in which the models are learning. So the first thing we want you to do is, people don’t talk enough about this, but the private outputs, the evals, as I think about as, maybe the most important IP a firm creates are these private benchmarks and the private evals where you are tastefully recognizing what’s the output, the quality. And by the way, today’s failure cases are informing you to change the benchmark continuously, it’s not a static thing, that’s kind of how the evals work. And so if you have your private evals, then you have your own reinforcement learning environment that you’ve created, then you invite all the models to show up, and then you say, “Model A, generate the output that is maxing this eval using my environment and my trajectories and model B…”, and I can switch. In that context, the MAI models is one more lineage that you can put into,c and what we proved today was even a very efficiently trained reasoning model or a coding model can hill-climb using your traces and that will be more token-efficient and it will be fundamentally a great advantage. Exclusive to you the customer. SN: Yeah, that’s right. But is that just for now? If you fast-forward, is your vision that actually MAI models are fully competitive on the frontier with the other general models? SN: They are. Even today, when you start saying that — the world will keep getting better in general.** Well, I guess this goes back to, is this about how you need to do what you’re good at? SN: Correct. One, what we’re good at and also what’s the equilibrium of the world? Which is, if you believe there are only going to be two firms in the world, then of course, they only need two frontier models, but if you fundamentally believe that there are going to be as many firms as there are today and more, then what is the firm in the age of AI? It’s going to have human capital and token capital, how did that token capital get created? It’s not a bunch of API calls, it’s actually some set of weights even they have. Right. And so do you want to accrue that advantage or do you want to give it to OpenAI and Anthropic? Well, speaking of the OpenAI partnership, I mentioned you referred to it like the Microsoft-Intel partnership, and sometimes partnerships are the only way to get ahead. How do you think about that partnership now? SN: I still think that it’s — I’m very proud of the fact that we came together, you remember the circumstances in which we came together were very different and the fact that there is a company now that may go public and be a trillion-dollar company— This is my question — how long were the knockdown, drag out fights between in this corner, there’s Satya Nadella, the operator, and in this corner, there’s Satya Nadella, the investor, tussling over what to do? SN: (laughing) At the end of the day, we are an operating company, investment is just more of an accident. Yeah, but the shareholders are ultimately those investors! SN: I’m glad and it’s a fantastic outcome for our shareholders too and what have you. But I think the way I came at this, Ben, is to say genuinely I’ve always approached it as, if there’s a partner that we can partner with and ourselves innovate, and they’re also successful, that’s fantastic. I always go back to the story of having built SQL Server with SAP. SAP was successful, we were successful, we also then went on to do other things. And so therefore, I think OpenAI, I’m glad we worked with them, we’re working with them, they continue to be a premier partner. As I said, until 2032, we still have a lot as a customer of theirs, them as a customer of ours, as an IP partner. So every day OpenAI does well, Microsoft does well. Is there a bit where everyone thought you were so far ahead because of your partnership with OpenAI, and now when we talk about things like your MAI models, it’s like actually “We got a little bit lulled to sleep because we offloaded too much to them, and now we’re having to recalibrate”? SN: Lots of things, one is, like all things, there’s a lot more competition, there is OpenAI, there is Anthropic, there’s Google, there is tons of folks who are in there. And so I think for us, the beginning, it was great that we got started with OpenAI. Think about where we were in 2018 to where we are in 2026, here we are competing with Google and a bunch of people whose names I wouldn’t have known in 2018, and so that itself proves that to your very first question, “How competitive is Microsoft?” — I’m glad Microsoft took that shot. Here we are competing with a bunch of new people, a bunch of old people, and we have our own game. So we already talked about Satya Nadella, the operator, and Satya Nadella, the investor. What about Satya Nadella, the capital allocator ? There were a lot of reports in about early 2025 about Microsoft pausing and a reconsidering some data center investments, you guys have sort of spun that as, “Lots of speculative stuff”, “We’re streamlining”, etc. — but at the same time, your percentage of free cash flow committed to CapEx lags fairly significantly behind your peers. Four months ago, that was a compliment. Now, is it a diss? How are you feeling about that? SN: The last time I checked, my free cash flow is getting allocated pretty well to capital return that makes sense. Is there a case that you’ve underinvested? SN: Not really. I think the key thing that at least we wanted to make sure is we were not upside down on building — we have a hyperscale busines, we have our own application business, and we have our own research compute to allocate, there are three buckets, we wanted to allocate with great discipline on all three. So take the hyperscale business. Hyperscale businesses are about having a few big customers, but also having a massive long tail, so you can’t have a book of business that is just a few model companies — in fact, one model company — that was the fundamental decision. And you wanted to get out of that business. SN: Not just get out. They’re still there, they’re a major tenant. SN: They’re a major tenant. But, let’s face it, Anthropic over time or OpenAI over time will build their own, it makes sense. They would use — I’m not saying that they won’t use other cloud providers. So to me, it was clear as day that, what I wanted to do was not allocate all my compute only to one player and so that was the adjustment. And once you make that adjustment, you can’t build 10 gigawatts in Texas and say, “That’s it”, you’ve got to build a plant that is spread around the world, around the United States, and that adjustment is what we want to do on hyperscale. The other thing that I have to do is make sure we’re doing also the long-term thing for our investors, which is, “Let’s invest in ourselves”, which is inference compute has exploded, whether it’s in GitHub or whether it’s in M365 and we needed to make sure we fund our own applications. And then our own research compute, these MAI models. So I just took the approach of putting these three, we will definitely want to allocate as we see progress on all this and we’ll see how it all shakes out. But to me, I’m not literally matching quarter-to-quarter. By the way, the other interesting thing is the catch-up, we started early. You were early, and you got a lot of the good spots, a lot of the good power generation. SN: Yeah, and also two years of cash flow. Yeah, for sure. Well, speaking of the balance between the three, in January 2026 , you missed Azure earnings by like 0.1%, so it was very small, and you said on the call , you allocated more compute to internal R&D and applications. Setting aside the earlier question about whether or not you erred by the total amount of capacity, you talked in that call about having a portfolio approach in terms of investment, balancing Azure, and those two other businesses. That’s all well and good, but if there is a constraint, you do have to choose, do you think you made the right choice then? And is that the choice you’ll make going forward? Where you are at the end of the day, you have a higher lifetime value, higher margin on your own businesses, and that’s going to be number one. SN: Yeah, and also research compute. Ben, I think that for all of us, quite frankly, we have to really, at the end of the day, that’s why I think quarterly earnings are interesting, which is, of course, The Street should hold every one of us very accountable for “What did you do for me lately?”. But was that a very particular, annoying, being held accountable for the wrong thing? SN: It’s their job, everyone’s got to do their job, and so I can’t accuse them of them asking, “Hey, what did you do for me this quarter?”, that’s the question they rightfully should ask. And the right answer for me is, “I’ve done enough for you this quarter, and we’re also making sure that 10 quarters from now, Microsoft’s continuing to thrive”, and that’s the job, and sometimes there’s a little bit of disconnect on it. But when I look at the three things, you just have to be disciplined that you’re doing what you can add value, it can’t be, “Oh, I’m misallocated”. To your point, you get punished if you do things where you’re not producing. So that’s why research compute, here is now an MAI model output. Today, it’s just not a model output as an academic thing, that’s now in differentiating our Foundry where we now are able to license it, it’s going to grow Foundry revenue. And so as long as I’ve felt that as long as Microsoft can continue to invest in ways that show results, then we will have the ability to do the right thing in the long run and in the short run deliver results. For the last quarter, was there a bit of, “Let’s give a little bit more compute to Azure?” SN: Last quarter, no. In fact, that one was just a little more of the compute — we are supply-constrained. I know, but that’s what makes it so interesting. SN: We are not at all, like at this point, if anything, the thing that we do not want to do is to disappoint especially our enterprise customers on Azure. That was the question, right? Because if they look at that quarter and they’re like, “Hmm, Microsoft’s saying we’re supply-constrained and also we’re prioritizing our higher margin, higher lifetime value businesses, where does that leave me? I’m competing against my supplier”. SN: That’s one of the reasons why we had to make some very hard choices around, for example, raw GPUs. We’re not selling raw GPUs to a bunch of Neolabs, for example. I wish I could add more Neolabs on Azure, we just cannot. And so therefore, we are being very disciplined on some business that we turn away. Were those some of the conversations you had to have? SN: Yeah, and so to me, in a world where you have constraints, you want to basically make sure you’re building for both what the world expects and the customers who have trusted you in the longest and so we will definitely make sure that Azure has capacity, it’s just that we are not going to go for what I’ll call in this context, “easy money”. Which is, you can always, in today’s day and age, if you want to have short term Azure revenue, it’s pretty easy. Oh yeah, we’ve seen that , to say the least. SN: Yeah, all you gotta do is turn up, you know, and go sell it to a Neolab. So when it comes to AI infrastructure specifically, as you look out in the long run, you mentioned it may very well be rational for the frontier labs to build their own hardware, for example. You have all these Neolabs, you have whatever controls [Nvidia CEO] Jensen [Haung]’s allocation of GPUs, you have different ASICs, what is your true differentiation as a hyperscaler? Is it just lower cost of capital? SN: First of all, think of our hyperscale business as this portfolio, everything from what we are trying to get done is build a system which we have to be competitive in when it comes to tokens-per-dollar-per-watt, that’s one side of it. We can unpack that and what our thesis is there. Well, I just noticed when you were talking about some of your chips, sometimes it was tokens-per-watt, sometimes it was tokens-per-dollar. SN: Yeah, I think of all three, right? It’s like tokens as a function of both power and dollars and so that’s a systems thing that we have to be world class at and be competitive at. And I would be able to claim, and that’s where I think [Microsoft AI CEO] Mustafa [Suleyman] talked about it, like unless and until you build your own model, you can’t, there’s no point. I believe that you don’t want to build accelerators without building a model, you kind of have to co-design. In the long run, the only way to be super efficient on that is to think about, the network is a great example, which is you want the network, the model, all to come together in ways that make sense, so therefore that’s one side. Then the other side for us is the differentiation has to come from, “If I’m building agents on top of this infrastructure, what agents does Microsoft produce?”. I have three domains in which we are going to try and major on: coding, security, and knowledge work. Luckily these are three massive domains where tokens make sense — I’m not saying there won’t be others, science is another one we will enable but I think there will be others who will do great work in there. But to me the three primary domains in which all this is going to be exercised use. So when I think about the portfolio of building a system plus model plus these three domains, then I feel like that’s where our differentiation will come from. But is that just a re-articulation of circling back to, in the long run, our true differentiation is from our higher margin, our own businesses, higher LTV? Where does that leave just customers who— SN: I think it’s not higher margin. The overall margin dollars from our infrastructure business may be higher. In fact, they already are getting close to being higher than our total margin dollars from our high margin businesses. So I think that Microsoft has always benefited from having a portfolio of businesses, and we’ve been comfortable managing through it, where it’s not one margin profile. But in aggregate, we will have high ROIC, and we will make sure that we have an infrastructure business that’s got ROIC that’s commensurate with an infrastructure business, and we have a business that builds on top of it, which I’d like call it like the new apps are agents. So we’ll have agent businesses in security, in coding, in knowledge work, as the three big domains. We’ll get to agents in a little bit, but I didn’t expect to ask this question, big news this week, will you ever issue equity to fund this build out ? SN: Yeah, I just saw the news, I think Google just did it. Were you as surprised as everyone else? SN: I’m not sure, exactly, I’ve not studied it, it came last night, I think, so I’ve got to go understand what’s happening. But, it’s like maybe it’s the thing to do is everybody is going public or reissuing equity, maybe that’s the season. Gobble up some of the money. Is software dead? SN: I think software is alive, but the way I think this entire meme has come about is, like, if you take the SaaS question in particular, right? We built in a particular way where I had a data model, and then I had a business logic tier, and then I had a UI tier, I coupled the three, then had a business model. Integration is a beautiful thing. SN: Look at this, Ben, right now, we took what is the database that no one knows about underneath Microsoft 365 and said, “Oh, WorkIQ is available , it’s just a skill/MCP, and it’s out there”, and suddenly people are falling in love with, “I can now interrogate and have an agent continuously hit this database to reason over and plan over, act over from any place”. By the way, it requires a new business model. So, for example, when Cowork is using WorkIQ, that’s going to be a usage-based business model, so I think what needs to happen is we now need to take what we built, rebuild it for the agent era and change the levers of the business model such that you have a per-user business model and you have a consumption business model. So the hybrid business model, you do think that is going to be the future? SN: 100%. And once you have that then I think what happened between servers — even I had not understood it when we moved to the cloud, even I was a little worried about, “Oh man, we move to the cloud, we’ll sell the same servers”, and it turned out we sold a lot more subscriptions because people who never bought servers from us were buying subscriptions. I think that’s what’s happening already with agents, I see that on GitHub, I see that on M365, I see that on security, because everyone is building these agent systems that are continuously “working” and so what we built and thought of as the end-user compute is completely getting rebuilt. Is there a bit where, if you have to zoom out a hybrid system where a combination of per-seat but also usage, where does E7 fit in this idea, it’s like double the price, it seems it’s an attempt to respond to maybe a secular decrease in seats by increasing ARPU? Is that the right way to think about it? SN: The way you think about this is, see per-seat is a very important element still because what is per-seat? Per-seat is basically a set of usage entitlements, so anyone who is budgeting really will push you. That’s right, people don’t like usage, we’re seeing that right now , it could explode . SN: Exactly, so therefore you just want to take packaging or bundling of usage into proceeds so that there’s some way for people to budget. So I kind of think about the E7, E5, these things will continue and then you’ll always have the outcall consumption. People also talk about, “Hey, maybe people want outcome-based pricing”. Outcome-based pricing, we’ll be thrilled about some of that, but remember, outcome-based pricing is also called royalty. When a customer has a great outcome, they necessarily don’t want to share their outcome so I think what is really being thought about is, ultimately, there is real marginal cost to software, that’s kind of what it is, and that’s going to be priced through. When did that really click for you, the implications of that? SN: I think that I would say agents. Before agents, if it is still human interaction— Right, you can imagine a world where just like basic inference got super cheap and easy. SN: Exactly, the Moore’s Law itself. Like, if you think about it, if I just used Moore’s Law, get software efficiency, I used software for efficiency and drive that home for customers to have more functionality. In fact, I used to always think about, “Hey, how much more value did we add in M365 and not raise price?” — we didn’t raise prices for a decade plus. That’s all thanks to the software efficiencies on top of hardware. But now where you are, and if you have a thousand autonomous agents that are all working continuously 24/7 hitting Work IQ, then that is a lot and so that is where I think, and so the real test for me Ben is, that’s why evals, outcomes — no customer will use consumption or their seats if it’s not creating value for them. Therefore, they now are going to be a lot more disciplined on, “What exactly did this stuff do for me?”, “How do I measure it?”, “How do I get into the efficient?”. And if you think back to going back to the 80s or 90s, where back then it’s like, “Don’t waste time on optimization, the next processor will come out and solve all your problems”, is that now totally the wrong paradigm? SN: In some sense, you want that to happen, but you can’t just count on that. It will happen, but your prices will explode. SN: Exactly, and more importantly, you will be found out if you don’t optimize. Take that example we showed with Land O’Lakes today, which is, here’s an agent, and there is an outcome you care about, I was able to use a model that is using 500B, I was able to use a 5B, and have it really deliver the same outcome, why would I not use that? That does seem to be a very different thing about this period. It seems clear that’s going to be a huge thing in enterprise going forward, using the right model, optimizing, it’s like we didn’t get to the optimization stage of the PC era. SN: That’s right. I don’t think we ever did get there. SN: We never got there. Stuff’s still bloated as ever, because everyone just assumes it’s going to get faster, it’s going to be fine. SN: Exactly, because things were not priced for it. Once you have consumption, everyone will optimize. For E7, it does seem like the real lure there is Cowork . It’s like this new capability, it’s super powerful, it’s taking Anthropic’s Cowork, which is on your PC, now it’s in the cloud, has all the niceties around that, permissions, controls, all those sorts of things. Is that why it’s there? Is that the hook? SN: Yeah, there’s also the Agent 365 , so there’s a whole lot. Like always, these things, we’re going to take everything from what I’ll talk about as what is an end-user thing and an IT thing, bring it all together. You guys know bundling. SN: And security. Yeah, definitely, and they’re all about, ultimately, how do we get the value equation right such that the customer can cover, because right now, it’s kind of fascinating. You have an agent, you immediately say, “Oh, I’ve got to secure it, I’ve got to have observability on it, I need a sandbox for it”. So it’s just that if you don’t bundle, you kind of are sending the customer down the chase of five different things. With that, though, the reason I find that striking is you’ve talked a lot about — to what extent do you think the point of integration that really matters is it does seem to be increasingly between the models and the harness themselves ? You’ve talked about things like your CoreAI initiative and GitHub Copilot, a lot of which is, “We’re going to build the harness and you can slip the models in and out”, and that works right now for Copilot and you can choose your model and even then, from what I’ve heard, not quite as easy as you might think it might be, but it’s still there, the selector’s there. Cowork seems like, “Yeah, that’s right, it has to be the whole package and it’s important for us to have a selling point on E7” — that this feels like maybe it’s not easily substitutable. SN: No, it is. The same thing on Cowork. In fact, right now, the Cowork that I’m using is already mostly defaulted GPT. Okay, so it is going to be fully interchangeable? SN: We’re using the same harness that we use in GitHub and the same thing in security, too. So we have the same harness that’s a multi-model harness in which we will rotate through — obviously MAI by default gets trained in our harness, but we will have GPT, we will have Anthropic in there and any open weight model. We will allow anyone to take any of the models they fine-tune or build. In fact, they can take an open weight model from Fireworks, tune it, put it into Copilot, no problem. All right, so I am misinformed, so I will take the L on that. Explain what is Cowork then and what is the connection with Anthropic as far as that product goes? SN: Cowork, to me, it’s kind of like Copilot. I took the term Cowork, it’s part of there and it’s definitely got the Anthropic models in there. Cowork is — think of it as a form factor, the best way to describe it is we built a chat interface first for Copilot, then we now have built Cowork for Copilot, and now we’re building autopilots, as I described it there, think of it as the enterprise-grade OpenClaws. So basically, I think of these as different form factors of agents — chat was the first thing, Cowork is the next thing and in fact, you can even go back to the developer thing. Developers, how did we start? We started with code completions first, then we went to— I get all this, but I’m genuinely confused here, because I go back to the blog post . It says, “Working closely with Anthropic, we took what they’ve done with Cowork…”. SN: Yeah, that’s what we launched first. All I’m saying is it’s evolved. It’s kind of like, Copilot today. Got it, which started out with ChatGPT. SN: ChatGPT, now it has both Opus and GPT models. Got it, okay. SN: So, they’re going to be all over. All right. So, I wasn’t completely off the reservation. SN: That’s right. I failed to catch up, I will accept that. [ Editor’s Note: the FAQ for Cowork still says it uses Anthropic models, just like the original blog post ] SN: Every product of ours, you’ll have both Anthropic and OpenAI models, and MAI models, and your ability to put your own models, and that, I think, is the fundamental promise. Oh, by the way, I should mention this. The amount of auto — I don’t know how much you’re doing selection, I’m mostly auto — and so then one of the biggest pieces of work at Microsoft is all the training models to do auto-routing. That, by the way, is perhaps one of the biggest continuous learning things.** It’s interesting because I probably approach it more from a consumer perspective, so I just literally choose the app that I want to do something in or call from the CLI. What happened to Github Copilot? You’re talking about it very positively, but I think a negative spin would be two or three years ago, you were first to market with autocomplete, everyone assumed you got there, you won, and now it’s like, “We’re going to catch up with GitHub Copilot”. SN: I think what happened is this is one of those classic cases — remember, it was a tools business before, and now it is the business, who would have thought that coding is everything? Right, it should have been everything, but it seems like for some period of time, it wasn’t? SN: For us, I think what has happened is we have continued — there are two things that are happening in GitHub, before I even talk about Copilot, I should talk about GitHub. All these coding agents have shown up to work, and where have they shown up? In GitHub. And so the first thing that, quite frankly, I wish we had anticipated better, was the amount of agenting. The whole GitHub reliability thing is like one thing, but for Copilot specifically. SN: I’ll say the first thing, that’s kind of, at some level I take that job seriously, because job number one before you want to get to Copilot is go make sure that we are scaling, so let’s leave that alone. There’s a lot of people very unhappy about that. SN: Yeah, and we’re going to work it and they should have higher expectations of us and we need to deliver for them. Then the next thing is on the Copilot side, you’re absolutely right, we started by saying, “This must be just a code completions thing in the IDE”, we added chat, we added tasks, and guess what? Let’s give credit where it needs to be given. Anthropic showed up with a model. Well, this is like Cursor’s story , they ate your lunch even before Anthropic did. Or you’re saying that that was also an Anthropic story? SN: Not really, I mean it’s kind of like Cursor/Microsoft, it’s like Borland v us , it’s not like that was not the end all be all. It was really the Anthropic coming in with a completely different approach, a more agentic approach. SN: That’s right, with a different approach. With a model and what they’ve done there, and essentially the agent loop is what the change was. In fact, if you look at it, Cursor never, total volume-wise— They got eaten by the same thing, they’re facing the same challenges. SN: Also even the market share and so on — Cursor did fantastic, they forked VS Code, did a good job, lots of credit to them. But the real thing was agentic coding became real and now the good news is the agentic coding really drives — people want choice, we will be there, we will have our own models. GitHub itself and Copilot itself will have both the Anthropic and Claude. In fact, the rubber duck feature is my most favorite feature , which is I can use it to check the others. The headline announcement from this week, I guess is these new Nvidia-based PCs running Windows . However, the announcement I found much more interesting — or not an announcement, preview — Project Solara , viewing these devices as ways to access agents in the cloud, totally different center of gravity. I don’t know if it was you that said it or the presenter, something which I thought was really compelling, which is a limitation of wearables is if you have to interact with them continuously, they get very tiring, so their utility is fundamentally limited. But if you can ask an agent to do something, then you can go do something else and meanwhile, it’s running in the background. Super compelling. I guess the question is, this feels totally different than Windows — it was weird to start this keynote talking about Windows and the AI PC, and that’s nice, and local inference, but this is like, “Actually, what if everything was in the cloud?”. SN: Yeah, I always find this frame back from 2014 of ubiquitous computing and ambient intelligence and it’s becoming more and more real each day. First of all, the first part of it was, “I’m so thrilled to have these Windows machines”, and the fact that Jensen had that beautiful slide, the picture of him with all the desktops, I was like “God, yes, I’ve been waiting for it”, which is it’s great, so I think because it makes sense, it makes logical sense to have powerful silicon systems with power that really have it with unmetered intelligence. When I worked at Windows, I had to like furtively hide my iPhone and then it was okay to show up on campus with an iPhone, now I’m here with a MacBook Air — next time I interview you do I have to feel bad that I don’t have an Nvidia AI PC? SN: You will always have choice, Ben, and I hope you choose the right thing. I’m excited about that stuff because I think there’s unmetered intelligence, even there was one little feature that we showed, which is that ability to have eight agents running continuously, analyzing logs and so on, but all of them were unmetered. Right, but that feels like it’s a side project, side quest. SN: It’s kind of like a billion users all having that, that’s not a side quest. To me, it’s as fundamental as like I think the people are going to want for their knowledge work, for their security work, for their coding work, machines— They’ll want for themselves. Is this actually the new consumer/enterprise separation? SN: The enterprise — the business model, we had this long conversation about enterprises continuously optimizing — in fact, I think the biggest value prop of a Windows machine in the enterprise will be unmetered intelligence. So people are going to say, “Oh wow, instead of having my cloud bill keep going up, I’m going to have Windows machine and amortize it that way”, so I think that there is going to be a real value to — because in a world where you have infinite amount of tokens you want to consume, you want to optimize, and why would I not optimize using everything? I don’t know, I just feel like — as you know, I’ve been very impressed with the job you’ve done with Microsoft, ending the stranglehold Windows had on the company, I still remember I was actually in the Bay Area, I was sitting at the bar at The Westin by the airport typing The End of Windows , recounting all these things you did to not kill Windows, but not make it the center of gravity for the company. SN: And that I think is what goes to Solara. I don’t think Windows, we are trying to make Windows— SN: Solara, to your point, I thought it was a great question, because the thing that I want us to take a shot at is the following which is, “Can you think of a platform and platform rules, by the way, which are built for the agent era?” — because right now, what is everyone else who are “platform owners” who will try to move from the phone to this wearables will try to bring their apps to the same game, right? I want to open that up, so I would like, for example, like what we were able to do with Teams devices , and that’s where we built some of this sort of distribution capability, so I want to use that connected to this agent world so I’m excited I’m in MediaTek, Qualcomm. Well I have a great analogy for you, I think. So there’s a bit where I think you just circle back to the great job you’ve done as CEO — this is the butter-up portion of the interview — there is a bit where I think you benefited from following the follower as it were. Steve Ballmer’s one that had to go after Bill Gates and he for better or worse created the conditions for you to succeed, I think is one way to put it, is it possible that for this, your opportunity device space — like can Apple ever really make an agent that works everywhere as long as they’re stuck on the phone? SN: That’s a great question. That is the question for all of us which is you know the reality is it’s easy to say for someone who’s been so successful with something that in face continues to have a lot of success and say, “I’m going to burn it all down and build something else”. But to the point, the way they’re architectured, everyone’s vertical. SN: Exactly, it’s not natural. Like you think about it, we’re saying, “Building agents is easy”, the SOCs are jumping out everywhere, they’re there, the silicon is easy, the system is easy, the operating system is built, and now you’re telling me that I have only one choice for an ambient thing in a hotel, in a restaurant, in a healthcare setting? It makes no sense. So therefore, I imagine that building these ambient devices using Project Solara will be as easy — if you’re successful a year from now, everybody, even in the enterprise, is going to say, “Oh, I’m just going to order a bunch of these things from a no-name ODM who just built it for me”. I think it’s super smart to start at the enterprise only. Do you have dreams that maybe this will eventually spill over? SN: Right now, I want us to again do what I think is natural, like where am I seeing people— Well, that’s where you have the Microsoft 365 environment, you have all the context there. SN: And also the agents, where would people build agents? The thing is, the consumer one will be like, “I need the one agent I want”, so it’s not like I’m not building a Copilot device, I’m building an agentic platform where the healthcare provider can have their own agent, so that’s the right place for Microsoft to start, let’s see how it goes. One last question. You had a data center segment appropriately focused on communities, you talked about things like paying your way for electricity, not using water, building up the tax base, education, etc. Why not just pay the residents ? Just pay them a dividend? SN: I’m open to all ideas here, I’m not close-minded at all because at the end of the day, I think the fundamental thing you’re asking about is, “How does this industry, including Microsoft, have permission to do what we’re doing in terms of infrastructure build out?”. My theory is we get to everything backwards in the US, this is how we back into UBI [Universal Basic Income], is we’re just paying people to build data centers. SN: Yeah. And I mean, one thing that I have an issue with things like UBI and so on are the— I’m anti-UBI. That’s how you get there while being anti-UBI. SN: I want people and communities to have control, have agency, humans to have real dignity in their work and you’re 100% right in saying, “Look, we have to do what it takes to get that permission”. And so right now, there’s so much about our industry that’s so glorious, so good, so great. What about the you’re going to lose your job part? SN: Yeah, that’s the problem. Self-obsession about our own glory and our own — if you’re not creating opportunity, why would anybody want you to succeed? That’s the fundamental memo that needs to be re-sent to everyone across our industry, and then we have to live up to it. Satya Nadella, great to talk to you again. SN: Thank you so much, Ben, as always. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Unsung Yesterday

Good type against all odds

This is not italics. This is not even oblique . This is a side effect of how those displays work. Instead of a whole rectangle of pixels being changed at once, the display is updated line by line, starting from the top one. As it’s moving towards the bottom, the internal horizontal position might have already advanced, the subsequent lines will be drawn slightly to the left, and it all leads to a slanted appearance. (This is in effect the same problem as rolling shutter in photography.) The interesting thing is that it could’ve gone the other way. Twice. In English or German, we treat scrolling left to be natural, and we consider only one direction of italic slant good. The first has to do with the direction of reading. I believe the second is, like many things in typography, customary; there’s nothing inherently better than right-leaning letters, except we’re used to them since those are the only ones we see. But, the person putting it all together could’ve just as well done it the other way: scrolling to the right, or slanting to the left (by updating the display bottom to top – not as unusual as you might think!). Were those intentional choices, or was it a default? I’m not sure, but it points to the value of knowing this stuff, or creating a culture where this stuff is treasured. Often, more craft will require more work. Sometimes, however, you will get it for free – but only if you choose the right fork in the road. While we’re here, how about a few other examples of delightful moments in typography where I did not expect them? These, I believe, will be all intentional. But whether you consider them craft, or even good, I don’t know. Here are some surprising small caps: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/2.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/2.1600w.avif" type="image/avif"> Here’s a cute depiction of a train carriage, somewhat hampered by the limitations of a similar workhorse 5×7 pixel font display: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/3.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/3.1600w.avif" type="image/avif"> But here’s something even better. This icon of a stadium cleverly leaned into the same limitations. It’s so delightful. These are, I believe, four characters side by side: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/4.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/4.1600w.avif" type="image/avif"> Here, someone added nice decoration to fill out the space: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/5.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/5.1600w.avif" type="image/avif"> Here, someone removed all the line height to create a fascinating vertical ligature. This is Gorton and the letters are carved into the plastic, so this required some effort! = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/6.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/6.1600w.avif" type="image/avif"> Speaking of obliques, this NOT is too thick, and slightly too large, but you have to appreciate someone actually slanting the text rather than underlining it, or decorating in a simpler way: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/7.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/7.1600w.avif" type="image/avif"> Even if you underline, you can go a little… well, below and beyond: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/8.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/8.1600w.avif" type="image/avif"> Or, here, with maybe the most impressive, three-dimensional underline I’ve ever seen: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/9.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/9.1600w.avif" type="image/avif"> = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/10.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/10.1600w.avif" type="image/avif"> This I spotted on an old typesetting machine, and I would like to believe this is an intentional easter egg: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/11.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/11.1600w.avif" type="image/avif"> This was on a computer keyboard. You don’t expect hyphenation in this context… = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/12.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/12.1600w.avif" type="image/avif"> …and you definitely don’t expect an old-fashioned contraction: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/13.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/good-type-against-all-odds/13.1600w.avif" type="image/avif"> #above and beyond #craft #details #real world #typography

0 views

Is datacentre sovereignty really that important?

In the UK (and I'm sure elsewhere) politicians and commentators are falling over themselves to suggest that without huge fleets of datacentres built in the UK that we are going to be hopelessly left behind. I'm not convinced this is the case, and it risks really falling into the same (mostly misguided) obsession many politicians have for heavy industry revival. This is going to be a rare UK-centric post on my blog. Apologies for my mostly global readership; the argument may be different where you live. One of the first (and most easily dismissed) arguments I've heard is that without datacentres close to the users, the latency will be too high to use AI services. This would therefore make them too slow to use. Clearly, this isn't the case. Nearly all AI use cases are not hugely latency sensitive. To put this in context, the time to first token (how quickly the AI responds) on Opus models is between 1.6s and 3.6s. The round trip latency introduced from the UK to the East Coast of the US is around 80ms, to Europe 10-20ms, and to Asia around 200ms. So the latency on the providers side is orders of magnitude higher than the latency for a UK based user to reach an overseas datacentre. It is fair to say that real time voice or video applications benefit from lower latency than these typically text based use cases. But these are a tiny fraction of AI usage (at the moment) and even in that case European datacentres can provide reasonable latency for these - it doesn't have to be in the UK itself . And my personal belief is that real time audio based agents are likely to work best when they can run on device entirely (so there is 0 network latency) - so without a data centre requirement at all. Regardless, many of these same commentators also suggest locating datacentres in the very north of Scotland (to take advantage of the excess wind power), but ironically these would have significantly worse latency for users from the densely populated south of England - Paris, Amsterdam, etc all are closer, and thus faster to respond. The next argument that is often floated is that it becomes a tax base - in the UK business rates are applied to commercial buildings and are paid to the local authority in question. The formulae for calculating this is in true UK tax law style overly complicated, but in essence it works on the rateable value of the building in question - what the estimated annual rent would be to rent the property - including relevant fit out. This is then multiplied by 0.508 to arrive at the annual business rate value. To take a very rough example, my research found that buildings 5-8 in the Virtus London campus can support 100MW [1] of load. These are valued as far as I can tell at around £12m/yr of rateable value. So the local authority (London Borough of Hillingdon) gets approx £6m/yr from this in business rates. If we scale that up to 1GW, it's fair to say that the local authority might get somewhere close to £100m/yr of business rates. While this is not nothing - and certainly gives local authorities a valuable source of revenue - it really is a rounding error under the current system . If we moved every single datacentre under construction globally (30GW) to the UK instead, it would bring in approximately £3bn/yr, or around 0.2% of government spending. Detractors may say that this is the current system and the tax base could be changed. But by doing that you massively reduce the attractiveness of the UK as a place to build the aforementioned datacentres. And the potential tax rates to be at all material would have to be punishingly high. This combined with the extremely high price of electricity in the UK would make it completely unfeasible to operate them in the UK. It's a similar story with jobs. Datacentres are famously light on permanent staff - the whole point is that they're highly automated, so even a large 100MW site might employ only a few dozen people once it's running. The construction phase is more labour intensive, but temporary, and much of the capex (the chips especially) is spent overseas rather than in the UK. Even on generous assumptions the direct contribution to a ~£2.8tn economy is a rounding error. The final and perhaps most plausible sounding argument is that in the event of political instability it would give us control over AI usage - which is and will be a growing national priority. There are really two versions of this argument. The cruder one is outright seizure, which I'll come to. The more serious one is that in a global compute crunch, having the datacentres physically here means we won't be left at the back of the queue. But this doesn't survive contact either. If a hyperscaler or a frontier lab owns the racks, a datacentre in Slough serves their global demand - not ours. You can't compel a private operator to give UK users preferential access just because the building sits on UK soil. Location buys you almost nothing. The real leverage here is to contractually lock in the compute - which is something the UK government could do, regardless of where the datacentre is. Onto the cruder version, then. I've even heard certain people suggest that in the event of major turbulence in the world the state could seize control of them. The issue with this is multifaceted - but I think has three main failings. Firstly, this is not a steelworks or power plant. The underlying value is not from the datacentre, it's from the models running on the datacentre. If we assume AI model development continues, the value of a 'seized' datacentre decays rapidly. Imagine the UK government had seized control of a frontier labs datacentre at the start of 2025. They'd have access to GPT4o, or Sonnet 3.7. These models are now outclassed by open weight models that you can run on a relatively powerful laptop. They have virtually no value. Secondly, it completely underestimates the supply chain that modern software runs on. It's highly likely that if the geopolitics had got so bad HM Government was nationalising frontier lab datacentres, the frontier labs would remotely wipe the servers before they could be "seized". And that's not to mention that models have loads of supporting software and operational infrastructure that is not colocated with the models themselves. The concept of the SAS seizing servers running frontier models before they can be wiped in the dead of night is probably best kept to Tom Clancy novels - not government policy. Finally, if we are in some alternate reality where the UK/Europe has been cut off from frontier models, we are almost certainly also cut off from most/all cloud services from big tech, which means no (or much reduced) email, video conferencing, card payments etc. Not being able to run Claude is probably the least of society's worries. By no means am I suggesting that AI datacentres shouldn't be built in the UK - they should - and we should reform the planning system to make it easier to build them. But it's important to get this in perspective. Modern information societies are a huge tangled web of globally interconnected pieces of software. Every day you browse the internet you are connecting to thousands of servers located in dozens of countries. Each one of those servers is sending your requests to various other providers - to store and process data. There are genuine requirements for data sovereignty. It may be preferable to host sensitive health data only in the UK, for example. But that's a simple regulation problem (if desired) - require UK based datacentres for this type of data, including AI usage. But this is a tiny sliver of total AI demand. And the world is too complicated to dream in this "Blitz spirit" self sufficiency era, especially when it comes to digital services. The UK in my opinion has many structural advantages for harnessing the economic power of AI. All of the major frontier labs have significant - and growing - labs and offices in London. We have world class researchers and institutions on the cutting edge of AI. And the UK takes the majority of European tech funding. In my opinion, we need to lean into those strengths and ensure we continue to attract and grow these companies and talent. Not worrying about where exactly we should put huge sheds. Datacentres are measured by the amount of servers it can power, in watts (or megawatts (MW) millions of watts/gigawatts (GW) - billions of watts). ↩︎ Datacentres are measured by the amount of servers it can power, in watts (or megawatts (MW) millions of watts/gigawatts (GW) - billions of watts). ↩︎

0 views
Sean Goedecke Yesterday

Anti-AI nostalgia and the cult of the past

Programmers were better back in the day, weren’t they? Back when we had real programmers. Not just people who got paid to write code, but people who lived it, who were obsessed with their craft, and whose code was a lively expression of themselves. Hackers were hackers in those days before money took over the industry. Don’t even get me started on LLMs. Could there be a better example of today’s degenerate spirit? A machine to mass-produce software (not good software, just barely good enough), so that the weak minds that dominate the industry can indulge their obsession with quantity : of slop code, of features, and ultimately of money, which is the only way they can understand value. If they weren’t destroying our way of life, they would be pitiable. All of them together don’t have a fraction of the spiritual integrity of someone like Mel . But as it is, we must band together to crush them and drive them from our industry like the parasites they are. Okay, that’s not actually what I believe. But there sure are a lot of posts 1 and comments on the internet that sound a bit like the paragraph above. Here are some older quotes that might sound similar: …the third collapse, in which power tends to pass into the hands of the lowest of the traditional castes, the caste of the beasts of burden and the standardized individuals. The result of this transfer of power was a reduction of horizon and value to the plane of matter, the machine, and the reign of quantity. 2 Usura rusteth the chisel \ It rusteth the craft and the craftsman \ It gnaweth the thread in the loom 3 The actual accomplishments of the past will nevertheless remain accomplishments, while the artistic stammerings of the painting, music, sculpture, and architecture produced by these types of charlatans will one day be nothing but proof of the magnitude of a nation’s downfall. 4 These are all from the writings (or speeches) of famous fascists: Julius Evola, Ezra Pound, and Hitler himself. Mussolini’s Doctrine of Fascism begins by defining fascism as a “spiritual attitude”, which the fascist man adopts in order to regain the mysterious qualities that were lost by the transition to modern life. In his classic Ur-Fascism , Umberto Eco’s first two defining features of fascism are the “cult of tradition” and the “rejection of modernism”. So when someone tells me that the industry has lost its way and we must deny the corrupting influence of modern technology in order to retvrn to the time of virile real programmers (who understood and appreciated the spiritual dimension of programming), I get suspicious. It’s strange to describe anti-AI sentiment as potentially fascist, since a very popular argument is that LLMs themselves are an inherently fascist tool. Surely both sides of the debate can’t be fascist? I do think that the structure of fascist arguments is generally persuasive , and that many avowedly anti-fascist groups do sometimes fall into this trap: describing the world as a struggle between the spiritual power of the macho, traditional man and the corrupting influence of degenerate (often foreign) capital. For instance, I am a big fan of Lord of the Rings. I’ve read the series and watched the films multiple times, and even made a failed attempt to learn Elvish as a kid. But it’s hard to deny that fascists absolutely love Lord of the Rings. “Marble statue of a Roman emperor” might be the most popular avatar for fascists on the internet, but Aragorn is the second most popular. Neo-fascist movements in Italy explicitly take up Lord of the Rings as a foundational text. Why? Because the core conflict in the text is between the traditional, nostalgic heroism of the Shire and Gondor, and the corrupting modern industrial (partly foreign ) influence of Saruman and Sauron 5 . I don’t think Lord of the Rings (or anti-AI rhetoric) is intrinsically fascist. In fact, the surface-level reading of the text is anti-fascist: the plucky people of the West banding together to fight Sauron’s command-and-control totalitarian society. But I can see why fascists love it. One common historical touch-point for anti-AI folks is the Luddites, who were a violent conservative labor movement in early 1800s England. Anti-AI blogs adopt Luddite language like “smashing frames”, and positively cite the Luddites as “the go-to enemies of fascism since its inception”. I’ve written at length about what we can learn from the Luddites in Luddites and burning down AI datacenters , but one point I think is under-emphasized by the (generally pro-Luddite) books is that the Luddites were a little bit fascist themselves . Brian Merchant’s Blood in the Machine is the most popular recent book on the Luddites. I enjoyed it, but Merchant’s attempts to paint the Luddites as a friendly, left-wing, proto-feminist movement 6 seemed really unconvincing to me. From the writings of the Luddites, it’s clear that they were interested in protecting the rights of their all-male elite guild fraternity. Here’s one Luddite threat to a workshop that explicitly includes a threat against the female workers 7 : We think it quite inconsistent with our duty as men, as husbands and as fathers to suffer ourselves to be ruined any longer by a set of vagabond strumpets and those gibbet-deserving rascals that are looking over them. We will lead them to their satisfaction. We sincerely hope, gentlemen, that you will discharge the bitches and take men into your employ again, or they must take what they get. These were fundamentally conservative people who felt (correctly) that modernity had deprived them of their elite status, handing it instead to lower-paid inferiors: women, vagabonds, and foreigners. The Luddites were obviously not fascists 8 . However, the basic ingredients were there: wounded pride, a masculine elite identity, hatred of modern economics, and violence aimed at restoring their previous position in society. The currents that produced Luddism are the same currents that guided so many unhappy people towards fascism. When things are looking grim for an elite group, they often turn towards any movement that promises a return to an idealized past. If my blog has themes, one of them is surely that many software engineers labor under a delusion that their job is to be excellent at their craft. Of course, wanting to be an excellent programmer is not a delusion; it is a completely legitimate value to hold, and a legitimate purpose to pursue. It’s just not what you’re paid to do at work. Your job , unfortunately, is producing shareholder value . This delusion has been punctured by the end of ZIRP , and again more recently by the rise of AI coding. In this environment, I worry that some software engineers will form exactly the kind of disillusioned elite that was the audience for Ezra Pound’s poems about “usury” or the Luddites’ campaign against unapprenticed (often female) textile workers. I worry that AI, and the companies that build AI, are becoming an enemy against which anything is permitted: an enemy which in Umberto Eco’s words is “at the same time too strong and too weak”, unable to reason and yet powerful enough to drastically reshape the global labor market for the worse. The enemy of fascism is nuance. Fascism presents a good, clean, rousing story about a spiritual conflict between right and wrong. It is anathema to fascism to stop and muddy the waters a bit: in this case, to explore the ways in which LLMs, like any transformative technology, can both support and endanger traditional values. In The left-wing case for AI I wrote about how AI is being used right now as a disability aid, and many disabled readers wrote in to share their positive experiences with LLMs, and often how alienated they feel by the anti-AI mainstream on the left. I recently got an email describing how there’s a sudden flood of accessibility software for blind people 9 that’s actually built by blind people , who can now iterate with a LLM to get a product that meets their needs. Framing AI as an ontological evil erases experiences like these. Being anti-AI is not inherently fascist. Many of the anti-AI posts I’ve quoted are thoughtful, sensitive pieces exploring how the author thinks about one of the biggest changes to our industry. I still think the world needs more articles like that, not less, but the more of them I read, the more I recognize the tropes: spiritually pure lovers of the craft, degenerate peddlers of corrupt modernism, a need to return to the traditional ways of the hacker, and a lament for the (potentially) waning power of an elite fraternity of programmers. I know I’m tiptoeing around the worst argument in the world . It isn’t a refutation of anti-LLM arguments to say that they are structurally similar in some ways to fascist arguments, any more than it’s a devastating critique to say the same thing about Lord of the Rings. Sometimes it is good to try and halt the march of progress! Some of our past traditions really were purer and more spiritually robust! It just bothers me, that’s all. I used to read The Story of Mel with unalloyed pleasure. Now it makes me nervous. If you believe you’re fighting the embodiment of fascism , or for the idea of value itself , what tactics are off-limits? What positions might you eventually come to accept? It feels wrong to directly associate my caricature with any actual posts, but it also feels wrong to make a blanket assertion without examples. Just so you know what I’m talking about, here are some posts that have elements of this attitude. I like some of these posts and dislike others. Page 329 of my copy of Julius Evola’s Revolt Against the Modern World . Ezra Pound, Canto XLV. “Usura” should be read as “usury”, or today we could gloss it as “capitalism”: all Pound’s examples of great art were from the pre-capitalist patronage era of art. Adolf Hitler, from his speech at the 1933 Party Congress in Nuremberg. Of course, there’s also historically been a strong pro -technology current in fascist thinking (even specificially Italian fascist thinking ). Page 134 of Blood in the Machine has a brief argument that Luddism was feminist because the (exclusively male) artisans’ wives would provide food for their meetings. No, really. From Kevin Binfield’s Writings of the Luddites , page 40. I’ve taken the liberty of re-rendering it in modern spelling and grammar. Aside from being too early, they didn’t have any connection to the state apparatus of power (in fact, they were ultimately crushed by it) and they famously lacked a singular leader. The example cited was BlindRSS . It feels wrong to directly associate my caricature with any actual posts, but it also feels wrong to make a blanket assertion without examples. Just so you know what I’m talking about, here are some posts that have elements of this attitude. I like some of these posts and dislike others. ↩ Page 329 of my copy of Julius Evola’s Revolt Against the Modern World . ↩ Ezra Pound, Canto XLV. “Usura” should be read as “usury”, or today we could gloss it as “capitalism”: all Pound’s examples of great art were from the pre-capitalist patronage era of art. ↩ Adolf Hitler, from his speech at the 1933 Party Congress in Nuremberg. ↩ Of course, there’s also historically been a strong pro -technology current in fascist thinking (even specificially Italian fascist thinking ). ↩ Page 134 of Blood in the Machine has a brief argument that Luddism was feminist because the (exclusively male) artisans’ wives would provide food for their meetings. No, really. ↩ From Kevin Binfield’s Writings of the Luddites , page 40. I’ve taken the liberty of re-rendering it in modern spelling and grammar. ↩ Aside from being too early, they didn’t have any connection to the state apparatus of power (in fact, they were ultimately crushed by it) and they famously lacked a singular leader. ↩ The example cited was BlindRSS . ↩

0 views
iDiallo Yesterday

Now that your newsletter is AI-generated, I've Unsubscribed

I've remained subscribed to some newsletters for over 20 years. The authors managed to keep my attention all that time. But then, one day, they decided to switch to an AI-generated newsletter without making any announcement. After a couple of weeks of blue high-tech image thumbnails, I simply hit unsubscribe. Here's what happened: a person earned my trust. He maintained that trust for all those years. But then he thought the best way to improve was to take himself out of the equation. If you're just going to present me with prompt-generated content, I hate to break it to you but I have access to ChatGPT, and I can do that myself. The reason the human voice matters to me is because there's real experience behind the words. The oldest newsletter in my inbox is from when I was just 12 years old. It was from a French writer I used to read. After a decade of following him, the emails stopped coming. I was only reminded a few years later, when the emails started coming back. I didn't jump on it immediately. I didn't even remember who it was. But when I read one at random, the words were different, the tone was nostalgic, and the name was unfamiliar. I dug deeper and found that the author's son had taken over the newsletter. That was my cue to unsubscribe. But he hadn't used AI to replace his father's voice. He didn't use any tricks to garner clicks. Instead, he announced that his father had passed away and that he would share some stories. I remained subscribed until the last story was released. I rarely sign up for any newsletter. If I do, it's intentional because I'm interested in what the author has to say. It's not much deeper than that. There is a big difference between a newsletter written by a person, one that breathes and wanders and sometimes takes his time. Compared to the rapid fire, mechanical hum of AI-generated content. One feels like someone is thinking with you. The other feels like a monetization strategy.

0 views
neilzone Yesterday

Why are there no good tablets at the moment?

A friend was looking for a new tablet, and they asked me for a recommendation. And… I just don’t have one. The only good tablet, because Android can be replaced with GrapheneOS , was the Google Pixel Tablet, and that is no longer available. Secondhand prices are sky high. That was my go-to recommendation for a while. But it looks like Google has abandoned this project too. Amazon’s range of FireOS tablets are, IMHO, bloated with crapware which one cannot easily remove. Even the Fire-Tools scripts only get one so far. I can’t recommend one. There are some fun-looking “tablet computers”, but they are all expensive. A secondhand Surface Go, if one wants a Linux-based tablet, is readily available and pretty cheap, but honestly not what most people will want. And, while I like it as a cheap, touchscreen, Linux machine, it is not particularly powerful, which can be frustrating. And getting the camera working is a nuisance. I guess that there are some iPads, if one is accepting of Apple / iOS. Again, that wouldn’t be my choice, but I can see why some people like them. Why is there no good (non-Apple) tablet at the moment?

0 views
Kaushik Gopal 2 days ago

OpenCode power user tips

In this post, I’d like to talk about some power user tips for OpenCode - an open source , model agnostic harness that more people should be using. Hopefully some of the advanced use cases convince you to give OpenCode (and OpenChamber ) a shot. intermediate to advanced tips only I am specifically choosing to talk about some advanced tips in this post. If you’ve never used an agent harness or are looking to learn how to use OpenCode, this post can be useful but reader beware. While (Ctrl + P) will list out all the possible commands (and is helpful), OpenCode has the concept of a “leader” key (which defaults to ). The leader key allows you to execute targeted useful commands more quickly and there’s a slew of useful ones pre-defined 1 . People reach for whole terminals and extra tooling to juggle between agent sessions. I too had an overly customized tmux setup that looked like this: OpenCode simplifies this. Just hit and you view current sessions and can instantly switch to that session by just selecting it from the list. The ability to quickly rename a session from this view is a godsend for me and what lets me be organized. session directory filtering you can pass a flag to when launching it, which filters the session list to just this workspace/directory by default. You can alternatively not pass that flag, and the session list will show all sessions. Forking takes the session you’re in and spawns a new one. You branch off into a separate conversation while the main agent keeps grinding on whatever you left it doing. I love this feature and even cobbled my own version with tmux long before most harnesses shipped it. Claude Code, Codex and other harnesses have caught up and support this feature. But OpenCode’s UX is the smoothest. You simply type in your chat. It gives you the option to fork the current chat or from a previous point in the message. You can then rename the forked session right from the list ui, and jump back and forth. The easy session switching again comes in handy here. Need to rewind to an earlier point in the same conversation? In OpenCode, there’s no escape-escape dance. leader g shows you a timeline and you can revert the conversation instantly, fork a new session from there, or just copy the message text. Probably one of the main reasons I find it hard switching away from OpenCode. I can bounce between GPT-5.5, Kimi K2.6, and Opus by just hitting 2 . change model & reasoning + switches the model on the fly. changes the reasoning type. I see a future where we will have smaller models we can run locally. OpenCode can point to that ollama model you have running on your own machine too. Click here if you’re curious about my model choices. Not everyone realizes this but OpenCode ships with LSP servers built-in . This means the coding agents inside OpenCode understand how to navigate different programming languages better. You’ll find less file search and grepping. Anthropic even recommends LSP server integration as an advanced move for making harnesses behave in large codebases. OpenCode gives you much of that for free. The other reason I swear by OpenCode: hit to cycle through custom agents. Here’s a few I use a lot: view the subagent work When an agent fans work out to subagents, + pulls up the subagent view so you can watch them work. Like others, you can use OpenCode for scripting and one-shot reviews: So up until now, I’ve mostly talked about features in the context of the TUI. My good friend YY recently introduced me to OpenChamber and it’s changed a lot of things for me. OpenChamber is an OpenCode GUI wrapper. OpenCode already has a web client btw. But OpenChamber has a lot of nice bells and whistles. But here’s the kicker, it’s using your same OpenCode server. In a previous post I dug into OpenCode’s server-client architecture: you run OpenCode as a server and connect multiple clients to it. A client can be a terminal tab, your phone, a desktop, a browser — each an isolated session pointed at the same server, fully synced. OpenChamber is just another client, but a super powered GUI one. This feature has taken the world by storm; especially since Codex introduced their implementation. OpenChamber gives you this feature for free with a super nice UX. One button click and either using or internally, it opens a secure 3 tunnel that you can connect your phone or another client to. So now, your phone controls OpenChamber and by proxy OpenCode exactly as you would from your computer. This was possible with OpenCode and tailscale too (as I mentioned in my previous post) but OpenChamber’s UX and secure tunnel approach makes this fluid. I almost never take my work laptop with me, when I’m getting out of the house now. Just speaking to my phone and a browser tab that has OpenChamber open. The other OpenChamber feature I lean on: multi-run. You have a prompt and want to try it across several models at once. I think Cursor was the first to introduce this feature. OpenChamber provides a super nice UI for this. This is how I’ve been kicking the tires on Opus 4.8 and updating my model choices . There’s just one caveat to be aware of. OpenChamber by default probes for a running OpenCode server. If it doesn’t find an OpenCode server there, it will silently spawn its own. So if you truly want all your sessions in sync, you should start your OpenCode server on port first, then open OpenChamber regularly and it’ll attach to the one you already have. I have a handy shell alias to just start a background OpenCode server now like so: If you didn’t read this tip in time, and need to kill previous OpenCode server instances, I suggest the handy procs cli command. There’s a lot more to both OpenCode and OpenChamber, but this is the stuff I reach for daily. The bit that’s stuck with me most is the one-server, many-clients setup — run a single OpenCode server and point everything at it: the TUI, OpenChamber, your phone. Steal whatever helps here, and if there’s a tip I’m sleeping on, send it my way. OpenChamber v1.12.0 tunnel bug Heads up: OpenChamber v1.12.0 added a headless web app mode, and remote instance switching now changes the OpenChamber API endpoint without loading the full remote UI. This seems to have busted the remote mobile tunnel setup I describe above. :/ The developer is responsive and working on a fix 🤞. Until then, I recommend sticking to v1.11.7 , which you can download manually. You can also bind commands that don’t have a predefined key. As an example, I bind the “Exit the app” command to so I can quit OpenCode quickly.  ↩︎ yes yes, you’re probably nuking your prompt/KV cache, but you shouldn’t have long running conversations anyway.  ↩︎ one-time + TTL + revocable connect link  ↩︎ + switches the model on the fly. changes the reasoning type. red-team — think differently from the implementer with an independent adversarial lens and hunt for failure modes. ghostwriter — drafts messages, posts with a less AI tropey voice. brainstormer — custom agent that’s explicitly tuned to help me brainstorm ideas, plans etc. pr-reviewer — strict reviewer that ignores past conversation and reviews with fresh eyes. kimi-coder — a coding agent guardrailed to Kimi: fast, cheap implementation. agent-kombat — see my agent-kombat post. I have it wired into a custom agent for quick use. You can also bind commands that don’t have a predefined key. As an example, I bind the “Exit the app” command to so I can quit OpenCode quickly.  ↩︎ yes yes, you’re probably nuking your prompt/KV cache, but you shouldn’t have long running conversations anyway.  ↩︎ one-time + TTL + revocable connect link  ↩︎

0 views
The Coder Cafe 2 days ago

Services in Space

☕ Welcome to Lattes & Stories! Today, I’m taking you back to 2010, to the worst job interview of my life and the internship that followed at a space company, working on an architecture that was about to quietly reshape the entire software industry. I was 22, my English was terrible, and I could feel something important was happening. Get cozy, grab a coffee, and let’s begin! 🔔 If you only want to receive notifications for the traditional Concepts posts, you can configure your notifications in your settings . Interviews December 2010. I was finishing my last year at a French engineering school. I’m not sure if it’s the same in all countries, but in France, most engineering degrees require a 6-month internship to graduate, and I was looking for mine. I was searching for a good internship when, at some point, an ex-student who graduated the year before me sent a message to the school mailing list about a company called Ariba 1 that was looking for interns. San Francisco-based . I read the email and thought: “ Why not? ” With a friend of mine, we decided to apply and got an interview with their tech team. Let me tell you something first. At that time, saying my English wasn’t great was an understatement . I could read it reasonably well. Speaking was another matter entirely. And this was going to be my first job interview in English. To prepare, I had written a long document with answers to the most common interview questions, something I could read from during the call. My interview was scheduled on December 21st. The pressure was starting to build. The day came, I joined the call, and what followed was not one of the worst interviews of my life. It was the worst interview of my life. “Hello?” “Hi, is this Teiva?” “Yes” “Hi Teiva, this is … [Two minutes of him talking without me understanding a single word] ” Do you know how long two minutes can feel when you’re supposed to be having an important interview and you realize your English is too bad to understand anything? The longest two minutes of my life. At some point, he stopped, and I assumed it was my turn to speak. I started reading the speech I had prepared on my screen. Except I couldn’t read it. The stress had generated enough heat to fog up my glasses completely . I took them off, but I’m nearsighted, so I had to put my face about 10 centimeters from the screen just to see the words. Take a moment to picture that… The interview continued, and it was an absolute disaster . That same night, I met my friend who’d had the same interview. We had probably one of the biggest laughs of our lives when he told me he’d made the interviewer repeat even the very first “ Hi, is this Marc? ” sentence. Anyway. In my head, it was time to move on. In parallel, I had applied to another company and ended up getting a yes there (the interview was in French). Moving forward to the end of January. My internship was starting the following week. Everything was settled for my move to Toulouse . I was at the supermarket, standing in line at the checkout, when I received an email: To this day, I still don’t know what happened. I swear this isn’t false modesty, but my performance during that interview had been frightful to say the least. I was thrilled, but I had already signed with the other company. The timing was too tight, and I didn’t feel comfortable walking away from a commitment. I ended up turning Ariba down . The other company, the one I ended up doing my internship with, was Astrium , a space company known today as the space division of Airbus Defense & Space . The domain was really interesting. Astrium’s team was building software for spacecraft control centers , the ground systems responsible for monitoring and commanding satellites in orbit. Think sending telecommands up to the satellite, receiving telemetry back down, and making sure nothing goes wrong. The title of the internship? Service-Oriented Architecture (SOA) for spacecraft control centers. For those who may never have heard of SOA , let me give you a quick historical introduction. In the old days of IT, the unit of deployment was the application. When we bought a new application, there was no strong standardization for data exchange ; nothing like the REST or gRPC interfaces we take for granted today. We could rely on complex standards like CORBA , which was designed to exchange data across applications, or we could build our own communication protocol on top of TCP, for example. Either way, integrating with a new application was painful. From that pain, an idea started to take shape: what if the granularity switched from the application to the service ? What if a standard contract could act as a middle layer between two applications, reducing coupling and making the whole system easier to evolve? Does that sound familiar? It has strong similarities with microservices as we know them today. The main differences between SOA and microservices, in my opinion: SOA came with a design-time registry for service discovery, such as UDDI , and other governance layers that simply don’t exist in microservices today. The granularity was different. There was no concept of micro . Containers weren’t yet popular, and applications were harder to scale, so services were mostly macro in scope. It’s hard to fully convey, but I was already fascinated by service architecture back then. To me, it felt like a revolution, and joining a company that was at the very beginning of its IT transformation and trying to make the case for why services mattered was a dream internship. The first part was mostly about that: exploring SOA, building transferable knowledge, and convincing people about the benefits. Astrium wasn’t the only one thinking along these lines. There’s an international consortium of the major space agencies of the world, in charge of defining standards for space data systems, called the Consultative Committee for Space Data Systems (CCSDS) . Founded in 1982, it brings together NASA , ESA , JAXA , CNES , and around 26 other agencies. The motivation was easy to understand: most existing spacecraft control center architectures were closed, monolithic, and built with no thought for reuse. The result was a lack of interoperability between solutions, skyrocketing deployment costs, and systems that were nearly impossible to evolve. The CCSDS had been working on exactly this problem. They defined various blueprints, such as the Mission Operations Monitor & Control Services , a set of standard service interfaces for spacecraft control centers, including: An action service to submit executable tasks (e.g., spacecraft telecommands) An alert service to emit notifications of operationally significant events or anomalies A parameter service to subscribe to parameter value reports from a remote system It was a bottom-up approach : the CCSDS defined the contracts, and to be part of the ecosystem, your service had to respect a standard service interface: As a service provider, if your service matched the contract defined by the CCSDS, it could plug right in. And one of the main benefits was this. Back then, buying a software component for your ground segment meant being tied to it for years, and switching was painful because nothing spoke the same language. A standard interface changed that: if you respect the service interface, you can easily switch to another CCSDS-compatible service , for example, because a competitor is less expensive. That may not sound impressive today, but back then, for a company like Astrium, this was a genuine win. Better interoperability, less vendor lock-in, and for the first time, a path toward a more open ground segment. A significant part of my internship was about studying SOA, exploring those standards, and communicating why a service-oriented architecture was the right direction. The department wasn’t fully sold on CCSDS. It was a new standard, and it would take years before fully functional services were built on top of it. But they were sold on SOA, so I also spent time mapping the existing application landscape and proposing how the concept of services could be integrated. What I didn’t fully realize at the time was how much was happening around me. Indeed, the early 2000s were marked by three events that made, in my opinion, service-based architecture popular: Around 2002, Jeff Bezos issued a famous internal mandate: all Amazon teams would expose their data and functionality through service interfaces . No other form of interprocess communication was allowed. No direct linking, no shared-memory model, no back-doors whatsoever. Later, companies embraced a concept called the Enterprise Service Bus, a middleware that acted as a central hub to expose cross-domain services and avoid direct coupling between applications. Was it really helpful? To some degree. But the main problem was that most SOA discussions became: “ What kind of ESB should we buy? ” instead of focusing on the core concept: the service . A critical turning point came with a post called SOA is dead; long live services by an analyst named Anne Thomas Manne. She pulled the conversation away from technology choices and back to what actually mattered. On March 20, 2013, Docker was born and became the engine that pushed the movement even further. As I said, applications were hard to scale before containers. With Docker making containerization practical and accessible , it became easy to deploy small, independent units of functionality, which played a huge role in what we know today as microservices. SOA as a brand died, but the idea remained. The core concept of the service remained, lighter, smaller, stripped of the governance overhead that had weighed it down, and became one of the most common architectures we know today. With hindsight, did I regret turning Ariba down? While I sometimes think my life could have been completely different if I had accepted, I don’t regret it at all. I really enjoyed my internship at Astrium . It was a great way to explore an entire business domain and practice something I hadn’t expected: convincing people, building mental models, and making an abstract idea feel concrete and worth caring about. I would have loved to stay, but there were no open positions in the area where I had worked. There was just one thing left. My last presentation to the company was in English. Six months after that catastrophic interview. You can imagine that over those six months, I had practiced hard and made real progress, right? Absolutely not . It was bad. Not as terrible as the interview since this time fog didn’t impair my vision, but still. My supervisor, who had sat through all my presentations, put it simply: Your presentation? It was way better in French. Not all stories should have a happy ending after all. AI is getting better every day. Are you? At The Coder Cafe, we serve fundamental concepts to make you an engineer that AI won’t replace. Written by a Google SWE, trusted by thousands of engineers worldwide. So, I Wrote a Book The Story of The Coder Cafe Why I Switched to Vim Keybindings SOA is dead; long live services Stevey’s Google Platforms Rant CCSDS Recommended Standards Ariba was acquired by SAP in 2012. Interviews December 2010. I was finishing my last year at a French engineering school. I’m not sure if it’s the same in all countries, but in France, most engineering degrees require a 6-month internship to graduate, and I was looking for mine. I was searching for a good internship when, at some point, an ex-student who graduated the year before me sent a message to the school mailing list about a company called Ariba 1 that was looking for interns. San Francisco-based . I read the email and thought: “ Why not? ” With a friend of mine, we decided to apply and got an interview with their tech team. Let me tell you something first. At that time, saying my English wasn’t great was an understatement . I could read it reasonably well. Speaking was another matter entirely. And this was going to be my first job interview in English. To prepare, I had written a long document with answers to the most common interview questions, something I could read from during the call. My interview was scheduled on December 21st. The pressure was starting to build. The day came, I joined the call, and what followed was not one of the worst interviews of my life. It was the worst interview of my life. “Hello?” “Hi, is this Teiva?” “Yes” “Hi Teiva, this is … [Two minutes of him talking without me understanding a single word] ” SOA came with a design-time registry for service discovery, such as UDDI , and other governance layers that simply don’t exist in microservices today. The granularity was different. There was no concept of micro . Containers weren’t yet popular, and applications were harder to scale, so services were mostly macro in scope. An action service to submit executable tasks (e.g., spacecraft telecommands) An alert service to emit notifications of operationally significant events or anomalies A parameter service to subscribe to parameter value reports from a remote system As a service provider, if your service matched the contract defined by the CCSDS, it could plug right in. And one of the main benefits was this. Back then, buying a software component for your ground segment meant being tied to it for years, and switching was painful because nothing spoke the same language. A standard interface changed that: if you respect the service interface, you can easily switch to another CCSDS-compatible service , for example, because a competitor is less expensive. That may not sound impressive today, but back then, for a company like Astrium, this was a genuine win. Better interoperability, less vendor lock-in, and for the first time, a path toward a more open ground segment. The Service Revolution I Was Sitting Inside A significant part of my internship was about studying SOA, exploring those standards, and communicating why a service-oriented architecture was the right direction. The department wasn’t fully sold on CCSDS. It was a new standard, and it would take years before fully functional services were built on top of it. But they were sold on SOA, so I also spent time mapping the existing application landscape and proposing how the concept of services could be integrated. What I didn’t fully realize at the time was how much was happening around me. Indeed, the early 2000s were marked by three events that made, in my opinion, service-based architecture popular: Around 2002, Jeff Bezos issued a famous internal mandate: all Amazon teams would expose their data and functionality through service interfaces . No other form of interprocess communication was allowed. No direct linking, no shared-memory model, no back-doors whatsoever. Later, companies embraced a concept called the Enterprise Service Bus, a middleware that acted as a central hub to expose cross-domain services and avoid direct coupling between applications. Was it really helpful? To some degree. But the main problem was that most SOA discussions became: “ What kind of ESB should we buy? ” instead of focusing on the core concept: the service . A critical turning point came with a post called SOA is dead; long live services by an analyst named Anne Thomas Manne. She pulled the conversation away from technology choices and back to what actually mattered. On March 20, 2013, Docker was born and became the engine that pushed the movement even further. As I said, applications were hard to scale before containers. With Docker making containerization practical and accessible , it became easy to deploy small, independent units of functionality, which played a huge role in what we know today as microservices. So, I Wrote a Book The Story of The Coder Cafe Why I Switched to Vim Keybindings SOA is dead; long live services Stevey’s Google Platforms Rant CCSDS Recommended Standards

0 views
fLaMEd fury 2 days ago

Link Dump: May 2026

What’s going on, Internet? In true fLaMEd style, I missed the April update, so here are all the bookmarks from April and May 2026. Want more? Check out all my bookmarks at /bookmarks/ and subscribe to the bookmarks feed . Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website. Prepping for the endgame of the open web - The History of the Web Jay’s been thinking about this longer than most. The open web has survived worse, but it still needs us to show up. Attenuating the Web - The Darth Mall An interesting pushback — RSS readers strip so much of what makes a website actually worth visiting. The conditionally open web Cory puts into words something I keep circling back to. The open web was never really open, just conditionally so. My Quest to be the Scrobble King Reaching back to scrobbling to fix what streaming broke about music discovery. Ctrl-ZINE Issue.24 Stoked my flossing piece landed in this one alongside ~loghead’s proper smol web rallying cry — issue 24 is a good one. Joyful web design Treating playfulness on the web as the point, not a frivolous extra you tack on later. Who knows that you blog? That weird gap between blogging publicly and never bringing it up with people you actually know Have a Fucking Website “The internet was built on websites that linked to one another”, don’t rent your space inside the walled gardens. No, I Won’t Download Your App. The Web Version is A-OK. | Sid’s Blog I will avoid your app if I can Own Your Web – Issue 18: Curators Is curation the personal web’s superpower now that half the web is AI-generated, or has it always been? 😃 How to Surf the Web in 2025, and Why You Should Algorithmic scrolling killed surfing, but David Cain reckons the old web is still there if you go looking. A Secret Web The indie web isn’t secret, just hidden by commercial search. Benjamin Hollon on the tools we already have to find it. the web as a space to be explored · roytang.net The web isn’t dying. Roy Tang reckons the indie web is still alive and explorable Join the Inclusive Front Sara Joy’s manifesto for web folks who reckon building inclusively is just doing the job properly Your Ai Hate Is Showing - Matt’s Blog Blanket AI-hate misses the target. The problem is the corporations weaponising it, not the tools. The Joys of a Small Social Feed How deliberately following a small number people on Mastodon leads to a more peaceful experience. Has me contemplating my own following count. Why I Still Like the Internet Gordon on how blogs are quietly winning again The Blogger’s Manifesto Eight principles for blogging that go against the “build an audience” playbook. Staying small and honest is the point. How to Hate AI There’s a lot of AI hate going around these days, and Steve’s take is where I think it should actually be aimed. AI is out of the bag. It’s happening. Rather than directing hate at people who are curious, learning, and already using the tools, we could focus that energy on learning, understanding, and educating on the best and safest ways to use them.

0 views
Stratechery 2 days ago

The Nvidia AI PC, Project Solara, Microsoft AI

Listen to this post: Good morning, I don’t normally give away my interview subjects ahead of time, but I’m going to make an exception this week given the subject and the below Update. I am writing this in San Francisco where I interviewed Microsoft CEO Satya Nadella after his Build developer conference keynote ; normally I would want to publish that immediately so that you have the full context of my analysis. In this case, however, I came to the opinions below during the keynote, and before the interview, so for that reason (and a few logistical ones) I wanted to articulate them first (before you see my questions), and follow up with Nadella’s view on them (and a number of other topics) afterwards. So with that noted, on to the Update: From CNBC : Nvidia has emerged as the world’s most valuable company by dominating the market for artificial intelligence chips in the data center. Now the company is expanding its prowess to chips that will serve as the main processor for personal computers, entering an arena that’s long been ruled by Intel, Advanced Micro Devices, Qualcomm and Apple. During a keynote address at Taiwan’s Computex conference on Monday, Nvidia CEO Jensen Huang unveiled a new PC processor made alongside Microsoft. The RTX Spark superchip, which Huang also referred to as the N1X, debuts in the fall on a fresh line of Windows PCs from Microsoft, Dell, HP, ASUS, Lenovo and MSI. I’m actually starting in Taipei on Sunday, where Huang introduced the long-rumored Nvidia PC chip; from Tom’s Hardware : At full strength, this chip offers up to 20 Arm CPU cores, a Blackwell GPU with 6,144 CUDA cores, 128GB of LPDDR5X RAM, and up to 300 GB/s of memory bandwidth. That powerful CPU and GPU, connected over NVLink C2C, and the large memory pool give AI agents and 120-billion-parameter models plenty of power and space for long-running tasks with context lengths stretching to a million tokens, according to Nvidia. We don’t have any benchmarks yet, but the RTX Spark appears to be broadly similar to the DGX Spark; that’s a decent chip that excels at prefill, but is slower than an M5 Max at decode (thanks to lower memory bandwidth), and significantly slower at CPU tasks. Huang appeared during the keynote via live video to discuss the chip. Satya Nadella: Suddenly, this concept of unmetered intelligence right at the edge is so hot again. So maybe you want to talk a little bit about this: you have thought about this, talked about this, and now, of course, with RTX Spark really delivered, I think, what’s a breakthrough system for AI to be much more ubiquitous. But maybe, Jensen, you can just share a little bit your vision around where you see this going. Jensen Huang: Well, this all started about three years ago between a conversation between you and I. And we were talking about how we could build a new class of PCs that’s incredible for designers and creators. And it would be incredible for artificial intelligence. And it would be one of these systems that has the processing capability, but also the software stack that’s integrated into the world’s design packages and creator packages. And, of course, all the things that we’re doing with AI. And here we are, three years later, we built an incredible new chip. And this system is supported by all of this new software that you created for Windows. And we now have the ability to have essentially an autonomous agent running on the PC. This clip explains why I find this chip specifically, and AI PCs generally, pretty underwhelming. Three years ago we were still in the ChatGPT era of AI, and I was very excited about the possibility of local inference. Then came the reasoning era, blowing up KV cache (which increases the need for more memory) and emphasizing the importance of decode (to generate that many more tokens). Now we’re in the agentic era, where CPU performance is incredibly important. To that end, the ideal setup for a local agent is strong local CPU performance and calling out to the cloud for inference. The RTX Spark, however, spends tons of die space on GPU cores that are inferior to the cloud (because of memory size and bandwidth if nothing else) at the expense of CPU. It’s a suitable chip if you just want a chatbot circa 2023; it’s hard to see it being worth the price — or the software compromises that are the reality of Windows on ARM — in 2026. Jump ahead to the Build keynote, which I found very underwhelming to start. Nadella opened with a brief overview of the AI stack, then started talking about Windows, and I was honestly pretty surprised at the lack of vision and enthusiasm. That’s when it occurred to me: I think that Nadella agrees with me! Sure, some local inference is nice, but that’s not where the AI that matters is going to be located. Nadella, keep in mind, has no real loyalty to Windows; indeed, I credit him with The End of Windows . Specifically, Nadella didn’t end Windows as a product, but he ended its run as the organizing principle around which the entire company operated, focusing on software that ran everywhere and a cloud that ran everything. That leads to a surprising takeaway, and the most interesting part of the Build keynote: what if Microsoft is actually well positioned to get back into AI devices? From GeekWire : A team inside Microsoft has been quietly building a platform for devices that run AI agents instead of apps, based on Android instead of Windows, with two working hardware designs so far, and an initial set of big-name companies lined up to run pilots. The platform, dubbed “Project Solara,” is Microsoft’s bet that AI will open up entirely new scenarios for computing — using agents to avoid the constraints of traditional software, and off‑the‑shelf components to develop new devices quickly and inexpensively. Project Solara is, to be clear, vaporware at this point, although the company did show real devices and has signed up Qualcomm and MediaTek as chip partners. It is also extremely compelling. Here’s how Nadella introduced it: So far, we’ve talked about the edge and the cloud. The current form factors, right? I mean, when I saw that Jensen picture from the weekend where he had all the desktops, I felt like, man, I’m back in the 90s, right? Because it was so cool to see the lineup of all the machines that I loved and I grew up with back yet again with new functionality, right? It’s the same form factor, but unbelievable new functionality because of the onboard AI capability, right? So that’s sort of what we’ve seen with the laptop, the desktop, and of course with the cloud. But it also, you know, sets up that next question: if you have that capability, which is new function, and you can put it into existing form factors, can you even purpose-build new form factors for the new function? Can you build a new platform even for the agent era? And that is the motivation behind Project Solara, which we’re introducing today. First off, note the framing: the PC is old tech with agents; what about new tech uniquely enabled by agents? And note the classic Microsoft hook: could that new tech sit on top of a new platform? Corporate Vice President Steve Bathiche, the head of Microsoft’s Applied Sciences Group, explained the vision: Before I talk about those awesome new devices you just saw, let me start with the why. Back at Build 2023, I talked about the outside AI application structure, where AI moves from operating within the application frame to operating globally, working across multiple apps and services to connect, coordinate, and maintain context across entire workflows, devices, and time scales. What if there were an ecosystem of devices specifically designed for that new type of application structure, for those types of agents, for that transformational interaction technology? That is the impetus behind Project Solara. But with so many possible forms, which one do you pick? What is the next device? You see, the big aha for us is that it’s not about choosing one specific form factor. It is about creating a system that extends your agent across a constellation of devices. The next computer is not one device. It is all these devices working together as one system, with agents showing up closer to where and when you need them. There was one brief moment in the promotional video that preceded Bathiche’s appearance that made the concept click for me: The problem with wearable devices is the interaction model: they are only useful when you are interacting with them, when the human is in the loop, but being in the loop with a wearable is annoying and inefficient. What is being demonstrated here, however, is a brief interaction, and then an agent doing work in the background. In other words, the usefulness happens in the cloud without the human needing to be involved, because an agent is doing the work. That’s what I find compelling. On one hand, you can make the case that of course Microsoft would be interested in a device model that uses the cloud as a platform, given that Microsoft doesn’t control a mobile device like an iPhone. What occurs to me, however, is that even if Microsoft doesn’t succeed with Project Solara, this model — where the cloud is the hub and multiple devices are the spoke, instead of the phone being in the center — is clearly a better one for agents. Agents work best in the cloud, and across apps and devices; yes, the phone might be one of those devices, but when it comes to agents it shouldn’t be the hub. Again, this is vaporware, and very much in Microsoft’s interest, so take Project Solara with the appropriate grain of salt. It’s a vision of the future, however, that does make a lot of sense, particularly in an enterprise scenario where all of the context and compute is already in the cloud (and Project Solara is focused on enterprise, not consumer). It’s also something completely different from the past, and fits my thesis that, in the age of AI, thin is in . From GeekWire : Microsoft has based much of its AI business on models from OpenAI, before expanding more recently to Anthropic. On Tuesday, the company showed how it plans to rely less on both. At the Build developer conference, the Microsoft AI Superintelligence Team unveiled a family of seven models built from scratch. It’s part of an ongoing effort by the company to build credible in-house alternatives to models from partners and rivals with competing allegiances… The flagship of the seven newly announced MAI models is MAI-Thinking-1, a reasoning model that Microsoft says draws even with Anthropic’s Claude Sonnet 4.6 in blind human testing, and matches the more capable Claude Opus 4.6 on a widely used coding benchmark. [CEO of Microsoft AI Mustafa] Suleyman stressed that MAI-Thinking-1 was trained from the ground up with no distillation from other companies’ models, looking to appeal to enterprises that care about clean data lineage. These models seem pretty decent, all things considered, but what was interesting to me was the framing: Microsoft emphasized that enterprises could take these models and make them their own. Suleyman said: This is what owning the full stack end-to-end looks like. It’s the foundation of Microsoft Frontier Tuning, it lets you customize the MAI models using our full stack hill climbing machine right where you want it. And it means that the disciplined and very relentless engineering that has gone into building our models is now available to all of you on a platform that you can trust, working on your behalf to create custom agents that you will control. So the really big thing, of course, that’s happened in the last year is these RLEs, reinforcement learning environments, these unique training gyms for your AIs. They create company and task-specific agents adapted only to you, built on MAI models. So for example, within Microsoft, we use our RLEs combined with our MAI models to climb towards the best agentic use cases on Excel. Our MAI-tuned model is now on par with GPT 5.4 on public and private benchmarks, whilst at the same time being 10 times more efficient on cost, and many other early adopters are seeing similar results. When we’ve tuned our models on McKinsey’s tasks, MAI delivered the highest win rate, even outperforming GPT 5.5, and again delivering 10x greater efficiency on cost. So to us, this is the advantage of very carefully calibrated frontier tuning. And importantly, unlike with some of the other companies, with MAI, you don’t rent intelligence from a shared model that learns from everybody. Only you keep the benefits of your hard-earned workflows, know-how, knowledge, and your own institutional data. Only you get to control the resulting model. And so with us, the RLEs and the models that you build inside of them, they become your moat. I really think this is distinct. It marks a new era in AI that we’re all very, very excited about. This has shades of AWS’s Nova Forge offering , which lets enterprises add their data at a checkpoint in pre-training; it’s a little different in that it’s more focused on reinforcement learning, but those lines are getting blurred. The concept is that enterprises get to have their own model for their own data, without sharing it with the frontier labs that want to eat their lunch, and it’s a concept that is certainly appealing in theory; the real test will be to see if enterprises that choose this route aren’t penalized by not being on the cutting edge of functionality. Then again, helping cautious enterprises embrace the future on their terms, without necessarily having to win on pure performance, is exactly how Microsoft has long maintained its position. This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery . The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a subscriber, and have a great day!

0 views
DHH 2 days ago

A pond of interesting problems

The great joy of having built a successful business that employs a broad team of talented people is that I get to fish for exactly the kind of problems that most interest me, most of the time. Usually, this coincides well with the needs of the business. When we moved out of the cloud, I spent months getting Kamal off the ground, so we didn't have to get mired in the complexity of Kubernetes. Fun problem to solve! And of course, the origin story of Ruby on Rails is that Basecamp gave birth to it all back in 2003. Because I simply wanted Ruby to work well for the web, and we needed a platform to build the business. But sometimes it's also a bit further afield. We had our big clash with Apple over the App Store's monopoly abuses back in 2020, but it wasn't until 2024 that I severed our exclusivity with the Mac on the engineering side by moving to Linux, and ultimately building Omarchy. I don't always get to choose, of course. There are occasionally urgent problems that just need our, and therefore my, full attention as a company, or humdrum issues that I just happen to be best qualified to tackle. But this is increasingly rare because of all those great people we've managed to assemble at 37signals. And that's how it should be! Building a successful business should yield dividends beyond just the financial ones. It should afford you more opportunity to press your comparative advantage, so you spend most of your time on the projects that stimulate a little Call of the Wild. Never to the point of being too good for anything, mind you. Taking out the trash is still everyone's job some of the time. But mostly, I want to be sitting by the pond of interesting problems, fishing for the ones that catch my eye and hook my motivation.  Who could wish to retire from that?

0 views

How Other Link Checkers Do Recursion

After I published Five Years of Trying to Add Recursion to lychee , one reply I got was a very fair question: If recursion is so hard, how do other link checkers do it? Plenty of them already crawl websites! This sent me down a rabbit hole of reading the code of other link checkers. The key takeaway is: they didn’t find a clever trick we missed. They were built as crawlers from the very first commit, and I initially built lychee as a stream. I went and read the source of the recursive checkers we list in lychee’s README : muffet (Go), LinkChecker (Python), linkinator (TypeScript), and broken-link-checker (JavaScript). This post is a teardown of how each one actually handles recursion, what it costs them, and what it means for lychee. If you haven’t read the first post , the summary is that lychee was architected as a one-shot, unidirectional pipeline ( ). Recursion needs a cycle (responses create new inputs), and cycles in an async, channel-based pipeline are where the dragons live . 🐲 Five years and four attempts later, the pieces we’ll need to do it properly only just landed. DAGs vs. cycles Every recursive checker I looked at is built from the same three parts: Diagrammatically, lychee is different from the others: Crawlers have a back-edge baked in. Our pipeline doesn’t, and every one of my failed attempts was an effort to bend that back-edge into a graph that was never designed for it. Let’s look at that graph design more closely: Note that the visited check happens in the enqueue step, atomically with the mark, before the worker ever touches the network. That ordering is the entire fix to the deduplication race that haunted lychee’s attempts 1–4, where the cache was written after checking. Each tool uses a variation on it. muffet (Go): a WaitGroup and a Set muffet is closest in spirit to lychee: a fast, single-binary, concurrent website checker. The dedup + scheduling decision lives in one method ( ): is a (a mutex-guarded ). returns whether the URL was already present, so a page is only scheduled the first time it’s seen. Dedup happens at enqueue, synchronized by the set’s mutex. This is basically a line-by-line translation of the diagram above. Checking a page fetches all of its links concurrently, and feeds qualifying ones back into , the back-edge: How muffet knows it’s done muffet’s answer to termination is a little built around a ( ): Every scheduled page increments the group; every completed page decrements it; returns when the count hits zero. The whole crawl bootstraps with a single before , so the counter is positive before anyone waits on it. This is the same counter I tried (and failed with) in Attempt 1 and Attempt 4 . The difference is the invariant: is only ever called from inside an already-running daemon that holds the count above zero (or from the bootstrap). There is no window where the counter briefly reads zero while work is still pending. Go’s enforces this invariant so naturally that it doesn’t feel like distributed termination detection at all, but that’s exactly what it is. It’s the moral equivalent of the primitive Kait contributed to lychee in 2026 . Where the tradeoffs are Concurrency isn’t bounded by the daemon manager. does for every task, spawning unbounded goroutines. The actual limiting happens downstream in a (a buffered-channel counting semaphore) and a per-host throttler pool. muffet separates “the frontier” from “the rate limiter,” which is exactly the separation lychee lacked when it tried to use one bounded channel as both in the past. Cheap goroutines do a lot of heavy lifting. Spawning a goroutine per link is “fine” in Go. The equivalent in Rust ( per link, each needing state) is what pushed me toward and the ownership pain I wrote about . On extensibility, muffet is a focused CLI, not a library. There’s no plugin surface; you get what the flags give you. lychee deliberately ships as a reusable crate, which raises the bar, since every architectural choice has to uphold the standards of a public API. On scalability, unbounded goroutines plus an in-memory visited set scale comfortably to large sites, but there’s no disk-backed frontier, so a truly enormous crawl is bounded by RAM. Same as lychee. Takeaways: muffet LinkChecker (Python): a joinable unbounded queue LinkChecker has existed since the year 2000. It’s a synchronous, thread-pool crawler. Its frontier is a hand-written ( ), a clone of Python’s with / . Look at the very first design comment: It’s explicit about the exact deadlock that bit me. That comment is our Attempt 4 backpressure deadlock , called out and designed around. lychee tried to push discovered URLs into a bounded channel; when it filled, the response handler blocked, no responses drained, no slots freed. Deadlock. 💥 LinkChecker’s answer is brutalist in nature: the frontier is unbounded . Backpressure is enforced elsewhere (a fixed thread count and per-host throttling), never by blocking a producer that is also a consumer. Termination by counter, done right blocks until hits zero ( ): Again: a counter. But the increment in and the decrement in are both inside the queue’s lock, and a worker calls only after fully processing an item including enqueuing its children . So children are counted before the parent is marked done, with no premature zero. It’s semantics implemented with a mutex and a condition variable. Deduplication, before the request LinkChecker writes the URL into its result cache at enqueue time ( ): That sentinel is a “fix” that’s missing in lychee’s attempts. By the time any worker thread checks the URL, the cache already says “mine,” so concurrent discovery from another page is a no-op. Per-host politeness and termination guards The ( ) throttles per host: and calls so a stuck crawl can’t hang forever. Where the tradeoffs are Blocking threads instead of async. Each of the (default 10–100) threads does blocking I/O via . Simple and battle-tested, but the concurrency ceiling is the thread count, and each thread carries a full stack. lychee’s Tokio model reaches thousands of concurrent in-flight requests on a handful of OS threads; LinkChecker can’t, and doesn’t try. The unbounded frontier trades a deadlock for unbounded memory. The explicit “no max size” decision means RAM growth on huge sites. There’s a cap and a periodic to mitigate it. Extensibility is excellent. LinkChecker has a real plugin system ( : anchor checks, SSL, virus scanning, and more) and many output loggers. This is the most extensible of the bunch, and it pays for that with a large, mature, somewhat old-fashioned codebase. On scalability, it’s GIL-bound and thread-limited, so raw throughput is the lowest here, but correctness and feature coverage are high. Takeaways: LinkChecker linkinator (TypeScript): Single-Threaded linkinator is a Node.js checker, and it benefits from something neither Go nor Rust provides: a single-threaded event loop . Check-and-insert into the visited set is atomic for free , because no two callbacks run simultaneously. The frontier is a concurrency-limited (a p-queue-style structure). Termination is one line in ( ): is the library’s termination detection: it resolves when the queue is empty and no task is in flight. Same idea as muffet’s and LinkChecker’s , just expressed as a promise and backed by a single-threaded runtime, so no Mutex is needed to protect the visited set. The back-edge and the race-free dedup When crawling, GETs the page, extracts links, and for each new URL re-enters the queue ( ): Because JavaScript is single-threaded, the entire thing executes without interruption. In Rust or Go, that’s a critical section you must guard with a mutex (and get the ordering right); in Node it’s just three statements. This is the single biggest reason recursion is easier in Node than in Rust. It’s just a language feature. linkinator also keeps a of keys, and a map so it can wait on an in-flight check and still report a duplicate broken link against every parent that references it. Those reuse-operations are themselves pushed onto the same queue, so correctly waits for them too. HEAD vs GET linkinator uses for leaf links but when it needs to crawl, because recursion needs the response body to find more links : This is precisely lychee’s remaining open problem : you can only recurse into pages you fetched with a body. linkinator just always GETs when crawling; lychee plans to reuse the body it already has in cache from the check it just performed. Where the tradeoffs are Single-threaded is both a blessing and a ceiling. No data races, trivially correct dedup, but HTML parsing is CPU work that blocks the one event loop. For thousands of pages, you’re bound by a single core. lychee’s multi-threaded runtime parses and checks in parallel. It suffers from in-memory result inflation. The source explicitly comments on “massive result inflation for heavily interlinked sites”: the array, , and all grow with the crawl. Fine for a docs site, heavy for a giant one. Rate limiting is reactive, not proactive. There’s a that backs off per host on a with , but no general per-host concurrency cap like lychee’s . linkinator can hammer a host until it complains; lychee now paces before the complaint. For extensibility, it’s an ( , , and so on), so it’s embeddable and scriptable, which is nice. It’s a library first, like lychee. Takeaways: linkinator broken-link-checker (JavaScript): event-driven, using two queues broken-link-checker (BLC) takes the event-driven model furthest. It’s built on , a queue with (concurrency) and , and it nests two of them: a site-level queue feeding a page-level . The frontier and dedup live in ( ). Visited pages are tracked in a , written at enqueue time: Recursion is governed by a filter that decides whether a discovered link becomes a crawled page: Termination by event cascade BLC has no counter and no . It rides the queue’s drain events. When the page-level queue empties it fires , which makes emit and call the site queue’s callback; when the site queue drains, it fires . That’s the public : That’s their termination detection, expressed as “the request queue reported empty.” And in classic Node.js fashion, the callback is what actually tells the site queue to free up a slot for another site. So the termination of one site is what allows another to start, and the termination of the whole crawl is what allows the process to exit. It’s a cascade of events that propagates from the page queue to the site queue to the process. Where the tradeoffs are It’s the best web citizen of the bunch. robots.txt is honored ( , ), is respected, and plus are first-class. This is a crawler that’s polite by default. Event cascades are powerful but fiddly. Termination is spread across half a dozen event handlers and two nested queues. It works, but the control flow is much harder to follow than . This is the JS cousin of the “leaky abstraction” problem I described, where recursion-awareness ends up sprinkled across many handlers. It’s single-threaded, the same ceiling as linkinator, plus the in-memory per site. On maturity versus momentum, it’s very widely used (it powers a lot of tooling), but development has slowed. The architecture is still sound and worth studying. Takeaways: broken-link-checker A note on markdown-link-check and the “industrial” crawlers Our README marks markdown-link-check as supporting recursion, but there’s some nuance there: it recurses over Markdown files , not by spidering a live website. There’s no HTTP frontier and no termination problem in the sense above. Worth a mention so the comparison is honest, not worth a teardown. If you want to see the pattern at full industrial scale, look at Scrapy (Python/Twisted) or Colly (Go). Both use the same approach: a scheduler (frontier) with a pluggable, optionally disk-backed queue, a dupefilter (often a Bloom filter rather than a ), a bounded downloader pool, and explicit “engine idle → close spider” termination. They solve exactly the problems lychee struggled with ( distributed termination detection , backpressure, dedup), just with years of dedicated crawler engineering behind them. The takeaway isn’t “lychee should be Scrapy”: it’s that crawling is a well-trodden architecture, and lychee is simply standing on a different one right now. Side-by-side Tool Lang / runtime Concurrency model Frontier “Done?” signal Dedup point Per-host limiting muffet Go, goroutines goroutine pool + semaphore + host throttler mutex-guarded set + daemon channel visited set at enqueue host throttler pool LinkChecker Python, threads fixed blocking thread pool unbounded joinable-queue counter ( ) result cache at (req/s) linkinator Node, event loop single-thread + p-queue ( ) p-queue at enqueue (race-free) reactive broken-link-checker Node, event loop ( ) nested request queues queue-drain events at enqueue + lychee (2026) Rust, Tokio tasks + channels + per-host pool lychee in 2026 finally has a column-for-column match. The is muffet’s and LinkChecker’s . The is BLC’s / and LinkChecker’s . The per-URI mutex is everyone’s enqueue-time dedup. So Why Couldn’t We Just Copy Them? Three reasons, in increasing order of how much they’re actually lychee’s fault. They started as crawlers; lychee started as a stream. Every tool above has a back-edge in its core data structure. lychee’s core was a DAG optimized for the 99% case (a list of files/URLs, checked once, fast). Retrofitting a cycle onto a pipeline is much harder than having one from the start. The problem is architectural in nature. The frontier and the rate-limiter must be different objects. muffet (set + semaphore), LinkChecker (unbounded queue + thread count), linkinator (p-queue + delayCache), BLC (request queue + maxSockets) all keep “what to do next” separate from “how fast to go.” lychee’s early attempts tried to make one bounded channel serve both roles, and a cycle through a bounded channel deadlocks. The fix (lychee’s plus a over an unbounded work source) is the same separation we’re aiming for now. Single-threaded runtimes get dedup for free. Both Node tools dedup with a plain and zero locking, because the event loop serializes access. Go and Python pay a mutex. Rust pays a mutex and fights the borrow checker about who owns the shared state across . That’s the ~30% “Rust tax” I estimated last time : not the algorithm, but the friction of expressing shared mutable frontier state under . None of this is a knock on lychee’s design. A unidirectional stream is the right call for the common, non-recursive case: it’s why lychee is fast and why the 30% channel regression from Attempt 2 was a dealbreaker. The other tools pay for their back-edge on every run, recursive or not. lychee refused to, and that principle is exactly why recursion took five years and why, when it lands, it won’t slow down the path everyone actually uses. I believe that we can have our cake and eat it too: a crawler architecture that supports recursion without sacrificing the speed of a one-shot pipeline. But it’s a harder problem than just “copy what they do,” because most link checkers didn’t start with uncompromising performance as their top goal. Key takeaways So when someone asks “how do other link checkers do recursion?”, the real answer is: they made it a part of the architecture from the beginning, and they leaned on a runtime (providing conveniences like a , a joinable queue, an idle promise) that solved termination without solving “distributed termination detection.” Thanks to the maintainers of muffet, LinkChecker, linkinator, and broken-link-checker: reading your source is the clearest way to learn about crawler architecture out there and we’re all in this together, just with a different set of tradeoffs. A mutable work queue (let’s call it “frontier”), not a fixed input stream. Discovered URLs go back into the same queue they came from. A visited set that’s updated at enqueue time (before the request completes), so two pages discovering the same link can’t both submit it. A primitive that answers “is everything done?”: a , a joinable-queue counter, an promise, or a queue-drain event. Concurrency isn’t bounded by the daemon manager. does for every task, spawning unbounded goroutines. The actual limiting happens downstream in a (a buffered-channel counting semaphore) and a per-host throttler pool. muffet separates “the frontier” from “the rate limiter,” which is exactly the separation lychee lacked when it tried to use one bounded channel as both in the past. Cheap goroutines do a lot of heavy lifting. Spawning a goroutine per link is “fine” in Go. The equivalent in Rust ( per link, each needing state) is what pushed me toward and the ownership pain I wrote about . On extensibility, muffet is a focused CLI, not a library. There’s no plugin surface; you get what the flags give you. lychee deliberately ships as a reusable crate, which raises the bar, since every architectural choice has to uphold the standards of a public API. On scalability, unbounded goroutines plus an in-memory visited set scale comfortably to large sites, but there’s no disk-backed frontier, so a truly enormous crawl is bounded by RAM. Same as lychee. muffet’s termination is a , full stop. It’s the design lychee converged on after five years; muffet got it for free from Go’s standard library on day one. The frontier and the concurrency limiter are separate things. A mutex-guarded set is the frontier; a semaphore plus host throttler bounds concurrency. Conflating them is what deadlocked lychee. Goroutines hide the cost that Rust makes you pay explicitly. The same per-task model that’s trivial in Go is where Rust’s /ownership friction shows up. Blocking threads instead of async. Each of the (default 10–100) threads does blocking I/O via . Simple and battle-tested, but the concurrency ceiling is the thread count, and each thread carries a full stack. lychee’s Tokio model reaches thousands of concurrent in-flight requests on a handful of OS threads; LinkChecker can’t, and doesn’t try. The unbounded frontier trades a deadlock for unbounded memory. The explicit “no max size” decision means RAM growth on huge sites. There’s a cap and a periodic to mitigate it. Extensibility is excellent. LinkChecker has a real plugin system ( : anchor checks, SSL, virus scanning, and more) and many output loggers. This is the most extensible of the bunch, and it pays for that with a large, mature, somewhat old-fashioned codebase. On scalability, it’s GIL-bound and thread-limited, so raw throughput is the lowest here, but correctness and feature coverage are high. The unbounded frontier is a deliberate anti-deadlock choice, documented in a one-line comment. It describes the exact problem we hit in lychee in attempt 4. Dedup at time (a placeholder in the cache) is their synchronization mechanism. The cache must claim the URL before the request, not after. Threads buy simplicity at the cost of throughput. A blocking thread pool is the easiest correct model… and the slowest one. Single-threaded is both a blessing and a ceiling. No data races, trivially correct dedup, but HTML parsing is CPU work that blocks the one event loop. For thousands of pages, you’re bound by a single core. lychee’s multi-threaded runtime parses and checks in parallel. It suffers from in-memory result inflation. The source explicitly comments on “massive result inflation for heavily interlinked sites”: the array, , and all grow with the crawl. Fine for a docs site, heavy for a giant one. Rate limiting is reactive, not proactive. There’s a that backs off per host on a with , but no general per-host concurrency cap like lychee’s . linkinator can hammer a host until it complains; lychee now paces before the complaint. For extensibility, it’s an ( , , and so on), so it’s embeddable and scriptable, which is nice. It’s a library first, like lychee. is the termination mechanism. Simple and provided by the JS runtime. A single-threaded event loop makes request deduplication pretty much free. This is the biggest structural reason recursion is easier in that case. Reactive 429 backoff is not the same as proactive per-host pacing. lychee’s aims higher, at the cost of more machinery. It’s the best web citizen of the bunch. robots.txt is honored ( , ), is respected, and plus are first-class. This is a crawler that’s polite by default. Event cascades are powerful but fiddly. Termination is spread across half a dozen event handlers and two nested queues. It works, but the control flow is much harder to follow than . This is the JS cousin of the “leaky abstraction” problem I described, where recursion-awareness ends up sprinkled across many handlers. It’s single-threaded, the same ceiling as linkinator, plus the in-memory per site. On maturity versus momentum, it’s very widely used (it powers a lot of tooling), but development has slowed. The architecture is still sound and worth studying. Termination is a cascade of queue-drain events, not a counter. Same idea, different syntax. Politeness is built in. robots.txt, , and make it the most server-friendly recursive checker by default. Event-driven control flow is the cost. Distributing recursion logic across many handlers is exactly the kind of spread-out complexity that makes the feature hard to reason about. There is no secret sauce. Every recursive checker is a worklist plus a visited set plus a quiescence detector. The “trick” is being shaped like a crawler from commit one. Termination is always the same idea wearing different clothes: (muffet), joinable-queue counter (LinkChecker), (linkinator), queue-drain events (BLC), (lychee 2026). All of them are distributed termination detection. Dedup belongs at enqueue, before the request. Marking a URL visited after checking it (what lychee did for four attempts) is the bug. Everyone else claims the URL the moment it enters the frontier. Separate the frontier from the rate limiter. A bounded channel that is both your queue and your backpressure will deadlock the instant you add a cycle. There is no free lunch. Node’s single thread makes dedup trivial at the cost of performance; Go’s goroutines and make termination trivial at the cost of a runtime; Rust gives you neither for free but hands you a compiler that refuses to let the races compile and you can get the network card to glow if you know exactly what you are doing.

0 views
Max Bernstein 2 days ago

A survey of inlining heuristics

Compilers, especially method just-in-time compilers, operate on one function at a time. It is a natural code unit size, especially for a dynamic language JIT: at a given point in time, what more information can you gather about other parts of a running, changing system? I don’t have any data to back this up—maybe I should go gather some—but on average, methods are small. Especially in languages such as Ruby that use method dispatch for everything, even instance variable (attribute, field, …) lookups, they are small . And everywhere. This makes the compiler sad. If we are to continue to anthropomorphize them, compilers like having more context so they can optimize better. Consider the following silly-looking example that is actually representative of a surprising amount of real-world code: Right now, in the method, I count 8 different method calls: (Technically more, but the ivar lookups (including !), addition, and subtraction are generally specialized and don’t push a frame, even in the interpreter.) Furthermore, there are at least two heap allocations: one for each instance. Last, there is a bunch of memory traffic to and from instances. This all is a huge bummer! What should be a simple math operation is now overwhelmed with a bunch of other stuff. is certainly not a zero-cost abstraction. Even if we had a bunch of other optimizations such as load-store elimination or escape analysis, they would not be able to do much: pretty much everything escapes and is effectful. That is, unless we inline . Inlining is the lever that enables a bunch of other optimization passes to kick in. I wrote about the design and implementation of Cinder’s inliner ( FB link , personal blog link ) a couple of years ago. I wrote about arguably the simplest part, which is copying the callee body into the caller. It took me at least a week to get working. Probably closer to months if you consider all the plumbing through the rest of the JIT. In February during a small hackathon, I watched my colleague k0kubun prototype that bit of the inliner inside ZJIT in about 30 minutes. There is more to do when pretty much every part of the VM is observable from the guest language: both Python and Ruby allow inspecting the state of the locals, the call stack, etc from user code. Sampling profilers also expect some amount of breadcrumbs to work with to inspect the stack. So there’s some more machinery still required to pretend like the callee function was not inlined. I talk about this a little bit in the Cinder blog post. Even so, all of that can probably be designed and wired together in a couple of months. Then you will find yourself tuning the inliner for the next 10 years. This is much harder. The thing that makes inlining difficult, especially in a method JIT, is that you are trying to make an entire (dynamic!) system faster but you are only looking through a microscope and only capable of local reasoning 1 . Whereas other optimizations such as strength reduction, inline caches, and value numbering are an un-alloyed good for the generated code, inlining can have negative effects . It is also perhaps the first optimization people add that has non-local impact. If you inline wrong, your code size might blow up. This might thrash your CPU’s caches. Bummer, but happens to the best of us. But also, if you inline wrong, you might get in the way of other helpful optimizations: if you hit some size limit after inlining method A, you might never get to inline B, which is the key to unlocking the performance of the method you are trying to optimize. Last, inlining might hurt compile time. In situations where latency is paramount (think: interactive client JavaScript), adding tons more code into the fray might add noticeable hiccups, even if the long-term throughput improves. As always, in-band compilation is a trade-off because any time you spend compiling, you are not executing code . You have to write your compiler to reason about all of this stuff. So you have heuristics. For example, here is Michael Pollan’s inliner heuristic: Inline methods. Mostly small. Not too many. I did a survey of a bunch of compilers, mostly JIT compilers, to see what their inlining heuristics look like. I also read (skimmed) some papers to see what those folks had to say. I wonder if they agree. This post was a long time coming. I started working on it about five years ago but then when I quit working at Facebook I accidentally left behind all of the inliner research I did for Cinder’s inliner. So then I kind of just thought about it aimlessly for a while before redoing it this year. Anyway, here’s wonderwall. Spoiler alert: all in all, people tend to look at: And also have different interesting ways to pipe in profile information. Last, some newer papers do some wild stuff: Another thing to consider in inlining is how you gather and interpret profiles. When you compile a function, you tend to specialize it based on the input it has historically been given. For a monomorphic input, maybe you guard that the type is still the same and otherwise jump into the interpreter. For a polymorphic input, maybe you check the top K (~4) common cases and otherwise jump into the interpreter. Fine. But sometimes you can be compiling a polymorphic method that is actually monomorphic in its caller . That is, might only ever pass one kind of input to , but other callers pass all kinds of stuff. Here is a bit of a silly example to show what I mean: Just kidding, not so silly at all. It’s a super common pattern in Rails . It makes polymorphic in even though for many of its callers, it may well be monomorphic (or even a constant). In order to plumb this information through to the compiler, you have to figure out this call context relationship. There are a couple of common ways to do it. YJIT, for example, though it does not inline, splits methods based on the types of the arguments going in. This means that it clones the compiled code, generating a new version for each context. This does not give call context (“A calls B”) but gives type context (“B is called with integers, B’ is called with strings”). A compiler could do type-based splitting in the interpreter or a baseline tier. If you don’t fancy duplicating the code, you can instead duplicate the profiles. You could either do this using type context (as above) or using call context. SpiderMonkey, for example, does “trial inlining” that allows callers to pass down a bit of memory for potential inline candidate callees to record their inline caches. Instead of each function holding its own ICScript, the caller allocates a unique ICScript for that potential-inline call-site. This gives each callee function (at least?) one level of call context. Later, when inlining the callee into the caller, we don’t have other callers’ type information polluting the IR builder (or whatever reads the profiles). JavaScriptCore handles this by inlining bytecode into other bytecode. This is a gnarly transformation but gives the interpreter, even (!) access to call context. On tier-up to the compiler, all the inlining decisions have been made already. HotSpot handles this with multiple tiers. The interpreter tiers up to the client compiler, C1. C1 profiles branch and call targets in compiled code. C1 may eventually recompile based on this new information. C1 may eventually tier up to C2, which copies C1 inlining decisions. This way, we get call context in profiles via inlining. One last thing you could do is just trust your type inference and branch folding in the optimizer. You could inline and do polymorphic specialization in the callee when building the IR, then hope that your branch pruning monomorphizes the inlined callee. It’s a little wasteful because the polymorphic code is built “for nothing”, but it might work fine? Okay, onto the collected notes and half-baked commentary. Here’s a survey of a bunch of JIT compilers and how they reason about inlining heuristics. But before we get into that, thanks to Iain Ireland, CF Bolz-Tereick, and Ian Rogers for feedback on this blog post! What follows is mostly a “bits and bobbles” section a la Phil Zucker . We’ll start with Cinder , because when I wrote Cinder’s inliner I added only the simplest heuristics, mostly “don’t inline” signals. Over time, after I left, people tuned it a bit more. The inliner starts from the caller CFG, walking it to find suitable inlining candidates. Inlining candidates are only for call targets that are known—in Cinder’s case, only for monomorphic call targets—and pass some checks. The callee is only known by it’s function object, which includes its bytecode. There is no IR available for the callee until we decide to inline. Most of the “can’t handle this” checks are related to argument handling. Python has a pretty complex calling convention, so if the caller/callee have not agreed on how the arguments should be passed through, the inliner doesn’t care to try and figure it out on its own. That is the responsibility of other parts of the compiler . Things in this function could be considered “TODO”. Failures are logged so they can be analyzed. If the Cinder team determines that there is some very frequent case they should handle, they will find out from the logs. The inliner collects all candidate call instructions in one pass over the CFG. It loads the configurable “cost limit” from the options struct. Then it does one pass over the inlining candidates vector, inlining until it (maybe) hits the cost limit. It does some graph maintenance work after inlining these calls, but that’s it. This approach gets a surprising amount of utility for being so simple: it inlines constants (quite a few methods look like ), small methods, and (at least, as far as I can remember) shrinks the compiled code size. All for very little compile time overhead. There’s one other “standalone” Python JIT out there, PyPy. So we should look at that too. There are two inliners in PyPy. One is inside the RPython to C translation pipeline, which acts more like an ahead-of-time compiler 2 . Then there is the tracing JIT bit, which has its own optimizer and heuristics. We’re going to look at the latter. I talked to CF Bolz-Tereick about the inliner and their comment was that PyPy’s inlining heuristic is “yes”. There are a couple of exceptions, such as not inlining recursive functions or functions with loops. But the basic idea of tracing includes tracing through call instructions, which naturally means that you are “inlining”. PyPy also does this neat thing where they treat frame pushes like normal allocation. Frame pushes, frame reads, and frame writes get written to the trace like normal object memory traffic and can get optimized away like other field reads and writes. This means that they can “just” use DCE to eliminate frame pushes and pops, whereas Cinder has some complicated mechanism to do it (which is my fault). TODO get more details here V8 is a JS engine and it has over the years had many execution approaches. We’ll look at three of them since they all have or had their place in the history: They also each inline at different times in the pipeline, which made for a fun time trying to understand the different codebases. Inlining happens during Hydrogen graph building Don’t store function bytecode of all functions; need to re-parse callee text source to inline Heuristics https://github.com/tekknolagi/v8/blob/a969ab67f8e1e7475d9b26468225c3a772890c64/src/crankshaft/hydrogen.cc#L7807 https://docs.google.com/document/d/1VoYBhpDhJC4VlqMXCKvae-8IGuheBGxy32EOgC2LnT8/edit https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/compiler/js-inlining-heuristic.h#L14 When optimizing, add call instructions to the inline candidates list: https://github.com/v8/v8/blob/1a391f98cc7a9196369f2d6cab7df35ffbe92c08/src/maglev/maglev-graph-optimizer.cc#L1271 https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/maglev/maglev-inlining.h#L36 Unlike for example Cinder, Maglev looks like it does not have a lot of restrictions about what can get inlined into what, so its “can inline” signal is about budget. Actually two budgets: small budget and normal budget. Then its inlining loop is a greedy walk of the to-inline queue checking candidate sizes. It runs this loop (which drains the queue) interleaved with the optimizer (which populates the queue). Confusingly, though, the optimizer also calls another function called which checks if it legally can inline: appears unused? / dead declaration? maybe src/maglev/maglev-graph-builder.cc is just not working on github search also unused / dead declaration same JavaScriptCore is funky! Unlike these other compilers that do inlining in their neat little SSA IRs, JSC inlines at the bytecode level 4 . This is their way of making sure that they get at least one level of call context into their interpreter inline caches, which will eventually give better information to the compiler. JSC only inlines based on bytecode profile information, and only inlines bytecode?? TODO find better sources for bytecode inlining SpiderMonkey has another way of getting that call contet without doing bytecode inlining: they add call context to their inline caches. Methods can pass down an ICScript to their callees where the callee writes its inline cache information. Then, when compiling, the callee is more likely to be monomorphized. https://github.com/mozilla-firefox/firefox/blob/438a3ce10eb77fb50d968463b7741117aec5bb4a/js/src/wasm/WasmHeuristics.h#L213 SpiderMonkey ICScript https://fitzgen.com/2025/11/19/inliner.html Plan: run in interpreter; tier up to C1; profile call targets; inline in C1; profile branch counts; tier up to C2, which copies C1 inlining decisions in bytecode parser https://github.com/openjdk/jdk/blob/a05d5d2514c835f2bfeaf7a8c7df0ac241f0177f/src/hotspot/share/opto/bytecodeInfo.cpp#L116 https://github.com/openjdk/jdk/blob/497dca2549a9829530670576115bf4b8fab386b3/src/hotspot/share/opto/bytecodeInfo.cpp#L197 https://github.com/openjdk/jdk/blob/497dca2549a9829530670576115bf4b8fab386b3/src/hotspot/share/opto/parse.hpp#L42 https://github.com/openjdk/jdk/blob/497dca2549a9829530670576115bf4b8fab386b3/src/hotspot/share/opto/doCall.cpp#L185 Not too small Walk up the call stack to figure out what to compile Handling the right thing to inline: def foo(a) = a.each {|x| x } want to compile , inline each, inline block, not compile block separately (probably) https://bernsteinbear.com/assets/img/design-hotspot-client-compiler.pdf https://github.com/openjdk/jdk/blob/d854a04231a437a6af36ae65780961f40f336343/src/hotspot/share/c1/c1_GraphBuilder.cpp#L755 https://github.com/openjdk/jdk/blob/d854a04231a437a6af36ae65780961f40f336343/src/hotspot/share/c1/c1_GraphBuilder.cpp#L3854 heuristics: TruffleRuby uses weighted compile queue Graal https://ieeexplore.ieee.org/document/8661171 https://github.com/dotnet/runtime/blob/2d638dc1179164a08d9387cbe6354fe2b7e4d823/docs/design/coreclr/jit/inlining-plans.md https://github.com/dotnet/runtime/blob/0b3f3ab1ecf4de06459e5f0e2b7cb3baf70ef981/src/coreclr/jit/inline.def#L94 https://github.com/dotnet/runtime/blob/0b3f3ab1ecf4de06459e5f0e2b7cb3baf70ef981/src/coreclr/jit/inlinepolicy.cpp https://github.com/dotnet/runtime/blob/0b3f3ab1ecf4de06459e5f0e2b7cb3baf70ef981/docs/design/coreclr/jit/inline-size-estimates.md?plain=1#L5 https://github.com/dotnet/runtime/blob/0b3f3ab1ecf4de06459e5f0e2b7cb3baf70ef981/src/coreclr/jit/fginline.cpp https://github.com/dotnet/runtime/issues/10303 https://github.com/AndyAyersMS/PerformanceExplorer/blob/master/notes/notes-aug-2016.md https://github.com/dart-lang/sdk/blob/391212f3da8cc0790fc532d367549042216bd5ca/runtime/vm/compiler/backend/inliner.cc#L49 https://github.com/dart-lang/sdk/blob/391212f3da8cc0790fc532d367549042216bd5ca/runtime/vm/compiler/backend/inliner.cc#L1023 https://web.archive.org/web/20170830093403id_/https://link.springer.com/content/pdf/10.1007/978-3-540-78791-4_5.pdf An adaptive strategy for inline substitution (PDF) tracelet based https://github.com/facebook/hhvm/blob/eeba7ad1ffa372a9b8cc9d1ec7f5295d45627009/hphp/runtime/vm/jit/inlining-decider.h#L89 https://github.com/LineageOS/android_art/blob/8ce603e0c68899bdfbc9cd4c50dcc65bbf777982/compiler/optimizing/inliner.h https://github.com/JikesRVM/JikesRVM/blob/5072f19761115d987b6ee162f49a03522d36c697/rvm/src/org/jikesrvm/compilers/opt/inlining/DefaultInlineOracle.java#L55 Partial inlining Understanding and Exploiting Optimal Function Inlining (PDF) machine learning Automatic construction of inlining heuristics using machine learning Machine-Learning-Based Optimization Heuristics in Dynamic Compilers (PDF) Guiding Inlining Decisions Using Post-Inlining Transformations (PDF) U Can’t Inline This! (PDF) Towards better inlining decisions using inlining trials RhizomeRuby inlining An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers (PDF) Automatic Tuning of Inlining Heuristics (PDF) Inlining-Benefit Prediction with Interprocedural Partial Escape Analysis (PDF) Inlining of Virtual Methods (PDF) A Study of Type Analysis for Speculative Method Inlining in a JIT Environment (PDF) A Comparative Study of Static and Profile-Based Heuristics for Inlining (PDF) clusters from Custom benefit-driven inliner in Falcon JIT (PDF) https://github.com/oracle/graal/blob/5dde777cba22a99ebe3f19745d03ddfbc35c563c/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/phases/common/inlining/policy/GreedyInliningPolicy.java https://github.com/oracle/graal/blob/5dde777cba22a99ebe3f19745d03ddfbc35c563c/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/phases/common/inlining/InliningPhase.java https://github.com/oracle/graal/blob/5dde777cba22a99ebe3f19745d03ddfbc35c563c/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/phases/common/inlining/info/elem/InlineableGraph.java#L148 There are some newer papers, especially in Java land, that try to do a lot of analysis ahead-of-time and bundle the resulting information in .class files. Then the JIT can read it and see more than local context. Or, if you are an AOT compiler, you can probably do a lot more whole system reasoning—both for time budget reasons and also because you can see more functions at once.  ↩ Check it out if you like. I stumbled across it by accident.  ↩ See also “Turbolev”, which seems to merge Maglev (CFG) with Turbofan (Sea of Nodes)… somehow.  ↩ Potentially a misunderstanding based on a private conversation. I’m working on tracking down the implementation…  ↩ Profiles of call target Cumulative caller size (increasing as callees get inlined) Callee size Inline depth Number of inlined calls at a certain depth If recursion is present Callee/caller call count ratio (if callee only called less than K% of calls to caller, don’t inline callee) Callee stack usage Polymorphism in callee What mode the compiler is in (baseline vs more aggressive) If the callee looks like it always raises/throws Train neural networks to make inlining decisions Let inlining drive the entire optimization pipeline, treating it as a search heuristic over a BFS walk of the call graph Use AOT-gathered information to aid in JIT heuristics Hydrogen was the first real SSA IR and it looks very familiar to me, having worked on Cinder and now ZJIT. It is now defunct. Turbofan was the replacement, going full Sea of Nodes. In the grand scheme of things it is a pretty fast compiler, but it does not hold back from doing some expensive rewrites. This was recently rewritten from Sea of Nodes to a mode traditional CFG and nicknamed Turboshaft. Maglev is meant to coexist alongside Turbofan, preferring to speculate a little more eagerly and do fewer incremental rewrites in the name of compile time. 3 https://github.com/tekknolagi/v8/blob/a969ab67f8e1e7475d9b26468225c3a772890c64/src/crankshaft/hydrogen.cc#L9236 something about native context check callee AST size against configurable limit check inlining depth against configurable limit don’t inline recursive functions check current cumulative method size (as tracked by AST node count) against configurable limit Find candidates https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/compiler/js-inlining-heuristic.cc#L134 Can inline https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/compiler/js-inlining-heuristic.cc#L75 Force inline small functions https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/compiler/js-inlining-heuristic.cc#L309 Loop over sorted (by comparator) list https://github.com/v8/v8/blob/036842f4841326130a40adfcff38f85a9b4cd30a/src/compiler/js-inlining-heuristic.cc#L847 skip recursion https://github.com/v8/v8/blob/1a391f98cc7a9196369f2d6cab7df35ffbe92c08/src/objects/shared-function-info-inl.h#L421 not called enough (min call frequency) bytecode too big Bytecode inlining https://github.com/WebKit/WebKit/blob/709c3895afd71e0836f8c8be7393e44d41fab7e1/Source/JavaScriptCore/bytecode/CodeBlock.cpp#L2453 DFG https://github.com/WebKit/WebKit/blob/709c3895afd71e0836f8c8be7393e44d41fab7e1/Source/JavaScriptCore/dfg/DFGCapabilities.cpp#L76 https://github.com/WebKit/WebKit/blob/917854a9c245b87b333e23ed4b195505d574a333/Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp#L1703 https://github.com/WebKit/WebKit/blob/917854a9c245b87b333e23ed4b195505d574a333/Source/JavaScriptCore/bytecode/CallLinkStatus.cpp#L294 https://github.com/WebKit/WebKit/blob/d919344236c47b610930636d3310f00380624d43/Source/JavaScriptCore/bytecode/InlineCallFrame.h skip callees with exception handlers (unless explicitly allowed with a CLI flag) skip synchronized callees (unless explicitly allowed with a CLI flag) skip classes with unlinked callees skip uninitialized classes max inline level (default 9) max recursive inline level (default 1) callee bytecode size (max for top level is 35 bytecodes, but falls off by 10% per inline level) callee stack usage (max of 10 slots) max total method size (default 8000 bytecodes) There are some newer papers, especially in Java land, that try to do a lot of analysis ahead-of-time and bundle the resulting information in .class files. Then the JIT can read it and see more than local context. Or, if you are an AOT compiler, you can probably do a lot more whole system reasoning—both for time budget reasons and also because you can see more functions at once.  ↩ Check it out if you like. I stumbled across it by accident.  ↩ See also “Turbolev”, which seems to merge Maglev (CFG) with Turbofan (Sea of Nodes)… somehow.  ↩ Potentially a misunderstanding based on a private conversation. I’m working on tracking down the implementation…  ↩

0 views