Posts in Go (20 found)
Stratechery Yesterday

The Deployment Company, Back to the 70s, Apple and Intel

Listen to this post: Good morning, President Trump is on the way to China, and Sharp China is your go-to podcast for understanding what happens next. Add it to your podcast player now in anticipation of the next few episodes breaking down the trip. On to the Update: From Reuters : OpenAI said on Monday it is setting up a new company with more than $4 billion in initial investment to help organizations build and deploy artificial intelligence systems, and will acquire an AI consulting firm, Tomoro, to quickly scale up the unit. After its early models saw strong resonance with consumers, OpenAI has been working aggressively to sign corporate contracts and establish a large presence in the business world where its AI will see large-scale deployment. The venture, which will be majority owned and controlled by OpenAI, also comes as rival Anthropic enjoys strong success in its enterprise AI push with its Claude family of models seeing rapid adoption among businesses. The new firm, called OpenAI Deployment Company, will help the ChatGPT maker embed engineers specializing in frontier AI deployment into organizations that will then work closely with various teams to identify where AI can make the biggest impact, OpenAI said. Its acquisition of Tomoro, a consulting firm that helps enterprises deploy AI, will bring around 150 experienced AI engineers and “deployment specialists” to the new unit from day one. Tomoro was formed in 2023 in alliance with OpenAI, and counts companies such as Mattel, Red Bull, Tesco and Virgin Atlantic as its clients, according to its website. That was on Monday; on Tuesday, from The Information : Google plans to hire hundreds of engineers to help customers start using its business-focused AI products, according to a person familiar with the situation. Google’s new “forward deployed engineers” will form a new team within Google Cloud, the unit’s chief, Thomas Kurian, said on LinkedIn on Tuesday, without disclosing the size of the effort. Matt Renner, Google Cloud’s chief revenue officer, said in a separate post that the move would help Google “show up for our customers with more technical resources (vs just an ocean of salespeople).” The announcement is one of several in the industry in recent weeks as tech companies are deploying armies of humans—often described as “forward deployed engineers”—and partnerships with consulting companies to get customers using AI-driven technology intended to automate work. On Monday, OpenAI launched the “OpenAI Deployment Company” in partnership with consulting and investment firms. Last week, Anthropic announced the creation of a joint venture with private equity firms to sell its AI to the PE firms’ customers. It is, needless to say, tempting to drop some snark about AGI apparently not being good enough to deploy AI, but instead I’m going to go with “as predicted”. In 2024’s Enterprise Philosophy and the First Wave of AI , I made the case that the proper analogy for AI in the enterprise was not SaaS, but rather the first wave of computing in the 1970s. Agents aren’t copilots; they are replacements. They do work in place of humans — think call centers and the like, to start — and they have all of the advantages of software: always available, and scalable up-and-down with demand…Benioff isn’t talking about making employees more productive, but rather companies; the verb that applies to employees is “augmented”, which sounds much nicer than “replaced”; the ultimate goal is stated as well: business results. That right there is tech’s third philosophy: improving the bottom line for large enterprises. Notice how well this framing applies to the mainframe wave of computing: accounting and ERP software made companies more productive and drove positive business results; the employees that were “augmented” were managers who got far more accurate reports much more quickly, while the employees who used to do that work were replaced. Critically, the decision about whether or not to make this change did not depend on rank-and-file employees changing how they worked, but for executives to decide to take the plunge. Specifically, I don’t think that the Deployment Company is going in to help employees use chatbots; that’s even more clearly the case with the PE firms that both OpenAI and Anthropic are doing deals with. I expect there to be an ever-increasing number of deals where PE buys software firms with reliable cash flows and conducts significant layoffs, forcing AI to pick up the slack, solving stock-based compensation issues in the process. I don’t know if the mandate for the Deployment Company is going to be quite so harsh, but I assume this is a company that is hired by the executive suite to fundamentally rethink business processes in a way that hasn’t been done since the mainframe: Most historically-driven AI analogies usually come from the Internet, and understandably so: that was both an epochal change and also much fresher in our collective memories. My core contention here, however, is that AI truly is a new way of computing, and that means the better analogies are to computing itself. Transformers are the transistor, and mainframes are today’s models. The GUI is, arguably, still TBD. To the extent that is right, then, the biggest opportunity is in top-down enterprise implementations. The enterprise philosophy is older than the two consumer philosophies I wrote about previously: its motivation is not the user, but the buyer, who wants to increase revenue and cut costs, and will be brutally rational about how to achieve that (including running expected value calculations on agents making mistakes). That will be the only way to justify the compute necessary to scale out agentic capabilities, and to do the years of work necessary to get data in a state where humans can be replaced. The bottom line benefits — the essence of enterprise philosophy — will compel just that. What I wonder is how much of the work ends up reworking data; that, as I noted in that article, is why I was bullish on Palantir: That leaves the data piece, and while Benioff bragged about all of the data that Salesforce had, it doesn’t have everything, and what it does have is scattered across the phalanx of applications and storage layers that make up the Salesforce Platform. Indeed, Microsoft faces the same problem: while their Copilot vision includes APIs for 3rd-party “agents” — in this case, data from other companies — the reality is that an effective Agent — i.e. a worker replacement — needs access to everything in a way that it can reason over. The ability of large language models to handle unstructured data is revolutionary, but the fact remains that better data still results in better output; explicit step-by-step reasoning data, for example, is a big part of how o1 works. To that end, the company I am most intrigued by, for what I think will be the first wave of AI, is Palantir… That integration looks like this illustration from the company’s webpage for Foundry, what they call “The Ontology-Powered Operating System for the Modern Enterprise”: What is notable about this illustration is just how deeply Palantir needs to get into an enterprise’s operations to achieve its goals. This isn’t a consumery-SaaS application that your team leader puts on their credit card; it is SOFTWARE of the sort that Salesforce sought to move beyond. Google’s Kurian, by the way, did dismiss any sort of Palantir comparison in a Stratechery Interview last month: This all makes perfect sense, particularly this bit about the Knowledge Catalog definitely fits how I’ve been thinking. I wrote about this a few years ago about this importance of this whole layer and understanding it, it’s a bit of a big lift to get this in place. You have some sort of analog, say, with like a Palantir that’s putting in like their ontology thing. They have FDEs out on the site, multi-month projects doing this. You have OpenAI talking about Frontier, their agent layer, and they’re partnering with all the tech consultancies to build this out. Is this going to entail a lot of boots on the ground to get this graph working and functional in a way that your agents can operate effectively across it? TK: We’re not competing with Palantir, we’re not building a semantic dictionary or an ontology. What we’re doing is, today I’ll give you the closest analogy. TK: Today when you use a model, let’s say you use Gemini, and you ask a question, Gemini goes through reasoning, and then it shows you a citation. A citation is, “How did I answer the question and what’s the source I derived from?” Now imagine that citation was a query that needed to go to a folder in, for example, a storage system because there’s some documents there and a database because, for example, in a part number, just think about there’s a part number document that lists all the part numbers and sits in a drive and then that part number you need to fetch out to say it’s the modem that the guy is coming to repair, and that’s mapped to a table in a database. So what the graph does, we use Gemini, so we don’t need humans, we use Gemini to say, “Hey, go and read all these documents in these drives and extract the information from it and then match that to the database table that has the reference to the part number”, and so then when Gemini turns around and says, “I got this query about how much inventory of modems they are”, the first thing it does is it says, “Okay, go to the Knowledge Catalog and it says modem is part number one, two, three, four, five”, and then it says, “By the way the table in the database that has the inventory information about this part number is this table, here’s a SQL”, it then makes the quality of what we generate higher and then when it answers the question it shows back — back to your, “Trust my data”, it shows a grounding citation saying, “That’s where we got it from.” Well, so much for not needing humans! I joke, mostly — Kurian was referring to not needing a Palantir-like ontology, not necessarily dismissing the need for FDEs — but it sure is interesting how AI is creating the need for new kinds of jobs. It’s almost as if the world is more dynamic, and pure intelligence, unadulterated by what already exists and the burden of reflexivity, is more static, than the most pessimistic prognosticators may have anticipated. More prosaically, OpenAI and Anthropic need the revenue, enterprises need the imagination, and Google needs to stay in the game. From the Wall Street Journal : Apple and Intel have reached a preliminary agreement for Intel to manufacture some of the chips that power Apple devices, according to people familiar with the matter. Intensive talks between the two companies have been ongoing for more than a year, and they hammered out a formal deal in recent months, these people said. Bloomberg News previously reported the talks. It’s still unclear which Apple products Intel would make chips for, these people said. Apple ships more than 200 million iPhones a year as well as millions of iPads and Mac computers. Ming-Chi Kuo reported on X late last year that Intel would make Apple’s most basic M processor on its 18A process; he didn’t specify which generation. Regardless, while the Wall Street Journal cites Trump administration pressure, and an earlier Bloomberg article Apple’s concentration risk on TSMC and Taiwan, the most obvious reason for a deal — assuming it exists — is economic. Specifically, Apple has for two quarters running said it can’t satisfy demand because it can’t get enough capacity at TSMC. CEO Tim Cook referenced this point multiple times on the last earnings call , but I think this was the most important articulation: The constraint in the March quarter and the June quarter, the primary constraint is the availability of the advanced nodes our SoCs are produced on, not memory. And so I don’t want to predict for supply and demand to match because if I look at it realistically, I think on the Mac mini and the Mac Studio, I believe it will take several months to reach supply-demand balance. And so we’re not at the point where we’re saying this is going to end anytime soon. And it’s not because of a problem per se other than we just undercalled the demand. And there are lead times to this, as you well understand, and it takes a while to correct that. And the primary constraint from a product point of view, or the majority of it for this quarter, for the June quarter will be on the Mac. And it’s Mac mini, Mac Studio and the MacBook Neo. It’s all of those. Cook talked about lead times last quarter as well, and the important thing to note is that while it does take five months or so to make new chips, assuming Apple realized it needed more iPhone 17 Pro chips right away, those new A19 Pro lines only started producing chips partway through last quarter (which is why iPhone 17 Pro sales weren’t as high as they could be). Critically, however, what seems likely is that Apple took capacity away from the Mac to make more iPhone chips, and now doesn’t have enough chips for the Mini and Studio either. The long-and-short of it is this: Apple doesn’t have flexible access to TSMC capacity anymore, because so much of that capacity is going to AI in particular, and it’s costing Apple meaningful money across multiple product lines. This was always the thing that would bring companies to Intel; I wrote in TSMC Risk : Becoming a meaningful customer of Samsung or Intel is very risky: it takes years to get a chip working on a new process, which hardly seems worth it if that process might not be as good, and if the company offering the process definitely isn’t as customer service-centric as TSMC. I understand why everyone sticks with TSMC. The reality that hyperscalers and fabless chip companies need to wake up to, however, is that avoiding the risk of working with someone other than TSMC incurs new risks that are both harder to see and also much more substantial. Except again, we can see the harms already: foregone revenue today as demand outstrips supply. Today’s shortages, however, may prove to be peanuts: if AI has the potential these companies claim it does, future foregone revenue at the end of the decade is going to cost exponentially more — surely a lot more than whatever expense is necessary to make Samsung and/or Intel into viable competitors for TSMC. This, incidentally, is how the geographic risk issue will be fixed, if it ever is. It’s hard to get companies to pay for insurance for geopolitical risks that may never materialize. What is much more likely is that TSMC’s customers realize that their biggest risk isn’t that TSMC gets blown up by China, but that TSMC’s monopoly and reasonable reluctance to risk a rate of investment that matches the rest of the industry means that the rest of the industry fails to fully capture the value of AI. We’re already here (reportedly). TSMC’s failure to invest aggressively enough over the last several years will, in the end, give Intel the single most important thing it needs to become a viable competitor: the customer who did more than any other to make TSMC into the leader in the first place. This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery . The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a subscriber, and have a great day!

0 views
Anton Zhiyanov 1 weeks ago

Solod v0.1: Go ergonomics, practical stdlib, native C interop

Solod ( So ) is a system-level language with Go syntax and zero runtime. It's designed for two main audiences: The initial version (let's call it v0) was focused on picking a subset of Go and translating it to C. The next logical step was to port Go's standard library and make it easier to interop with C. That's what the v0.1 release I'm presenting today is all about. Standard library • SQLite bindings • Persistent map • Store and retrieve • Command-line interface • Performance • Wrapping up Solod v0.1 ships with the following stdlib packages ported from Go: And a couple of its own packages: Stdlib documentation In the following sections, I'll demonstrate some of the v0.1 features using a simple example: a persistent key-value store backed by SQLite. Since So doesn't provide yet, we'll call SQLite directly through its C API. To do this, let's import the necessary headers with the directive and generate extern declarations using the sobind tool: The directive is required for constants ( ) and types ( ). As for functions ( ), we can just declare them without a body — the transpiler will treat them as extern declarations even without . With the SQLite API in place, let's implement a key-value type that wraps the database connection: Add a constructor that connects to an SQLite database and creates a table to store the items: As you can see, this So code looks a lot like regular Go code. However, there are some key differences: First, let's implement the method: No surprises here, just a bunch of SQLite API calls. The method is more interesting: The pointer returned by is managed by SQLite. It becomes invalid after calling (which does before returning). Because of this, we need to allocate a copy of the returned value, using in this case. So's approach to memory allocation is similar to Zig's — all heap allocations must be done explicitly by providing a specific instance of the interface. The caller, of course, must free the allocated string: Here, is a specific allocator that uses libc's and . Alternatively, we could use or any other implementation of the interface: With the type in place, let's create a simple CLI using the package: Then add command routing: Again, no surprises here — the package works just as it does in Go. Solod isn't trying to outperform hand-tuned C. Still, performance matters: the code is benchmarked and optimized to run reasonably fast. Since So compiles to plain C and then to native code with full optimizations, the results are sometimes better than Go's. Here are some highlights from the benchmarks: There're no GC pauses and no Cgo bridge cost when calling C libraries. The tradeoff is that you have to handle memory yourself, but as the SQLite example above shows, So's allocator interface makes that pretty manageable. Solod vs. Go benchmarks Solod is still in its early days, but with the v0.1 release, it's ready for hobby projects. The already-ported parts of the Go standard library make it easy to write command-line tools (check out the , , , and examples ). Plus, with native C interop, you can build just about anything else you need. The next release (v0.2) will likely focus on networking, concurrency, or both — along with more stdlib packages. If you're interested, take a look at So's readme — it has all the information you need to get started. Or try So online without installing anything. Go developers who want low-level control and zero-cost C interop, without having to learn a new language or standard library. C developers who like Go's style. , , and — Abstractions and types for general-purpose I/O. , , , and — Common byte and text operations. and — Generic heap-allocated data structures. and — Generating random data. , , and — Working with the command line and files. — Structured logging. — Measuring and displaying time. — Memory allocation with a pluggable allocator interface. — Low-level C interop helpers. When compiled, the code is first translated to plain C, then compiled into a native binary using GCC or Clang. Unlike Go, there is no runtime (no automatic heap memory allocation, no garbage collection, no goroutine scheduler). There is no overhead when calling C functions, unlike Go's Cgo. The interop syntax is a bit cleaner. For example, Go's ( in the call) automatically decays to C's . Buffered I/O is 3x faster than Go. String and byte operations are up to 2.5x faster. Maps are 1.5x faster for modifications. Integer formatting is 2x faster.

0 views
Manuel Moreale 1 weeks ago

Hyde Stevenson

This week on the People and Blogs series we have an interview with Hyde Stevenson, whose blog can be found at lazybea.rs . Tired of RSS? Read this in your browser or sign up for the newsletter . People and Blogs is supported by the "One a Month" club members. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. Hyde Stevenson is a nickname I've been using online for years. It's a mix from Dr Jekyll and Mr Hyde, and its author Robert Louis Stevenson. Privacy is important to me, so I generally avoid using my real name. My parents are from Serbia, but I was born in Paris. I lived in London, and, now, I live in southern Europe. More vitamin D was needed in my life. I had two passions as a kid: sport, and computers. Sport has always been a big part of my life. When I was a kid, all my friends played football, but I was always more into basketball. I don't mind watching a good football game, but that's where it ends. But, basketball is another thing. I'm a big Nikola Jokic fan, and I haven't missed a Denver game for the last four years. When we were kids, we all dreamt about the NBA. There weren't many games available to watch. We had one guy who ordered games on tape direct from the US. Then, we shared, and copied them. Basketball was our life. We played at school, after the school, the weekends. We were chasing the best playgrounds to compete with other players. It was great. It was the end of the 80s. Bird, Magic, Jordan, the Pistons Bad Boys, and also Yugoslavian players like Vlade Divac and Dražen Petrovic. The Dream Team too, the real one. I'll always wonder what might've happened if the war in the Balkans hadn't happened and the USA and Yugoslavia had played each other in the Olympics final. That love for the game made me play at a semi-pro level. But, a bad coach put me off the courts. I was young and didn't understand why I couldn't play more when I knew I had the level. I remember one shooting training where I got 46/50 on 3pts, and the guy behind me got 36/50. Did the coach say something to me? Nope. That was enough, and I took a break from the game for a few years to pursue another passion: boxing. My love of boxing probably stems from those nights when my father would wake me up at 4am to watch Mike Tyson's fights. I've always loved boxing. My father's mate's nephew was a boxer. He invited me to train at his gym. And I got hooked. Sad story about this young man. He went pro, but after a bar fight, I heard he was murdered out for revenge by someone involved in that brawl. I also had a great group of friends, and we trained grappling, and MMA for four or five years. A good friend trained us grappling. Today, he trains fighters who fought in the UFC, and got lucky to meet many MMA fighters like Jon Jones . Another one, Guillaume Kerner trained us Thai boxing. Guillaume was one of the first western European Thai boxer who won a World Title in Thailand. You can check some highlights of his career . That was before I moved to London. When I got back in France, I was training exclusively in boxing until 2021, when I moved abroad. Since I relocated, I've really missed the camaraderie of the boxing club. I'm lucky enough to have a garage where I've hung a punching bag and can keep training. For those interested, I started last year a #50kPushUps challenge . The goal is to make 50,000 push-ups in one year. I could write many anecdotes about people I met, but I want also to share my other passion: computers. When I meet people, the first thing they say to me is that I don't look like a computer guy. Stereotypes... 🤷 My passion probably started when one night my father brought home the VCS, the Video Computer System, later renamed the Atari 2600. It's not a computer, but that's where it all started. Later, I asked if I could have a computer, and they offered me the Amstrad CPC464 with its 64Kb RAM, and cassette deck. Later, my grandmother offered me the updated version the CPC6128 with the same RAM, but with a 3-inch floppy disk. After that I had many other ones. I started to build them. I tried my first Linux distro in 1995. It was a Debian. Today, my main distribution is still Debian, even if I tried, and used many others. I've tried probably many window managers over the years. But, for the last 15 years more or less, I've been using only awesomewm , a tiling window manager, light, and customizable if you know Lua a bit. I could write a lot about Linux, but I don't think it'd be of much interest to our readers. What I can say is that my love for computers is what got me to where I am today in my career. My first blog was about Debian, the GNU/Linux distribution. It was in 2001, and it was called debianworld.org. I used to write how-tos, and articles about Linux. I used the blog to post English to French translation of the Debian Weekly News, but also the Securing Debian Manual , and some part of the Advanced Bash scripting guide . Then in 2014, after a long summer, I found out I got cyber squatted. And, just like this it was gone. Then, for five years, I didn't set up anything online until 2019. I met a colleague that asked me if I participated in any conferences, or if I had a blog. That's when I wanted to have a personal place online again. I love bears, that's why I chose that domain name. And, lazy, because I am sometimes. About the theme, it took me some time to create it, and be happy with the final result. But, then, it didn't really change. It depends. First, I need a topic, or an idea. Sometimes a blog post, a news, a new tool, or basically anything can inspire me to write directly a post. But, often, I like to go through my Zettelkasten. Every morning, I use this keybinding -0. That opens a random note. If it doesn't sparkle anything, I hit the same keys again. A "new" note appears, and, sometimes, a discussion starts. I will add more content, or argue with previous thoughts. That's how some drafts start. English not being my mother tongue, I read the different parts multiple times to be sure to make sense. My goal is to make simple sentences, but that connect with everyone. Once done, I check if some grammar hasn't been forgotten by my LSP. Then, a script will sync the content to my blog, and post it also on Mastodon. I don't. I just need my laptop, a terminal, and a coffee. That's all. Maybe the physical space could help some people. Maybe if I had a seaside view, it could impact my creativity 😅. Previously, for other projects, I used Drupal, then Wordpress. But, for this one, I wanted something easily to maintain. No database, or plugins updates. Something simple. That's why I went for a SSG, a Static Site Generator. I chose Hugo , and I've been happy with it for years. There is some JavaScript from Carl Schwan's post to add Mastodon's comment on the blog. So far it works well. Everything is hosted on a dedicated server. All post have been written in Neovim, my go-to editor, on a Tuxedo laptop. My local repository has a backup on a Synology DS1812+ NAS, which also had a remote backup. That repository is pushed on a private Codeberg repository too. Domain name was purchase at Unlimited.rs , a registar in Serbia. Originally, the name of the blog was lazybear.io, but since the announcement that it will disappear in the future, that's when I switched to a Serbian one. For other projects, I use also Porkbun that I love. I don't think so. A few of my friends suggested that I should specialize and monetize it, but that was never its goal. It's my little corner on the web where I can do whatever I want. I can tweak it as I want, try new things, post photos the way I want, without having to follow a specific format. It was always meant to be my place to experiment. I don't track visitors, I don't care about numbers. Now, and then, I get some emails, and I like the discussions I get there. Keep them coming 🙌 The domain name is around €24 per year. The dedicated server around €30 per month, but I use it for other things too. It doesn't generate any money. I could add a Ko-fi account, and maybe I will... just in case. 😇 If people want to monetize it, I don't see any issue with that. Everyone is free to do whatever they want. Ok, I have a couple of them! And, two French photographers: I also have a list of blogs I enjoy, and follow . Yeah start a blog, value your privacy, and send an email to Manuel so we can find more about you. Now that you're done reading the interview, go check the blog and subscribe to the RSS feed . If you're looking for more content, go read one of the previous 139 interviews . People and Blogs is possible because kind people support it. Rldane.space Zerokspot.com Joelchrono.xyz Benjaminhollon.com Christiantietze.de Jeremyjanin.com GregoryMignard.com

0 views
qouteall notes 1 weeks ago

Rust Async Traps

In Rust, if you call an async function, it returns a future. But the future is just data by default. If you don't await it or spawn a it, its async code won't run. The word "future" has very different meaning in Java. In Java, when obtaining a , the task should be already running. Async runtime schedules async tasks on threads. When an async task suspends, the thread can run other async tasks. But it requires the async task to cooperatively suspend ( ). An async task can keep running without for long time, and the async runtime cannot force-suspend it. Then a scheduler thread will be kept occupied. This is called blocking the scheduler thread . When a scheduler thread is blocked, it reduces overall concurrency and reduces overall performance. And it may cause deadlock. The normal sleep and normal locking will block thread using OS functionality. When a thread is blocked by OS, async runtime don't know about it. In Tokio, use for mutex and and sleep. They will coorporatively pause and avoid that issue. That issue is not limited to only locking and sleep. It also involves networking and all kinds of IOs. So Tokio provides its own set of IO functionalities, and you have to use them when using Tokio for max performance. Also, heavy computation work without point is also blocking. The async runtime cannot force-suspend the heavy computation if it doesn't cooperatively . Tokio also supports an "escape hatch". The task spawned by runs in another thread pool and won't block the normal scheduler thread. The code that does non-async blocking or heavy compute work should be ran in . How to deadlock Tokio application in Rust with just a single mutex Why do I get a deadlock when using Tokio with a std::sync::Mutex? In Rust, a future can be dropped. When it's dropped, its async code stops executing in an await point. This is called cancellation. It's a implicit exit mechanism. The control flow of it is not obvious in code. Note it cancels the future, not the IO. Cancelling a future just stops the async code from running (and drop related data). The already-done IO operations won't be cancelled. (The written files won't be magically rolled back. The sent packets won't be magically withdrawn.) Cancellation not the only implicit exit mechanism. Panic is another implicit exit mechanism. And in the languages that have exceptions (Java, JS, Python, etc.), exception is another implciit exit mechanism. However, exceptions and panics are often logged, but future cancel is often not logged . Although panic is implicit code control flow, it's often explicit in logs. It's easy to debug because it's visible in log. But a future cancel by default logs nothing. Debugging future cancel issue is much harder than debugging panics. The cancellation "catch": normally when the parent future cancels, the inner futures are also cancelled. It propagates from outside to inside. The can stop that propagation. Although is , dropping it won't cancel the spawned task. So if you want to avoid cancellation, wrap it in (and don't call ). In Golang, there is panic, but there is no implcit cancellation. All cancellation need to be explicit. (However managing context cancellation in Golang still has traps, just different to async Rust.) Two examples of cancellation issues: Alan tries to cache requests, which doesn't always happen , Barbara gets burned by select See also: Dealing with cancel safety in async Rust , Cancelling async Rust There is another kind of "cancel": doesn't drop the future but does not the future. This is also dangerous. Elaborated below. Tokio documentation about cancellation safety: 1 , 2 Note again that "cancel" just drops Rust future (and un-track it in async runtime). It doesn't cancel the IO operation. With epoll, the buffer can be directly put inside future, with no extra allocation. If the Rust future is dropped, it just don't do the IO after being notified. With io_uring, dropping the future doesn't cancel the kernel's IO process. So putting buffer into future in io_uring is not memory-safe on cancellation (kernel will write into freed memory). Two solutions: See also: Notes on io-uring As previously mentioned, dropping a future cancels it. There is another kind of "cancellation": just not the future, without dropping the future. It's also dangerous. It may cause deadlock or weird delaying. In you can pass ownership of a future, but you can also pass a future borrow. When a future borrow is passed, one dangerous case can happen. If the select goes into one branch, the future of other branches are dropeed. If you pass a future borrow to it, the borrow itself is dropped, but the borrowed future is not dropped. However, the borrowed future will not be polled again (you can explicit await it after the , but it doesn't before finishing). This creates a temporaily un- -ed future. This is dangerous when async lock is involved. After acquiring lock, the returned future holds lock. If the future holding lock is dropped, it released lock. But if the future holds lock but not dropped and not polled, it's likely to deadlock. This is the mechanism behind futurelock . When using buffered stream, some futures in buffer may be temporarily un- -ed. This can cause weird delaying or deadlock. https://tmandry.gitlab.io/blog/posts/for-await-buffered-streams/ https://without.boats/blog/poll-progress/ Rust currently have no in-place initialization. Heap-allocating one thing requires firstly creating it on stack then move it to heap. In release mode, it can be optimized to directly initializing on heap. But in debug mode it still involves creating on stack. Some futures may be very large. Creating a large future on stack can cause stack overflow. Sometimes it stack overflows in debug mode but not release mode, because in release mode it directly writes to heap. In Windows the default stack size is smaller so it's more likely to stackoverflow. There is currently some inefficiency in future size. See Async Future Memory Optimisation How to reduce future size: It will print All of them execute on main thread. There is no parallelism. The parallelism can be enabled by using . But without it has no parallelism by default. This is different in Golang. In Golang, goroutines are parallel. Async-sync-async sandwitch: Async function call sync function that blocks on another async function. Its async-to-sync calling blocks scheduler thread. It's very prone to deadlock. Tokio does multi-thread work-stealing scheduling. Its purpose is very similar to OS scheduling. And an async task's purpose is very similar to OS thread. The duality of the two: As long as the data is owned by a thread, it's data-race free. The correspondence: as long as the data is owned by an async task, it's data-race free. Tokio requires the future to be . This can create some troubles. It requires because Tokio does work stealing. An async task in one thread could be then scheduled to another async task. However if async task is analogous to thread, then if we ensure that the data is owned by async task, it can also achieve data-race free, even if the data is not . However Rust doesn't check "async task boundary". An async task can pass data out. Then the data is no longer owned by async task. There is no language mechanism that ensures that the data is tied within async task. So you still have to satisfy even for the data that's only used with one async task. The constraint can be avoided for thread-per-core async runtimes. Using multiple async runtimes together is possible but is hard and error-prone. And there are many async-runtime-specific types. So async runtime naturally has exclusion. That's why Tokio has monopoly. In Golang you can only use one official goroutine scheduler. In Rust, although Tokio has monopoly, you have choices of using other async runtimes. This trap is not Rust-specific. When using thread pool, it often has thread count limit, which limits concurrency. But in async, there is no concurrency limit by default. This is good for high-performance web server. But it has downsides: One solution is to add a semaphore to limit concurrency. Structural concurrency force all concurrent tasks to be scoped. Then the tasks form a tree-shaped structure. Structural concurrency can borrow data from parent. There is no need to make the future . There is no need to wrap things in . The tree shape is free of cycles, so awaiting on child tasks alone cannot deadlock (but it can deadlock if other kinds of waits are involved). But there are cases that structural concurrency cannot handld. One is background tasks. For example, a web server provides a Restful API that launches a background task. The background task keeps running after the request that launch task finishes. The bane of my existence: Supporting both async and sync code in Rust Why async Rust? Async Rust can be a pleasure to work with (without ) Making Async Rust Reliable - Tyler Mandry FuturesUnordered and the order of futures The "fully owned" here means not just ownership in Rust semantics. The has internal data structures. The "fully owned" applies to these internal data structures. One async task fully own the means the internal data structure (that contains reference count) is only accessible from one async task. ↩ . When one branch is selected, the futures of other branches are cancelled. . Explcitly cancel a task. . When timeout is reached but the future hasn't finished, it's cancelled. In epoll, the OS notifies app that an IO can be done, then the app does another system call to do IO. It involves context switching from kernel to app (receive notification), then to kernel (do the IO syscall) then to app (finishing IO). The app can choose to not do the IO after receiving notification. This works well with Rust future cancellation. In io_uring, the OS directly finish IO (write to buffer) then tell the app. It's just a context switch from kernel to app (it's faster than epoll's kernel-to-app-to-kernel-to-app). The IO is fully done by kernel. The app cannot choose to "receive notification but not do IO". When app receives notification, the IO has already been done. This doesn't work well with Rust async cancellation. Make the future non-cancellable. Rust doesn't yet have linear type (must-move type) so this cannot be guaranteed by language. Make the buffer heap-allocated. When future is dropped, the buffer can still exist, kernel can write to it without violating memory safety. Avoid creating an in-place buffer like . The buffer will directly be in the future. When calling another async function, firstly box that future then await on it. If not boxed, the sub-future will be directly put inside parent future. Making async code call sync code is easy, but has risk of blocking scheduler thread, as mentioned previously. Making sync code call async is not easy. It requires using async runtime's API. But it's less risky. For scraper, if concurrency is too high, it may use too much memory then OOM. If it sends too many concurrent requests to a remote server, it may trigger rate limit then most requests fail. The "fully owned" here means not just ownership in Rust semantics. The has internal data structures. The "fully owned" applies to these internal data structures. One async task fully own the means the internal data structure (that contains reference count) is only accessible from one async task. ↩

0 views
Langur Monkey 2 weeks ago

Local TTS is getting very capable and accessible

Around 2007 I spent half a year in the University of Aberdeen working on my final year project involving NLP . The project consisted of an interactive game that was controlled by language input. It also had to produce speech. At that time, we managed to partner with a group at La Salle University that were working on a TTS system for Catalan. It was a closed system that was accessible via a web API, but it was far too slow for real time use. I ended up preprocessing the audio of all dialog in the project. At that time, I was amazed that a computer could so easily convert text to an understandable audio file. The voice was very robotic, and the results were hit or miss, but it worked . Fast forward to today, TTS systems are everywhere. Several groups have released low-parameter TTS models that run very well on consumer hardware. I have been using the lightweight Kitten TTS for a while with fantastic results. The models are so lightweight that some websites are heavier than entire Kitten TTS models: Projects like streamline and trivialize Kitten TTS inference. I have a shell script in one of my directories that does everything in a single command: This clones the project, pulls dependencies and models, and plays the audio. It is quite fast, especially when using cached data. Kitten TTS produces acceptable results, though the output usually lacks emotion and nuance. For simple use cases (reading notifications, generating voiceovers for scripts) it’s more than sufficient. Qwen3-TTS , which I’ve been recently testing, represents a step-up in quality. It’s extremely good, and local inference is practical even on modest hardware given the model sizes. It offers three interesting variants: The voice design models are particularly clever: you describe the voice you want alongside the text to convert. Want a deep, gravelly voice with a Scottish accent? Or an excited teenager talking about a video game? Just describe it. It’s remarkable that you can run this locally so easily. However, as far as I know there’s no off-the-shelf CLI tool that handles dependencies, downloads the model, and runs inference out of the box. That’s why I created QwenSay . With it, you can clone the repository and convert text to speech locally from your terminal without wrestling with dependencies or writing any code. Here’s how it works. First, set it up: Now, you are ready to convert your text to speech with Qwen3-TTS: This uses the default 1.7B voice design model. You can also specify the model with . There are many other CLI arguments that you can use to tune your output. Check out the repository documentation for more details. Whether you’re building accessibility features, creating voiceovers for projects, or just experimenting, this is worth a try. I’ve made QwenSay my go-to TTS tool because it produces high-quality results and is genuinely fast.

0 views
Stratechery 2 weeks ago

An Interview with OpenAI CEO Sam Altman and AWS CEO Matt Garman About Bedrock Managed Agents

Good morning, As I noted yesterday, today’s Stratechery Interview is early in terms of my timing — Tuesday instead of Thursday — and late in terms of delivery — 1pm Eastern instead of 6am — because the topic was embargoed. That embargo created a bit of a weird situation for me over the last several days: So here we are. I think the Microsoft-OpenAI deal makes a lot of sense for both sides. Here are the bullet points of the new arrangement from Microsoft’s post : I think the most important point is the last one. Azure had a real competitive advantage thanks to being the only hyperscaler able to offer OpenAI models, but this also hindered OpenAI, particularly once it became clear that many enterprises cared first and foremost about accessing models on their current cloud of choice; I’ve been noting for a while that this was a real competitive advantage for Anthropic . In other words, Azure’s exclusivity was actively damaging Microsoft’s investment in OpenAI, and given Anthropic’s rapid growth this year, Microsoft needed to tend to their investment, even if it diminished Azure’s differentiation. OpenAI, meanwhile, clearly sees AWS as a massive opportunity — so much so that they are forgoing Azure-related revenue for the next few years (which, per the previous point, will help Azure management feel better about losing their exclusivity; their PnL is going to look a lot better without paying a revenue share to OpenAI). OpenAI is also releasing Microsoft from the AGI clause ; now the agreement between the two companies will run through 2032 no matter what. What does seem clear is that OpenAI’s focus is going to be on AWS, and the greatest evidence in that regard is the topic of this interview: Bedrock Managed Agents, powered by OpenAI. The easiest way to think about this offering is Codex in AWS; a lot of what makes Codex work is the fact that it is local, which gives you a lot of complexity, particularly in terms of security, for free. It’s another thing entirely to figure out how to make agents work across an organization, and the goal of this offering is to make these workflows much more accessible for organizations who already have most of their data in AWS. To that end, in this interview, we discuss how AWS created the entire cloud category, and the impact it had on startups, and how AI is both similar and different to that previous paradigm shift. Then we discuss Bedrock Managed Agents, what it is, and how it differs from Amazon’s existing AgentCore offering. We also touch on Trainium and why chips won’t matter to most AI users, and why partnering makes sense relative to Google’s focus on full integration. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Matt Garman and Sam Altman — well Matt, welcome to Stratechery — and Sam, welcome back [I previously interviewed Altman in October 2025 , March 2025 , and February 2023 ]. Sam Altman: Thank you. Matt Garman: Thank you, thanks for having me. So Matt, this is your first time on Stratechery. Alas, I think that Sam’s presence is going to preclude the usual getting to know you section. Besides, he doesn’t want to hear us reminisce about our times at Kellogg Business School, but it is good to have a fellow alumnus on the podcast. MG: Yeah, I’m happy to be here. I’ll come back another time and we can do a little deeper dive. That’d be great. You’ve been working on AWS since you were an intern, and you’re now in charge of the entire organization during this AI wave. What aspects of building the AI business are the same as building the original commodity compute business, for lack of a better term, and what aspects are really different? MG: I think that the parts that are the same are that I see that same excitement and builders out there being able to do things that they were never able to do before, and one of the cool things is when we first started AWS, is developers all of a sudden could get their hands on infrastructure that was only available to the largest companies who had millions of dollars to go build data centers. With a credit card and a couple of dollars, they could spin up applications and it really exploded what was possible for people building out there on the Internet. We kind of took the idea that people could build whatever they want and we weren’t going to presuppose what they should do and that the creativity of the world out there was, if we could put powerful tools in front of them, they’d build interesting and amazing things. I think this is as much, if not more, transformational to what it’s enabling builders out there to do. As you think about what’s possible, you don’t have to have gone to school and learned for 10 years to code in order to go build an application, you don’t have to have huge teams of hundreds of people and months and months and months of time to go build things. You can build things with small teams, you can build it fast and you can iterate quickly, and AI is unlocking all sorts of innovation across every different aspect of the world. I think in many ways that’s very similar, and it’s super exciting to see what it’s enabling from the customer base out there. There was a bit, though, when AWS came along, you were the only one , so you get all the upsides and downsides and everything sort of for free. Is there a bit where it felt like in the AWS era, there’s a lot about commodity compute, making it fungible, elastic, cheap — in AI, particularly in training, it feels like the winning abstraction was more about these really vertically integrated super clusters, really advanced networking, and really tight linkages between software and hardware. Was that sort of a surprise for you, where you’re coming at it now — instead of fresh, “We’re the only ones here, we had a particular way of looking at large-scale compute”, and at least for the first few years of AI, it maybe didn’t perfectly align? MG: I don’t know that it was different for us. I think for what was different though, is just the incredible rapid scale of adoption, and I think that that’s probably surprised everybody. Sam, you can weigh in different if you disagree, but just the speed of adoption and how fast people have grabbed onto the capabilities there, I think has surprised everyone. It’s different if you go to the, when we started cloud computing, it took us a really long time to explain why a bookseller would provide your compute power, that was a lot of explanation to explain what cloud computing was. There was a lot of hard work that people forget, but back in 2006, it wasn’t a given that that’s just how the world’s computing would move to and so there was a lot of kind of hard work there. Do you think you had to do a bit of explaining now though, because lots of people were anchoring on the training era and you’re like, “We’re thinking about the inference era “, and that’s going to be something different, maybe you still had to get those explanatory powers going again? MG: You do, but it’s just how quickly people understand what you’re talking about is just totally different. So I think yes, I think if you move from where people are saying, “That does seem kind of cool, and it’s really neat that I have this intelligent chatbot that I can talk to”, going to, “I can actually do work in your enterprise”, has been a little bit of an education, but it’s also been relatively quick in the scope of how fast technology moves. We’re going to get to the product that we’re here for very quickly, I promise, but Sam — from the startup ecosystem perspective, when you look back, obviously AWS, transformational , completely changed where the barrier was, now anyone can get started. You have seeds, you have angel investors, and it sort of moves back the barrier where the cutoff point, you don’t have to get servers on a PowerPoint, you can build an app and then go to your Series A or whatever it might be. What, though, is different or the same compared to what that enabled versus the world today from your perspective? SA: I think there have been four great moments for platform enablement of startups at mass scale: there was the Internet, there was cloud, there was mobile, and then there was AI. The first one of those that I was kind of like an adult for was the cloud and in the early days of YC [Combinator] — it’s like hard to overstate what a change this meant for startups. Before, you had these startups that were like renting colo[cation] space and putting together servers and putting stuff in there and it was this like massively complex thing, and you had to like raise all this money. Then all of a sudden, even though the cloud happened like right after YC got started, I guess it was the year after. I was just going to ask that — is it really at the end of the day, they’re really hand-in-hand more than you realized at the time? SA: They felt incredibly hand-in-hand at the time, it felt like YC was, you know, surfing this wave of the cloud from the very beginning because there were some early pre-AWS examples. You don’t need to put that much money into a startup to get something off the ground if AWS exists compared to what it might’ve been before. SA: It was this huge enabling change and it was part of why YC sounded so crazy at the time. People were like, “Well, there’s no way you can fund a startup with a few tens of thousands of dollars, it’s impossible, the server costs more than that”, so it was this complete change to what startups could do with small amounts of capital. Startups generally win when there is a big platform shift and you can do things with a faster cycle time and much less capital than before, that’s a classic way startups can beat big companies, and at the beginning of my career, I really witnessed that happen with the cloud, it actually feels quite directionally similar now watching what companies are doing building on AI, but as Matt was saying, the speed of it is crazy. Is there a bit where the incumbents, the large companies, are adopting this way faster than they than they were the cloud? SA: There’s definitely more of that, but I also mean just the the rate that revenue is scaling in at startups — I spoke at YC recently and I kind of asked at the end, “What are the expectations for revenue for a good company at the end of YC?”, and they’re like, “Well it’s kind of changing every month, maybe we’d have a different answer at the beginning of the batch versus the end of the batch”, and this never used to happen before. Just the rate at which people are able to build scaled business on this new platform is unlike anything I’ve seen before. You were the cloud of choice for basically all startups, a huge advantage to that whole era, Matt. What makes you the cloud of choice today? Because you think about a lot of people building on the OpenAI API, or is that something you felt, “Actually we’re coming at this market from a very different perspective, we have a huge installed base who’s begging us to get AI things, and we have less visibility into this whole cohort that Sam’s talking about”? MG: I think there’s a couple of things. One is, is we’re quite excited about our partnership, and I think it’s going to be really meaningful to a bunch of startups out there. But today, even if you go and you talk to startups, the vast majority of scaling startups are still scaling on AWS today, and there’s a whole bunch of reasons for that. The scale is there, the availability is there, the security is there, the reliability is there, that kind of partner ecosystem of other ISVs are in AWS, the customers are in AWS. (laughing) Everyone’s used the AWS panel whether they wanted to or not, so they’re used to it. MG: And we help them. We spend a ton of time enabling startups, whether it’s with credits, but it’s not just with credits, it’s advice on how to set up your systems, how to think about go-to-market, a bunch of those things that are, I think, are really appreciated by a bunch of the startups, we invest a lot of time and effort to make sure because we really feel like the startups are the lifeblood of AWS. They were from the beginning, like when Sam was talking about it, but they remain today, and I still go once a quarter out to Silicon Valley or other places to meet directly with startups to hear what they’re doing, to make sure that what we’re building is landing with them. So there is more competition today than there was 20 years ago for that startup attention, and it’s just as important for us as it’s ever been and and we spend a ton of time to make sure that we’re meeting the needs of those startups. Is it fair to say people building directly on the OpenAI API, as opposed to say the Azure version of it, are more likely to have a stack of AWS for for regular compute and then OpenAI for for their AI? MG: I think that’s a very common pattern that a lot of startups have today, absolutely. Well that brings us to today’s announcement: Bedrock Managed Agents, powered by OpenAI, I think I got that right. The pitch, as I understand it, is not simply OpenAI models are available in AWS — I don’t think that’s allowed — it’s that OpenAI’s frontier models are being packaged inside an AWS-native agent runtime, identity, permission state, logging, governance, and deployment. Sam, is that the right way to articulate it? SA: Yeah, that was pretty good. Thank you. What is this? Now explain it in English. SA: I think the next phase of AI is going from you supply some text to an agent and get more text back, or even you supply a bunch of code and get more code back, to we are going to have these agents running inside of a company doing all different kinds of work. Virtual co-workers is kind of my least bad of the ways I’ve heard this described, but no one has quite figured out the right language for this, and we are packaging a new product that we’re working on together to help enable companies that want to build these sorts of stateful agents and make them available. Again, I think we don’t know exactly how the world’s going to talk about these, use these, but if you look at what’s happening [with Codex], I think there’s a great example of where we can see this all going. How important is the harness , the runtime around the model, the tools, state — to your point, a very important word to you — memory, permissions, evals, to making agents actually work? SA: Hard to overstate how critical it is. I no longer think of the harness and the model as these entirely separable things, like my experience of using these, I am very aware of the fact that I don’t always know when I fire something off in Codex and it does an amazing thing for me. I don’t know how much credit — Was it that the model is amazing or the harness was amazing? SA: Yeah, exactly. To what extent is the harness developed in conjunction with the model? Where does that integration happen? Is it in post-training? Is it in the prompt? What makes this integration work? SA: Both of those. It’s not really part of the pre-training process but I would say you can look at it — there’s a more interesting thing here which is the fact that we’ve seen examples of this many times in the past of where things that we thought were very separable get baked in more and more and more. Like the way we initially thought about tool-calling, which is now a critical part of how we use these models, was not something that we thought about deeply integrating into the training process and over time we’ve done more and more of that. I would also suspect that model and harness come together more over time and I would for that matter, I would expect that pre-training and post-training eventually come together more over time as well. It’s such a cliché to say, but I’ll do it anyway, because I think it’s very, very true — we’re so early in the paradigm of all of this, this is still like the Homebrew Computer Club days of how much this is like really matured as an industry. This is why I think so interesting, I wrote about this a few weeks ago , in any value chain, ultimately a point of integration emerges that that’s where it’s really important, these two pieces have to go together to make it work. And over time, that’s obviously where a lot of value collects — my thesis then is that this harness-model integration is the key point. It’s to your interest, but it sounds like you agree. SA: It is to my interest, I do agree, but I also would say even more broadly, what you care about is that you go type into Codex what you want to happen and that it happens. You don’t care about the implementation details. SA: I don’t think you do. There have been so many examples as we’ve been figuring all of this out where we had to do something at the level of the system prompt, that later we didn’t. The general observation here is as the models get smarter, you have more flexibility to get them to behave in the ways you want which sounds like an obvious statement, but it is— It’s easier to tell a 10-year-old what to do than a 5-year-old. SA: When I think back to what we had to do to get any drop of utility squeezed out of these models back in the GPT-3 days that now you never would have to, because of course the model just understands and does it well out of the box, that trend may keep going much further. MG: I was just going to add to that — I completely agree with that and I think when you talk to customers who have ideas exactly what they want these systems to do, previous to this kind of joint collaboration that we worked on together, is that customers were kind of forced to pull that together themselves, right? They wanted these models and agents to remember that they work together well and they wanted to integrate into their existing systems, and it’s not just third-party tools, it’s their own tools. They want them to learn about their own data, their own applications, and their own operating environment and all of that kind of integration today, at least, is left to every single customer to do on their own. So part of this joint collaboration that we were leaning into together is co-building a new type of product that actually brings those things much closer together so that customers can much more easily go accomplish these things that they want to do, where identity is already kind of built into that product, where the ability to go authenticate to your database all happens inside of your AWS VPC [ Virtual Private Cloud ]. You can do a bunch of these things that would be possible to do if we were kind of at the OpenAI APIs and AWS over here, but by building this thing together, we make it much easier for customers to much more rapidly get to value and go accomplish the thing they want to do inside of their enterprise environment. So you think that you can build a functional agent in a generic harness, it’s just way more difficult? You’re making it easier? Or is there a bit where actually there might not even be stuff you can do if you don’t have them tied together? SA: To go back to your earlier analogy, pre-AWS days, you could do a lot if you were willing to go stand in a cage and buy a bunch of servers and figure out how to connect them and hire your own network engineer, and you could make a lot of things happen and then all of a sudden as soon as you could just like log into an AWS control panel and click, “I need another S3 instance”, or whatever, you could make a lot more things happen because the activation energy, the amount of work that required for the basics, got way better so you can do a lot with the models today. Yet every time I watch someone use our models or try to set up some of this work Matt was saying, I am torn between being happy they’re so impressed and feel like this is a magical technology and pulling my hair out at how much pain and suffering they’re going through to get anything to work at all, and that’s not just true of developers building these products, even using ChatGPT and watching people copy and paste things from here to there and try to have this complicated set of prompts — I know that’s going to go away, and I’m thrilled. It’s still so early, and so bad. Just don’t take away your integration with BBEdit , that’s all I ask, my number one favorite feature of the ChatGPT app. (laughing) Thank you. SA: A) This stuff is just way too hard to do, and we think if we can make it way easier it’ll bring way more value to developers and businesses, but B) there are a lot of things that you just can’t reliably get to work at all and I think through our joint collaboration not only will it be a story of ease of use and not having to go build out your own colo or whatever, we are going to jointly figure out a lot of new things to build where people will be able to build products and services that just can’t be done even with a lot of pain and suffering. I actually want to come back to that point about things to be built. But just to go back to Codex real quick — Codex is a harness and model, it runs locally. Why is it easier to get agents to work locally right now? SA: Actually, we started with it running in the cloud, and I think eventually you do want it to run in the cloud. For sure. I’m walking through the transition to this offering, which is in the cloud. But why did you go back to local? SA: You have your whole environment there, your computer’s set up, your data is there, you don’t have to like think about — it was just easier to get to work, even though it’s not the end state. But getting to a world where agents do run in the cloud and when you — if you have a very intensive thing, or you need to close your computer or whatever, you can hand stuff off to working on the cloud, I think is clearly going to be great. But the ease of use that we were able to deliver clearly in the short term, it won out to have it using your local environment. There’s one way that I think about it, is like you have the old school security model, which is like the castle-and-moat sort of thing, and you’re moving to a new security model of zero trust and everything having the appropriate permission structure and authenticating and all those bits and pieces, and it feels like to me one way to frame running locally, it’s like your self-imposed castle-and-moat, everything’s on there, I just assume it’s all fine and easy to do. And a way I’m thinking about this, and Matt, let me know if that resonates with you, is to get all those pieces to actually function in a production environment you just can’t even have that all locally, you have to be operating this environment from the get-go, is that a right way to think about it? MG: I don’t know that there’s any computing environment that’s gotten rid of a client, there are just benefits of operating locally. There’s a reason that most of your iPhone apps also have a local component, whether it’s connectivity or latency or just local compute or access to files and applications. The local client does have a particular — as Sam said, it’s easy, it works really well, it’s constrained, though, there’s limits to it. You can’t scale out your local laptop, you have what you have and once you start getting in an enterprise contract, sharing between two people gets to be a little bit harder — thinking about permissions, thinking about security boundaries gets to be a little bit harder. So there’s a number of those pieces where I think that, I wouldn’t say that having the local environment is a bad thing, it’s just a different thing, and I think that you’re eventually going to want to have that bride across both. That’s my question, because you have in the cloud era, you had containers that helped you converge local and production environments, but it kind of feels like in this case if you have to deal with agents, to your point, say I was like a virtual co-worker and or whatever it might be, if they have their own identity and they have their own permissions and all those sorts of things, to even build them you need to be in the right environment as you’re going to deploy it, it would seem that way to me. SA: I think there is so much to figure out here. Just to give one example, if you’re an employee at a company, do you want to have one account for when you use some service, and then should your agent just use your account, or should your agent use a different account so that the server can tell which is which? Or what if you want lots of agents? SA: Exactly. I suspect that what we actually want is something we haven’t figured out yet, and maybe it’s that when Ben’s agent is logging in as Ben, it uses Ben’s account but it notes that it’s an agent and not the real Ben. We don’t even have a primitive to think about that, but we may quickly need to figure that out and and my sense is there there are going to be 50 other things like that where as we have agents join the workforce and act with increasing levels of autonomy and complexity of tasks, a lot of the mental models that we have for how software works and how access control and permissions work inside of a company or on the broader Internet, those are all just going to have to evolve. How do you think about, Matt, in terms of security and access policies and whatnot for agents? MG: Yeah, I do think that that’s where when you move more of these workloads into the cloud that you can have as a central organization, more controls over some of the security pieces of it. And I do think, when we talk to customers all of the time, it is what they worry about, which is, “I love the promise of what I can do with some of these really powerful models and agents, how do I make sure that I don’t have a company-ending event where I screw it up?”, and there’s the worry out there. I think we can help with that because it these are solvable problems, they are, and I think, giving some customers confidence, “Well, it operates inside of this VPC”, and you can at least then control that boundary and know what it has access to, or it goes through this gateway, and you can give it permissions, much like you give it a role inside of the rest of your environment. These are constructs that over the last 20 years, we’ve built up a really rich set of capabilities, so that it’s not just Y Combinator startups, but it’s global banks and healthcare agencies and everybody in the world and government agencies that can use AWS and having built up all of that security structure around it, I think can help us further accelerate how they take advantage of this technology and kind of have these safeguards to run fast. I think a lot of times when you’re in a company, particularly companies that are in risk-averse environments, having those safety guardrails where they say, “If it operates inside of the sandbox, I am excited to go fast”, can actually help many of our customers start to use these technologies for a much broader set of things. A lot of these capabilities you’re talking about that you’ve developed over 20 years and you’re trying to put it in place for agents are exposed today through AgentCore . So what is the relationship between Bedrock Managed Agents powered by OpenAI and Bedrock AgentCore? MG: A lot of what we’ve built together is building on the building blocks of AgentCore in order to kind of pull some of these pieces together. So there’s like a super set that sits on top of that? MG: The AWS team and the OpenAI team used AgentCore components together with the OpenAI models and a bunch of those pieces to go and co-build this product together. AgentCore is kind of our set of primitives that just like if with AWS, if you want to go and build our own agentic workflows, you can do that. You can have a memory component, you can have a safe execution environment, you can have a permissioning capability, and you can go and configure all of those and we have customers running those in production today that are doing really cool things. But not with OpenAI. MG: But not with OpenAI, they have to use different models today, that’s true. Actually, that’s not true, we have people doing it with OpenAI. Oh, just calling to another cloud or whatever. MG: They just call directly to the OpenAI model. So we actually absolutely have people doing it with OpenAI today, not natively inside of Bedrock, but they’re still using that. And it’s an open ecosystem where you can pull different capabilities to go build whatever you want and my bet is that people will continue to do that. We have builders out there that love to, to Sam’s analogy, love to continue to build computers at home today, even though you don’t have to do that, and even though people like to build and we think that people for a long time will build their own agents, but the vast majority of them are going to want an easier way to do it where they don’t want to have to go configure all of those pieces themselves and that’s part of what we’ve launched in this collaboration together. Just to be super clear, you talk about this managed experience with Bedrock Managed Agents, you can also use AgentCore and pull from a model, whether on AWS or somewhere else. And just to make clear, Sam, this is a question for you, this is the distinction between OpenAI on say, Azure, where that’s just you have direct access to the API, and that is distinct from this managed service on Amazon. Is that correct? SA: Correct, yep. And you feel very good about that, that’s scoped correctly in all terms, it’s not going to be an issue going forward? SA: Yeah, I think things will evolve over time, but I feel very good about this as a way to start. Is this going to be an exclusive offering for AWS? Or do you anticipate having this sort of managed agent service on other clouds? SA: Yeah, we’re doing this exclusively with Amazon, we’re excited about it. How much of the exclusive is, “Look, we’re using all Amazon’s APIs, of course it’s only on Amazon”, or is this the overall idea of a managed experience, it’s not just a “We’re using Amazon APIs”, it’s, “Right now this is going to be on Amazon”? SA: Spiritually, we want to do this as a joint effort between our companies. Got it. The PR does say something, and this goes back to the point you mentioned, Matt, earlier about you could call out to other APIs and glue this all together yourself. In this case, the customer data stays within AWS, so what exactly does OpenAI see, what does that mean? MG: That’s right. So the whole thing kind of stays within your VPC and so data is protected inside of the Bedrock environment. Got it. And this is going to be running on OpenAI models through Bedrock, and these are going to be on Trainium ? MG: They’ll be through a mix of different – some of it will be on Trainium, some of it will be on GPUs. Is that just a function of timing? Because I think as part of your announcement a couple of months ago — MG: Some of it’s timing and capabilities, I think we’ll kind of be mixing in the different components of building the system together, using the right infrastructure for the right parts of it. But over time, more and more of it will be on Trainium. SA: We are quite excited to get these models running on Trainium. I can imagine. One quick question, just a general question about Trainium, Matt. Trainium, is it fair to think, and this is the way I’m thinking about it, so I want to make sure I have it right. Trainium — very unfortunately named, because it’s really going to be about inference going forward — the number one manifestation will be through managed services like a Bedrock, where the customer doesn’t even necessarily know what compute they’re using, is that a fair way to think about it? MG: Number one, I take responsibility for bad naming across all AWS services. Look, I have a word-of-mouth site named Stratechery, so I have all sympathy for bad naming. SA: I think Trainium is a cool word. MG: It is a cool word. It is a cool word, it just feels like it’s an inference chip, not a training chip. MG: It is. But, yeah, naming aside, it is useful for both training and inference. And look, it’s a chip that we’re incredibly excited about, and both in the current generations as well as ongoing, we think that’s going to be a huge business and a real enabler for a lot of the things that we do together. I think just with GPUs, by the way, you’re going to interact with a lot of these accelerator chips through abstractions. So the vast majority of customers don’t interact with GPUs either, except through maybe like in their laptop or something like that, for graphics. But when you’re talking to OpenAI, even if they’re running on GPUs, you’re not talking to the GPUs, if you’re talking to Claude, you’re through GPUs or Trainium or TPUs, you’re not talking to any of those chips, you’re talking to the interface. And the vast majority of inference out there is being done on one of a handful of models. And so whether it’s 5, 10, 20, 100, it’s not millions of people that are programming to those things directly, and that’s gonna be true going forward just because these systems are so complex, they’re very large. If you’re going to go train a model, not that many people have enough money to go train a model, not that many people have the expertise to actually manage it. They’re very complicated systems, and the OpenAI team is incredible in their ability to squeeze value out of a very large compute cluster. But not that many people have the team that can do that, independent of what the chip happens to be, and so I think that that’s going to be true for all accelerator chips, honestly. SA: Ben, I increasingly think of what we have to do as a company is to be a token factory. But what the customer cares about is that we can deliver the best unit of intelligence at the lowest price and as much of it as they want, with as much capacity as they want. Do you think we stick with pricing as far as — pricing is based on tokens, does that make sense in the long run? SA: No. And in fact, like there was an interesting example of this with our model that just came out , 5.5. where the per-token cost is much higher than 5.4, but it requires a hugely fewer number of tokens to get the same answer, and you actually don’t care about how many tokens the answer takes, you just want the piece of work done, and you want again a price and an amount of capacity you can have for that. So maybe I was wrong to say “token factory”, but we’re like an intelligence factory or something. We just want as many units of intelligence for the lowest price and whether that is a bigger model running fewer tokens, a smaller model running lots of tokens, whether a GPU or Trainium or something else, whether we do any of the other kind of number of things we could do about that creatively, I don’t think customers care. In fact, they don’t really interact with that. When you go put something into Codex or when you go build a new kind of agent in the SRE [ Stateful Runtime Environment ], you should never have to think about that and you should just be astonished at how much you get for how little cost. Is the reduced token usage is that model, or is that harness? SA: That’s mostly model, it’s a little bit harness. Got it. Do you anticipate Matt, by the way, I asked Sam the exclusive question, do you anticipate offering a similar managed service for other models? MG: We’re focused on doing this with OpenAI right now. We’re very excited about what we’re doing together, and the fullness of time is a long time. The fullness of time is a long time, I’ll let you stick with that one. It’s fine, I had to ask the question. I do have a question as far as customers, Sam, to your point, both your input on this, I’m curious — when people are actually in production, where does OpenAI’s responsibility end and AWS’s begin? It sounds to me, if all the data is on AWS and it’s staying there, and they’re operating at a higher level, this is ultimately AWS’s responsibility? Is that the right way — am I thinking about that correctly from a consumer perspective? MG: Yeah, I think that’s right. When you’re going to call somebody, you’ll call AWS support to help you out, and it’s part of your AWS environment and you build it together and your AWS account reps are going to help you there. And we’ll bring in, when we’re building it, we’ll bring in our OpenAI colleagues to help you figure out how to best take advantage of this or whatever. At some point, if we run into a bug that we need their help with, we’ll escalate over to them, but AWS will be that frontline support that you kind of interact with. Where do you see the scale of this business, Sam, relative to your core API business? SA: I hope it’s going to be huge, we’re putting a lot of effort into this, we’re committing to buy a lot of compute, I believe there will be a lot of revenue there to support this. The increasing framework that I’ve had is that at a low enough price, demand for intelligence is essentially uncapped. So is it very elastic in that regard? You decrease price, demand goes up? SA: It’s certainly that, but again, you can decrease the price of water and maybe you’ll drink a little more water, maybe you’ll shower twice a day instead of once a day, there’s some elasticity there but at some point you’re like, “You know what, I have enough water”. Also you will buy water no matter how much it costs if you have to. SA: Other utilities, if electricity is cheaper you’ll certainly use more of it, but if you think about intelligence as a utility, there’s no other utility I know of that I’m just like, “I just want more, I’ll just use more as long as the price is low enough, I’ll just use more”. MG: I will say actually and interestingly it’s largely been true of compute power where if you think about the cost of a compute cycle today versus what it was 30 years ago, like I don’t even know how many orders of magnitude cheaper, and there’s more compute being sold today than ever. Right. People don’t really think about the cost of compute at least until they’re at extremely high levels it’s a material level, but by and large strategically speaking it’s just assumed you have compute. What’s the runway to getting there with with AI where it’s not the number one thought process, “How much am I spending here?”. SA: I don’t think that is the number one thought process. Right now we have way more customers asking us, “No matter what the price is, can you give me more? I just need more capacity, I’ll pay you extra”, than we have arguing with us about the price. But I do think we are going to continue to bring the price down crazily dramatically, now maybe the more we do that the amount of wealth that wants to flow and just goes up more and more and more. But I am confident we will continue to be able to reduce the cost of today’s level of intelligence quite dramatically — one thing that has somewhat surprised me is how much, and I don’t know if this is going to stay the case or not, but at least today how much of the total market demand is at the absolute frontier. Right, there’s a lot of questions about that. It’s very expensive to serve the front end, people can just get the previous one, but you’re saying people just want to be on the front end no matter what? SA: So far they do. MG: And I think that’s a good signal that you’re not anywhere close to where we want to be and that there’s so much more demand, and I really do think it’s like if you go 40 years ago to compute demand, a computer was crazy expensive, and now it’s dwarfed by the the power that’s in everybody’s cell phone and we sell billions more of those things. I do think that that’s what’s going to happen to the AI world where today you’re pushing, everybody wants to use the frontier because that’s what you need in order to get a lot of useful work, and everyone’s so excited about the capabilities out there. I think over time, you will have a mix of models, by the way, where you will have some smaller models that are able to do stuff that even the latest OpenAI models aren’t able to do yet, but they will be smaller and cheaper and faster over time, and you’ll have the super big ones that are going to go try to cure cancer and other things like that. But I think we’re still at just the early stages of what’s possible and when you see this much demand and this much growth when you’re at the early stages of what’s possible, it’s exciting for what the future holds. Is there a bit of a cynical view here where, Sam, you had a bunch of customers that are like, “We’d love to use OpenAI models, but all our stuff’s in AWS, we’re not moving”. And Matt, you’re like, “Look, all our stuff’s in AWS, can you please go get OpenAI models?”, and this is just satisfying that need — and it turns out, because AWS is the biggest, that was an astronomical amount of need. Is that just the easiest answer? Or is there a bit here, too, where you actually think you can deliver something highly differentiated that will also draw new customers for each of you? SA: We’re clearly thrilled to get access to AWS customers, and so many people love AWS. Yeah, that is a true statement. MG: That part is definitely true. (laughing) Right. MG: And vice-versa, our customers are very excited to get access to OpenAI technology. SA: But I do think there is something incredible and new to build together, and I am hopeful that when people look back on this in a year, the most important thing people will talk about is not like, “Oh, finally, you can get access to these models via AWS”, or whatever, but it’ll be like, “Wow, we didn’t realize how important this new product was”. I think we are close at a model and harness and capability level to just a completely new kind of computing and that will feel very different than the existing ways people have thought about, “I need an API to this model”, or whatever. MG: I couldn’t agree more, that’s exactly it. The first part is great and is nice and the second part is, I think, what we all get super excited about. To that point, I mentioned I want to come back to this earlier, but I have a theory, which may or may not be correct, I’m curious your guys’ point about this, about stuff to be built. Specifically, there may end up being this real middleware or middle layer of where you have all these different databases and SaaS apps and all these bits and pieces of data in an organization that can stretch across things, you have this agent layer/harness or with the harness, I guess, sitting on top, and there’s something to be built in the middle and OpenAI Frontier gets at this a little bit. Is this part of this? Or is this something to be built? Or am I totally off base and we don’t need that at all? SA: You are totally right that we need something there. When I’ve been talking to customers recently, like large enterprises, they’re like, “I want some sort of agent runtime environment, I want a management layer where I can connect my data to agents and also make sure that I understand where I’m spending on tokens and not and have some sort of oversight there, and I want some sort of workspace” — hopefully it’ll be Codex — “something like that for my employees”, and that package of what people are asking for is getting remarkably consistent, but there is work to go off and now go build all that offering. It feels like there’s like almost a double agent layer that’s necessary. There’s like the agent layer to maintain the middle layer that is constantly spelunking down in all these data sources and then there’s the actual user interface layer that is where people are actually interacting with. Does that sort of fit with where we’re going or is that off base? SA: On both of those, I agree that that’s a picture of how the world looks today. As the models get really smart, I don’t think we know exactly what the architecture of the future is going to look like. Right now people do, at this sort of call it user agent layer, want to interact with multiple agents and we make it so that you can build agents for this thing and that thing and they can talk together and whatever else and then at the company management layer, people have all these controls about how you help the AI go spelunk and files in file systems. And at some point you realize that you’re just holding on to the past for no reason at all, this should just be in the model. SA: That’s what I was going to say. At some point, you may say, “Actually, we have such incredible capabilities, let’s re-architect the whole thing”. MG: Yeah, I agree. And I think there’s something different, and I’m not sure we all know what it is yet, but that’s part of the beauty also, is you get customers using and building and you can learn from them and figure out how you can make that easier, faster, better for them. Sam, this is the second time we’ve done one of these product launch interviews, last time it was with Kevin Scott and New Bing — you were pretty confident about the threat you posed to Google then, how well do you think that worked out? SA: I think we have done better than I expected. ChatGPT is, I think, the first really large-scale new consumer product since Facebook. Is that actually the answer, you’ve done better than you expected, but it manifested mostly through ChatGPT as opposed to other other areas? SA: No, I think we’ve also done quite well on the API, particularly on Codex, but that was not what I was thinking at the time. At the time, I was thinking maybe these new kinds of language interfaces are going to change the way people find information on the the Internet and you know — Google, also just absolutely phenomenal company, I think in many ways Google is still underrated just in terms of the breadth and depth of what they do, but I am happy with how ChatGPT has performed relatively. I actually have a Google question for you Matt, in a similar way. Google was just up there this week, Thomas Kurian talking about their fully integrated stack, all the way up and down from model to chip to to agent layer, all that sort of thing. You’re here with another company executive, definitionally not fully integrated within Amazon, but is there a bit where everyone was critical of you not having a frontier edge model — now that we’re in this sort of inference area, you’re used to serving a lot of companies. Did you maybe end up in a better spot by being neutral in a way? Was that on purpose or did you accidentally end up in a great place that you didn’t realize it was going to be? MG: A little bit on purpose. We, since we started AWS, we have always embraced our partners as a key part of us supporting our end customers. Since the very beginning, it’s been an incredibly important part of our strategy is to lean in with partners and maybe different than some others, we view our success is if the partners are successful and they’re building on top of us or together with us, and if they’re successful, then we’re successful, that’s awesome. We view it as that’s growing the pie together, then that’s a win, and it’s not necessarily how others view the world. Sometimes they say, “I have to own everything”, and that’s okay, that’s a view that people have. But I think that choice is important, and that way the best products win. And by the way, you can have first-party products in that world, you can have lots of third-party products in that world, but our view is we want the customers to be able to pick the best thing for them. And if the best thing is your own stuff that you’re building, awesome. For us, if the best thing is what our partners are building, but it’s on top of us, we view that as a win as well, it’s because it’s the best thing for our customers. We’ve long thought that, and it’s actually how we built the Bedrock platform in the AI world. We want to support a broad set of models, we want to support a broad set of capabilities, and it’s true, it’s been true across from databases to compute platforms to other things like that. So I think it’s been an intentional strategy, I think it’s a strategy that customers appreciate because they like that, and we’re excited to continue to lean into it. Yeah, it’s interesting. There’s the balance between software, platform, infrastructure, and everyone says they’ll serve everyone. But it does feel like you go way back when AWS started, it’s like you start with the I [Infrastructure], and that gives you almost – that gives you the greatest flexibility, it feels like, from my perspective, to meet Sam in the middle. Sam’s got a great S [Software], you guys are building a P [Platform] together, I guess is the way to put it. MG: That’s right. It does make it hard where you say, “We have one S3”, there’s not other S3 offerings, that part is true. So some of those core components are, like you said, at the infrastructure layer, we do lean in pretty heavily on the stuff that we build. But as you move up that stack, I think there’s a broader set of capabilities and if you view the world that — in no world do I think any one company is going to own every application and as you get further down the stack, when you get to kind of the models and services layer, there’s fewer of those and you get down the infrastructure, there’s even fewer of those and our view is kind of embracing that whole set of partners is great for us end customers. Sam, any final words? SA: I think that was very well put. I really do think there’s a potential at a new generation of the kinds of products that developers can now build and given how steep we expect model capability progress to be over the next year, the fact that we’re going to go on this journey together and try to really build a platform to enable it, is coming at a good time, and I think people are going to love it. Very good. Matt, Sam, thanks for coming on Stratechery. MG: Awesome. Thanks for having us. SA: Thank you. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day! Last Friday I conducted the following interview with OpenAI CEO Sam Altman and AWS CEO Matt Garman about Bedrock Managed Agents, powered by OpenAI ; naturally, one of my questions was about how this fit in with OpenAI’s deal with Microsoft giving Azure exclusive access to OpenAI models. Late Sunday I heard through the grapevine that Microsoft would announce something Monday morning; I wondered if it might be a preemptive lawsuit! On Monday Microsoft and OpenAI announced they had amended their agreement , allowing OpenAI to serve its products on other cloud providers, including AWS. Microsoft remains OpenAI’s primary cloud partner, and OpenAI products will ship first on Azure, unless Microsoft cannot and chooses not to support the necessary capabilities. OpenAI can now serve all its products to customers across any cloud provider. Microsoft will continue to have a license to OpenAI IP for models and products through 2032. Microsoft’s license will now be non-exclusive. Microsoft will no longer pay a revenue share to OpenAI. Revenue share payments from OpenAI to Microsoft continue through 2030, independent of OpenAI’s technology progress, at the same percentage but subject to a total cap. Microsoft continues to participate directly in OpenAI’s growth as a major shareholder.

0 views
Unsung 2 weeks ago

Abort, Retry, No, Thanks

If there was one go-to example of an impenetrable error message in the 1980s, it must have been this – popping up, for example, if your disk drive was dirty: On some technical level, the options made sense: “Abort” would stop whatever you were doing, “Retry” would try to repeat the action, and “Ignore” would proceed as if there was no error. But in the heat of a moment, or seeing it for the first time, this was a puzzling choice to be asked to make. Not only were the words weighted improperly (the seemingly most innocuous action here, “Ignore,” was actually the only one that could do actual lasting damage); it also wasn’t entirely clear what’s the safe thing to do to get out of the situation . (The redesign of “Abort, Retry, Ignore” was “Abort, Retry, Fail,” and it wasn’t really a huge improvement.) Last night, I installed Google Photos on my iPhone, and the first message that greeted me was this: This is really a matryoshka doll of bad dialog presentation. First: any buttons in a dialog should be labeled with enough information to keep me going . Here, both have generic labels, so now I need to pay attention. Second: Even after reading, I have no idea what is the choice I’m making. I see the pathway marked “yes, keep it the way I had it” and, sure – this would be generally what I want from any given computer on any given Sunday. But what’s the actual alternative? But the third, and most important one, is this: this dialog has no safe escape hatch. By now, in UX design, we established quite a few canonical escape hatches: But you can’t × this dialog out. The main button seems positive, but it also feels like I’m taking an action with consequences, and I don’t want to deal with that. There is a “No, thanks,” but it doesn’t feel like the other “No, thankses” I have seen – it’s juxtaposed with copy that makes it seem… a dangerous thing to choose. And this last bit makes it a pretty serious design offense, because you are now messing with foundational stuff. You need to protect those escape hatches for the future; the moment you introduce hesitation into the mix and taint “No, thanks” as a concept , really bad things will start happening all across your product. In real life, fire doors have to open outwards when pushed with body weight, aircraft stick shakers are impossible to ignore, and anti-lock braking systems do smart things even after your brain turns off its smart parts. I know seeing a dialog like this would never happen in a moment of true panic, but sometimes I think of the user in their most absent-minded moment: trying to get their kids to hurry up for school, on hold with an annoying cable provider, with a cat looking like it’s about to jump up directly into a running toaster. A dialog on their phone pops up. If that dialog absolutely has to happen, what is the escape hatch it can offer so they can dismiss it safely if they cannot think about it at all ? This Google Photos screen needs a lot more rethinking and rewriting, but in its current incarnation, it desparately needs a clear and trustworthy escape hatch I can tap absentmindedly, just so I can get to my photos. #errors #google #onboarding #writing a Cancel button, a × close box, a “No, thanks” link, a press of an Escape key.

0 views
Stratechery 3 weeks ago

An Interview with Google Cloud CEO Thomas Kurian About the Agentic Moment

Listen to this post: Good morning, This week’s Stratechery Interview is with Google Cloud CEO Thomas Kurian . Kurian joined Google to lead the company’s cloud division in 2018; prior to that he was President of Product Development at Oracle, where he worked for 22 years. I previously spoke to Kurian in March 2021 , April 2024 , and April 2025 . The occasion for these interviews, at least for the last three years, is Kurian’s annual keynote at Google Cloud Next. You can watch the keynote here , and read the blog about Google’s announcements here . I spoke to Kurian a week ago, on April 15, and at that time only had access to the afore-linked blog post. With regards to the keynote, which I have since watched, I thought it was a powerful opening: Kurian returned to last year’s theme, about a unified architecture, but emphasized that the use cases were no longer theoretical or pilots but running at scale for real users. He also emphasized — in a foreshadowing of a point we discussed below — that Google itself was running on the same infrastructure as Google Cloud. Google CEO Sundar Pichai, meanwhile, talked about Google’s capex investment, and that (1) half of it was going towards Google Cloud, and (2) that Google Cloud was running the same stack as Google itself. I sense a theme! Pichai also emphasized security, a point that Kurian was also careful to raise in our talk, before discussing the shift to agents. To that end, in this interview — which again, was conducted before the keynote — we discuss agents. Specifically, I wanted to get Kurian’s take on the quality of Gemini’s harness (unsurprisingly, he thinks it’s great). Google has an integration advantage, but is it paying off in such a large company? I was also curious about how Google thinks about TPUs specifically and the cloud business generally in terms of balancing its internal needs with external customers like Anthropic. We also talk about the software ecosystem, why Google still believes in partnerships, and why the company was ready to seize the AI moment (hint: it’s because of Kurian). As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Thomas Kurian , welcome back to Stratechery. I promise I have recording turned on this year — in fact, I have two recordings turned on. TK: Thank you so much, Ben. Good to see you, thanks for taking the time. Well, I look forward to talking to you. It’s good to talk to you for multiple interviews, much better than talking to you multiple times in one interview, so we’re already doing better this year. But like last year, we are recording before your Google Next keynote . We’re actually quite a bit ahead, I think we’re several days ahead, but this podcast won’t be released until after the keynote. Therefore, I’m going to ask the exact same question I asked last year. Specifically, I like watching keynotes, not for the announcements, but for the framing that happens up front. Last year, that framing was infrastructure, [Google CEO] Sundar Pichai actually delivered that at the opening, then you came in and talked about that, and that was the context for everything that you talked about. What is the framing this year? TK: The framing this year is that as AI models have become more sophisticated, we see customers evolving the use of AI models from being used to answer questions in a chatbot-like fashion, to actually automating tasks on their behalf, and to automate process flows within the organization. By automating process flows, you both get efficiency improvements, productivity improvements, frankly, you can also change the way that you introduce new products and services to market, for example. In order to do that well, the technology, what you need is a world-class agent platform and to underpin the agent platform, you need world-class infrastructure. You need the way that the agents interact with your company’s data and your business — so you need capabilities to help an agent really understand the company’s business information and context. I think, as you’ve seen in the press, AI and cyber have become very contextual now, there’s a lot of concerns that AI will accelerate the speed of cyber attacks on people’s systems, and so we’re going to be talking about how we’re bringing AI and our cyber technology together to protect, including the integration of Wiz , and then we’re introducing Gemini Enterprise and our agent platform to customers. That’s sort of the theme of what we’re talking about. You mentioned agents last year, everyone was talking about them to a degree, what has really changed from last year to this year that makes this different? I read your whole blog post, it’s very long, and I think the word “agent” may appear in every single paragraph. TK: There’s three or four big things that have changed. The first is capabilities of models — Gemini is able to reason much more effectively as new versions of Gemini have come out. Second, they’re able to maintain long-running memory, which you require if you have an agent that’s automating tasks over many, many steps, it has to maintain a lot of state in memory. Third, their interaction with tools and the rest of the world, there have been good abstractions, skills, tools, MCPs [ Model Context Protocol ], as they’re called, they’re all abstractions for how an agent reasons and interacts with the rest of a company’s systems. All of them have advanced and so the core capabilities that the models themselves have gotten a lot better, the capability and the ability to use tools and interact with the rest of the world has become a lot better, the abstractions that the world exposes itself to the model has improved and so now you have models have these capabilities to do these very complex tasks. That all makes sense and certainly tracks. A lot of these announcements, though, as I was going through them, a lot was about the infrastructure around agents, which makes sense — the orchestration, registry, identity, security, all these bits and pieces. All of this is clearly necessary for large enterprises, something they’re going to worry about and ask about. But the agents have to actually work; do Gemini agents actually work? Because there’s a lot of talk, you know, Gemini was the belle of the ball four months ago, but over the last little bit, it’s been mostly a lot about Anthropic and Claude, Codex, a lot of talk about that, and Gemini, not much talk. What’s your feeling about your actual capabilities, not just agents in general? TK: I’ve always said when people ask us about it, I always say, “Let our customers talk about it, rather than we talk about it”, I think you’re going to hear from 500 customers telling their stories at Next. Even people building agents, we have a whole range of them, from Citigroup to Bosch to eBay to Virgin Voyages to Walmart, there’s a whole range of them, Food and Drug Administration, etc., Comcast, Unilever, all of them are going be talking about specific business problems they had. For example, for Citi, they’ll be talking about a new wealth advisor, Investment Management, where they’re using our agents to research a person’s investment priorities. So a person says, “Here’s my priorities for investment, my kids are going to school, I need this kind of cash flow in order to fund it”, and then it researches your financial portfolio and interacts with you to give you recommendations. If you look at Comcast, they’re using us for all of the work that they do for consumer services — this is repair, scheduling appointments, dispatching field technicians, there’s very complex flows that have many, many steps and interact with you with a lot of complex systems. If you look at some of these flows, they require all of the capabilities I talked about. So as an example, I want the capability to call a set of tools, and those tools may be I want to book an appointment, so I need calendar, I need to look up, if I’m dispatching a technician, I need to look up spare parts so I need to pull up from my inventory that spare parts inventory, I need to schedule that to be available at the same time as the person who’s going out, I need to update my inventory that have taken something out of it. I mean, these are very, very complex steps. What’s interesting about all these complex steps and going through all these bits and pieces, it sounds like you’re saying that almost the more constraints there are, the more things you’re bumping up into, is that actually a better environment for instituting these sort of flows just because what you need to do is clearly defined? TK: Just being perfectly frank, Ben, having constraints requires the model to be even more intelligent. Just as an example, the number of variants in a process flow that’s complicated many, many steps, the number of different idiosyncratic situations that you may encounter are large so you cannot a priori program every one of them. You need to teach the model to use, for example, to be able to spin up a virtual machine and use a tool in the virtual machine to generate code to deal with some of these situations. So the most sophisticated thing is where you can give the model a high level set of instructions and have it goal seek an outcome. So you say, “I need to schedule this appointment”, and it turns out there may be 19 different conditions that occur when you’re trying to schedule an appointment and as part of that, you can’t a priori tell the model every single possible condition deterministically. So you need to teach the model, “Okay, the user did not tell you what to do, but the goal was to schedule an appointment, so here is how you generate code to then create a collection of things that can interact with the model and understand what to do”. This is very interesting, you’re walking through this process, this makes a lot of sense. How do you have that conversation with DeepMind? You’re connecting the, “This is the workflow that is needing to happen, these are what we need the model to do, this is where it does well, where it doesn’t”, what’s the working relationship there? TK: We have a harness in which all these flows journeys, for example, as we see them with customers, we put them into the harness and they get into the reinforcement loop for Gemini. How tight is that process? TK: Very tight. We have people sitting next to [DeepMind CEO] Demis’ [Hassabis] team, in fact I just came from a meeting with them, that loop is what allows us — we are in a unique position in the market. We’re unique in three different ways, we’re unique because we have the whole stack of AI technology. In order to do agents well, you need to have a model that takes all these journeys and puts it into the harness that handles the improvement, as we call it, hill climbing, literally every hour of every day, and the complexity of the journeys we see are in some ways much more complicated because in companies, you have many different systems, different conditions, different flows, you may not see that in other domains, like in a pure consumer domain. In order to do these well, you also need, for example, models need to spin up compute, models need to now hold on to tokens for longer because they need to hold, for example, a KV cache that holds memory about what’s happening during the transaction flow. Having awesome infrastructure, both classical, what we call classical compute machines, and TPUs gives us real strength there. Third, as you walk through these, one of the things you find is a lot of the systems these models interact with are things like databases, enterprise applications. So understanding the context of these, like for example, “How much inventory do you have?”, defining “What is inventory?”, “What part are you talking about?”, “What part number are you talking about?”, those things require you to have technology that understands the business graph and the dictionary of all the objects and the sources of information in your company. Our strength in data processing gives us some technology that we’re going to be talking about next week around something we call Knowledge Catalog, think of it as as your global dictionary for all information within the company, that’s a unique strength. And then obviously you don’t want information that’s critical to your company exposed on the Internet, you don’t want your model to get attacked because now it’s handling very complex process flows, you don’t want it hijacked, and so all the anxiety around cyber, we have very specific tools on, so our differentiation is all these pieces working together. That makes sense, the integration is a big part of your pitch. At the same time, you’re also a big, sprawling company and I think there’s maybe a perception, that I maybe hold, that some of the frontier labs are much more focused, they’re much more top-down about, “This is how our harness is going to work, the way it’s going to use tooling”, and all the things you’re talking about having this feedback flow back in sounds great unless there’s so many different takes on the way it should work and then you have your own internal customers as well. How do you balance having a point of view versus getting stuck in the muck? TK: Every product that Google has is on the same Gemini version, on the same day, on the same hour, every one of us is using the same harness. And you feel good that that harness is where it needs to be — it’s not getting pulled in 50 million directions thanks to all your customers and Google’s workloads? TK: Absolutely not, we are very focused on working with Demis and [DeepMind CTO] Koray [Kavukcuoglu] who lead our team to make sure they see the sophistication of these scenarios and we work literally side-by-side, hour-to-hour with them. There’s been a lot of speculation on are we distracted the company… I don’t think you’re distracted, I think it’s more just a matter of it’s a classic big company versus small company bit. Like a startup comes in and you have a very clear point of view and you don’t have all the enterprise stuff, you don’t have all this protecting the data, or permissions and all those structures, and yet that stuff sort of gets pulled along because there’s such demand to use your product that works really well and then over here it’s like, “Hey, we have everything protected and we have all these things around it”, but does the core product actually deliver? TK: The core product is being used by lots of people. The proof of that — we generate 16 billion tokens a minute, up from 10 just last December or January. Well, your financial results certainly showed that as well. There’s a bit where you’re doing so well, I have to be a little hard on you here. TK: A lot of people told us we were dead in 2023 — we’re still living. I think you’re doing more than living, you’re doing very well. TK: And so we never say anything negative about anybody else, our results prove for themselves. I always say, let our customers tell the story, they’re doing amazing things with Gemini in companies, enterprise, and they see the value of what we’re delivering for them. You mentioned that everyone in Google is on the same version of Gemini, using the same harness. Does that also apply to all this infrastructure around agents you’re doing, around sort of identity and security? TK: Yeah, in the enterprise, the way that all the infrastructure works is we have configurable mechanisms. Like for example, when you configure an agent, a very simple thing is you want to configure the agent with a different identity from a person, just a very simple example so that you can track, “Who did this transaction? Was it the human or the agent?, because there’s issues like liability. You may want to revoke permissions for the agent at a certain point in time, you want to allow it to only do certain tasks and not everything that the human does so there are controls you want to put around an individual agent and a collection of things that’s separate from the person. As we bring agents to consumers as part of our Gemini app, very similar concepts want to be exposed, and so the architecture that we use allows us to have those things. The sources of that may be different. In the consumer world, they may use the Google login account, in the enterprise world, they may use a directory to store it, but that’s just an abstraction of our technology to the rest of the world. We’ve been talking a lot about Gemini agents and the whole Gemini platform, but you also have just the broader Google Cloud platform. One of your major tenants is a company I was just sort of referring obliquely to, which is Anthropic, they’re doing a lot of inference on TPUs in particular. If Anthropic wins deals at the expense of Gemini, is that still a win? TK: We sell different parts of our stack. One of the things people don’t realize is we monetize many different parts of the stack in different ways. Like Anthropic, there’s a lot of labs that use our stack — in fact, most of the large AI labs use our stack. So if somebody uses TPUs to either to train their model or to use it for inference, we’re monetizing that part of the stack, that gives us resources to then fund our R&D and other investments. Some of the labs use our TPU and our Gemini model, others may use our TPU and then buy our cybersecurity protection for their models. So as a platform player, we have to allow our technology to be monetized in as many ways as possible and we don’t see it as a zero sum. Sometimes, though, if you have the SaaS layer and the platform layer and the infrastructure, is there one that is the most important? On one hand, SaaS has the highest margins, it kind of decreases going down. On the other hand, that infrastructure needs to be used, you’re spending a lot of money on it, you want full utilization. How do you think about that in terms of what’s the most important? I know they’re all important, but how do you think about that tradeoff? TK: If we were making TPUs just for ourselves, we would have lower volume than we do as a general purpose TPU supplier, which means there would be times of day that we would not be using those TPUs. Do you follow me? Like if you think how chat systems work, they’re very diurnal in nature, because you ask questions when you’re awake and we have a great search business and we have a great Gemini app business, but there would be a certain diurnalty to it during the daytime, there’d be a lot of questions, what about in the evening? Because we sell TPUs in the market, we’re able to offer it at spot to the rest of the world because we have such a large business. We’re able to also get manufacturing, better terms with suppliers and other things because of a real volume player, and that in turn lowers our cost of goods sold. So there are many more dynamics. The company is very focused on ensuring we win every part of this, not just one part of it. Gemini is obviously a super important initiative for us, and you’ll see the big announcements are around— For sure, it’s almost all Gemini. TK: But I wouldn’t assume that if we do that, the only way to do that is to offer our chips along with our model. We see a strong business offering our chips to many other people and you’ll see all of this is what’s accelerating our differentiation, and you see it in our financial results. Your financials are incredible, your revenues up, margins are up hugely, I’ve been posting that chart of them for a long time, last quarter was amazing . I do have to ask about TPUs, though. You talk about selling our TPU chips, to date that has meant TPU instances on GCP, but now there’s talk about actually selling TPU chips, what’s the status of that? What’s the official word, can I go buy a TPU? TK: I’ll explain a little bit what we see. So let me talk briefly about what the announcements we’re making, what the product is being used for, and then how we bring some of it to market. TK: We’re introducing two big new TPUs next week. One is TPU 8t, which “t” stands for training, it’s more optimized for training, think of it as 9,600 TPU chips, a single pod, as we call it, it has three times better performance than the current generation, which is already the leading one in the market. Then there’s 8i, which is “i” for inference, it’s 1,152 chips, three times the SRAM, and it has a new thing called the Collectives Engine, which gives you super efficient calculation performance for inference. Now, along with that, we are introducing Nvidia VR200, we’re also introducing more ARM capability for classical compute, because people who use models increasingly need to spin up a VM in order to do tasks, and that VMs we see interest in. We’re introducing not just new compute families, but also new storage, there are two new storage offerings. There’s one, the fastest Lustre solution in the market, it’s 10 terabits per second, that’s just to give you a sense, it’s like five times number two. We’re also introducing a new thing for ultra low latency — when you do inference, you want super low latency in accessing storage, we call it Rapid Storage, it can give you 15 terabits per second with ultra low latency, like microsecond latency. So why are we introducing all this stuff? TPUs, definitely a big market is the AI labs, but we’re seeing interest from new segments of the market. So a big new segment is financial services and when I say financial services, capital markets, and the reason is that today, if you’re a trading firm, a capital markets firm, you spend a lot of time running algorithmic trading and algorithmic trading is running numerical algorithms on traditional Intel type cores, x86 cores. Now what they find is that models can do inferencing and the inference performance is actually better than traditional numerical computing. So that’s one new segment, the second segment is high performance compute. We see a ton of people wanting to do energy modeling, computational fluid dynamics, solid state, there’s a whole bunch of parameters there too. What’s interesting about those is, you will see at our event, Citadel Securities for example, talk in the keynote about how they’re using TPU. Citadel, as you know, is a large capital markets firm. Department of Energy, they have a mission called Genesis , which is the new national lab mission on changing the energy infrastructure for the United States. There’s a big Brazilian largest utility in Brazil, Axia, all of them are examples of people who are part of just the keynote talking about how they use TPUs. When we look at that, there’s a couple of different things we see. Capital markets firms say, “Hey, if we’re going to replace our algorithmic trading solution, you have to bring TPU to where the venue is”. Right, because they care about the latency of going to a data center, that’s why they’re all New Jersey. TK: Secondly, if you’re a national lab, you have so much data you’ve collected over the last X number of years with your experiments — saying you have to bring all that data to the cloud to reason on it doesn’t make sense, so you will see us putting TPU in other people’s venues, and when we do that, we’re introducing new ways of people also procuring it. When I say procuring it, you buy it as a system, you don’t have to buy it just as a cloud source. How does this new way of selling, which is almost like a third way, so you have in Google’s data centers, you have bringing TPUs to customers, but then you have a deal like last week where between Anthropic and Broadcom and Google, this is going in their data centers. There’s these sort of renegade data centers that have access to power, maybe they were doing Bitcoin or whatever it might be, there’s been a big push to get TPUs into those. Where does that fit into this? TK: I would not assume everything you read in the press is true. Well, the Anthropic announcement was definitely a a big announcement. TK: Just to be honest with you, we have a flavor that runs in the cloud and a flavor that runs in third-party data center. The technology, the machines are identical. My question here is, where is that coming from? Is that part of your TSMC allocation? Is that Broadcom’s? Because no one can get enough compute, so ultimately that goes all the way back to the root. TK: The chips are all part of our global — TPU is a Google chip, as you know. So it’s part of global allocation, Broadcom partner who manufactures the TPUs with us and so it’s just part of the overall business. The new thing we’re talking about is just that you can run TPU in other venues. Makes sense. Will we ever have enough compute? Last year you said, “I think we’re going to resolve it shortly”, it doesn’t seem very resolved, what’s the status there? TK: We’ve worked super hard as an organization, our team that’s done our compute infrastructure, our global data centers, machines, all that, they’ve done an amazing job, there’s always a shortage, there’s never enough. But it doesn’t mean that we’re not — we would not be growing at the rate we are if we didn’t have enough compute. And so there’s more that we want, but there’s also the reality of our teams have done an amazing job, and our customers who are using it will tell you they’re seeing the benefits of the hard work our teams have done. There’s potential customers in the market, maybe current customers, who may be willing to pay basically any price for compute at this point. How do you think about the short term, “Wow we can actually just make a lot of money right now”, versus, “We need to invest in our products” — you had Microsoft, who I’m not going to ask you to comment on, but last quarter they’re like, “Yeah, we allocated less to Azure because we had our own internal workloads”. These are real trade-offs that you need to think about, how do you think about that in terms of GCP? TK: We run a balanced portfolio, we want to grow different parts of our business, we sit down as an executive team and also with Sundar and work through how we’re going to balance the different parts of our portfolio. We see, broad brush, three to four buckets of things. One bucket of things is where we want to grow Gemini as a business, our core Gemini business is doing super well, 16 billion tokens a minute, up 40% since last quarter, even this product called Gemini Enterprise , which is our core agent platform, has grown 40% sequentially quarter-over-quarter. So that part of the business, we’re committed to making it super successful, it’s a priority for us. Second segment of the business is where Gemini is being used inside of some of our core products, so I’ll give you an example. We’ve introduced Gemini inside our threat intelligence tools. Why is that? Because we have real expertise at Google scanning the dark web to identify threats, the problem is there’s so many of them, an average organization doesn’t know which of those many threats apply to them. So we use Gemini to process and prioritize which threats might affect you, it’s 98% accurate and has processed 3.9 million threats in the last year, so that’s an example of Gemini being used as an embedded capability. Right. The whole SaaS, PaaS, IaaS — the SaaS bit is still important. TK: There’s that capability, there’s people who want to use Gemini to reason on data in our analytics infrastructure so there’s a second big set where Gemini is an embedded capability and that in turn depends on chips and TPUs and GPUs. And the third one is offering our compute platform to people. We balance across those because we want all of them to be successful by bringing hardware or out machines to other people’s venues. We’re broadening our TAM, total addressable market, in that part of the business also we see a different cash flow model than if you were putting CapEx so there’s a lot of different parameters we have to balance. All those ones you listed for you to make trade-offs on, but then you also have to get in a meeting with Sundar and the other leaders of Google to make trade-offs with DeepMind and their R&D and with the consumer products. What are those meetings like? TK: We have a regular set of cadence of meetings and we balance the different priorities and we want to be successful on many different dimensions. I wouldn’t assume all of these dimensions are zero sum. Like, for example, when we offer our product in other venues, we drive cash flow in a different way than putting CapEx — so to some extent, that changes the boundary of how we offer our capital boundary as a company also. So I think there’s a general view of there’s a compute shortage, and if you give one, you will have to take from another, I think that’s an overly simplistic view of it, having been in this for long enough and having been, my team does both parts. We are responsible for delivering all the infrastructure for Alphabet, and they’ve done an amazing job doing that, and I’m also responsible for running the cloud business, and you can tell that our differentiation, I come back to this, it would be a different problem if you didn’t have demand. You can, and whenever I ask us to prove that you’ve got demand, I always say, “Look at our results”. Well that’s been the biggest change even since January where there was still some sort of latent skepticism about, “Is all this CapEx worth it?”, feels like those questions have been completely erased at this point. Speaking of markets in the last couple months, all these SaaS companies are getting killed in the market, you have a big SaaS business, you’re definitely not getting killed in the market, why are you escaping it? TK: I think we have transitioned. The core fundamentals is finding, and this is the way we approach our product portfolio, I’ll give you a very simple example — 2023, we said, “Hey, at 2022, we said, we’re not just going to build a secure cloud, we’re also going to start offering cybersecurity products”. When we entered the market and then we looked at what other things people — the value of cyber is driven by two dimensions. Dimension one, “What is it protecting?”, because it has to protect high value things, and the other element is, “How good is it at protecting?”, “What’s the technology that it’s going to use to protect?”. So we said, “There are only two valuable places to protect, there’s either the endpoint”, which is your desktop on which apps run, other people are doing a good job there, the rest of the world is moving all their applications and data to the cloud, let’s protect that. Second, we said AI is going to find vulnerabilities because at the end of the day, finding vulnerabilities is a question of a model really understanding code, and if you can find vulnerabilities at a much more accelerated rate, people need to fix vulnerabilities at an incredibly aggressive, fast rate, and so we started a set of work back then and we said to ensure that we have the leading product portfolio, let’s acquire Wiz. We’re now working on, you’ll see a number of announcements, there’s the Threat Intelligence Agent that allows us to you know understand the threat landscape and use Gemini to prioritize what you should pay attention to where a lot of people are using Gemini to actually scan their code, and then we’re introducing three new Gemini-powered agents with Wiz , one called Red Agent — think of it as continuous red-teaming of your infrastructure, a Blue Agent that says, “Okay, I looked at what’s happening with the Red team and I know what you need to go fix”, and a Green Agent that says, “I’ll fix it for you”, and that’s going to cut the cycle time. Like our Threat Intelligence Agent, you will see reference customers from Chicago Mercantile Exchange, there’s a whole bunch of them talking next week, about how it takes an investigation that just take 30 minutes and does it in 30 seconds, that allows you to get response. Now, this is an example of when we started, people said, “Why would a hyperscaler want to become a cyber company?”, and we were like, “It’s not about being a hyperscaler, it’s about solving that problem at the intersection of — AI is going to accelerate cyber threats and you cannot do repair the old way”. Yep, it really answers the question that people had when you acquired Wiz, which is, “ Why do you need to buy it , why can’t you just build it?”. It’s like, “Well, in two years, it’s going to be too late”. That’s, I think, also felt very tangibly right now. TK: Today, we are where we are because we made that bet. TK: So when people ask, “Why are you guys growing even in sectors that may be struggling?”, it’s because we have differentiation and we made those decisions early. That makes sense. One of the interesting product announcements this year is this cross-cloud lakehouse which lets customers leave their data in AWS and Azure while still being query-able by by your services instantly. Is this the final admission that even if enterprises love your AI and love Gemini, they’re not going to shift all their workloads if they’re already on other clouds? Lots of your products have been about that in the past — even Wiz is about that to a certain exten — but is that just the reality? There’s not going to be a huge amount of spillover as far as pulling things from other clouds to Google. TK: If you use BigQuery today, you don’t have to move your transactional applications to BigQuery. If you’re using Gemini today, you can keep your applications in another cloud and use Gemini to reason on it. The problem we were trying to solve is a very specific problem. Today, when people talk about lakehouses, they say, “We have a multi-cloud lakehouse”. What they really mean is their lakehouse can be run on any cloud, but when it’s running on a particular cloud, you can only access the data in that cloud. And then people say, “That’s crazy, because I’ve got data in a SaaS app like Salesforce”, “I’ve got data in an ERP system”, “I’ve got data in Azure and Amazon, and I’d like to use analysis across all this”, one choice to customers is copy all that data out, that’s expensive for them because of the egress tax that everybody imposes. So we said, “Keep your data there, we can still give you world-class analysis”, and so it’s solving that custody. The customer has a problem, they want to do analysis, there are four things we’re giving them. Keep your data where it is, no matter how many clouds. We’re not talking about a single cloud lakehouse, we’re talking about across all the clouds and across all your SaaS apps, we can do analysis, one. Two, people said, “How fast can you run?”, the proof that we’re going to show is we’re 2x better in price performance than the market leader, right out of the gate. The third one, people said, “I’m not an expert on writing Python and Spark, can you give me essentially vibe coding for Python and Spark?” — yes, you’ll see us introduce a agent manager to generate Python and Spark code using Gemini. And then the last one people said today, Ben, if you ask a question, I was using that example of field service, I’m running a query on, “How much inventory do I have in parts?”, before I send the technician — that information sits inside an application in a set of tables in a database, most organizations have thousands of databases, teaching the model which system has what information, and the notion of part is split across 10 different tables in this particular database, you need a system that builds that semantic graph of all the information in your company. Right, this is the Knowledge Catalog . TK: That’s the catalog, and that gives you super good accuracy when you’re researching information. So we put all this together and back to, we’ve always been super pragmatic. I always say enterprises have certain problems that they see independent of a cloud. For example, security — they don’t want to buy three different security tools from three different hyperscalers. Analytics — they don’t want to buy three different analytic tools from three different hyperscalers. Others have chosen to say, “My stuff only works with my cloud”, that’s why enterprises often choose us, because we work across all the clouds and all the security environments you have and you can keep stuff wherever you are and use Gemini to access and automate stuff for you, so all that is just part of listening to customers. This all makes perfect sense, particularly this bit about the Knowledge Catalog definitely fits how I’ve been thinking. I wrote about this a few years ago about this importance of this whole layer and understanding it, it’s a bit of a big lift to get this in place. You have some sort of analog, say, with like a Palantir that’s putting in like their ontology thing . They have FDEs out on the site, multi-month projects doing this. You have OpenAI talking about Frontier , their agent layer, and they’re partnering with all the tech consultancies to build this out. Is this going to entail a lot of boots on the ground to get this graph working and functional in a way that your agents can operate effectively across it? TK: We’re not competing with Palantir, we’re not building a semantic dictionary or an ontology. What we’re doing is, today I’ll give you the closest analogy. TK: Today when you use a model, let’s say you use Gemini, and you ask a question, Gemini goes through reasoning, and then it shows you a citation. A citation is, “How did I answer the question and what’s the source I derived from?” Now imagine that citation was a query that needed to go to a folder in, for example, a storage system because there’s some documents there and a database because, for example, in a part number, just think about there’s a part number document that lists all the part numbers and sits in a drive and then that part number you need to fetch out to say it’s the modem that the guy is coming to repair, and that’s mapped to a table in a database. So what the graph does, we use Gemini, so we don’t need humans, we use Gemini to say, “Hey, go and read all these documents in these drives and extract the information from it and then match that to the database table that has the reference to the part number”, and so then when Gemini turns around and says, “I got this query about how much inventory of modems they are”, the first thing it does is it says, “Okay, go to the Knowledge Catalog and it says modem is part number one, two, three, four, five”, and then it says, “By the way the table in the database that has the inventory information about this part number is this table, here’s a SQL”, it then makes the quality of what we generate higher and then when it answers the question it shows back — back to your, “Trust my data”, it shows a grounding citation saying, “That’s where we got it from.” What do you need from everyone in the ecosystem if this is going to work, all these SaaS applications and across all these entities, not just what’s in your databases, but what’s in a SAP database or whatever it might be. How do you get them on board so you can understand their data and build this Knowledge Catalog? TK: Really easy, the first thing is to use the lakehouse we support a standard format, industry is very standardized on it, it’s called Iceberg , so anybody who supports Iceberg we can talk to it and so that’s pretty much the whole world right now, so we don’t need them to do anything special to make it work. Second, all of these business systems have API specifications, and our Catalog can learn off of those API specifications, we just teach Gemini to process those, and so we can build a catalog pretty quickly. There are reports that OpenAI on Amazon Bedrock has been massively popular. Are we going to get OpenAI on Vertex? TK: We would love to have them. We are announcing a variety of third-party models on Vertex, including Anthropic, including open source, we’re open to any model provider on Vertex. I believe you. That’s going to be great, when and if it happens. Just one last question. We’ve talked in this interview series previously about how I think, and this is before your time, it’s not your fault, that Google Cloud missed the boat in terms of being a point of integration for the Silicon Valley enterprise ecosystem. I think last year I asked you if AI represented a new opportunity to do that. However, is there a bit where the models, and you’re in this game because you have one of the leading models, is just going to eat everything and is going to gradually expand to do the jobs and everyone else is just going to be a system of record? It’s going to be all one interface, that the integration, such that it is, is all under the surface, it’s not necessarily tying things together in user space. Is Gemini going to be all the user needs in the long run? TK: We don’t see it that way. In fact, one announcement you’ll see us make next week is how many third-party SaaS and ISV [independent software vendors] vendors are embedding Gemini not just as a model, but as an agent platform, because they want to build agents and our agent platform, you can use to build agents, not just our own agents, but they can use it and there’s a lot of independent software vendors embedding those agents. And do they see you as like, “Hey, you’re another established guy, let’s go with you because we don’t know what these other folks are up to, they want to eat all of us”? TK: It’s also the capabilities. The differentiation, I would say, is just think about you’re a bank or an insurance company, and think about you’re a SaaS vendor selling to them or an independent software vendor, there’s a number of things around identity, policy management. For example, if you’re a bank and you have documentation about a person and their credit, you cannot have that egress the bank’s boundary, so we have a gateway that protects against that, that’s part of our agent platform. You want to have auditability on the agent to say which agent did what task on what system when, that’s built into the platform. You want to have a registry where you expose all your skills so that people are not duplicate building all these things, we have a registry that does that. This is sort of the bit we started with at the beginning, it’s not just going to benefit your agents it’s going to benefit all agents, that’s sort of the pitch. TK: So one of the things that people like is the fact that we built all that plumbing for them, and so they don’t have to invest in it, they can focus on the value add that they have on their agent side. Additionally, for companies in this broader ecosystem, the cost of agents — and it becomes part of their bill of materials, if you will, the cost of goods sold — the fact that we have these super efficient chips that run inference with such efficiency eventually translates into cost efficiency for a third party that’s building on top of us. You can see that all of those benefits, we’re taking away all that complexity for these guys, so we definitely don’t see that all the ecosystem is going to die, we definitely don’t see that, we see us facilitating that ecosystem. You’ll see us announcing a number of things, including a substantial investment in dollars to accelerate the partner ecosystem around our platform. Thomas Kurian, great to talk to you again. TK: Thanks so much, Ben. And just in closing, the work that we announce every year at Next is a testament to all those customers and partners who gave us a shot to work with them. You’ll see them telling their story, and it’s a testament to all those people at our organization that made a bet to solve a technical problem a different way, or to bring our technology — we’ve hugely expanded our go-to-market organization, and doing all that with growing top line and operating income at the same time is a testament to the demand we see for our products and services. I mean, six, seven years ago, people used to tell us, “You have no shot in the market”, I think we are now truly uniquely positioned. Name one other player that has the stack of technology to do AI, when I look forward, I think there’s no question in people’s minds that the central problem that companies need to solve and technology providers need to solve is how good is the capability you offer for AI. We’re the only ones with chips, models, the context to feed the models from all of the data infrastructure, the cyber tools, and then a world-class agent platform. I would also add, you’re actually an enterprise company now. The things you talked about, pragmatism, listening to customers, all these pieces, GCP did not have at all a decade ago — there’s a bit where Wiz was ahead of its time, for sure, being forward-looking, but there’s a bit where the organization is ready for this moment in a way I don’t think it would have been previously. I find it very impressive. TK: We are very proud of the team. Also for Alphabet, to do AI well, you have to do a couple of things. One, see the breadth of problems that we see, we see all of the consumer problems, we see the enterprise problems, we see the problems that search sees, we see the problems that YouTube needs, we see all those that we’re solving with AI, that gives us a breadth of capability that the model needs to solve, that over time is a real strength because the diversity of problems we’re solving. Second, in order to do AI well, you have to invest, and in order to invest, you need to monetize in as many different ways as possible. I think we are very confident that our team, we do not have any hubris, but we are confident in where we stand. I think it’s very impressive. I look forward to your keynote. TK: Thanks so much Ben, it’s a privilege to talk to you every year and it’s great that you took the time to speak with me. And it’s all recorded, I can promise you that! This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Ahmad Alfy 3 weeks ago

Stop Hardcoding Your Timeouts

A developer rant about tools built for one kind of internet Recently, I’ve been losing my mind to hardcoded timeouts . Silent, arbitrary, unconfigurable time limits baked into tools by developers who apparently have never had to wait more than 200ms for anything in their lives. Let me tell you about my week. Now that coding agents are everywhere, everyone is using skills. The popular way to add them is through packages developed by vercel-labs, and the go-to collection is awesome-copilot , a curated set of skills sitting at 30K+ stars at the time of writing. Except I can’t use it. The repository is too big, and the installer just chokes and dies. There’s an open issue about this since February #278 on the vercel-labs/skills repo and no one has responded. I’d be happy to send a PR and fix it myself. I just need someone to acknowledge it exists. Is there a configuration option? A flag? An environment variable? No, there is nothing. The workaround I found? Clone the repo manually first, then install from the local copy. It works, mostly. Except now points to a path on my machine. My colleagues cannot use it. I also have to update my copy everytime I want update my skills. One workaround creates a lot of other problems. Then came Docker Gordon, the AI-powered debugging assistant baked into Docker. Useful concept. I was stepping through a container build issue, the kind that requires iteration: tweak, rebuild, inspect, repeat. I’ve never used Gordon but when the error manifested itself, it came with a suggestion to try Gordon and so I did. Except Gordon has a hard limit: if your container doesn’t finish building within two minutes , it gives up. The session dies. You start over. A two-minute build might sound like plenty if you’re in a fast environment with warm caches and pulled base images. But if you’re pulling a fresh base image over a slower connection? Debugging a multi-stage build with several heavy layers? Forget it. Gordon has already moved on. There is no way to configure this. No env var. No flag. Nothing. The tool just assumes that two minutes is forever, and if you need more, that’s your problem. Developers often working on fast machines, in offices or homes with gigabit connections, in cities with world-class infrastructure. They build tools with timeout defaults that reflect their own experience. And then they ship those tools to the whole world, with no knobs to turn. The thing is, timeouts need to exist. Infinite waits are bad. Hanging processes are bad. I’m not arguing against timeouts. I’m arguing against unconfigurable timeouts. Against the implicit message that says: if you can’t do this in 60 seconds, your environment is wrong, not my assumption. A timeout should be: This isn’t hard. It’s respect for your users. I’m writing this from Cairo. My internet is decent, better than many places in the world. But it’s not 1 Gbps symmetric fiber. It’s not co-located next to an npm registry mirror. A of a large repo takes time. Pulling a Docker image takes time. These are not failures. They are physics. When your tool dies silently after 60 seconds without any way to change that limit, you haven’t built a tool for the world. You’ve built a tool for your office. And this matters more than most developers acknowledge. The global developer community isn’t located in San Francisco or Amsterdam or London. It’s in Lagos, in Karachi, in Cairo. It’s people on 4G connections, on shared broadband, on connections that have real latency because the nearest CDN edge is 50ms away instead of 5. When you assume a fast connection, you’re not making a neutral technical decision. You’re making a statement about whose experience matters. I don’t think anyone is doing this maliciously. I think it’s a blind spot. Your internet is fast, so a 60-second timeout feels generous. Your machines are powerful, so a 2-minute build window seems like plenty. But please: before you ship a timeout, ask yourself: And then add a config option. One environment variable. One flag. That’s all it takes to go from “this tool doesn’t work for me” to “this tool works for me.” As Bruce Lawson once said: it’s the World Wide Web, not the Wealthy Western Web. The web and the tools we build on top of it are for everyone. Let’s start acting like it. A safe default for the common case Clearly documented so users know it exists Overridable via a flag, an environment variable, a config file, something What if the user is on a slower connection? What if their repo is larger than mine? What if they’re debugging something slow, and that’s the whole point?

0 views
Stratechery 3 weeks ago

Tim Cook’s Impeccable Timing

Listen to this post : It’s the nature of business that the eulogy for a chief executive doesn’t happen when they die, but when they retire, or, in the case of Apple CEO Tim Cook, announce that they will step up to the role of Executive Chairman on September 1 . The one morbid exception is when a CEO dies on the job — or quits because they are dying — and the truth of the matter is that that is where any honest recounting of Cook’s incredibly successful tenure as Apple CEO, particularly from a financial perspective, has to begin. The numbers, to be clear, are extraordinary. Cook became CEO of Apple on August 24, 2011, and in the intervening 15 years revenue has increased 303%, profit 354%, and the value of Apple has gone from $297 billion to $4 trillion, a staggering 1,251% increase. The reason for Cook’s accession in 2011 became clear a mere six weeks later, when Steve Jobs passed away from cancer on October 5, 2011. Jobs’ death isn’t the reason Cook was chosen — Cook had already served as interim CEO while Jobs underwent treatment in 2009 — but I think the timing played a major role in making Cook arguably the greatest non-founder CEO of all time. Peter Thiel introduced the concept of Zero To One thusly: When we think about the future, we hope for a future of progress. That progress can take one of two forms. Horizontal or extensive progress means copying things that work — going from 1 to n. Horizontal progress is easy to imagine because we already know what it looks like. Vertical or intensive progress means doing new things — going from 0 to 1. Vertical progress is harder to imagine because it requires doing something nobody else has ever done. If you take one typewriter and build 100, you have made horizontal progress. If you have a typewriter and build a word processor, you have made vertical progress. Steve Jobs made 0 to 1 products, as he reminded the audience in the introduction to his most famous keynote : Every once in a while, a revolutionary product comes along that changes everything. First of all, one’s very fortunate if one gets to work on one of these in your career. Apple’s been very fortunate: it’s been able to introduce a few of these into the world. In 1984, we introduced the Macintosh. It didn’t just change Apple, it changed the whole computer industry. In 2001, we introduced the first iPod. It didn’t just change the way we all listen to music, it changed the entire music industry. Well, today we’re introducing three revolutionary products of this class. The first one: a widescreen iPod with touch controls. The second: a revolutionary mobile phone. And the third is a breakthrough Internet communications device. Three things…are you getting it? These are not three separate devices. This is one device, and we are calling it iPhone. Steve Jobs would, three years later, also introduce the iPad, which makes four distinct product categories if you’re counting. Perhaps the most important 0 to 1 product Jobs created, however, was Apple itself, which raises the question: what makes Apple Apple? “What Makes Apple Apple” isn’t a new question; it was the central question of Apple University, the internal training program the company launched in 2008. Apple University was hailed on the outside as a Steve Jobs creation, but while I’m sure he green lit the concept, it was clear to me as an intern on the Apple University team in 2010, that the program’s driving force was Tim Cook. The core of the program, at least when I was there, was what became known as The Cook Doctrine : We believe that we’re on the face of the Earth to make great products, and that’s not changing. We’re constantly focusing on innovating. We believe in the simple, not the complex. We believe that we need to own and control the primary technologies behind the products we make, and participate only in markets where we can make a significant contribution. We believe in saying no to thousands of projects so that we can really focus on the few that are truly important and meaningful to us. We believe in deep collaboration and cross-pollination of our groups, which allow us to innovate in a way that others cannot. And frankly, we don’t settle for anything less than excellence in every group in the company, and we have the self-honesty to admit when we’re wrong and the courage to change. And I think, regardless of who is in what job, those values are so embedded in this company that Apple will do extremely well. Cook explained this on Apple’s January 2009 earnings call , during Jobs’ first leave of absence, in response to a question about how Apple would fare without its founder. It’s a brilliant statement, but it is — as the last paragraph makes clear — ultimately about maintaining, nurturing, and growing what Jobs built. That is why I started this Article by highlighting the timing of Cook’s ascent to the CEO role. The challenge for CEOs following iconic founders is that the person who took the company from 0 to 1 usually sticks around for 2, 3, 4, etc.; by the time they step down the only way forward is often down. Jobs, however, by virtue of leaving the world too soon, left Apple only a few years after its most important 0 to 1 product ever, meaning it was Cook who was in charge of growing and expanding Apple’s most revolutionary device yet. Cook, to be clear, managed this brilliantly. Under his watch the iPhone not only got better every year, but expanded its market to every carrier in basically every country, and expanded the line from one model in two colors to five models in a plethora of colors sold at the scale of hundreds of millions of units a year. Cook was, without question, an operational genius. Moreover, this was clearly the case even before he scaled the iPhone to unimaginable scale. When Cook joined Apple in 1998 the company’s operations — centered on Apple’s own factories and warehouses — were a massive drag on the company; Cook methodically shut them down and shifted Apple’s manufacturing base to China, creating a just-in-time supply chain that year-after-year coordinated a worldwide network of suppliers to deliver Apple’s ever-expanding product line to customers’ doorsteps and a fleet of beautiful and brand-expanding stores. There was not, under Cook’s leadership, a single significant product issue or recall. Cook also oversaw the introduction of major new products, most notably AirPods and Apple Watch; the “Wearables, Home, and Accessories” category delivered $35.4 billion in revenue last year, which would rank 128 on the Fortune 500. Still, both products are derivative of the iPhone; Cook’s signature 0 to 1 product, the Apple Vision Pro, is more of a 0.5. Cook’s more momentous contribution to Apple’s top line was the elevation of Services. The Google search deal actually originated in 2002 with an agreement to make Google the default search service for Safari on the Mac, and was extended to the iPhone in 2007; Google’s motivation was to ensure that Apple never competed for their core business , and Cook was happy to take an ever increasing amount of pure profit. The App Store also predated Cook; Steve Jobs said during the App Store’s introduction that “we keep 30 [percent] to pay for running the App Store”, and called it “the best deal going to distribute applications to mobile platforms”. It’s important to note that, in 2008, this was true! The App Store really was a great deal. Three years later, in a July 28, 2011 email — less than a month before Cook officially became CEO — Phil Schiller wondered if Apple should lower its take once they were making $1 billion a year in profit from the App Store. John Gruber, writing on Daring Fireball in 2021 , wondered what might have been had Cook followed Schiller’s advice: In my imagination, a world where Apple had used Phil Schiller’s memo above as a game plan for the App Store over the last decade is a better place for everyone today: developers for sure, but also users, and, yes, Apple itself. I’ve often said that Apple’s priorities are consistent: Apple’s own needs first, users’ second, developers’ third. Apple, for obvious reasons, does not like to talk about the Apple-first part of those priorities, but Cook made explicit during his testimony during the Epic trial that when user and developer needs conflict, Apple sides with users. (Hence App Tracking Transparency, for example.) These priorities are as they should be. I’m not complaining about their order. But putting developer needs third doesn’t mean they should be neglected or overlooked. A large base of developers who are experts on developing and designing for Apple’s proprietary platforms is an incredible asset. Making those developers happy — happy enough to keep them wanting to work and focus on Apple’s platforms — is good for Apple itself. I want to agree with Gruber — I was criticizing Apple’s App Store policies within weeks of starting Stratechery , years before it became a major issue — but from a shareholder perspective, i.e. Cook’s ultimate bosses, it’s hard to argue with Apple’s uncompromising approach. Last year Apple Services generated 26% of Apple’s revenue and 41% of the company’s profit; more importantly, Services continues to grow year-over-year, even as iPhone growth has slowed from the go-go years. Another way to frame the Services question is to say that Gruber is concerned about the long-term importance of something that is somewhat ineffable — developer willingness and desire to support Apple’s platforms — which is, at least in Gruber’s mind, essential for Apple’s long-term health. Cook, in this critique, prioritized Apple’s financial results and shareholder returns over what was best for Apple in the long run. This isn’t the only part of Apple’s business where this critique has validity. Cook’s greatest triumph was, as I noted above, completely overhauling and subsequently scaling Apple’s operations, which first and foremost meant developing a heavy dependence on China. This dependence was not inevitable: Patrick McGee explained in Apple In China , which I consider one of the all-time great books about the tech industry, how Apple made China into the manufacturing behemoth it became. McGee added in a Stratechery Interview : Let me just refer back to something that you wrote I think a few months ago when you called the last 20, 25 years, like the golden age for companies like Apple and Silicon Valley focused on software and Chinese taking care of the hardware manufacturing. That is a perfect partnership, and if we were living in a simulation and it ended tomorrow, you’d give props for Apple to taking advantage of the situation better than anybody else. The problem is we’re probably not living in the simulation and things go on, and I’ve got this rather disquieting conclusion where, look, Apple’s still really good probably, they’re not as good as they once were under Jony Ive, but they’re still good at industrial design and product design, but they don’t do any operations in our own country. That’s all dependent on China. You’ve called this in fact the biggest violation of the Tim Cook doctrine to own and control your destiny, but the Chinese aren’t just doing the operations anymore, they also have industrial design, product design, manufacturing design. It really is ironic: Tim Cook built what is arguably Apple’s most important technology — its ability to build the world’s best personal computer products at astronomical scale — and did so in a way that leaves Apple more vulnerable than anyone to the deteriorating relationship between the United States and China. China was certainly good for the bottom line, but was it good for Apple’s long-run sustainability? This same critique — of favoring a financially optimal strategy over long-term sustainability — may also one day be levied on the biggest question Cook leaves his successor: what impact will AI have on Apple? Apple has, to date, avoided spending hundreds of billions of dollars on the AI buildout, and there is one potential future where the company profits from AI by selling the devices everyone uses to access commoditized models; there is another future where AI becomes the means by which Apple’s 50 Years of Integration is finally disrupted by companies that actually invested in the technology of the future. If Tim Cook’s timing was fortunate in terms of when in Apple’s lifecycle he took the reins, then I would call his timing in terms of when in Apple’s lifecycle he is stepping down as being prudent, both for his legacy and for Apple’s future. Apple is, in terms of its traditional business model, in a better place than it has ever been. The iPhone line is fantastic, and selling at a record pace; the Mac, meanwhile, is poised to massively expand its market share as Apple Silicon — another Jobs initiative, appropriately invested in and nurtured by Cook — makes the Mac the computer of choice for both the high end (thanks to Apple Silicon’s performance and unified memory architecture) and the low end (the iPhone chip-based MacBook Neo significantly expands Apple’s addressable market). Meanwhile, the Services business continues to grow. Cook is stepping down after Apple’s best-ever quarter, a milestone that very much captures his tenure, for better and for worse. At the same time, the AI question looms — and it suggests that Something Is Rotten in the State of Cupertino . The new Siri still hasn’t launched, and when it does, it will be with Google’s technology at the core. That was, as I wrote in an Update , a momentous decision for Apple’s future: Apple’s plans are a bit like the alcoholic who admits that they have a drinking problem, but promises to limit their intake to social occasions. Namely, how exactly does Apple plan on replacing Gemini with its own models when (1) Google has more talent, (2) Google spends far more on infrastructure, and (3) Gemini will be continually increasing from the current level, where it is far ahead of Apple’s efforts? Moreover, there is now a new factor working against Apple: if this white-labeling effort works, then the bar for “good enough” will be much higher than it is currently. Will Apple, after all of the trouble they are going through to fix Siri, actually be willing to tear out a model that works so that they can once again roll their own solution, particularly when that solution hasn’t faced the market pressure of actually working, while Gemini has? In short, I think Apple has made a good decision here for short term reasons, but I don’t think it’s a short-term decision: I strongly suspect that Apple, whether it has admitted it to itself or not, has just committed itself to depending on 3rd-parties for AI for the long run. As I noted above and in that Update, this decision may work out; if it doesn’t, however, the sting will be felt long after Cook is gone. To that end, I certainly hope that John Ternus, the new CEO, was heavily involved in the decision; truthfully, he should have made it. To that end, it’s right that Cook is stepping down now. Jobs might have been responsible for taking Apple from 0 to 1, but it was Cook that took Apple from 1 to $436 billion in revenue and $118 billion in profit last year. It’s a testament to his capabilities and execution that Apple didn’t suffer any sort of post-founder hangover; only time will tell if, along the way, Cook created the conditions for a crash out, by virtue of he himself forgetting The Cook Doctrine and what makes Apple Apple.

0 views
Martin Alderson 3 weeks ago

Figma's woes compound with Claude Design

I think Figma is increasingly becoming a go-to case study in the victims of the so-called "SaaSpocalypse". And Claude Design's recent launch last week just adds a whole new dimension of pain. Firstly, I should say that I love(d?) the Figma product. It's hard to understand now what a big deal Figma's initial product was when it launched in the mid 2010s. The initial product ushered in a whole new category of SaaS - using the nascent WebGL and asm.js technologies to allow designers to design entirely in browser. It used to be the running joke that an app like Photoshop would ever run in the browser, but Figma proved it wrong. It quickly overtook Sketch as the defacto design tool in the market. Firstly for UI/UX wireframing and prototyping, but increasingly for everything graphic design. As it was based in the browser, it was a revelation from the developer side to be able to open UI/UX files if you weren't on a Mac (Sketch is Mac only). It was also brilliant to be able to leave comments on the design and collaborate with the designer(s) to iterate on designs really quickly. The collaborative features (without requiring anyone to download any software) quickly meant it got adoption outside of pure design roles - PMs and executives could finally collaborate in real time on the product they were building, without having to (at best) send back revisions and notes from badly screenshotted files that tended to be out of date by the time they were received. I'll skip over the rest of the history, including a no doubt distracting takeover attempt by Adobe, that was later blocked on competition grounds. But (of course) LLMs happened and suddenly one of the most forward looking SaaS companies became very vulnerable to disruption itself. One completely unexpected development me and others noticed (and wrote up a few months ago at How to make great looking reports with Claude Code ) was that LLMs started to get fairly "good" at design. By good I do not mean as good as a talented designer, clearly it's nowhere near that - currently. But like many things, not everything requires a great designer. Even if you use a great design team to build out your core product experience (and many do not ), there's an awful lot of design 'resource' required for auxiliary parts of the product, reports, proposals etc. It's not stuff that tends to get designers excited but can sap an awful lot of time going back and forth on a pitch deck. And this is exactly why I think Figma is almost uniquely vulnerable. The way it managed to expand into organisations by getting uptake with non-designers becomes a liability if those non-designers can get an AI agent to do the design for them. Looking at Figma's S1 (which is somewhat out of date by now, but is the only reported breakdown I can find) corroborates this potential weakness. Only 33% of Figma's userbase in Q1 2025 was designers, with developers making up 30% and other non-design roles making up 37%. A lot of Figma's continued expansion depended on this part of their userbase. A lot of their recent product development has been to enable further expansion in organisations - "Dev Mode" for developers (which now looks incredibly quaint against LLMs), Slides (to compete against PowerPoint and other presentation tools) and Sites (a WebFlow-esque site builder) all are about expanding their TAM out of "pure" design. The real surprise for me though was how basic their "flagship" AI design product Figma Make is. It really does feel like something that someone put together in an internal AI hackathon one weekend and it never progressed beyond that. Given how much Figma managed to push the envelope on web technology I found this surprising - perhaps they were caught off guard with how quickly LLMs' design prowess improved, or there were internal disagreements about the role AI should or will play in design. Regardless, it's an incredibly underwhelming product as it stands. If things weren't bad enough, Anthropic themselves launched Claude Design which is a pretty direct competitor to Figma in many ways. While it's nowhere near functional and polished enough to replace Figma's core design product, I expect it will get significant traction outside of that. The ability for it to grab a design system from your existing assets in one click is very powerful - and allows you to then pull together prototypes, presentations or reports in your corporate design style that look and feel far better than anything a non-designer could do themselves. And I thought it was extremely telling that unlike a lot of the other Anthropic product launches that have touched design - Figma did not provide a testimonial on it (understandably). Canva did , which I found extremely odd (they are in my eyes even more vulnerable to this product than Figma). I think this really underlines two major weaknesses in many SaaS companies' AI strategies: Firstly, it's very difficult to compete on AI against the company that is providing your AI inference. A quick check on Figma Make suggests that Figma (at least on my account) is indeed using Sonnet 4.5 for its inference - though I have seen it use Gemini in the past: At this point Figma is effectively funding a competitor - and the more AI usage Figma has - the more money they send over to Anthropic for the tokens they use. Even worse, Sonnet 4.5 is miles behind what Anthropic uses on Claude Design (Opus 4.7, which has vastly improved vision capabilities [1] ), so the results a user gets on Make vs Claude Design are almost certainly going to underwhelm. Also, unlike most/all SaaS costs, inference (especially with these frontier models) is expensive . As Cursor found out, the frontier labs can charge a lot less to end users than API customers like Figma. When you are potentially looking at a shrinking userbase, it's far from ideal to have very expensive variable costs that start pulling your profitability down. Secondly, it really underlines to me how incredibly efficient headcount-wise companies can build products now. Figma has close to 2,000 employees - not all working on product engineering of course. I really doubt Anthropic even needed 10 to build Claude Design. Indeed the entirety of Anthropic is around 2,500 people. It's also worth noting that a lot of the things that would traditionally lock a company like Figma in stop working as well in an agent-first world. Multiplayer matters less when your collaborator is an agent iterating on a prompt. Plugin ecosystems matter less when you can just ask for the functionality directly. Design system tooling is the whole point of Claude Design. Enterprise SSO - Claude already has that. Most of the moats that protect a mature SaaS company are moats against other SaaS companies, not against the thing providing their inference. I might be wrong about how bad this gets for Figma specifically. Companies with strong brands, great distribution and genuinely talented teams can often adapt faster than outsiders expect, and I'd rather be long Figma than most of its competitors. But the structural point is harder to wriggle out of. Figma has ~2,000 employees. Anthropic has ~2,500 total and I doubt Claude Design took more than a handful to build. Figma now needs to out-execute a competitor whose inference is ~free to them, whose marginal cost to ship is roughly zero, and who employs fewer people on the competing product than Figma has on a single pod. That's a very hard position to pivot out of. This feels like a preview of where SaaS economics are heading. The companies that built big orgs on the assumption of steady seat expansion are going to find themselves competing with products built by tiny teams inside the frontier labs. Figma just happens to be the first big public name where one of their primary inference suppliers has started competing against them. Both GPT 5.4 and Opus 4.7 can now "see" screenshots at much higher resolution - Opus 4.7 jumped from 1568px / 1.15MP to 2576px / 3.75MP. Resolution isn't the whole story (scaffolding and post-training matter a lot too) but it meaningfully helps with small-element detection and layout judgement. If you've ever pasted a screenshot of something broken and the model told you it looks great, the previous lack of resolution is one of the reasons why. ↩︎ Both GPT 5.4 and Opus 4.7 can now "see" screenshots at much higher resolution - Opus 4.7 jumped from 1568px / 1.15MP to 2576px / 3.75MP. Resolution isn't the whole story (scaffolding and post-training matter a lot too) but it meaningfully helps with small-element detection and layout judgement. If you've ever pasted a screenshot of something broken and the model told you it looks great, the previous lack of resolution is one of the reasons why. ↩︎

0 views
Stratechery 4 weeks ago

An Interview with F1 Driver and Venture Capitalist Nico Rosberg About the Drive to Win

Listen to this post: Good morning, This week’s Stratechery Interview is with F1 driver-turned-venture capitalist Nico Rosberg . Rosberg started his F1 career in 2005, and retired after winning the world championship in 2016; Rosberg spent his last four years as teammates on Mercedes with his childhood friend Lewis Hamilton in one of the most intenst teammate rivalries in F1 history. Over the last several years, however, Rosberg has reinvented himself as a venture capitalist, founding Rosberg Ventures , with a specific focus on leveraging his F1 background to build connections between European money and Silicon Valley startups in one direction, and startup products and German businesses in the other. In this interview we cover all aspects of Rosberg’s journey, from having a steering wheel in his crib, pioneering the use of sports psychology in F1, and his decision to retire on top of the world. Then, we discuss how F1 builds connections, the similarities between founders and drivers, and how he realized he could leverage that in a new competition: winning as an investor. What I found particularly interesting is how Rosberg’s background and history seems so varied and unconnected on the surface, yet are clearly linked by a consistent ethos of maximizing opportunity in the service of winning. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Nico Rosberg, welcome to Stratechery. Nico Rosberg: Thank you very much, Ben, it’s really an honor to be on the show. I hear so much about your show always especially when I’m in the Bay Area. Well, I don’t normally interview venture capitalists on Stratechery, but you are no normal venture capitalist, which you use to your advantage. I want to ask you about that, but needless to say, that made this an easy exception to make, particularly since I’m a big Formula 1 fan. To that end, I always start my interviews talking about the subject background, we may spend a bit more time on yours if that’s okay with you, it’s pretty fascinating. NR: I understand. With pleasure. Okay, good. Well, you were born in 1985 in West Germany to a German mother and a Finnish father. Your father Keke was the 1982 Formula 1 world champion. Was there a steering wheel in your crib when you came home from the hospital? NR: There was actually, yes. (Laughing) Oh, that’s funny. NR: On my Facebook page you would see photos of me in a go-kart when I’m like three years old with a helmet on and everything, so yeah, it was an early discovery of that passion. I’m interested about that because obviously your father was tremendously successful, is he immediately all in on, “You have to do what I did”, or was there ultimately a bit of humoring you, “You can come along and try this but I’m not sure you could ever measure up to what I did?”. NR: There was a go-kart track near our house and he was going there with his friends even before I was born and then when I was born, and then I was six, seven years old, we just gave it a go, I enjoyed it, and I looked pretty fast also. So then he was like, “Maybe this can become a father-son hobby”, it just went from there and then you start doing a race here, a race there, I started winning the races kind of immediately and so that even that hooks me even even more than when you win, of course, it’s amazing, it’s an amazing motivation. So that’s how we just kind of got going and it became an amazing father-son hobby to share. We spent a lot of time with each other, we traveled in a motorhome to the races, so it was really lovely. There definitely is a bit to driving a car very fast. On one hand, of course, you started early, and you see the history of Formula 1 drivers, they start early, but you took to it right away. It’s definitely like father, like son in that regard. NR: Indeed. I think as in every sport — you also see it with golf or tennis — you have to start pretty early now it just gives you a head start and in practicing those skills. And I think, yeah, I guess I inherited some of those genes from my father because we need to be very good at hand-eye coordination, that’s super important. NR: We need to be also very good at processing things very quickly because we have things coming at us at 220 miles an hour, our eyes are flickering left and right all the time, just taking in all the inputs that we’re seeing and also feeling, so I think that also probably has to be a strength of ours. There’s a lot of stuff in your background about your parents really pushing you in terms of academics, learning lots of languages, all that sort of thing. Was that unique to you, or to your bit, it always strikes me that Formula 1 drivers all come across as very intelligent. And to your point, there’s such a high degree of information processing that’s happening on, is that the norm, generally speaking? NR: I think you probably need to be a bit street smarter, at least, to be a successful F1 driver than maybe in some other sports, because we depend so much on this high technology car, and if we’re not able to understand the car, set it up properly, be at least street smart about all these things, then it doesn’t matter how talented you are, you’ll never be able to go fast. So probably I would say that in our sport, yeah, that comes a little bit more to the fore than maybe in other sports. But in my case, actually, my parents pushing me at school was the contrary, my mom and my dad would usually come in late at night and say, “OK, stop now”, because I was always very hard working at school. Somehow we had a group of friends, everybody wanted to achieve, and I wanted to achieve as well, and I had to catch up because I was missing half the week every other week because I was racing. So my parents were more actually telling me to stop now because I was trying to make too much of an effort to catch up. Interesting, because a bit I want to get to here is you’ve had such a widely varying career, even since you finished racing, you finished relatively young , and so that has been a theme for you all along, is like you born with the steering wheel in your crib, but you’re interested in more than that. NR: Yeah, I really always enjoyed the academic side. In fact, if I wasn’t going to make it as a driver, I already had a place reserved for me in Imperial College in London to study aeronautics, that was my plan B of how to get into F1, which would have been as an aerodynamicist. Right, design the car instead of driving it. NR: I don’t know if I would have gotten there in the end, but I think I had a good shot, so that was my plan B was already set. You’re most famous for your rivalry with Lewis Hamilton but as I understand it you actually met him quite young you were teammates in carting as well? NR: It’s a pretty crazy story because the McLaren Formula One team wanted to set up a little go-kart team at the time, and the two rising star drivers at the time was Lewis Hamilton from Great Britain and myself down south, and so they actually funded our two go-karting seasons. And so it was just the two of us driving for the McLaren Mercedes go-karting team and we were winning all the races and championships. Unfortunately for me, more often than not, it was Lewis winning and I was second, but there we go. So it’s incredible because we were best friends at the time and we were 13 years old and we were on holiday together all the time and dreaming, “Imagine what it would be like in 15 years to be in the F1 team together, winning races and championships?”, and it was impossible to achieve that dream, just seemed so far away. And yet really 15 years later, we’re in the Mercedes F1 team as teammates fighting for races and championships, so it’s a pretty incredible story. I mean, why did it seem even that impossible, though? I mean, your dad was an F1 driver, you’ve been racing in karts. What makes F1 feel so far away? NR: Well, come on, you can imagine if you’re 13 year old and you’re playing in your regional tennis camp in the middle of nowhere that you look at the television and you see [Jannik] Sinner and [Carlos] Alcaraz fighting for the Monaco Masters that’s going to look like extremely impossible and far away. Right, but there wasn’t a bit of total self-belief that, “I’m going to be there, there’s no question”? NR: Well maybe Lewis is a little bit more like that, I’m more sensitive, more insecure, less self-belief, so I never actually really believed of myself that I could get there and be good enough, which has pros and cons to think like that, because it also is an incredibly strong motivator. When you don’t have that self-confidence, you just fight so hard to prepare to the best of your abilities all the time. So it has pros and cons, and it was nice to see that, of course, someone like me that did not believe until the very last corner, I was still able to actually win in the end, so that was reassuring. I’m curious about this mindset bit, because this has been an area that you’ve actually talked a lot about. In 2007, you stopped working with your father as closely as you were, went to work with a sports psychologist. At what point was it clear to you that this mental aspect is going to be super important to your success? NR: That became clear to me in my first year of F1 because it was mentally just an enormous struggle. We had a bad car, so we’re either breaking down or finishing well out of the points all the time and it was a really rough start to my career. And this is with Williams at the time? NR: Yeah, with Williams. At times it was almost as if like, “Oof, I might not get taken on for the second year”, because it was such a rough start. So mentally, it was incredibly hard because my dream is at stake, my dream is to be an F1 driver, to win races, so that was difficult. So I decided that, “I’m spending four hours a day on training my body, why am I not training my brain? There must be solutions out there to improve my mental state”. So I sought out help, and I found a psychologist/philosopher and this was incredible for my life, for my performance, I worked 10 years with him. In the winter, two hours every two days, so it was like an incredible effort, it was harder than the physical training was actually the mental training. It was a combination of learning to meditate, learning to visualize, to learning the power of repetition, and also learning to understand myself better. “Why am I scared?”, “Why am I anxious, jealous?”, because then you cannot switch those emotions off very easily or almost not at all. But when you understand why they’re there, you can really adapt your reaction and that has a snowball effect, because when you react in a much better and more appropriate way, it has an enormous snowball effect on your life so it’s these kind of learnings that really helped me so much. Was this pretty novel for an F1 driver to seek this out and do this sort of training at the time? NR: Yeah, it’s a bit like in the startup world. Founders are not really allowed to admit that they’re scared of failing or that they’re working with a brain doctor, as some like to call it at the time in F1, so it was not something that I could really tell anybody about this because it would look weak in a way, but actually it became my superpower to go through that process. And now there’s a little bit more acceptance now, there’s been a couple of other drivers talking about it. I think even Lando Norris, the world champion last year, he sought help in the middle of last year as he was struggling mentally, clearly, and his championship was slipping away from him, and he went out and sought help and made enormous progress, and that’s what got him the world championship in the end so that was great to see. Lando’s always interesting because he seems to wear his insecurities on his sleeve, they just come through sort of so tangibly. Did you feel a lot of like sympathy for his sort of struggles and working through that? NR: Yeah, totally. That’s the state of mind that I can very much relate with, and that’s what people love also because he’s very authentic, so that’s really appreciated. At the same time I wrote Lando a direct message on Instagram and he never replied, but at least I wanted to see if maybe he would read it, because I’ve been through what he’s what he’s been through, and one of the obvious things that I would change if I was Lando, and he did change it a little bit, is to not always talk about the glass half empty, even when he was on pole position he almost only spoke about that one corner where he messed up rather than like, “Hey, that was almost the best lap of my life”. I mean, both is right. “Hey, that was almost the best lap of my entire life”, that would be correct or, “Ah, damn, I messed that last corner up so bad”, that would also be correct. You know? And he just says, “I messed that last corner up”, and, “I need to get my stuff together”, and that’s just unnecessary because it’s repetition, and it really ingrains itself in your mind that you always, if you say, “I make mistakes always”, you’re really going to believe that you make mistakes always. So that’s something that he could quite easily just adapt, even if he keeps on thinking that that, but don’t say it, and don’t say it out to the whole world, because that’s a whole tsunami that you’re setting off there repeatedly, which is not going to be beneficial to your performance. You’ve talked about talking to founders and not being able to show weaknesses. Have there been any examples in the times that as you’ve been an investor and talking to different companies, where you’ve identified someone and been like, “Look, you’re kind of a Lando Norris here” — maybe that’s not the words that you used — but, “Let me talk to you about your mindset and how you can shift that”, has that come in handy yet? NR: I really enjoy that because founders are really very similar to high performance athletes. NR: They’re extremely competitive, their drive is unbelievable, they’re very courageous also, because you have to be so damn brave to bet the company over and over as you’re innovating and pivoting, so there’s great similarities, and that’s why I really enjoy speaking to founders. Just now in the Bay Area, that’s very often the topic that I speak to founders about and they enjoy that as well, to discuss that kind of topic mentally, how they approach that and everything, and so that’s really enjoyable. I think I can really add value as well as I learned for myself also, but I can really add value by adding from my experience. The more founders that you talk to, is there a bit where — if you go back to F1, it’s very visible who’s the best, like it’s very measurable in a certain sense, but it’s interesting at F1 because sometimes you could have a great driver who doesn’t have a great car, and yet people will still say, “That person is excellent, they’re just limited by their circumstances”. Do you get a similar sense in being in tech, dealing with founders, and being able to separate the circumstances from the person and saying, “There’s something there even if the circumstances aren’t allowing it to show”? NR: That’s one very, very important ingredient for a successful founder, because actually it will be often many, many years until there’s any validation as to what he’s building or she’s building and the best founders have to be extremely resilient and not feel the need to bow to consensus thinking of people around them or of their board or whatever. They are the visionary and they have to believe with such high conviction in their idea, in what they’re building and see it through. Because if it was obvious, then everybody would be building it, and most of the time, they’re creating something that’s just not obvious to sometimes anybody except for themselves in the early stages, so that’s absolutely a very important trait. However, in combination with an extreme curiosity and desire to learn and remain open to new ideas and everything, so it’s a balance that has to be found. And again, that’s pretty rare to find both attributes within a founder, but usually that’s the case. Is that tension between the sort of insecurity and confidence and uncertainty and curiosity? Is that what you’re zoomed in on, what you’re looking for? NR: Yeah, totally. Because sometimes it’s like it opposes each other. Right, it’s a paradox. NR: Someone who’s very self-confident their idea will be will be completely arrogant and just so sure that their way is is the right way and that’s it and then they will not be very curious, so that’s why you don’t find it in every person and it’s important. I think these two character traits are very, very important. Continuing with the background, you have a YouTube channel that has 1.46 million subscribers, you haven’t posted on it for a while, but there used to be a whole host of videos. But I went back, scrolled all the way to the bottom, and the original upload was in 2011. A lot of people didn’t know what YouTube was at that point or barely did, how did you find YouTube and why did you start posting videos? NR: As an athlete, there was an opportunity that suddenly that came in those years, which was to connect closer with those out there that were supporting me. Were you the first one to really do that? NR: No, not the first, but I joined some of the early movers and it was amazing to see how you could directly connect with your fanbase, and there was also the belief that, of course, with time, Formula 1 is also about marketing and that can give you an edge over some other drivers. If you build a big following, a big brand for yourself, and you become highly relevant to brands for sponsorship, etc., then a team might choose you over someone who just drives fast. So there’s also that element that to be a successful F1 driver, usually it helps to really try and excel in every single domain that may be relevant and that domain plays a role, as well as working well with the media, because the media is so powerful and that’s a game you also need to try and nail. I’m curious about the sponsorship angle. F1 obviously has huge amounts of sponsorships, it’s an amazing sport where people will willingly wear gear with a bunch of sponsorships on it — I guess all racing is sort of like this. But right now, now that tech is huge and F1 is huge, there’s a lot of tech sponsorships of F1 and I’m just sort of curious: I’m in tech, but generally a lot of these companies are enterprise companies , a lot of B2B things, and this whole world of sponsorships and what goes on around that is somewhat foreign to me. I’m just a blogger here in Wisconsin before in Taiwan, what is in that game and how involved are the drivers? Is that a huge thing? You have to go out and actually help win these sponsorships too? Or you should show up to a bunch of events? I’m just curious, how does that world work? NR: So a few things here. First of all, because of Netflix , the sponsorship fees that the teams are now requesting are like 2-3x from what they were just six, seven years ago. Is that just because it’s more popular or because they also their logos also show up on Netflix? NR: Because it’s so much more popular and because it’s now become relevant in the US. So the whole tech industry has become interested and you’ll see most companies are now also sponsoring. I mean, look at just the Mercedes team , of course, but look at the Audi team also . They have Revolut, so the bank that’s come out of the startup ecosystem, ElevenLabs , the voice AI global Leader, all of these companies. In fact, I’m actually, because I’m so deeply connected now with Silicon Valley, I am more and more also kind of casually supporting some of these tech companies with sponsorships in F1. I’m just presenting one dev tools company, multi-billion dollar, with an opportunity to sponsor a team this week, I’m just sending that through. Because the sponsorship fees have increased so much, a team like Mercedes has $400 million in annual sponsorship revenue. $400 million! That’s so crazy. And then you add their share of TV revenues on top, so they get to beyond like $600 million in annual revenue, and because they inserted budget caps in F1, they don’t spend more than $300 million, even including driver salaries and everything. So they are so hugely profitable, these F1 teams, or especially the successful ones and that’s why the CrowdStrike founder now, George Kurtz , he just bought 5% of the Mercedes F1 team. And that stake, I mean, the Mercedes F1 team was valued at $6 billion, unbelievable. you know so so he paid three hundred three hundred million dollars he paid for a five percent share. Do you feel like you were 10 years too early? NR: I missed that train, because I think with a bit of effort probably at some point I could have had a nice little share in a F1 team somewhere, but I completely missed the train. It’s incredible how this sport has become has become really a business case now, and these these F1 teams have become investable assets, which never used to be the case, so it’s quite phenomenal. So these sponsors, we drivers spend a lot of time with these companies then, they invite all of their customers, I do dinner with them then even during a race weekend or the next morning for breakfast. Monaco Grand Prix, I’m at the Hotel de Paris having breakfast with one of the sponsors, so the drivers do spend a lot of time with those sponsors. And apart from that, the sponsors want visibility because visibility for their logo is just an amazing credibility stamp, and also they want to bring and host people at the races, so that’s what it’s about and I think it works amazingly well. I was talking to Michael Cannon-Brooks , Atlassian is now sponsoring Williams, and this idea of you actually have 24, or this year 22 , around the world, pre-planned, clear places to meet customers and bring them there. He’s like, “It makes scheduling very easy, it’s very straightforward”. NR: And for someone like Atlassian the customers are there anyways in the paddock, because the C-levels of all big companies are always there. To make deals in the paddock is incredible, an incredible opportunity and even I myself, so I do work for Mercedes F1 and they don’t actually pay me in Euros, they actually pay me most of the time with tickets for the F1 races, because I too, I love to host the VC community at the races, it’s such a great way to get to know people, build friendships and of course, yeah, it’s very important for me to really build relationships in this ecosystem. That’s super interesting. Speaking of Mercedes, when Mercedes rejoined F1, acquired Brawn , you were the first driver alongside Michael Schumacher, who was then replaced by Lewis Hamilton — two pretty impressive names to have as teammates to say the least. The rivalries between teammates is the stuff of lore in Formula 1 but is it actually underrated how intense that is? NR: So the norm in F1 is always that a team has a number one driver and a number two driver and that’s clearly kind of set in stone, and that’s the way you go racing. It’s very unusual that a team has two number one drivers, the most legendary such pairing was Ayrton Senna and Alain Prost at McLaren, and that ended in total disaster after only two years. They were crashing, then one guy quit, and it was just a total mess. It’s okay and not too bad as long as you’re racing for like fifth and sixth and seventh place — but as soon as you have the best car and you as teammates are fighting for every single race win, it just becomes so hard because you’re always going to push the boundaries and go into those gray areas because there’s a championship at stake and that’s your childhood dream and that’s what then happened between Lewis and I also. It kind of just spiraled from one going a little bit too far, then the other one paying back and then back again and then crashing and it just became very, very tense and difficult to manage. It was a very uncomfortable environment to be in because not only are you kind of enemies within the team, but also the whole team as such cannot really take a side anymore and they need to stay neutral, so they can’t really support you either anymore, so it’s a complicated dynamic. Well, you lasted longer than Prost and Senna, because I think you made it three years with Lewis Hamilton. Is that right? NR: Four, actually. We would have kept going, I had another contract for a few more years so it was kind of borderline manageable, but only after Toto Wolff made us sign a contract whereby it didn’t matter who was at fault, but if ever we crashed together, then we would have to split the bill, the repair bill, 50-50, and my most expensive one was $360,000 and after that, I made sure to leave extra space when Lewis was anywhere close. (laughing) That’s amazing. Why did you decide to retire? I mean, you finally win, you overcome Lewis, and then you’re done at 31. NR: I gave it a thousand percent, really, much more than any that I thought I could give. Total life commitment, insane intensity, the whole thing, mentally, physically and I achieved my dream, I achieved my dream in the best possible way, I beat the greatest of all time, I won that Formula 1 World Championship with Mercedes , the legendary car brand, it’s not possible for me to do better. I had a young family at home, a child at home, baby at home so it just felt like the right moment for the most beautiful exit possible for me that would carry me for the rest of my life. So it was a bit of a rational decision in that way and I just felt that was what I wanted to try and do. Of course, it was scary because when you make such a decision, you don’t really know how it’s going to go and how you’re going to feel. But now in hindsight, for me personally, it was really the best thing I could do and a great decision, which I’m very lucky to have been able to exit in that way. And a lot of founders listening, because I know you’re very popular with founders, also your podcast, they will be able to relate, it’s kind of the $10 billion or $50 billion dollar exit. NR: Once you put your life into it and you’ve created an enormous success and change people’s lives and then you go out on a high, I think that was my dream to do it that way. You made a lot of changes before that last year, too. But then there’s all these stories of that last year where you won the title, focusing on things like jet lag or like your nutrition and all those bits and pieces. Was that just like, “I have to figure something else to finally get over the hump”? NR: I tried to perfect every single possible marginal gain possible, that was really what I was about, and it went from working with a Professor of Sleep at Harvard , and who now has created a startup based on what we were working together at the time called Timeshifter , actually, which is a nice anecdote. And so there, for example, the secret was eliminating jet lag for the whole year because jet lag is a disaster. As an athlete, the difference between 99% focus and 100% focus is the difference between coming first and second, and jet lag just destroys you, and we’re traveling from continent to continent all the time. I managed to do a whole season with absolutely 0.0 jet lag, and it’s pretty simple. Of course, it takes a lot of discipline, but pretty simple. The secret was one-and-a-half hours maximum of time shift per day and then blackout glasses in the evening, two hours before needing to go to sleep and then also immediately upon waking up, 10,000 lux, like a light, you know, which you’re staring into, which you also see with Bryan Johnson , he does that and then, yeah, I mean, as long as I followed that, it was incredible. So I eliminated jet lag from my whole life for that year and every detail I worked on in that way, you know, really everything. So you see my helmet was black and it was bare carbon because I realized that the helmet was 80 grams and every gram counts in our sport so I took the paint off my helmet, just every single detail. I really tried to work on every single marginal gain possible. This sounds absolutely hellish with family and little kids at home. I can see why you once you accomplished it, you were done. NR: Yeah, of course. I mean with a little baby at home it required a lot of a lot of a great commitment also from my wife Vivian at the time and great support and and she did that awesomely so I’m very very grateful for that. Now you’re sitting here as an investor, but we’re a decade on from when you retired, what was the path to get to where you are now and to realize that, “This is what I want to do with the rest of my life”? NR: Seven years after retiring was first of all, just trying everything and nothing, trying to figure out what could be next in my life. And it’s hard because as an athlete, you are like CEO, you know, you’re top of the company, and you feel like being the king and then after your sports career you drop to zero. There’s nothing there and you cannot use your skill that you learned for something new, it’s just gone. And it’s very hard to accept that you really start from zero and you don’t even know if you’re going to have success in something new or not. So I tried a lot of things and and now finally I’ve landed on what I really enjoy doing and it’s being fully into the venture capital ecosystem building my own VC firm, Rosberg Ventures , out of Europe, investing a lot in the USA or even primarily in the USA. So super exciting and yeah, and I hit the ground running and I’m able to win also pretty quickly, which is what is really motivating. What made you realize there was this opportunity? If you sort of zoom out, this idea that there’s money in Europe, there’s opportunity in the U.S., someone needs to connect those two things together. But was there a specific conversation or something that came along that’s like, “Oh, I could actually do this and be good at it”? NR: Well more than money in Europe it was money in my bank account which was just sitting there. That makes sense. NR: And I was like, “What am I going to do with that?”, because it’s really really hard to invest capital across generations in a smart way. It’s like super, super difficult, as most people will know that or many people know. The way led to the Yale Endowment — everybody who’s interested in finance has once looked at the Yale Endowment because David Swensen is the gold standard for investing capital across generations. And my light bulb moment was then seeing that David Swensen had by then put 20% of the Yale endowment into venture capital, 20%, that’s $8 billion, and it was by far his best performing asset class with 21% yearly performance, 21% IRR. So that was my light bulb moment because I said, “Wow, I love startup anyways”, but I didn’t know you could make an asset class out of this, “Let me try and replicate what David Swensen did”, and I believe that with time because I have my unique angles, including F1, that with time, I can also build the right access by adding value into the ecosystem and everything to kind of replicate the approach that David Swensen took to the asset class. And that’s where we are now, we actually made it work. What are those unique angles? I think that sort of ties this together. You have the F1 background, you’re European. NR: So the unique angle, of course I have the F1 platform, which is a really unique advantage to be able to meet people from the VC ecosystem, make friendships, get insights. Appear on this podcast. NR: (laughing) I’m very, very lucky in that sense. But that’s something you seem to think about very strategically. Like, “This is an advantage that I have, I’m going to exploit this and push this”. Is this part of the thesis up front, particularly once you started? NR: Well, first of all, I really enjoy welcoming this incredible community to my sport, it’s amazing for me to be able to showcase my sport in a way. So this is where you did better from Drive to Survive in the end, because even if you sort of missed that era, now suddenly everyone’s interested in F1. NR: Oh yeah, definitely, I would not be here today if it wasn’t for Drive to Survive because that’s what has really engaged the whole tech community in my sport. It’s lovely to be able to invite people, bring them up close, show them what my sport is about, and see how excited everybody is and to share that with them is really amazing, so I enjoy that. And it’s a great opportunity to, as I said, build friendships and get insights, but then also to add value. How does that start? First of all, of course, curating the group that I invite. I invite the founder and then I invite the CIO of a big company and they then actually have a very valuable exchange. The CIO happens to be looking for the product that the founder is building, the founder obviously needs to go to market, so there’s a great way for me to build connections, and that’s how you start adding value. And beyond that, what we do is also we bring U.S. innovation to the German large corporates, we help with that. So Germany is your specific focus in particular in Europe. NR: Because I’m German, and because of my history and everything, I’m very well connected in Germany to all the C-levels in the large corporates. Does this even go back to like not just growing up in Germany, but also working for Mercedes, being the driver who’s interacting with all this? NR: Yeah, of course. All these large caps have been sponsors in F1, they’re all in the paddock, so I know them very well, and they’re all in desperate need of transformation now. Of course, there’s AI, there’s sustainability, there’s all these points and they’re not exactly the fastest, the German companies. They’re a little bit — many of them are real legacy businesses, who are not necessarily known to being the most brave when it comes to adopting new innovation and things like that. And are these generally like just regular companies, like manufacturer companies, things like that? NR: It goes all the way to the car manufacturers, whether it’s BMW or Mercedes and we have found a unique positioning where we’re able to support, just selectively, with bringing their attention to a couple of products that are just being built in the US in the startup ecosystem, whether it was vibe coding or it’s even legal tech, all these different things, and we can bring their attention to some of these innovations and really add value by creating these connections. So this is one of the secret sauces to Rosberg Ventures and to adding value, which works very well, and we’re hosting dinners with some of the C-levels and inviting some of the startups, etc., and it works very well. So you recently announced a new fund, $200 million assets under management . How did you grow your network on the asset side? Is that mostly then German money that’s coming back to the U.S. and you’re completing the cycle there? NR: Mainly German, so it’s German capital because the Europeans really lack connectivity, I realized that the Europeans lack access to U.S. venture capital and they know of the importance and the value that’s being created there, but they don’t have the access and they really kind of miss the boat on that, so it’s not too easy to convince them that, “Hey let’s join forces and partner up here, and let’s invest in the best opportunities in the U.S.”. So that’s been working very well and my way to raise or to convince these families is really going via the principle who I may know from F1 or whatever and then I say — I don’t even say too much like what i’m building because you don’t want to sell straight away — it’s more like, “Hey can you introduce me to your family office? I would love to just have a conversation with them”, and then the introduction, and I speak to them, I explain what we’re doing, and it’s just an obvious one. We’re kind of indexing the top 10 VC funds in the U.S., and also the top 10 growth stage companies, startups in the U.S., and indexing those and it’s kind of a no-brainer then, that’s how we’ve been able to raise capital very, very quickly. That makes sense. So everyone sees the opportunity, it’s not clear to get the capital in, you go in first sort of as like a seed investor with your own money, and that sort of starts that virtuous cycle, and that makes sense and then they get access to the German market in the long run. You’re bringing a unique angle and it’s just all about deal flow, I think it’s pretty compelling. Why is it so hard to do business in Europe ? Has everyone just given up on having a big startup ecosystem there and, “Let’s just get our money into the U.S.”? NR: So you mean the startup ecosystem in Europe? NR: There are flashes of real hope at the moment. Vibe coding was pioneered in Europe, the vibe coding for prosumers, that’s Lovable out of Sweden, and there’s many other examples. I mean, ElevenLabs, the global leader in voice AI, European, and many, many more examples. So there is flashes of real hope. But of course, we lack the breadth in the whole ecosystem and that’s as a result of a few things. It’s a bit of a chicken-and-egg. One, of course, it’s much harder to scale in Europe because of the geographical limitations, it’s so hard to go from Germany to France, different language, different regulatory framework, it’s just a huge friction there in the go-to market, so that’s one challenge. And then historically also, there’s been quite a lag in the distributions and liquidity in that asset class in Europe and so therefore, funding is not as ample as in the U.S. So it’s kind of a chicken-and-egg there also. But I think Europe is really working on trying to introduce one regulatory framework across the entire Europe, across all countries for startups, so that’s in the plan, so a lot is happening, and let’s see if Europe can develop more and more such promising companies. How have you managed this shift? You started out sort of a fund of funds sort of model, then you mentioned you’re doing more direct investing. Is that just a natural evolution of getting more access, having more assets under management? Or what was that explicit goal and strategy that you were seeking to pursue? NR: Well I think the holy grail in venture capital is is to invest directly in the startups and the fund of fund was the natural starting point from an asset class point of view, also from from copying and being inspired by what Yale did, and then from there the fund of fund is like a Trojan horse because it gets you positioned well into into the market where you see everything and then it really helps to identify which are the breakout startups, which are the most promising with the generational founders. So it really helps to create a short list and also to create those connections and to build those opportunities to actually invest directly in the startups. We met in San Francisco a couple of months ago, you had just met with Dreamer , I actually met with them the next day, they launched and were immediately acquired by Meta , was that your first exit of a direct investment? NR: So this is an important point that I don’t just like try and support the companies that I’ve backed. So in this case, this was the CTO of Stripe, the ex-CTO of Stripe, who was my friend, David Singleton , he built this together with Hugo [Barra] , who used to have a senior role at Facebook. Yep, I knew him when he was at Xiaomi , he was at Google, he was at Meta, he’s been all over the place. NR: Everywhere, it’s an incredibly promising founding team, and so I was just trying to support. And they happened to say that Stratechery, that they were the biggest fans in the world of you and Stratechery, so I was like, “Okay, well, that’s easy, I just met Ben yesterday, so I can make the connection there”. Yeah, it’s a pity how that went — I mean, pity because also from our point of view, I was so excited about that product, actually, it was vibe coding AI agents. Yep, it’s very compelling. I was looking forward to writing about it, they got snapped up before I could even get there. NR: I was looking forward to really using it at scale, but, yeah, now it’s bought by Meta and let’s see what Meta does with it, but it will certainly be, I’m sure, very promising what they build with that. As you’ve made this transition and levered up into tech and going from fund of funds to direct investment, it’s a time of great upheaval in tech , given AI. Theoretically, this should mean more startup opportunities. On the other hand, the frontier lab models might just eat everything. How are you thinking about that as an investor? Is it like, “I’m finally getting to the stage where I can get into startups, and now I’m not sure that I want to”? Or are you optimistic? NR: I’m very optimistic. I’m very optimistic because AI, the value creation within this wave of AI is going to be something like we’ve never seen before, and I do think there’s a lot of opportunities beyond just the frontier labs to capture market share, create new markets. But at the same time, you do need to be careful because we see the legal tech. Legal tech is a really big new market that’s being created there with a leader like Harvey and Legora , the two leaders, and then now Anthropic came out with a product which kind of starts to threaten their position a little bit. And Anthropic has been doing that for every sector, it feels like almost, so that is a little bit of a concern. It does feel like a safer place at the moment to be invested in frontier labs and neo labs, that does seem the more safe place to be. But nevertheless, I think there’s like, for example, Elevan Labs, voice AI, it’s very defensible what they’re building. They are a frontier lab themselves, by the way, because they build their own models. But still, voice probably is going to commoditize, the research, as in many cases and there it’s then going to be about the platform, distribution, products. And there, ElevenLabs is doing an excellent job. So it does look at the moment like they’re going to be able to really win and sustain any potential threat from these frontier labs so there are examples where beyond the frontier labs, many, many examples where they can be success stories, so it’s an exciting time. You mentioned platform and distribution, and this sort of seems to be a theme: you’ve thought about the F1 reputation and background, “I can leverage that, I know these sort of companies, I can leverage that”, you saw YouTube early on, you were on that, you’re here on this interview. Is that why you still do Sky Sports? Everyone’s favorite commentator , is that you love to commentate, does that keep Nico Rosberg sort of front and center? NR: You’re right. I do enjoy staying connected with the sport, but there’s the second reason that it’s really helpful for me to stay kind of relevant and it does help me also with relevance, even in the tech ecosystem. Because, of course, if then some people enjoy watching me and things like that, it’s easier to connect with them in future, even in the tech ecosystem. So that is twofold. We talked before, you were born with sort of steering wheel in your crib, in some respects, a advantageous background. But what I see as an overall theme is pretty consistently you identifying and leveraging your advantages and like what we just articulated is a good example. So now you’re in the investing world, totally separate, but figuring out what you have, how to work with it and build towards that. Is that the overarching sort of theme that you see in your life? What still drives you, is it that bit about being a little bit insecure and wanting to prove yourself and being super competitive? Is that just like you can’t turn that off and that’s what that’s why you’re still here? NR: I’m a super extreme competitor, I need to compete, I want to win, and I have now chosen venture capital as my space to try and win more and more in future. And I think, yeah, this is what I’m carrying over from the sport. I was very methodical about how do I get that win, in sports, every detail. I worked on every single detail possible to put all the pieces together to be the best that I could be and to get to that win eventually and I think that’s something that I’m now replicating in the world of venture capital, trying to optimize for everything and put everything together to be able to win more and more. How do you think about that with your kids, just out of curiosity? Your daughter sort of popped into the background on the call here. NR: So with my kids, because I went through such an extreme intensity in my sporting career, I, with them, am more focused on well-being rather than pushing them towards some success. But at the same time, you just credited your massive drive and competitiveness with your success. NR: Exactly, yeah, but wellbeing and happiness is what I put at a higher level for my kids and that doesn’t necessarily have to be success. So I’m very eager to push to try and help them discover their real passions, and we’re getting there. So my daughter, I put her in a go-kart two weeks ago, she drove slower than I could walk, so I could walk faster, and she ended up crying, so I hope she doesn’t listen to this one day, but I don’t see which one it is either, so we’re fine because I have two daughters. So it was clear that this is not her passion, and then we will never go again. But I can see that her passion is music, guitar, singing and so there I do nudge her towards more lessons, guitar lessons, drum lessons, without overdoing it, because I see that that’s her natural passion, you know? So that’s the approach I’m taking, but definitely really focused on happiness and well-being. So you mentioned you’re on holiday in Ibiza. I understand you have an ice cream shop there , is that right? NR: So yeah, with my wife, because she’s an interior designer, so she’s super creative and for some reason, we both of us, we love ice cream and we’ve been coming to Ibiza all our life, and there’s never been a nice ice cream place. So just as a hobby, we just said, “Hey, why don’t we open one ourselves?” — our friend, our common friend, he likes to make ice cream, so we do that, and it’s become a huge business. We have now a chain here in Ibiza, and very successful, and it’s the number one ice cream place. So Ben, next time you’re in Ibiza, ice cream is on us. (laughing) Sounds like a deal. You have an interesting life in terms of you learn five languages growing up, you have parents from different countries. Obviously, as part of being an F1 driver, you’re all over the world. You’re doing this connection between Germany in particular and Silicon Valley. Do you feel like, you talk about eras and riding them and starting and beginning in terms of F1 — do you feel that era, you’re like the pinnacle of like globalized civilization? Do you feel that that is an era that is going to persist past you, or do you feel that sort of cracking and changing? NR: This is related to the sport or? Just in general, just given you are like an international man of mystery, although maybe not that mysterious, but it’s like your superpower is connecting and linking all these disparate pieces together and seeing the ability to sort of build through them. And I’m wondering, is that something, an opportunity, that you think is going to persist given the way the world is going? NR: Well, I’m very optimistic in that sense, I’m very optimistic. And I see a long road ahead. And I think it’s an amazing time for venture capital now, it’s incredible, a time that we’ve never seen something like that before, the speed of innovation, and there may be my F1 speed also helps me, it doesn’t scare me at the moment because I’m used to driving 220 miles an hour. So maybe I’m one of the only people in the world where I’m not getting scared by the speed of innovation that we’re seeing in the startup ecosystem, because I’m quite used to speed. You actually focused a lot on e-mobility and electric vehicles. I do have to ask you, how are you feeling about the current F1 regulations , this 50-50 split? A lot of complaints that driver’s skills being taken away. What’s your view? NR: I saw a message from Toto actually recently, and he said, the F1 driver job might be the very last place that AI is going to endanger that job. Because it’s very, very hard for AI to try and replicate what we are doing in that racing car at the edge of physics. But has it been diminished a little bit if you’re going around a curve or you’re on a straight and your car’s just slowing down on its own? NR: No, I understand, F1 has tried to stay technologically relevant so they have gone full hybrid which is one of the most efficient powertrains in the world, the way they’ve done it, but of course yeah it’s a little bit to the detriment of racing on the edge, because now they’re going through a high speed corner towards the end of the straight and they actually downshift on the straight after the corner which is unheard of in the sport. But to be honest I’m quite easygoing about that because I like to really focus on just, “Is the racing exciting?”, “Is there good battles?”, “Is it unpredictable?”, “Is there rivalries?”, and as long as that’s happening, I think all fans will kind of forget about these regulations and will just enjoy the sport once again and be super excited. I think the season is shaping up really nicely. We have this super underdog, this 19-year-old who was really having a struggle last year, who suddenly has come to life and is showing his real talent and is dominating the championship so far, 19 years old, he’s still like a child, it’s incredible, Kimi Antonelli , Italian guy, driving for Mercedes. So it’s so exciting to see him in front and now everybody else trying to catch up to him, I think it’s great. You are associated with Mercedes, they are doing very well, I am a Kimi fan, my kids got a picture with him last year, so he’s by default who we’re cheering, for sure. But who do you cheer for in F1? NR: I do cheer for Kimi as well now because he used to be my driver in go-karting as well, so I know him since he’s 12 years old, and he is a generational talent of the level of [Max] Verstappen, Hamilton. His talent is exceptional and he’s so humble and authentic and nice guy also, so you can only cheer for him. It’s such a challenge that he’s facing, being a driver of the Mercedes team, leading the championship all of a sudden, an incredible challenge, and I can so relate because I was in that position and it’s so hard. It is so hard what he’s getting himself into now for the rest of the year. I’ve been writing him also and I said, just without telling him what he should do, I just told him like what I did and what worked for me, I’ve been writing him. And one thing, for example, was just really take it race by race, don’t think about the end of the season, don’t think about championship, just race by race, try and optimize for the next race, go in to win, and that’s it and then the rest will just see how it goes. Are you surprised it’s been a decade and Lewis [Hamilton] is still in F1? NR: I am quite surprised, because that’s a long time, and we weren’t exactly young at the time. So when I stopped 10 years ago, he was already almost 32 and he’s still going now, which is incredible and huge respect, respect for him to keep going, keep grinding, keep the motivation. Still seems as motivated as ever, driving really well again this year, he’ll definitely win some races this year, I think he’ll win some, so he’s doing really well. And every win that Lewis gets is another notch on your belt, right? NR: (laughing) That’s a little bit of an egotistical view to it, which sometimes I do think about. Yes, the better my success looks, which is nice, yeah. You won one, you beat Lewis. It’s a championship, if you’re going to win one, that’s about as good as it gets. But, hey, you didn’t stop there, it’s super impressive what you built, very interesting to learn more and I look forward in 10 years when Nico Rosberg is the champion VC investor. What is it, the Midas list ? Are you gunning for number one? NR: Yeah, sure, Midas List, that’s gonna be a hard one, but those kind of targets, at some point, yes. Nico Rosberg, great to talk to you. NR: Thank you very much. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Playtank 1 months ago

Analogue Prototyping

There is a lot to say about prototyping . Chris Hecker talked about advanced prototyping at GDC 2006, and provided a hierarchy of priorities that goes like this: Analogue prototyping comes in right away at Step 1: Don’t . By not launching straight into your game engine, you can save giant heaps of time between hypothesis and implementation. You can also figure out what kinds of references will be relevant before you reach Step 4: Gather References . There’s another side to analogue prototyping as well. In the book Challenges for Game Designers , Brenda Romero says: “A painter gets better by making lots of paintings; sculptors hone their craft by making sculptures; and game designers improve their skills by designing lots of games. […] Unfortunately, designing a complete video game (and implementing it, and then seeing all the things you did right and wrong) can take years, and we’d all like to improve at a faster rate than that.” Brenda Romero Using cards, dice, and paper leads to some of the fastest prototyping possible. It can be just ten minutes between idea and test, fitting really well into those two days of Step 2: Just Do It . Of course, it can also take weeks and require countless iterations, but that’s part of the game designer’s job after all. This post focuses on what to gain from analogue prototypes of digital games, and the practical process involved. It’s also unusually full of real work, since this is something I’ve done quite a bit for my personal projects and is therefore not under NDA. If you’re curious about something or need to tell me I’m wrong, don’t hesitate to comment or e-mail me at [email protected] . Why you should care about analogue prototyping when all you want to do is the next amazing digital game may seem like a mystery. A detour that leads to having your fingers glued together and a bunch of leftover paper clippings you can’t use for anything. In Chris Hecker’s talk, the first suggestion is that you should cheat before you put too much time into anything else. Since you will be cutting and gluing and sleeving, and some of that work takes time, this counts double with analogue prototypes. The easiest way to cheat is to use proxies. If you have a collection of boardgames, this is easy. You can also go out and buy some used games cheap or ask friends if they have some lying around that they don’t use. Perhaps that worn copy of Monopoly that almost caused a family breakup can finally get some table time again, in a different form. Aesthetics matter. If you want to take shortcuts with how a game feels to play, getting something that looks the part can be a shortcut. Go to your local Dollar Store or second hand shop and pick up some plastic toys or a game with miniatures that are similar to what you are after. They can merely be there to act as center pieces for your prototype. The easiest and most efficient reference board that exists is a standard chessboard. Square grid with a manageable size. You can also use a Go board, with the extra benefit that the Go beads also make for excellent proxy components. Beyond those two, you can really use any other board game board too. Just make sure to remember where you got it from if you want to play those games in the future. Or you can even pick up games with missing parts at yard sales, usually super cheap, and scavenge proxy parts from those. For some types of games, finding a good real-world map, perhaps even a tourist map or subway map, can be an excellent shortcut. Not just for wargames, but for anything with a spatial component. The guide map from a theme park or museum works, too. Packs of 52 standard playing cards are fantastic proxies. You can use suits, ladders, make face cards have a different meaning, and much more. Countless prototypes have used these excellent decks to handle anything from combat resolution to hidden information. It’s also possible to go even further, and make your own game use regular playing gards and the known poker combos as a feature. Balatro comes to mind. Many families have a Yatzy set lying around, providing you with a small handful of six-sided standard dice. You can do a lot with just this simple straightforward randomisation element. But don’t limit yourself to just six-sided dice, if you don’t have to. Get yourself a set of Dungeons & Dragons polyhedrals and you’ll have four-, eight-, ten-, twelve- and twenty-sided dice rounding out your randomisation armory. Just want to make an honorable mention of this fantasy wargame, because of its diversity. You can build all manner of strange scenery from just a core HeroScape set and use it effectively to represent almost anything. The same goes for Lego. The main issue with these kinds of proxies is that they can take a lot of space. Particularly HeroScape , since it has a predefined scale. With Lego, you just need to figure out a scale and stick to it. If there’s a game the people you will play with are especially familiar with, you can skip over having to design one of your systems by substituting a mechanic from a game you already know. Say, if you know that you will want to have statistics in your game, you can copy the traditional lineup of six abilities from Dungeons & Dragons , as well as their scale, to get started. Even if you know that you will want a different lineup later, this means you can test elements that are more unique to your game faster. An effective way to minimise cut-and-paste time is to print your cards very small. Preferably so all of them fit on a single piece of paper. They will be a bit trickier to shuffle this way, but that’s rarely an issue in testing. This way, you need less paper and you can cut everything faster. Going from eight cards to a sheet to 32 is a pretty big difference. Just avoid miniaturizing to the point that you need a magnifying glass. There’s no need to get fancy with real cardstock. Here are some things you can use. I usually just keep any interesting sheets from deliveries I receive. Say, the sturdy sheet of paper used in a plastic sleeve to make sure a comic book doesn’t bend in the mail. Perfect for gluing counters. There are three things you need to consider for paper: size, weight, and texture. For size, since I’m in Europe, I use the standardized A-sizes. A0 is a giant piece of paper, A1 is half as big, A2 half as big again, and so on. The standard office paper format is A4, roughly equivalent to U.S. Letter. This can easily be folded into A5 pamphlets. I also keep A3 papers around (twice the size of A4), but those I use to draw on. Not for printing. I don’t have a big enough home to fit a floor printer. The next thing is paper weight, measured in grams per square meter (GSM). Most home printers can’t handle heavier paper than 120-200 GSM. I always keep standard paper (80 GSM) around, and some heavier papers too. If I print counters or cards I sometimes use the sturdier stock. For reference, Magic cards are printed on 300 GSM black core paper stock. The black core is so you can’t see through the card and is taken directly from the gambling circuit. Lastly, the paper’s texture. If you want to work a little on the presentation, it can be nice to find paper canvas, or other sturdier variants. I’ve found that glossy photo paper is almost entirely useless in my own printer, however, always smearing or distorting the print. So when I buy any higher-GSM paper I try to find paper with coarser texture. There are many different kinds of cardboard, and you should try to keep as many around as possible. Some can be good for gluing boards or counters onto, while others can help make your prototype sturdier. This isn’t as important as paper, but gets used frequently enough that it felt worth mentioning. There will be a lot of rambling about cards later, and how to use them. For now, I only refer to loose cards you can use to prop up your thin paper printouts. These are not strictly necessary, but make shuffling easier. I don’t play much Magic: The Gathering anymore, but I still have lots and lots of leftover Magic cards, so those are the ones that get used as backing in most of my prototypes. You can cheaply buy colored wooden cubes as well as glass and plastic beads in bulk. It’s not always obvious what you may need, so keeping some different types around can be helpful. More specific pieces, like coins or pawns, can also be useful but unless these components provide unique affordances the kinds of components you have access to is rarely important. It’s usually enough to be able to move them around and separate them into groups. Storage is another thing that needs solving. If you mostly print paper and iterate on rules, a binder can be quite helpful. Especially paired with plastic sleeves so you can group iterations of your rules together and store them easily. If you also need to transport your prototypes, the kinds of storage boxes you find in office supply stores will have you sorted. You can push your analogue prototyping really far and build a whole workshop. A 3D printer for making scenery and miniatures, a laser cutter for custom MDF components, and a big floor-sized professional printer that takes over a whole room. If you have the space and the resources for that, you do you, but let’s focus on the smallest possible toolbox for making analogue prototypes. If you want to buy a printer, you just need to be aware that all of them have the same problems of losing connections and failing to print still to this day. Those same problems that have plagued printers since forever. I use a laser color printer with duplex (double-sided) printing support and the ability to print slightly heavier paper, up to 220 GSM. This has been more than enough for my needs. Specifically the duplex feature helps a lot if you want to print rulebooks. Having a good store of pencils and pens, including alcohol- and water-based markers, is more than enough. You can go deeper into the pen rabbit hole by looking at Niklas Wistedt’s spectacular tutorial on how to draw dungeon maps : it’ll have you covered in the pen and pencil department. Some tools you keep around to hold piles of paper or cards together. Paper clips are extra handy, because they can also be used as improvised sliders pointing at health numbers or other variables. Rubber bands are handy for keeping decks of cards together inside a box and for transportation. Almost every paper-based activity without decent scissors on hand will be a futile effort. Just beware that cutting things out by hand takes more time than you think. If you have a game with many cards, you may have to put on a couple of episodes of your favorite show as you cut them out. If you need more precision than scissors can provide, the next rung on the cutting lader is to get a proper cutting mat, a steel ruler, and a set of good sharp knives. These can be craft scalpels, metal handles with interchangeable blades (Americans insist on calling these “x-acto knives”), or carpet knives. Once you have rules and test documents printed, you’ll quickly disappear under a veritable ocean of paper. Though smaller sheafs can be pinned together with a paper clip, staplers are even better. A standard small office stapler is enough. But if you want to staple booklets and not just sheafs, it can be worth it to get a long-reach stapler capable of punching 20 sheets or more. Attaching paper to other paper can be done in more ways than with clips or staples. Sometimes you want to use glue or adhesive tape. Keeping a standard gluestick and a can of spray glue around is perfect. Regular tape and double-sided tape is also great for many things, even if the main use for tape can just be to make larger scale maps out of individual pieces of paper. As mentioned previously, it can take some time to cut out all the cards you want to print. You can cut this time down to a fraction, metaphorically and physically, by getting a paper guilloutine. These can usually take a few sheets at a time and will give you clean cuts along identified lines. Yelling “vive la France” when you drop the blade is optional. Lastly, a more decadent piece of machinery that isn’t strictly needed is a paper laminator. These will heat up a plastic pocket and melt the edges together to provide the paper with a plastic surface. It makes the paper much sturdier and has the added benefit of allowing you to use dry erase markers to make notes and adjustments right on the sheet itself. There is a lot of software out there that can be used to make cards, boards, illustrations, and whatever else you may need. The following is merely a list of what I personally use. Since you will often want to test things at different sizes, vector graphics are generally more useful for board game prototyping than pixel graphics. This is by no means a hard rule, but resolution of pixel images tends to limit how large you can scale them, while vector graphics have no such limitations. My go-to for vector graphics is Illustrator, but there are free alternatives like Affinity available as well. My other go-to piece of software for analogue shenanigans is InDesign, another Adobe program that can also be replaced by Affinity . I’m just personally too stuck in the Adobe ecosystem, after decades of regular use, that it’s too late for me to switch. You can’t teach an old dogs new tricks, as the saying goes. Indesign is great for multiple reasons. Not least of all its ability to use comma-separated value (CSV) files to populate unique pages or cards with data. A feature called DataMerge. Speaking of spreadsheets, all system designers have a lovely relationship to their tool of choice. This can be Microsoft Excel , OpenOffice Calc , or Google Spreadsheets, but the many convenient features of spreadsheets are a huge part of our bread and butter. I don’t even want to know how many sheets I create in an average year. Very broadly speaking, when making an analogue prototype, I will make use of spreadsheets for these reasons: The fantastic Tabletop Simulator is not just a great place to play tabletop games, it’s also a great place to test your own games. Renown board game designer Cole Wehrle has recorded some workshops for people interested in this specific adventure, and let’s just say that once you have this up and running it will make it a lot easier to test your game. Especially if the members of your team doesn’t all live in the same city. Its biggest strength is how quickly you can update new versions for anyone with a module already installed. If you share your module through Steam Workshop, it’s even easier. For most analogue prototypes, this isn’t doable, simply because of NDAs and rights issues. So much stuff ! Let’s put it all together. The way I’ve talked about this, there are really six steps to the process of making an analogue prototype: This is more important than you may think. An analogue prototype can easily become a design detour. Because of this, your goal needs to formulate why you are making this analogue prototype. “Test if it’s fun with infinitely respawning enemies” could be a goal. “See what works best: party or individual character” could be another one. But it can also be a lot narrower, for example designed to test the gold economy in your game. Perhaps even to balance it. The point is that you need a goal, and you need to stick to it and cut everything out that doesn’t serve that goal. If you need to test how travelling works on the map, you probably don’t need a full-fledged combat system, for example. Facts are the smallest units of decision in your game’s design . Stuff that every decision maker on your team has agreed on and that can therefore safely inform your analogue prototype. This can be super broad, like “the player plays a hamster,” or it can be more specific, like “the player character always has exactly one weapon.” You need these facts to keep your prototype grounded, but you don’t necessarily need to refer to them all at once. Pick the ones that are most important to your goal. With a goal and some facts, you need to figure out what systems you will use. Try to narrow it down more than you may think. Don’t make a “combat system,” but rather one “attack system” and another “defense system.” The reason for this is that what you are after is the resource exchanges that come from this, and the dynamics of the interactions. The attack system may take player choices as input and dish out damage as output, while the defense system may accept armor and damage input and send health loss as output. You can refer to the examples of building blocks in this post for inspiration. This is where we come to the biggest strength of analogue prototyping: real humans provide a lot more nuance and depth than any prototype can do on its own. Analogue or digital. One player can take on the role of referee or game master, similar to how it would work in a tabletop role-playing game . In many wargames of the past, this was called an umpire. Someone who would know all the rules and act as a channel between the players and the systems. If you have built a particularly complicated analogue prototype, a good way to test it can be to act as a referee and then simply ask players what they want to do instead of teaching them the details of the rules. Players can play each other’s opponents, representing different factions, interest groups, or feature sets via their analogue mechanics. If you built an analogue prototype of StarCraft , you’d probably do it this way, with three players taking on one faction each. One player can play the enemies, while another plays the economy system, or the spawning system. The goal here is to put one player in charge of the decisions made within the related system. If someone wants to trade their stock for a new space ship, and this isn’t covered by the rules, the economy system player can decide on the exchange rate and the spawning system player can say that this spawns a patrol of rival ships. Just take ample notes, so you don’t forget the nuances that come out of this process. There are many different ways to use the components you collected previously. Some of them may not be intuitive at all. The humble die: perhaps the most useful component in your toolbox. Just look at the following list and be amazed: People have been using playing cards for leisure activities since at least medieval times. Just as for dice, you’ll see why right here, and perhaps these things will fit your needs better than dice: Humans are spatial beeings that think in three dimensions. Even such a simple thing as a square grid where you put miniatures will create relationships of behind, in front of, far away from, close to, etc. All analogue prototypes don’t need this, but if you do need it, here are some alternatives to explore: With the fast iterations of analogue prototypes, you can usually just change a word or an image somewhere and print a new page. This means you may have many copies of the same page after a while. To prepare for this situation, make sure to have a system for versioning. It doesn’t have to be too involved, especially if you’re the only designer working on this prototype, but you need to do something. I usually just iterate a number in the corner of each page. The 3 becomes a 4. I may also write the date, if that seems necessary. I may also add a colored dot (usually red) to pages that have been deprecated, since just the number itself won’t say much and you may end up referring to the wrong pages if you don’t have an indicator like this. Step 1: Don’t : Steal it, fake it, or rehash stuff you have already made before you start a new prototype. Step 2: Just Do It : If it takes less than two days, just do it. As the saying goes, it’s easier to ask for forgiveness than for permission. Step 3: Fail Early : When something feels like a dud even at an early stage, you can assume that it is in fact a dud. There’s nothing wrong about abandoning a prototype. In fact, learning to kill things early is a skill. Step 4: Gather References : Prototypes can only really help with small problems. Big problems, you must break apart and figure out. Collect references. White papers, mockup screenshots, music, asset store packs, and so on. Anything that helps you understand the problem space. The same psychology applies . Rewards, risk-taking, information overload. Many of our intrinsic and extrinsic motivators are triggered the same by boardgames as by digital games. The distance is not nearly as far as we may tell ourselves. Players can represent complex systems . A player has all the complexity of a living breathing human, making odd decisions and concocting strange plans. This lets you use players as representations of systems, from enemy behaviors to narrative. Analogue games are “pure” systems . If you can’t make sense of your mechanic in its naked form, you can probably not expect your players to make sense of it either. Similar affordances . Generating random numbers with dice, shuffling cards, moving things around a limited space; analogue gaming is always extremely close to digital gaming, even to the point that we use similar verbs and parlance. Holism . Probably the best part of the analogue format is that you can actually represent everything in your game in one way or another. It doesn’t have to be a big complex system, as long as you provide something to act as that system’s output. Listing all the actions, components, elements, etc., that are relevant. Just getting things into a list can show you if something is realistic or not. Cross-matrices for fleshing out a game’s state-space. If I know the features I want, and the terrains that exist, a cross-matrix can explore what those mean: a feature-terrain matrix. Notes on playtests. How many players played, what happened, who won and why, etc. Calculators of various kinds, incorporating more spreadsheet scripting. Can be used to check probabilities, damage variation, feature dominance, etc. Session logging. If I want to be more detailed, I can log each action from a whole session and see if there are things that can be added or removed. Set a Goal Identify Facts Systemify the Facts Consider the roles of Players Tie it together with Components Types of dice : you can use any number of sides, and make use of the corresponding probabilities. Dividing a result by the number of sides gives you the probability of that result. So, 1/6 = 0.1666 means there’s a ~17% chance to roll any single side on a six-sided die. Use the dice that best represents the percentage chances you have in mind. Singles : rolling a single die and reading the result. Pretty straightforward. Sums : rolling two or more dice and adding the result together. Pools : rolling a handful of dice and checking for specific individual results or adding them together. Buckets : rolling a lot of dice and checking for specific results. The only reason buckets of dice are separated from dice pools here is because they have a different “feel” to them; they are functionally identical. Add/Subtract : add or subtract one die from the result of another, or use mathematical modifiers to add or subtract from another result. X- or X+ : require specific results per die. In these cases X- would mean “X or lower,” and X+ would mean “X or higher.” Patterns : like Yatzy, or what the first The Witcher called “Dice Poker:” you want doubles, triples, full houses, etc. Reroll : allowing rerolls of some or all of the dice you just rolled. Makes the rolling take longer but also provides increased chances of reaching the right result. Some games allow rerolling in realtime and then use other time elements to restrict play. So you can frantically keep trying to get that 6, but if an hourglass runs out first you lose. Spin : spinning the die to the specific side you want. Trigger : if you roll a specific result, something special happens. It could be the natural 20 that causes a critical hit in Dungeons & Dragons , or it can be that a roll of 10 means you roll another ten-sided die and add it to your result. Hide : you roll or you set your result under a cupped hand or physical cup, hiding the result until everyone reveals at the same time or the game rules require it. Statistics : common sense may say that you can’t possibly roll a fifth one after the first four, but in reality you can. Dice are truly random. Shuffle : shuffling cards is a great way to randomise outcomes. This can be done in many different ways, as well, where you shuffle a “bomb” into half of the pile and then shuffle the other half to place on top, for example. There are many ways to mix up how to shuffle a deck of cards. Uniqueness : each card can only be drawn once, which means that you can make each card in a deck unique and you can affect the mathematics of probability by adding multiple copies of the same card. Just like the board game Maria uses standard playing cards but in different numbers. Front and back : the face and back of the cards can have different print on them, or the back can just inform you what kind of card it is so you can shuffle them together in setup. Of course, the fact that you can hide the faces for other players is also what makes bluffing in poker interesting. Turn, sideways : what Magic calls “tapping” and other games may call exhausting or something else. Some cards can be turned sideways (in landscape mode instead of portrait mode) by default. Turn, over : flipping a card to its other side can serve to show you new information or to hide its face from everyone around the table. It can represent a card being exhausted, or injured, or other state changes like a person transforming into a werewolf. Over/under : cards can be placed physically over or under other cards, to show various kinds of relationships. An item equipped by a character, or a condition suffered by an army, for example. Card grids : cards can be placed in a grid to generate a board, or to act as a sheet selection for a character. One card could be your character class, another could be a choice of quest, etc. It’s a neat way to test combinations. Hide cards : if you want to get really physical, you can hide cards on your person, under boards, and so on. This was one way you could play Killer , by hiding notes your opponents would find. Card text : if you print your own cards, you can have any text you want on them. Reminders, rules exceptions, etc. Deck composition : how you put decks together will affect how the game plays, and predesigning decks for different tests can be very effective. Perhaps you remove all the goblins in one playtest and have only goblins in another. Deck building : decks can also be constructed through play, similarly to how Slay the Spire works. A style of mechanic where you can start small and then grow in complexity throughout a session. Stats : cards can be in different states. On the table, in your hand, available from an open tableau, shuffled into a deck, discarded to a discard pile, and even removed from the game due to in-game effects. Semantics : something that Magic: The Gathering ‘s designer, Richard Garfield, was particularly good at was to figure out interesting names for the things you were doing. You don’t just play a card, you’re casting a spell. It’s not a discard pile, it’s your graveyard. These kinds of semantics can be strong nods back to the digital game you are making, or they can serve a more thematic purpose. Statistics : with every card you draw, the deck shrinks, increasing the chances of drawing the specific card you may want. You are guaranteed to draw every card if you go through a whole deck, which is one of the biggest strengths of decks of cards. Node or point maps : picture a corkboard with pins and red thread, or just simple circular nodes with lines between them. You can draw this easily on a large sheet of paper and just write simple names next to each circle to provide context. Sector maps : one step above the node or point map is the sector map, where regions share proximity. Grand strategy games have maps like this, where provinces share borders. Another example are more abstract role-playing games, where a house’s interior is maybe divided into two sectors and the whole exterior area around it is another sector. It’s excellent for broad-stroke maps. Square grids : if you want a grid, the square grid is probably the most intuitive. But it also has some mathematical problems: diagonals reach twice as far as cardinals. This means you need to either not allow diagonals or allow them and account for the problems that will emerge. Hexagon grids : these are more accurate and classic wargame fare, but they will also often force you to adapt your art to the grid in ways that are not as intuitive as with a square grid. Freeform : finally, you can just take any satellite image or nice drawn map, perhaps an overhead screenshot from a level you’ve made, and use it as a map in a freeform capacity. This may force you to use a tape measure or other way to measure distances, but if the distances are not important that matters a lot less. For example if your game shares sensibilities with Marvel’s Midnight Suns .

0 views
David Bushell 1 months ago

No-stack web development

This year I’ve been asked more than ever before what web development “stack” I use. I always respond: none. We shouldn’t have a go-to stack! Let me explain why. My understanding is that a “stack” is a choice of software used to build a website. That includes language and tooling, libraries and frameworks , and heaven forbid: subscription services. Text editors aren’t always considered part of the stack but integration is a major factor. Web dev stacks often manifest as used to install hundreds of megs of JavaScript, Blazing Fast ™ Rust binaries, and never ending supply chain attacks . A stack is also technical debt, non-transferable knowledge, accelerated obsolescence, and vendor lock-in. That means fragility and overall unnecessary complication. Popular stacks inevitably turn into cargo cults that build in spite of the web, not for it. Let’s break that down. If you have a go-to stack, you’ve prescribed a solution before you’ve diagnosed a problem. You’ve automatically opted in to technical baggage that you must carry the entire project. Project doesn’t fit the stack? Tough; shoehorn it to fit. Stacks are opinionated by design. To facilitate their opinions, they abstract away from web fundamentals. It takes all of five minutes for a tech-savvy person to learn JSON . It takes far, far longer to learn Webpack JSON . The latter becomes useless knowledge once you’ve moved on to better things. Brain space is expensive. Other standards like CSS are never truly mastered but learning an abstraction like Tailwind will severely limit your understanding. Stacks are a collection of move-fast-and-break churnware; fleeting software that updates with incompatible changes, or deprecates entirely in favour of yet another Rust refactor. A basic HTML document written 20 years ago remains compatible today. A codebase built upon a stack 20 months ago might refuse to play. The cost of re-stacking is usually unbearable. Stack-as-a-service is the endgame where websites become hopelessly trapped. Now you’re paying for a service that can’t fix errors . You’ve sacrificed long-term stability and freedom for “developer experience”. I’m not saying you should code artisanal organic free-range websites. I’m saying be aware of the true costs associated with a stack. Don’t prescribed a solution before you’ve diagnosed a problem. Choose the right tool for each job only once the impact is known. Satisfy specific goals of the website, not temporary development goals. Don’t ask a developer what their stack is without asking what problem they’re solving. Be wary of those who promote or mandate a default stack. Be doubtful of those selling a stack. When you develop for a stack, you risk trading the stability of the open web platform, that is to say: decades of broad backwards compatibility, for GitHub’s flavour of the month. The web platform does not require build toolchains. Always default to, and regress to, the fundamentals of CSS, HTML, and JavaScript. Those core standards are the web stack. Yes, you’ll probably benefits from more tools. Choose them wisely. Good tools are intuitive by being based on standards, they can be introduced and replaced with minimal pain. My only absolute advice: do not continue legacy frameworks like React . If that triggers an emotional reaction: you need a stack intervention! It may be difficult to accept but Facebook never was your stack; it’s time to move on. Use the tool, don’t become the tool. Edit: forgot to say: for personal projects, the gloves are off. Go nuts! Be the churn. Learn new tools and even code your own stack. If you’re the sole maintainer the freedom to make your own mistakes can be a learning exercise in itself. Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views

watgo - a WebAssembly Toolkit for Go

I'm happy to announce the general availability of watgo - the W eb A ssembly T oolkit for G o. This project is similar to wabt (C++) or wasm-tools (Rust), but in pure, zero-dependency Go. watgo comes with a CLI and a Go API to parse WAT (WebAssembly Text), validate it, and encode it into WASM binaries; it also supports decoding WASM from its binary format. At the center of it all is wasmir - a semantic representation of a WebAssembly module that users can examine (and manipulate). This diagram shows the functionalities provided by watgo: watgo comes with a CLI, which you can install by issuing this command: The CLI aims to be compatible with wasm-tools [1] , and I've already switched my wasm-wat-samples projects to use it; e.g. a command to parse a WAT file, validate it and encode it into binary format: wasmir semantically represents a WASM module with an API that's easy to work with. Here's an example of using watgo to parse a simple WAT program and do some analysis: One important note: the WAT format supports several syntactic niceties that are flattened / canonicalized when lowered to wasmir . For example, all folded instructions are lowered to unfolded ones (linear form), function & type names are resolved to numeric indices, etc. This matches the validation and execution semantics of WASM and its binary representation. These syntactic details are present in watgo in the textformat package (which parses WAT into an AST) and are removed when this is lowered to wasmir . The textformat package is kept internal at this time, but in the future I may consider exposing it publicly - if there's interest. Even though it's still early days for watgo, I'm reasonably confident in its correctness due to a strategy of very heavy testing right from the start. WebAssembly comes with a large official test suite , which is perfect for end-to-end testing of new implementations. The core test suite includes almost 200K lines of WAT files that carry several modules with expected execution semantics and a variety of error scenarios exercised. These live in specially designed .wast files and leverage a custom spec interpreter. watgo hijacks this approach by using the official test suite for its own testing. A custom harness parses .wast files and uses watgo to convert the WAT in them to binary WASM, which is then executed by Node.js [2] ; this harness is a significant effort in itself, but it's very much worth it - the result is excellent testing coverage. watgo passes the entire WASM spec core test suite. Similarly, we leverage wabt's interp test suite which also includes end-to-end tests, using a simpler Node-based harness to test them against watgo. Finally, I maintain a collection of realistic program samples written in WAT in the wasm-wat-samples repository ; these are also used by watgo to test itself. Parse: a parser from WAT to wasmir Validate: uses the official WebAssembly validation semantics to check that the module is well formed and safe Encode: emits wasmir into WASM binary representation Decode: read WASM binary representation into wasmir

0 views
Filippo Valsorda 1 months ago

A Cryptography Engineer’s Perspective on Quantum Computing Timelines

My position on the urgency of rolling out quantum-resistant cryptography has changed compared to just a few months ago. You might have heard this privately from me in the past weeks, but it’s time to signal and justify this change of mind publicly. There had been rumors for a while of expected and unexpected progress towards cryptographically-relevant quantum computers, but over the last week we got two public instances of it. First, Google published a paper revising down dramatically the estimated number of logical qubits and gates required to break 256-bit elliptic curves like NIST P-256 and secp256k1, which makes the attack doable in minutes on fast-clock architectures like superconducting qubits. They weirdly 1 frame it around cryptocurrencies and mempools and salvaged goods or something, but the far more important implication are practical WebPKI MitM attacks. Shortly after, a different paper came out from Oratomic showing 256-bit elliptic curves can be broken in as few as 10,000 physical qubits if you have non-local connectivity , like neutral atoms seem to offer, thanks to better error correction. This attack would be slower, but even a single broken key per month can be catastrophic. They have this excellent graph on page 2 ( Babbush et al. is the Google paper, which they presumably had preview access to): Overall, it looks like everything is moving: the hardware is getting better, the algorithms are getting cheaper, the requirements for error correction are getting lower. I’ll be honest, I don’t actually know what all the physics in those papers means. That’s not my job and not my expertise. My job includes risk assessment on behalf of the users that entrusted me with their safety. What I know is what at least some actual experts are telling us. Heather Adkins and Sophie Schmieg are telling us that “quantum frontiers may be closer than they appear” and that 2029 is their deadline. That’s in 33 months, and no one had set such an aggressive timeline until this month. Scott Aaronson tells us that the “clearest warning that [he] can offer in public right now about the urgency of migrating to post-quantum cryptosystems” is a vague parallel with how nuclear fission research stopped happening in public between 1939 and 1940. The timelines presented at RWPQC 2026, just a few weeks ago, were much tighter than a couple years ago, and are already partially obsolete. The joke used to be that quantum computers have been 10 years out for 30 years now. Well, not true anymore, the timelines have started progressing. If you are thinking “well, this could be bad, or it could be nothing!” I need you to recognize how immediately dispositive that is. The bet is not “are you 100% sure a CRQC will exist in 2030?”, the bet is “are you 100% sure a CRQC will NOT exist in 2030?” I simply don’t see how a non-expert can look at what the experts are saying, and decide “I know better, there is in fact < 1% chance.” Remember that you are betting with your users’ lives. 2 Put another way, even if the most likely outcome was no CRQC in our lifetimes, that would be completely irrelevant, because our users don’t want just better-than-even odds 3 of being secure. Sure, papers about an abacus and a dog are funny and can make you look smart and contrarian on forums. But that’s not the job, and those arguments betray a lack of expertise . As Scott Aaronson said : Once you understand quantum fault-tolerance, asking “so when are you going to factor 35 with Shor’s algorithm?” becomes sort of like asking the Manhattan Project physicists in 1943, “so when are you going to produce at least a small nuclear explosion?” The job is not to be skeptical of things we’re not experts in, the job is to mitigate credible threats, and there are credible experts that are telling us about an imminent threat. In summary, it might be that in 10 years the predictions will turn out to be wrong, but at this point they might also be right soon, and that risk is now unacceptable. Concretely, what does this mean? It means we need to ship. Regrettably, we’ve got to roll out what we have. 4 That means large ML-DSA signatures shoved in places designed for small ECDSA signatures, like X.509, with the exception of Merkle Tree Certificates for the WebPKI, which is thankfully far enough along . This is not the article I wanted to write. I’ve had a pending draft for months now explaining we should ship PQ key exchange now, but take the time we still have to adapt protocols to larger signatures, because they were all designed with the assumption that signatures are cheap. That other article is now wrong, alas: we don’t have the time if we need to be finished by 2029 instead of 2035. For key exchange, the migration to ML-KEM is going well enough but: Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. It makes no more sense to deploy new schemes that are not post-quantum . I know, pairings were nice. I know, everything PQ is annoyingly large. I know, we had basically just figured out how to do ECDSA over P-256 safely. I know, there might not be practical PQ equivalents for threshold signatures or identity-based encryption. Trust me, I know it stings. But it is what it is. Hybrid classic + post-quantum authentication makes no sense to me anymore and will only slow us down; we should go straight to pure ML-DSA-44. 6 Hybrid key exchange is reasonably easy, with ephemeral keys that don’t even need a type or wire format for the composite private key, and a couple years ago it made sense to take the hedge. Authentication is not like that, and even with draft-ietf-lamps-pq-composite-sigs-15 with its 18 composite key types nearing publication, we’d waste precious time collectively figuring out how to treat these composite keys and how to expose them to users. It’s also been two years since Kyber hybrids and we’ve gained significant confidence in the Module-Lattice schemes. Hybrid signatures cost time and complexity budget, 5 and the only benefit is protection if ML-DSA is classically broken before the CRQCs come , which looks like the wrong tradeoff at this point. In symmetric encryption , we don’t need to do anything, thankfully. There is a common misconception that protection from Grover requires 256-bit keys, but that is based on an exceedingly simplified understanding of the algorithm . A more accurate characterization is that with a circuit depth of 2⁶⁴ logical gates (the approximate number of gates that current classical computing architectures can perform serially in a decade) running Grover on a 128-bit key space would require a circuit size of 2¹⁰⁶. There’s been no progress on this that I am aware of, and indeed there are old proofs that Grover is optimal and its quantum speedup doesn’t parallelize . Unnecessary 256-bit key requirements are harmful when bundled with the actually urgent PQ requirements, because they muddle the interoperability targets and they risk slowing down the rollout of asymmetric PQ cryptography. In my corner of the world, we’ll have to start thinking about what it means for half the cryptography packages in the Go standard library to be suddenly insecure, and how to balance the risk of downgrade attacks and backwards compatibility. It’s the first time in our careers we’ve faced anything like this: SHA-1 to SHA-256 was not nearly this disruptive, 7 and even that took forever with the occasional unexpected downgrade attack. Trusted Execution Environments (TEEs) like Intel SGX and AMD SEV-SNP and in general hardware attestation are just f***d. All their keys and roots are not PQ and I heard of no progress in rolling out PQ ones, which at hardware speeds means we are forced to accept they might not make it, and can’t be relied upon. I had to reassess a whole project because of this, and I will probably downgrade them to barely “defense in depth” in my toolkit. Ecosystems with cryptographic identities (like atproto and, yes, cryptocurrencies) need to start migrating very soon, because if the CRQCs come before they are done , they will have to make extremely hard decisions, picking between letting users be compromised and bricking them. File encryption is especially vulnerable to store-now-decrypt-later attacks, so we’ll probably have to start warning and then erroring out on non-PQ age recipient types soon. It’s unfortunately only been a few months since we even added PQ recipients, in version 1.3.0 . 8 Finally, this week I started teaching a PhD course in cryptography at the University of Bologna, and I’m going to mention RSA, ECDSA, and ECDH only as legacy algorithms, because that’s how those students will encounter them in their careers. I know, it feels weird. But it is what it is. For more willing-or-not PQ migration, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @[email protected] . Traveling back from an excellent AtmosphereConf 2026 , I saw my first aurora, from the north-facing window of a Boeing 747. My work is made possible by Geomys , an organization of professional Go maintainers, which is funded by Ava Labs , Teleport , Tailscale , and Sentry . Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement .) Here are a few words from some of them! Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews. Ava Labs — We at Ava Labs , maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network ), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩ Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩

0 views

Stamp It! All Programs Must Report Their Version

Recently, during a production incident response, I guessed the root cause of an outage correctly within less than an hour (cool!) and submitted a fix just to rule it out, only to then spend many hours fumbling in the dark because we lacked visibility into version numbers and rollouts… 😞 This experience made me think about software versioning again, or more specifically about build info (build versioning, version stamping, however you want to call it) and version reporting. I realized that for the i3 window manager, I had solved this problem well over a decade ago, so it was really unexpected that the problem was decidedly not solved at work. In this article, I’ll explain how 3 simple steps (Stamp it! Plumb it! Report it!) are sufficient to save you hours of delays and stress during incident response. Every household appliance has incredibly detailed versioning! Consider this dishwasher: (Thank you Feuermurmel for sending me this lovely example!) I observed a couple household appliance repairs and am under the impression that if a repair person cannot identify the appliance, they would most likely refuse to even touch it. So why are our standards so low in computers, in comparison? Sure, consumer products are typically versioned somehow and that’s typically good enough (except for, say, USB 3.2 Gen 1×2!). But recently, I have encountered too many developer builds that were not adequately versioned! Unlike a physical household appliance with a stamped metal plate, software is constantly updated and runs in places and structures we often cannot even see. Let’s dig into what we need to increase our versioning standard! Usually, software has a name and some version number of varying granularity: All of these identify the Chrome browser on my computer, but each at different granularity. All are correct and useful, depending on the context. Here’s an example for each: After creating the i3 window manager , I quickly learned that for user support, it is very valuable for programs to clearly identify themselves. Let me illustrate with the following case study. When running , you will see output like this: Each word was carefully deliberated and placed. Let me dissect: When doing user support, there are a couple of questions that are conceptually easy to ask the affected user and produce very valuable answers for the developer: Based on my experiences with asking these questions many times, I noticed a few patterns in how these debugging sessions went. In response, I introduced another way for i3 to report its version in i3 v4.3 (released in September 2012): a flag! Now I could ask users a small variation of the first question: What is the output of ? Note how this also transfers well over spoken word, for example at a computer meetup: Michael: Which version are you using? User: How can I check? Michael: Run this command: User: It says 4.24. Michael: Good, that is recent enough to include the bug fix. Now, we need more version info! Run please and tell me what you see. When you run , it does not just report the version of the i3 program you called, it also connects to the running i3 window manager process in your X11 session using its IPC (interprocess communication) interface and reports the running i3 process’s version, alongside other key details that are helpful to show the user, like which configuration file is loaded and when it was last changed: This might look like a lot of detail on first glance, but let me spell out why this output is such a valuable debugging tool: Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. I use NixOS, BTW, so I automatically get a stable identifier ( ) for the specific build of i3. To see the build recipe (“derivation” in Nix terminology) which produced this Nix store output ( ), I can run : Unfortunately, I am not aware of a way to go from the derivation to the source, but at least one can check that a certain source results in an identical derivation. The versioning I have described so far is sufficient for most users, who will not be interested in tracking intermediate versions of software, but only the released versions. But what about developers, or any kind of user who needs more precision? When building i3 from git, it reports the git revision it was built from, using : A modified working copy gets represented by a after the revision: Reporting the git revision (or VCS revision, generally speaking) is the most useful choice. This way, we catch the following common mistakes: As we have seen above, the single most useful piece of version information is the VCS revision. We can fetch all other details (version numbers, dates, authors, …) from the VCS repository. Now, let’s demonstrate the best case scenario by looking at how Go does it! Go has become my favorite programming language over the years, in big part because of the good taste and style of the Go developers, and of course also because of the high-quality tooling: I strive to respect everybody’s personal preferences, so I usually steer clear of debates about which is the best programming language, text editor or operating system. However, recently I was asked a couple of times why I like and use a lot of Go, so here is a coherent article to fill in the blanks of my ad-hoc in-person ramblings :-). Read more → Therefore, I am pleased to say that Go implements the gold standard with regard to software versioning: it stamps VCS buildinfo by default! 🥳 This was introduced in Go 1.18 (March 2022) : Additionally, the go command embeds information about the build, including build and tool tags (set with -tags), compiler, assembler, and linker flags (like -gcflags), whether cgo was enabled, and if it was, the values of the cgo environment variables (like CGO_CFLAGS). Both VCS and build information may be read together with module information using or runtime/debug.ReadBuildInfo (for the currently running binary) or the new debug/buildinfo package. Note: Before Go 1.18, the standard approach was to use or similar explicit injection. This setup works (and can still be seen in many places) but requires making changes to the application code, whereas the Go 1.18+ stamping requires no extra steps. What does this mean in practice? Here is a diagram for the common case: building from git: This covers most of my hobby projects! Many tools I just , or if I want to easily copy them around to other computers. Although, I am managing more and more of my software in NixOS. When I find a program that is not yet fully managed, I can use and the tool to identify it: It’s very cool that Go does the right thing by default! Systems that consist of 100% Go software (like my gokrazy Go appliance platform ) are fully stamped! For example, the gokrazy web interface shows me exactly which version and dependencies went into the build on my scan2drive appliance . Despite being fully stamped, note that gokrazy only shows the module versions, and no VCS buildinfo, because it currently suffers from the same gap as Nix: For the gokrazy packer, which follows a rolling release model (no version numbers), I ended up with a few lines of Go code (see below) to display a git revision, no matter if you installed the packer from a Go module or from a git working copy. The code either displays (the easy case; built from git) or extracts the revision from the Go module version of the main module ( ): What are the other cases? These examples illustrate the scenarios I usually deal with: This is what it looks like in practice: But a version built from git has the full revision available (→ you can tell them apart): When packaging Go software with Nix, it’s easy to lose Go VCS revision stamping: So the fundamental tension here is between reproducibility and VCS stamping. Luckily, there is a solution that works for both: I created the Nix overlay module that you can import to get working Go VCS revision stamping by default for your Nix expressions! Tip: If you are not a Nix user, feel free to skip over this section. I included it in this article so that you have a full example of making VCS stamping work in the most complicated environments. Packaging Go software in Nix is pleasantly straightforward. For example, the Go Protobuf generator plugin is packaged in Nix with <30 lines: official nixpkgs package.nix . You call , supply as the result from and add a few lines of metadata. But getting developer builds fully stamped is not straightforward at all! When packaging my own software, I want to package individual revisions (developer builds), not just released versions. I use the same , or if I need the latest Go version. Instead of using , I provide my sources using Flakes, usually also from GitHub or from another Git repository. For example, I package like so: The comes from my : Go stamps all builds, but it does not have much to stamp here: Here’s a full example of gokrazy/bull: To fix VCS stamping, add my overlay to your : (If you are using , like I am, you need to apply the overlay in both places.) After rebuilding, your Go binaries should newly be stamped with buildinfo: Nice! 🥳 But… how does it work? When does it apply? How do you know how to fix your config? I’ll show you the full diagram first, and then explain how to read it: There are 3 relevant parts of the Nix stack that you can end up in, depending on what you write into your files: For the purpose of VCS revision stamping, you should: Hence, we will stick to the left-most column: fetchers. Unfortunately, by default, with fetchers, the VCS revision information, which is stored in a Nix attrset (in-memory, during the build process), does not make it into the Nix store, hence, when the Nix derivation is evaluated and Go compiles the source code, Go does not see any VCS revision. My Nix overlay module fixes this, and enabling the overlay is how you end up in the left-most lane of the above diagram: the happy path, where your Go binaries are now stamped! How does the overlay work? It functions as an adapter between Nix and Go: So the overlay implements 3 steps to get Go to stamp the correct info: For the full source, see . See Go issue #77020 and Go issue #64162 for a cleaner approach to fixing this gap: allowing package managers to invoke the Go tool with the correct VCS information injected. This would allow Nix (or also gokrazy) to pass along buildinfo cleanly, without the need for workarounds like my adapter . At the time of writing, issue #77020 does not seem to have much traction and is still open. My argument is simple: Stamping the VCS revision is conceptually easy, but very important! For example, if the production system from the incident I mentioned had reported its version, we would have saved multiple hours of mitigation time! Unfortunately, many environments only identify the build output (useful, but orthogonal), but do not plumb the VCS revision (much more useful!), or at least not by default. Your action plan to fix it is just 3 simple steps: Implementing “version observability” throughout your system is a one-day high-ROI project. With my Nix example, you saw how the VCS revision is available throughout the stack, but can get lost in the middle. Hopefully my resources help you quickly fix your stack(s), too: Now go stamp your programs and data transfers! 🚀 Chrome 146.0.7680.80 Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} “This works in Chrome for me, did you test in Firefox?” “Chrome 146 contains broken middle-click-to-paste-and-navigate” “I run Chrome 146.0.7680.80 and cannot reproduce your issue” “Apply this patch on top of Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} and follow these steps to reproduce: […]” : I could have shortened this to or maybe , but I figured it would be helpful to be explicit because is such a short name. Users might mumble aloud “What’s an i-3-4-2-4?”, but when putting “version” in there, the implication is that i3 is some computer thing (→ a computer program) that exists in version 4.24. is the release date so that you can immediately tell if “ ” is recent. signals when the project was started and who is the main person behind it. gives credit to the many people who helped. i3 was never a one-person project; it was always a group effort. Question: “Which version of i3 are you using?” Since i3 is not a typical program that runs in a window (but a window manager / desktop environment), there is no Help → About menu option. Instead, we started asking: What is the output of ? Question: “ Are you reporting a new issue or a preexisting issue? To confirm, can you try going back to the version of i3 you used previously? ”. The technical terms for “going back” are downgrade, rollback or revert. Depending on the Linux distribution, this is either trivial or a nightmare. With NixOS, it’s trivial: you just boot into an older system “generation” by selecting that version in the bootloader. Or you revert in git, if your configs are version-controlled. With imperative Linux distributions like Debian Linux or Arch Linux, if you did not take a file system-level snapshot, there is no easy and reliable way to go back after upgrading your system. If you are lucky, you can just the older version of i3. But you might run into dependency conflicts (“version hell”). I know that it is possible to run older versions of Debian using snapshot.debian.org , but it is just not very practical, at least when I last tried. Can you check if the issue is still present in the latest i3 development version? Of course, I could also try reproducing the user issue with the latest release version, and then one additional time on the latest development version. But this way, the verification step moves to the affected user, which is good because it filters for highly-motivated bug reporters (higher chance the bug report actually results in a fix!) and it makes the user reproduce the bug twice , figuring out if it’s a flaky issue, hard-to-reproduce, if the reproduction instructions are correct, etc. A natural follow-up question: “ Does this code change make the issue go away? ” This is easy to test for the affected user who now has a development environment. Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Note that this is the same check that is relevant during production incidents: verifying that effectively running matches supposed to be running versions. Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. People build from the wrong revision. People build, but forget to install. People install, but their session does not pick it up (wrong location?). Nix fetchers like are implemented by fetching an archive ( ) file from GitHub — the full repository is not transferred, which is more efficient. Even if a repository is present, Nix usually intentionally removes it for reproducibility: directories contain packed objects that change across runs (for example), which would break reproducible builds (different hash for the same source). We build from a directory, not a Go module, so the module version is . The stamped buildinfo does not contain any information. Fetchers. These are what Flakes use, but also non-Flake use-cases. Fixed-output derivations (FOD). This is how is implemented, but the constant hash churn (updating the line) inherent to FODs is annoying. Copiers. These just copy files into the Nix store and are not git-aware. Avoid the Copiers! If you use Flakes: ❌ do not use as a Flake input ✅ use instead for git awareness I avoid the fixed-output derivation (FOD) as well. Fetching the git repository at build time is slow and inefficient. Enabling , which is needed for VCS revision stamping with this approach, is even more inefficient because a new Git repository must be constructed deterministically to keep the FOD reproducible. Nix tracks the VCS revision in the in-memory attrset. Go expects to find the VCS revision in a repository, accessed via file access and commands. It synthesizes a file so that Go’s detects a git repository. It injects a command into the that implements exactly the two commands used by Go and fails loudly on anything else (in case Go updates its implementation). It sets in the environment variable. Stamp it! Include the source VCS revision in your programs. This is not a new idea: i3 builds include their revision since 2012! Plumb it! When building / packaging, ensure the VCS revision does not get lost. My “VCS rev with NixOS” case study section above illustrates several reasons why the VCS rev could get lost, which paths can work and how to fix the missing plumbing. Report it! Make your software print its VCS revision on every relevant surface, for example: Executable programs: Report the VCS revision when run with For Go programs, you can always use Services and batch jobs: Include the VCS revision in the startup logs. Outgoing HTTP requests: Include the VCS revision in the HTTP responses: Include the VCS revision in a header (internally) Remote Procedure Calls (RPCs): Include the revision in RPC metadata User Interfaces: Expose the revision somewhere visible for debugging. My overlay for Nix / NixOS My repository is a community resource to collect examples (as markdown content) and includes a Go module with a few helpers to make version reporting trivial.

0 views
Jason Scheirer 1 months ago

Golang Webview Installer for Wails 3

Top Matter : Codeberg for the library , doc for the library . I’ve forked Lea Anthony’s library that eventually made its way into core Wails for two reasons: So here we are. I want it in Wails 3 and it’s not there I want to shave a meg off the binary size by not providing the embedded installer exe

0 views
Anton Zhiyanov 1 months ago

Porting Go's strings package to C

Creating a subset of Go that translates to C was never my end goal. I liked writing C code with Go, but without the standard library it felt pretty limited. So, the next logical step was to port Go's stdlib to C. Of course, this isn't something I could do all at once. I started with the io package , which provides core abstractions like and , as well as general-purpose functions like . But isn't very interesting on its own, since it doesn't include specific reader or writer implementations. So my next choices were naturally and — the workhorses of almost every Go program. This post is about how the porting process went. Bits and UTF-8 • Bytes • Allocators • Buffers and builders • Benchmarks • Optimizing search • Optimizing builder • Wrapping up Before I could start porting , I had to deal with its dependencies first: Both of these packages are made up of pure functions, so they were pretty easy to port. The only minor challenge was the difference in operator precedence between Go and C — specifically, bit shifts ( , ). In Go, bit shifts have higher precedence than addition and subtraction. In C, they have lower precedence: The simplest solution was to just use parentheses everywhere shifts are involved: With and done, I moved on to . The package provides functions for working with byte slices: Some of them were easy to port, like . Here's how it looks in Go: And here's the C version: Just like in Go, the ( → ) macro doesn't allocate memory; it just reinterprets the byte slice's underlying storage as a string. The function (which works like in Go) is easy to implement using from the libc API. Another example is the function, which looks for a specific byte in a slice. Here's the pure-Go implementation: And here's the C version: I used a regular C loop to mimic Go's : But and don't allocate memory. What should I do with , since it clearly does? I had a decision to make. The Go runtime handles memory allocation and deallocation automatically. In C, I had a few options: An allocator is a tool that reserves memory (typically on the heap) so a program can store its data structures there. See Allocators from C to Zig if you want to learn more about them. For me, the winner was clear. Modern systems programming languages like Zig and Odin clearly showed the value of allocators: An is an interface with three methods: , , and . In C, it translates to a struct with function pointers: As I mentioned in the post about porting the io package , this interface representation isn't as efficient as using a static method table, but it's simpler. If you're interested in other options, check out the post on interfaces . By convention, if a function allocates memory, it takes an allocator as its first parameter. So Go's : Translates to this C code: If the caller doesn't care about using a specific allocator, they can just pass an empty allocator, and the implementation will use the system allocator — , , and from libc. Here's a simplified version of the system allocator (I removed safety checks to make it easier to read): The system allocator is stateless, so it's safe to have a global instance: Here's an example of how to call with an allocator: Way better than hidden allocations! Besides pure functions, and also provide types like , , and . I ported them using the same approach as with functions. For types that allocate memory, like , the allocator becomes a struct field: The code is pretty wordy — most C developers would dislike using instead of something shorter like . My solution to this problem is to automatically translate Go code to C (which is actually what I do when porting Go's stdlib). If you're interested, check out the post about this approach — Solod: Go can be a better C . Types that don't allocate, like , need no special treatment — they translate directly to C structs without an allocator field. The package is the twin of , so porting it was uneventful. Here's usage example in Go and C side by side: Again, the C code is just a more verbose version of Go's implementation, plus explicit memory allocation. What's the point of writing C code if it's slow, right? I decided it was time to benchmark the ported C types and functions against their Go versions. To do that, I ported the benchmarking part of Go's package. Surprisingly, the simplified version was only 300 lines long and included everything I needed: Here's a sample benchmark for the type: Reads almost like Go's benchmarks. To monitor memory usage, I created — a memory allocator that wraps another allocator and keeps track of allocations: The benchmark gets an allocator through the function and wraps it in a to keep track of allocations: There's no auto-discovery, but the manual setup is quite straightforward. With the benchmarking setup ready, I ran benchmarks on the package. Some functions did well — about 1.5-2x faster than their Go equivalents: But (searching for a substring in a string) was a total disaster — it was nearly 20 times slower than in Go: The problem was caused by the function we looked at earlier: This "pure" Go implementation is just a fallback. On most platforms, Go uses a specialized version of written in assembly. For the C version, the easiest solution was to use , which is also optimized for most platforms: With this fix, the benchmark results changed drastically: Still not quite as fast as Go, but it's close. Honestly, I don't know why the -based implementation is still slower than Go's assembly here, but I decided not to pursue it any further. After running the rest of the function benchmarks, the ported versions won all of them except for two: Benchmarking details is a common way to compose strings from parts in Go, so I tested its performance too. The results were worse than I expected: Here, the C version performed about the same as Go, but I expected it to be faster. Unlike , is written entirely in Go, so there's no reason the ported version should lose in this benchmark. The method looked almost identical in Go and C: Go's automatically grows the backing slice, while does it manually ( , on the contrary, doesn't grow the slice — it's merely a wrapper). So, there shouldn't be any difference. I had to investigate. Looking at the compiled binary, I noticed a difference in how the functions returned results. Go returns multiple values in separate registers, so uses three registers: one for 8-byte , two for the interface (implemented as two 8-byte pointers). But in C, was a single struct made up of two unions and a pointer: Of course, this 56-byte monster can't be returned in registers — the C calling convention passes it through memory instead. Since is on the hot path in the benchmark, I figured this had to be the issue. So I switched from a single monolithic type to signature-specific types for multi-return pairs: Now, the implementation in C looked like this: is only 16 bytes — small enough to be returned in two registers. Problem solved! But it wasn't — the benchmark only showed a slight improvement. After looking into it more, I finally found the real issue: unlike Go, the C compiler wasn't inlining calls. Adding and moving to the header file made all the difference: 2-4x faster. That's what I was hoping for! Porting and was a mix of easy parts and interesting challenges. The pure functions were straightforward — just translate the syntax and pay attention to operator precedence. The real design challenge was memory management. Using allocators turned out to be a good solution, making memory allocation clear and explicit without being too difficult to use. The benchmarks showed that the C versions outperformed Go in most cases, sometimes by 2-4x. The only exceptions were and , where Go relies on hand-written assembly. The optimization was an interesting challenge: what seemed like a return-type issue was actually an inlining problem, and fixing it gave a nice speed boost. There's a lot more of Go's stdlib to port. In the next post, we'll cover — a very unique Go package. In the meantime, if you'd like to write Go that translates to C — with no runtime and manual memory management — I invite you to try Solod . The and packages are included, of course. implements bit counting and manipulation functions. implements functions for UTF-8 encoded text. Loop over the slice indexes with ( is a macro that returns , similar to Go's built-in). Access the i-th byte with (a bounds-checking macro that returns ). Use a reliable garbage collector like Boehm GC to closely match Go's behavior. Allocate memory with libc's and have the caller free it later with . Introduce allocators. It's obvious whether a function allocates memory or not: if it has an allocator as a parameter, it allocates. It's easy to use different allocation methods: you can use for one function, an arena for another, and a stack allocator for a third. It helps with testing and debugging: you can use a tracking allocator to find memory leaks, or a failing allocator to test error handling. Figuring out how many iterations to run. Running the benchmark function in a loop. Recording metrics (ns/op, MB/s, B/op, allocs/op). Reporting the results.

0 views
Taranis 1 months ago

Go has some tricks up its logging sleeve

Since it's more or less TDOV (IYKYK...), I'm going to talk about logging instead. Logging isn't exactly the most shiny or in-your-face thing that coders tend to think about, but it really can make or break large systems. Throwing in a few print statements (or fmt.Printf, or whatever) only scratches the surface. I'm mostly talking about my own logging library here. If there's interest, I'd consider releasing it as open source, but it's currently a bit of a moving target. Feel free to comment if you think you'd find it useful, and I'll try to find the time to split it out from the Euravox codebase and put it on GitHub. The Go programming language ships with logging capabilities in the standard library, found in the log package. If you don't have any better alternatives, using that package rather than raw fmt.Printf is far preferable. My own logging package is a bit nicer. It's not my first – one of my first jobs working in financial markets data systems back in the 90s was the logging subsystem for the Reuter Workstation, and there is some influence from that 30-odd years later in my library. One of the first things I always recommend is breaking out log messages by log level. I currently define the following: It's possible to set a configuration parameter that limits logging at a particular level. This makes it possible to crank logging all the way up for tests, but dial it down for production without changing the code or having to introduce if/then guards around the logging. It was a finding back in the 90s that systems would sometimes break when you took the logging out – this isn't something that's normally a problem with Go, because idiomatic code doesn't tend to have too many side-effects, but it was quite noticeable with C++. Of course, the library doesn't do the string formatting if the level is disabled, but any parameters are still evaluated, which tends to be a less risky approach. It's common to send log messages to stdout or stderr. There's nothing fundamentally wrong with this, but I find it useful to have deeper capabilities than this. My own library has three options, which can be used together (and with different log levels): Any good logging solution should be able to include file name and line number information in log output. Using an IDE like vscode, this allows control/command-clicking a log entry and immediately seeing the code that generated it. C and C++ support this via some fancy #define stunts. Go lacks this kind of preprocessor, but actually has something far better: the runtime.Caller() library function. This makes it possible to pull back the file name and line number (and program counter if you care) anywhere up the call stack. This code fragment comes from my logging function. The argument to Caller is typically 2, because this code is called from one of many convenience functions for syntactic sugar. Typical log commands look something like this: The logging library will automatically pick up the file paths and line numbers where the log commands are located. However, this isn't always useful, and sometimes can be a complete nightmare. Here's a small example: In this case, the file name and line number that will be logged will be where the command is located. This can be absolutely maddening if has many call sites, because they will look exactly the same in the log. My logging library has a small tweak that I've not seen elsewhere – I'm not claiming invention or ownership, because it's so obviously useful that I'd be shocked if nobody else has ever done it. It's just I've not personally seen it. Anyway, here goes: In this case, works similarly to , but it takes an extra parameter at the start, which represents how many extra stack frames to look through to find the filename and line number. The parameter returns the filename and line number of the immediate caller, so the thing that makes its way into the log is the location of the calls, not the logging calls themselves. This might seem to be a subtle difference, but the practical consequences are huge – get this right, and logs become useful traces of activity that make it possible to look backwards in time to see when particular data items have been acted upon, and exactly by what code. Almost as good as single-stepping with a debugger, but can be done after the fact. Anyway, in conclusion, trans women are women, trans men are men, nonbinary and all other variant identities are valid. And fuck fascism. SPM -- Spam messages. Very verbose logging, not something you'd normally use, but the kind of thing that makes all the difference doing detailed debugging. INF -- Information messages. These are intended to be low volume, used to help trace what systems are doing, but not actually representing a an error (i.e., they explicitly are used to log normal behaviour) WRN -- Warning messages. What it says on the tin. Something is possibly wonky, but not bad enough to be an actual error. Real production systems should have as close to zero of these things as possible -- samething should either be normal (INF) or an actual error (ERR). ERR -- Error messages. This represents recoverable errors. Something bad happened, but the code can keep running without risk. FTL -- Fatal errors. These errors show that something very bad has happened, and that the code must abort immediately. There are two cases where this is appropriate. One is when something catastrophic has happened -- system has run out of handles, process is OOMing, etc. The second is where a serious logic bug has been detected. Though in some cases ERR can be OK for this, aborting makes it easier to spot that processes in production are badly broken (e.g., after a bad push), and need to be rolled back. stdout. Nothing special here, but I do have the option to send colour control codes for terminals that support it, which makes logs much more readable. Files. This is similar to piping the process through the tee command, but has the advantage that things like log rotation can be built in. I need to get around to supporting log rotation, but file output works now. Circular buffer. This is the one you don't see often. The idea here is you maintain an in-RAM circular buffer of N lines (say about 5000), which can be exposed via code. I use this to provide an HTTP/HTML interface that makes it possible to watch log output on a process via a web browser. This is a godsend when you have a large number of processes running across multiple VMs and/or physical machines.

0 views