Posts in Go (20 found)
Playtank 3 days ago

Analogue Prototyping

There is a lot to say about prototyping . Chris Hecker talked about advanced prototyping at GDC 2006, and provided a hierarchy of priorities that goes like this: Analogue prototyping comes in right away at Step 1: Don’t . By not launching straight into your game engine, you can save giant heaps of time between hypothesis and implementation. You can also figure out what kinds of references will be relevant before you reach Step 4: Gather References . There’s another side to analogue prototyping as well. In the book Challenges for Game Designers , Brenda Romero says: “A painter gets better by making lots of paintings; sculptors hone their craft by making sculptures; and game designers improve their skills by designing lots of games. […] Unfortunately, designing a complete video game (and implementing it, and then seeing all the things you did right and wrong) can take years, and we’d all like to improve at a faster rate than that.” Brenda Romero Using cards, dice, and paper leads to some of the fastest prototyping possible. It can be just ten minutes between idea and test, fitting really well into those two days of Step 2: Just Do It . Of course, it can also take weeks and require countless iterations, but that’s part of the game designer’s job after all. This post focuses on what to gain from analogue prototypes of digital games, and the practical process involved. It’s also unusually full of real work, since this is something I’ve done quite a bit for my personal projects and is therefore not under NDA. If you’re curious about something or need to tell me I’m wrong, don’t hesitate to comment or e-mail me at [email protected] . Why you should care about analogue prototyping when all you want to do is the next amazing digital game may seem like a mystery. A detour that leads to having your fingers glued together and a bunch of leftover paper clippings you can’t use for anything. In Chris Hecker’s talk, the first suggestion is that you should cheat before you put too much time into anything else. Since you will be cutting and gluing and sleeving, and some of that work takes time, this counts double with analogue prototypes. The easiest way to cheat is to use proxies. If you have a collection of boardgames, this is easy. You can also go out and buy some used games cheap or ask friends if they have some lying around that they don’t use. Perhaps that worn copy of Monopoly that almost caused a family breakup can finally get some table time again, in a different form. Aesthetics matter. If you want to take shortcuts with how a game feels to play, getting something that looks the part can be a shortcut. Go to your local Dollar Store or second hand shop and pick up some plastic toys or a game with miniatures that are similar to what you are after. They can merely be there to act as center pieces for your prototype. The easiest and most efficient reference board that exists is a standard chessboard. Square grid with a manageable size. You can also use a Go board, with the extra benefit that the Go beads also make for excellent proxy components. Beyond those two, you can really use any other board game board too. Just make sure to remember where you got it from if you want to play those games in the future. Or you can even pick up games with missing parts at yard sales, usually super cheap, and scavenge proxy parts from those. For some types of games, finding a good real-world map, perhaps even a tourist map or subway map, can be an excellent shortcut. Not just for wargames, but for anything with a spatial component. The guide map from a theme park or museum works, too. Packs of 52 standard playing cards are fantastic proxies. You can use suits, ladders, make face cards have a different meaning, and much more. Countless prototypes have used these excellent decks to handle anything from combat resolution to hidden information. It’s also possible to go even further, and make your own game use regular playing gards and the known poker combos as a feature. Balatro comes to mind. Many families have a Yatzy set lying around, providing you with a small handful of six-sided standard dice. You can do a lot with just this simple straightforward randomisation element. But don’t limit yourself to just six-sided dice, if you don’t have to. Get yourself a set of Dungeons & Dragons polyhedrals and you’ll have four-, eight-, ten-, twelve- and twenty-sided dice rounding out your randomisation armory. Just want to make an honorable mention of this fantasy wargame, because of its diversity. You can build all manner of strange scenery from just a core HeroScape set and use it effectively to represent almost anything. The same goes for Lego. The main issue with these kinds of proxies is that they can take a lot of space. Particularly HeroScape , since it has a predefined scale. With Lego, you just need to figure out a scale and stick to it. If there’s a game the people you will play with are especially familiar with, you can skip over having to design one of your systems by substituting a mechanic from a game you already know. Say, if you know that you will want to have statistics in your game, you can copy the traditional lineup of six abilities from Dungeons & Dragons , as well as their scale, to get started. Even if you know that you will want a different lineup later, this means you can test elements that are more unique to your game faster. An effective way to minimise cut-and-paste time is to print your cards very small. Preferably so all of them fit on a single piece of paper. They will be a bit trickier to shuffle this way, but that’s rarely an issue in testing. This way, you need less paper and you can cut everything faster. Going from eight cards to a sheet to 32 is a pretty big difference. Just avoid miniaturizing to the point that you need a magnifying glass. There’s no need to get fancy with real cardstock. Here are some things you can use. I usually just keep any interesting sheets from deliveries I receive. Say, the sturdy sheet of paper used in a plastic sleeve to make sure a comic book doesn’t bend in the mail. Perfect for gluing counters. There are three things you need to consider for paper: size, weight, and texture. For size, since I’m in Europe, I use the standardized A-sizes. A0 is a giant piece of paper, A1 is half as big, A2 half as big again, and so on. The standard office paper format is A4, roughly equivalent to U.S. Letter. This can easily be folded into A5 pamphlets. I also keep A3 papers around (twice the size of A4), but those I use to draw on. Not for printing. I don’t have a big enough home to fit a floor printer. The next thing is paper weight, measured in grams per square meter (GSM). Most home printers can’t handle heavier paper than 120-200 GSM. I always keep standard paper (80 GSM) around, and some heavier papers too. If I print counters or cards I sometimes use the sturdier stock. For reference, Magic cards are printed on 300 GSM black core paper stock. The black core is so you can’t see through the card and is taken directly from the gambling circuit. Lastly, the paper’s texture. If you want to work a little on the presentation, it can be nice to find paper canvas, or other sturdier variants. I’ve found that glossy photo paper is almost entirely useless in my own printer, however, always smearing or distorting the print. So when I buy any higher-GSM paper I try to find paper with coarser texture. There are many different kinds of cardboard, and you should try to keep as many around as possible. Some can be good for gluing boards or counters onto, while others can help make your prototype sturdier. This isn’t as important as paper, but gets used frequently enough that it felt worth mentioning. There will be a lot of rambling about cards later, and how to use them. For now, I only refer to loose cards you can use to prop up your thin paper printouts. These are not strictly necessary, but make shuffling easier. I don’t play much Magic: The Gathering anymore, but I still have lots and lots of leftover Magic cards, so those are the ones that get used as backing in most of my prototypes. You can cheaply buy colored wooden cubes as well as glass and plastic beads in bulk. It’s not always obvious what you may need, so keeping some different types around can be helpful. More specific pieces, like coins or pawns, can also be useful but unless these components provide unique affordances the kinds of components you have access to is rarely important. It’s usually enough to be able to move them around and separate them into groups. Storage is another thing that needs solving. If you mostly print paper and iterate on rules, a binder can be quite helpful. Especially paired with plastic sleeves so you can group iterations of your rules together and store them easily. If you also need to transport your prototypes, the kinds of storage boxes you find in office supply stores will have you sorted. You can push your analogue prototyping really far and build a whole workshop. A 3D printer for making scenery and miniatures, a laser cutter for custom MDF components, and a big floor-sized professional printer that takes over a whole room. If you have the space and the resources for that, you do you, but let’s focus on the smallest possible toolbox for making analogue prototypes. If you want to buy a printer, you just need to be aware that all of them have the same problems of losing connections and failing to print still to this day. Those same problems that have plagued printers since forever. I use a laser color printer with duplex (double-sided) printing support and the ability to print slightly heavier paper, up to 220 GSM. This has been more than enough for my needs. Specifically the duplex feature helps a lot if you want to print rulebooks. Having a good store of pencils and pens, including alcohol- and water-based markers, is more than enough. You can go deeper into the pen rabbit hole by looking at Niklas Wistedt’s spectacular tutorial on how to draw dungeon maps : it’ll have you covered in the pen and pencil department. Some tools you keep around to hold piles of paper or cards together. Paper clips are extra handy, because they can also be used as improvised sliders pointing at health numbers or other variables. Rubber bands are handy for keeping decks of cards together inside a box and for transportation. Almost every paper-based activity without decent scissors on hand will be a futile effort. Just beware that cutting things out by hand takes more time than you think. If you have a game with many cards, you may have to put on a couple of episodes of your favorite show as you cut them out. If you need more precision than scissors can provide, the next rung on the cutting lader is to get a proper cutting mat, a steel ruler, and a set of good sharp knives. These can be craft scalpels, metal handles with interchangeable blades (Americans insist on calling these “x-acto knives”), or carpet knives. Once you have rules and test documents printed, you’ll quickly disappear under a veritable ocean of paper. Though smaller sheafs can be pinned together with a paper clip, staplers are even better. A standard small office stapler is enough. But if you want to staple booklets and not just sheafs, it can be worth it to get a long-reach stapler capable of punching 20 sheets or more. Attaching paper to other paper can be done in more ways than with clips or staples. Sometimes you want to use glue or adhesive tape. Keeping a standard gluestick and a can of spray glue around is perfect. Regular tape and double-sided tape is also great for many things, even if the main use for tape can just be to make larger scale maps out of individual pieces of paper. As mentioned previously, it can take some time to cut out all the cards you want to print. You can cut this time down to a fraction, metaphorically and physically, by getting a paper guilloutine. These can usually take a few sheets at a time and will give you clean cuts along identified lines. Yelling “vive la France” when you drop the blade is optional. Lastly, a more decadent piece of machinery that isn’t strictly needed is a paper laminator. These will heat up a plastic pocket and melt the edges together to provide the paper with a plastic surface. It makes the paper much sturdier and has the added benefit of allowing you to use dry erase markers to make notes and adjustments right on the sheet itself. There is a lot of software out there that can be used to make cards, boards, illustrations, and whatever else you may need. The following is merely a list of what I personally use. Since you will often want to test things at different sizes, vector graphics are generally more useful for board game prototyping than pixel graphics. This is by no means a hard rule, but resolution of pixel images tends to limit how large you can scale them, while vector graphics have no such limitations. My go-to for vector graphics is Illustrator, but there are free alternatives like Affinity available as well. My other go-to piece of software for analogue shenanigans is InDesign, another Adobe program that can also be replaced by Affinity . I’m just personally too stuck in the Adobe ecosystem, after decades of regular use, that it’s too late for me to switch. You can’t teach an old dogs new tricks, as the saying goes. Indesign is great for multiple reasons. Not least of all its ability to use comma-separated value (CSV) files to populate unique pages or cards with data. A feature called DataMerge. Speaking of spreadsheets, all system designers have a lovely relationship to their tool of choice. This can be Microsoft Excel , OpenOffice Calc , or Google Spreadsheets, but the many convenient features of spreadsheets are a huge part of our bread and butter. I don’t even want to know how many sheets I create in an average year. Very broadly speaking, when making an analogue prototype, I will make use of spreadsheets for these reasons: The fantastic Tabletop Simulator is not just a great place to play tabletop games, it’s also a great place to test your own games. Renown board game designer Cole Wehrle has recorded some workshops for people interested in this specific adventure, and let’s just say that once you have this up and running it will make it a lot easier to test your game. Especially if the members of your team doesn’t all live in the same city. Its biggest strength is how quickly you can update new versions for anyone with a module already installed. If you share your module through Steam Workshop, it’s even easier. For most analogue prototypes, this isn’t doable, simply because of NDAs and rights issues. So much stuff ! Let’s put it all together. The way I’ve talked about this, there are really six steps to the process of making an analogue prototype: This is more important than you may think. An analogue prototype can easily become a design detour. Because of this, your goal needs to formulate why you are making this analogue prototype. “Test if it’s fun with infinitely respawning enemies” could be a goal. “See what works best: party or individual character” could be another one. But it can also be a lot narrower, for example designed to test the gold economy in your game. Perhaps even to balance it. The point is that you need a goal, and you need to stick to it and cut everything out that doesn’t serve that goal. If you need to test how travelling works on the map, you probably don’t need a full-fledged combat system, for example. Facts are the smallest units of decision in your game’s design . Stuff that every decision maker on your team has agreed on and that can therefore safely inform your analogue prototype. This can be super broad, like “the player plays a hamster,” or it can be more specific, like “the player character always has exactly one weapon.” You need these facts to keep your prototype grounded, but you don’t necessarily need to refer to them all at once. Pick the ones that are most important to your goal. With a goal and some facts, you need to figure out what systems you will use. Try to narrow it down more than you may think. Don’t make a “combat system,” but rather one “attack system” and another “defense system.” The reason for this is that what you are after is the resource exchanges that come from this, and the dynamics of the interactions. The attack system may take player choices as input and dish out damage as output, while the defense system may accept armor and damage input and send health loss as output. You can refer to the examples of building blocks in this post for inspiration. This is where we come to the biggest strength of analogue prototyping: real humans provide a lot more nuance and depth than any prototype can do on its own. Analogue or digital. One player can take on the role of referee or game master, similar to how it would work in a tabletop role-playing game . In many wargames of the past, this was called an umpire. Someone who would know all the rules and act as a channel between the players and the systems. If you have built a particularly complicated analogue prototype, a good way to test it can be to act as a referee and then simply ask players what they want to do instead of teaching them the details of the rules. Players can play each other’s opponents, representing different factions, interest groups, or feature sets via their analogue mechanics. If you built an analogue prototype of StarCraft , you’d probably do it this way, with three players taking on one faction each. One player can play the enemies, while another plays the economy system, or the spawning system. The goal here is to put one player in charge of the decisions made within the related system. If someone wants to trade their stock for a new space ship, and this isn’t covered by the rules, the economy system player can decide on the exchange rate and the spawning system player can say that this spawns a patrol of rival ships. Just take ample notes, so you don’t forget the nuances that come out of this process. There are many different ways to use the components you collected previously. Some of them may not be intuitive at all. The humble die: perhaps the most useful component in your toolbox. Just look at the following list and be amazed: People have been using playing cards for leisure activities since at least medieval times. Just as for dice, you’ll see why right here, and perhaps these things will fit your needs better than dice: Humans are spatial beeings that think in three dimensions. Even such a simple thing as a square grid where you put miniatures will create relationships of behind, in front of, far away from, close to, etc. All analogue prototypes don’t need this, but if you do need it, here are some alternatives to explore: With the fast iterations of analogue prototypes, you can usually just change a word or an image somewhere and print a new page. This means you may have many copies of the same page after a while. To prepare for this situation, make sure to have a system for versioning. It doesn’t have to be too involved, especially if you’re the only designer working on this prototype, but you need to do something. I usually just iterate a number in the corner of each page. The 3 becomes a 4. I may also write the date, if that seems necessary. I may also add a colored dot (usually red) to pages that have been deprecated, since just the number itself won’t say much and you may end up referring to the wrong pages if you don’t have an indicator like this. Step 1: Don’t : Steal it, fake it, or rehash stuff you have already made before you start a new prototype. Step 2: Just Do It : If it takes less than two days, just do it. As the saying goes, it’s easier to ask for forgiveness than for permission. Step 3: Fail Early : When something feels like a dud even at an early stage, you can assume that it is in fact a dud. There’s nothing wrong about abandoning a prototype. In fact, learning to kill things early is a skill. Step 4: Gather References : Prototypes can only really help with small problems. Big problems, you must break apart and figure out. Collect references. White papers, mockup screenshots, music, asset store packs, and so on. Anything that helps you understand the problem space. The same psychology applies . Rewards, risk-taking, information overload. Many of our intrinsic and extrinsic motivators are triggered the same by boardgames as by digital games. The distance is not nearly as far as we may tell ourselves. Players can represent complex systems . A player has all the complexity of a living breathing human, making odd decisions and concocting strange plans. This lets you use players as representations of systems, from enemy behaviors to narrative. Analogue games are “pure” systems . If you can’t make sense of your mechanic in its naked form, you can probably not expect your players to make sense of it either. Similar affordances . Generating random numbers with dice, shuffling cards, moving things around a limited space; analogue gaming is always extremely close to digital gaming, even to the point that we use similar verbs and parlance. Holism . Probably the best part of the analogue format is that you can actually represent everything in your game in one way or another. It doesn’t have to be a big complex system, as long as you provide something to act as that system’s output. Listing all the actions, components, elements, etc., that are relevant. Just getting things into a list can show you if something is realistic or not. Cross-matrices for fleshing out a game’s state-space. If I know the features I want, and the terrains that exist, a cross-matrix can explore what those mean: a feature-terrain matrix. Notes on playtests. How many players played, what happened, who won and why, etc. Calculators of various kinds, incorporating more spreadsheet scripting. Can be used to check probabilities, damage variation, feature dominance, etc. Session logging. If I want to be more detailed, I can log each action from a whole session and see if there are things that can be added or removed. Set a Goal Identify Facts Systemify the Facts Consider the roles of Players Tie it together with Components Types of dice : you can use any number of sides, and make use of the corresponding probabilities. Dividing a result by the number of sides gives you the probability of that result. So, 1/6 = 0.1666 means there’s a ~17% chance to roll any single side on a six-sided die. Use the dice that best represents the percentage chances you have in mind. Singles : rolling a single die and reading the result. Pretty straightforward. Sums : rolling two or more dice and adding the result together. Pools : rolling a handful of dice and checking for specific individual results or adding them together. Buckets : rolling a lot of dice and checking for specific results. The only reason buckets of dice are separated from dice pools here is because they have a different “feel” to them; they are functionally identical. Add/Subtract : add or subtract one die from the result of another, or use mathematical modifiers to add or subtract from another result. X- or X+ : require specific results per die. In these cases X- would mean “X or lower,” and X+ would mean “X or higher.” Patterns : like Yatzy, or what the first The Witcher called “Dice Poker:” you want doubles, triples, full houses, etc. Reroll : allowing rerolls of some or all of the dice you just rolled. Makes the rolling take longer but also provides increased chances of reaching the right result. Some games allow rerolling in realtime and then use other time elements to restrict play. So you can frantically keep trying to get that 6, but if an hourglass runs out first you lose. Spin : spinning the die to the specific side you want. Trigger : if you roll a specific result, something special happens. It could be the natural 20 that causes a critical hit in Dungeons & Dragons , or it can be that a roll of 10 means you roll another ten-sided die and add it to your result. Hide : you roll or you set your result under a cupped hand or physical cup, hiding the result until everyone reveals at the same time or the game rules require it. Statistics : common sense may say that you can’t possibly roll a fifth one after the first four, but in reality you can. Dice are truly random. Shuffle : shuffling cards is a great way to randomise outcomes. This can be done in many different ways, as well, where you shuffle a “bomb” into half of the pile and then shuffle the other half to place on top, for example. There are many ways to mix up how to shuffle a deck of cards. Uniqueness : each card can only be drawn once, which means that you can make each card in a deck unique and you can affect the mathematics of probability by adding multiple copies of the same card. Just like the board game Maria uses standard playing cards but in different numbers. Front and back : the face and back of the cards can have different print on them, or the back can just inform you what kind of card it is so you can shuffle them together in setup. Of course, the fact that you can hide the faces for other players is also what makes bluffing in poker interesting. Turn, sideways : what Magic calls “tapping” and other games may call exhausting or something else. Some cards can be turned sideways (in landscape mode instead of portrait mode) by default. Turn, over : flipping a card to its other side can serve to show you new information or to hide its face from everyone around the table. It can represent a card being exhausted, or injured, or other state changes like a person transforming into a werewolf. Over/under : cards can be placed physically over or under other cards, to show various kinds of relationships. An item equipped by a character, or a condition suffered by an army, for example. Card grids : cards can be placed in a grid to generate a board, or to act as a sheet selection for a character. One card could be your character class, another could be a choice of quest, etc. It’s a neat way to test combinations. Hide cards : if you want to get really physical, you can hide cards on your person, under boards, and so on. This was one way you could play Killer , by hiding notes your opponents would find. Card text : if you print your own cards, you can have any text you want on them. Reminders, rules exceptions, etc. Deck composition : how you put decks together will affect how the game plays, and predesigning decks for different tests can be very effective. Perhaps you remove all the goblins in one playtest and have only goblins in another. Deck building : decks can also be constructed through play, similarly to how Slay the Spire works. A style of mechanic where you can start small and then grow in complexity throughout a session. Stats : cards can be in different states. On the table, in your hand, available from an open tableau, shuffled into a deck, discarded to a discard pile, and even removed from the game due to in-game effects. Semantics : something that Magic: The Gathering ‘s designer, Richard Garfield, was particularly good at was to figure out interesting names for the things you were doing. You don’t just play a card, you’re casting a spell. It’s not a discard pile, it’s your graveyard. These kinds of semantics can be strong nods back to the digital game you are making, or they can serve a more thematic purpose. Statistics : with every card you draw, the deck shrinks, increasing the chances of drawing the specific card you may want. You are guaranteed to draw every card if you go through a whole deck, which is one of the biggest strengths of decks of cards. Node or point maps : picture a corkboard with pins and red thread, or just simple circular nodes with lines between them. You can draw this easily on a large sheet of paper and just write simple names next to each circle to provide context. Sector maps : one step above the node or point map is the sector map, where regions share proximity. Grand strategy games have maps like this, where provinces share borders. Another example are more abstract role-playing games, where a house’s interior is maybe divided into two sectors and the whole exterior area around it is another sector. It’s excellent for broad-stroke maps. Square grids : if you want a grid, the square grid is probably the most intuitive. But it also has some mathematical problems: diagonals reach twice as far as cardinals. This means you need to either not allow diagonals or allow them and account for the problems that will emerge. Hexagon grids : these are more accurate and classic wargame fare, but they will also often force you to adapt your art to the grid in ways that are not as intuitive as with a square grid. Freeform : finally, you can just take any satellite image or nice drawn map, perhaps an overhead screenshot from a level you’ve made, and use it as a map in a freeform capacity. This may force you to use a tape measure or other way to measure distances, but if the distances are not important that matters a lot less. For example if your game shares sensibilities with Marvel’s Midnight Suns .

0 views
David Bushell 5 days ago

No-stack web development

This year I’ve been asked more than ever before what web development “stack” I use. I always respond: none. We shouldn’t have a go-to stack! Let me explain why. My understanding is that a “stack” is a choice of software used to build a website. That includes language and tooling, libraries and frameworks , and heaven forbid: subscription services. Text editors aren’t always considered part of the stack but integration is a major factor. Web dev stacks often manifest as used to install hundreds of megs of JavaScript, Blazing Fast ™ Rust binaries, and never ending supply chain attacks . A stack is also technical debt, non-transferable knowledge, accelerated obsolescence, and vendor lock-in. That means fragility and overall unnecessary complication. Popular stacks inevitably turn into cargo cults that build in spite of the web, not for it. Let’s break that down. If you have a go-to stack, you’ve prescribed a solution before you’ve diagnosed a problem. You’ve automatically opted in to technical baggage that you must carry the entire project. Project doesn’t fit the stack? Tough; shoehorn it to fit. Stacks are opinionated by design. To facilitate their opinions, they abstract away from web fundamentals. It takes all of five minutes for a tech-savvy person to learn JSON . It takes far, far longer to learn Webpack JSON . The latter becomes useless knowledge once you’ve moved on to better things. Brain space is expensive. Other standards like CSS are never truly mastered but learning an abstraction like Tailwind will severely limit your understanding. Stacks are a collection of move-fast-and-break churnware; fleeting software that updates with incompatible changes, or deprecates entirely in favour of yet another Rust refactor. A basic HTML document written 20 years ago remains compatible today. A codebase built upon a stack 20 months ago might refuse to play. The cost of re-stacking is usually unbearable. Stack-as-a-service is the endgame where websites become hopelessly trapped. Now you’re paying for a service that can’t fix errors . You’ve sacrificed long-term stability and freedom for “developer experience”. I’m not saying you should code artisanal organic free-range websites. I’m saying be aware of the true costs associated with a stack. Don’t prescribed a solution before you’ve diagnosed a problem. Choose the right tool for each job only once the impact is known. Satisfy specific goals of the website, not temporary development goals. Don’t ask a developer what their stack is without asking what problem they’re solving. Be wary of those who promote or mandate a default stack. Be doubtful of those selling a stack. When you develop for a stack, you risk trading the stability of the open web platform, that is to say: decades of broad backwards compatibility, for GitHub’s flavour of the month. The web platform does not require build toolchains. Always default to, and regress to, the fundamentals of CSS, HTML, and JavaScript. Those core standards are the web stack. Yes, you’ll probably benefits from more tools. Choose them wisely. Good tools are intuitive by being based on standards, they can be introduced and replaced with minimal pain. My only absolute advice: do not continue legacy frameworks like React . If that triggers an emotional reaction: you need a stack intervention! It may be difficult to accept but Facebook never was your stack; it’s time to move on. Use the tool, don’t become the tool. Edit: forgot to say: for personal projects, the gloves are off. Go nuts! Be the churn. Learn new tools and even code your own stack. If you’re the sole maintainer the freedom to make your own mistakes can be a learning exercise in itself. Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views

watgo - a WebAssembly Toolkit for Go

I'm happy to announce the general availability of watgo - the W eb A ssembly T oolkit for G o. This project is similar to wabt (C++) or wasm-tools (Rust), but in pure, zero-dependency Go. watgo comes with a CLI and a Go API to parse WAT (WebAssembly Text), validate it, and encode it into WASM binaries; it also supports decoding WASM from its binary format. At the center of it all is wasmir - a semantic representation of a WebAssembly module that users can examine (and manipulate). This diagram shows the functionalities provided by watgo: watgo comes with a CLI, which you can install by issuing this command: The CLI aims to be compatible with wasm-tools [1] , and I've already switched my wasm-wat-samples projects to use it; e.g. a command to parse a WAT file, validate it and encode it into binary format: wasmir semantically represents a WASM module with an API that's easy to work with. Here's an example of using watgo to parse a simple WAT program and do some analysis: One important note: the WAT format supports several syntactic niceties that are flattened / canonicalized when lowered to wasmir . For example, all folded instructions are lowered to unfolded ones (linear form), function & type names are resolved to numeric indices, etc. This matches the validation and execution semantics of WASM and its binary representation. These syntactic details are present in watgo in the textformat package (which parses WAT into an AST) and are removed when this is lowered to wasmir . The textformat package is kept internal at this time, but in the future I may consider exposing it publicly - if there's interest. Even though it's still early days for watgo, I'm reasonably confident in its correctness due to a strategy of very heavy testing right from the start. WebAssembly comes with a large official test suite , which is perfect for end-to-end testing of new implementations. The core test suite includes almost 200K lines of WAT files that carry several modules with expected execution semantics and a variety of error scenarios exercised. These live in specially designed .wast files and leverage a custom spec interpreter. watgo hijacks this approach by using the official test suite for its own testing. A custom harness parses .wast files and uses watgo to convert the WAT in them to binary WASM, which is then executed by Node.js [2] ; this harness is a significant effort in itself, but it's very much worth it - the result is excellent testing coverage. watgo passes the entire WASM spec core test suite. Similarly, we leverage wabt's interp test suite which also includes end-to-end tests, using a simpler Node-based harness to test them against watgo. Finally, I maintain a collection of realistic program samples written in WAT in the wasm-wat-samples repository ; these are also used by watgo to test itself. Parse: a parser from WAT to wasmir Validate: uses the official WebAssembly validation semantics to check that the module is well formed and safe Encode: emits wasmir into WASM binary representation Decode: read WASM binary representation into wasmir

0 views
Filippo Valsorda 1 weeks ago

A Cryptography Engineer’s Perspective on Quantum Computing Timelines

My position on the urgency of rolling out quantum-resistant cryptography has changed compared to just a few months ago. You might have heard this privately from me in the past weeks, but it’s time to signal and justify this change of mind publicly. There had been rumors for a while of expected and unexpected progress towards cryptographically-relevant quantum computers, but over the last week we got two public instances of it. First, Google published a paper revising down dramatically the estimated number of logical qubits and gates required to break 256-bit elliptic curves like NIST P-256 and secp256k1, which makes the attack doable in minutes on fast-clock architectures like superconducting qubits. They weirdly 1 frame it around cryptocurrencies and mempools and salvaged goods or something, but the far more important implication are practical WebPKI MitM attacks. Shortly after, a different paper came out from Oratomic showing 256-bit elliptic curves can be broken in as few as 10,000 physical qubits if you have non-local connectivity , like neutral atoms seem to offer, thanks to better error correction. This attack would be slower, but even a single broken key per month can be catastrophic. They have this excellent graph on page 2 ( Babbush et al. is the Google paper, which they presumably had preview access to): Overall, it looks like everything is moving: the hardware is getting better, the algorithms are getting cheaper, the requirements for error correction are getting lower. I’ll be honest, I don’t actually know what all the physics in those papers means. That’s not my job and not my expertise. My job includes risk assessment on behalf of the users that entrusted me with their safety. What I know is what at least some actual experts are telling us. Heather Adkins and Sophie Schmieg are telling us that “quantum frontiers may be closer than they appear” and that 2029 is their deadline. That’s in 33 months, and no one had set such an aggressive timeline until this month. Scott Aaronson tells us that the “clearest warning that [he] can offer in public right now about the urgency of migrating to post-quantum cryptosystems” is a vague parallel with how nuclear fission research stopped happening in public between 1939 and 1940. The timelines presented at RWPQC 2026, just a few weeks ago, were much tighter than a couple years ago, and are already partially obsolete. The joke used to be that quantum computers have been 10 years out for 30 years now. Well, not true anymore, the timelines have started progressing. If you are thinking “well, this could be bad, or it could be nothing!” I need you to recognize how immediately dispositive that is. The bet is not “are you 100% sure a CRQC will exist in 2030?”, the bet is “are you 100% sure a CRQC will NOT exist in 2030?” I simply don’t see how a non-expert can look at what the experts are saying, and decide “I know better, there is in fact < 1% chance.” Remember that you are betting with your users’ lives. 2 Put another way, even if the most likely outcome was no CRQC in our lifetimes, that would be completely irrelevant, because our users don’t want just better-than-even odds 3 of being secure. Sure, papers about an abacus and a dog are funny and can make you look smart and contrarian on forums. But that’s not the job, and those arguments betray a lack of expertise . As Scott Aaronson said : Once you understand quantum fault-tolerance, asking “so when are you going to factor 35 with Shor’s algorithm?” becomes sort of like asking the Manhattan Project physicists in 1943, “so when are you going to produce at least a small nuclear explosion?” The job is not to be skeptical of things we’re not experts in, the job is to mitigate credible threats, and there are credible experts that are telling us about an imminent threat. In summary, it might be that in 10 years the predictions will turn out to be wrong, but at this point they might also be right soon, and that risk is now unacceptable. Concretely, what does this mean? It means we need to ship. Regrettably, we’ve got to roll out what we have. 4 That means large ML-DSA signatures shoved in places designed for small ECDSA signatures, like X.509, with the exception of Merkle Tree Certificates for the WebPKI, which is thankfully far enough along . This is not the article I wanted to write. I’ve had a pending draft for months now explaining we should ship PQ key exchange now, but take the time we still have to adapt protocols to larger signatures, because they were all designed with the assumption that signatures are cheap. That other article is now wrong, alas: we don’t have the time if we need to be finished by 2029 instead of 2035. For key exchange, the migration to ML-KEM is going well enough but: Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. It makes no more sense to deploy new schemes that are not post-quantum . I know, pairings were nice. I know, everything PQ is annoyingly large. I know, we had basically just figured out how to do ECDSA over P-256 safely. I know, there might not be practical PQ equivalents for threshold signatures or identity-based encryption. Trust me, I know it stings. But it is what it is. Hybrid classic + post-quantum authentication makes no sense to me anymore and will only slow us down; we should go straight to pure ML-DSA-44. 6 Hybrid key exchange is reasonably easy, with ephemeral keys that don’t even need a type or wire format for the composite private key, and a couple years ago it made sense to take the hedge. Authentication is not like that, and even with draft-ietf-lamps-pq-composite-sigs-15 with its 18 composite key types nearing publication, we’d waste precious time collectively figuring out how to treat these composite keys and how to expose them to users. It’s also been two years since Kyber hybrids and we’ve gained significant confidence in the Module-Lattice schemes. Hybrid signatures cost time and complexity budget, 5 and the only benefit is protection if ML-DSA is classically broken before the CRQCs come , which looks like the wrong tradeoff at this point. In symmetric encryption , we don’t need to do anything, thankfully. There is a common misconception that protection from Grover requires 256-bit keys, but that is based on an exceedingly simplified understanding of the algorithm . A more accurate characterization is that with a circuit depth of 2⁶⁴ logical gates (the approximate number of gates that current classical computing architectures can perform serially in a decade) running Grover on a 128-bit key space would require a circuit size of 2¹⁰⁶. There’s been no progress on this that I am aware of, and indeed there are old proofs that Grover is optimal and its quantum speedup doesn’t parallelize . Unnecessary 256-bit key requirements are harmful when bundled with the actually urgent PQ requirements, because they muddle the interoperability targets and they risk slowing down the rollout of asymmetric PQ cryptography. In my corner of the world, we’ll have to start thinking about what it means for half the cryptography packages in the Go standard library to be suddenly insecure, and how to balance the risk of downgrade attacks and backwards compatibility. It’s the first time in our careers we’ve faced anything like this: SHA-1 to SHA-256 was not nearly this disruptive, 7 and even that took forever with the occasional unexpected downgrade attack. Trusted Execution Environments (TEEs) like Intel SGX and AMD SEV-SNP and in general hardware attestation are just f***d. All their keys and roots are not PQ and I heard of no progress in rolling out PQ ones, which at hardware speeds means we are forced to accept they might not make it, and can’t be relied upon. I had to reassess a whole project because of this, and I will probably downgrade them to barely “defense in depth” in my toolkit. Ecosystems with cryptographic identities (like atproto and, yes, cryptocurrencies) need to start migrating very soon, because if the CRQCs come before they are done , they will have to make extremely hard decisions, picking between letting users be compromised and bricking them. File encryption is especially vulnerable to store-now-decrypt-later attacks, so we’ll probably have to start warning and then erroring out on non-PQ age recipient types soon. It’s unfortunately only been a few months since we even added PQ recipients, in version 1.3.0 . 8 Finally, this week I started teaching a PhD course in cryptography at the University of Bologna, and I’m going to mention RSA, ECDSA, and ECDH only as legacy algorithms, because that’s how those students will encounter them in their careers. I know, it feels weird. But it is what it is. For more willing-or-not PQ migration, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @[email protected] . Traveling back from an excellent AtmosphereConf 2026 , I saw my first aurora, from the north-facing window of a Boeing 747. My work is made possible by Geomys , an organization of professional Go maintainers, which is funded by Ava Labs , Teleport , Tailscale , and Sentry . Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement .) Here are a few words from some of them! Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews. Ava Labs — We at Ava Labs , maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network ), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩ Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩

0 views

Stamp It! All Programs Must Report Their Version

Recently, during a production incident response, I guessed the root cause of an outage correctly within less than an hour (cool!) and submitted a fix just to rule it out, only to then spend many hours fumbling in the dark because we lacked visibility into version numbers and rollouts… 😞 This experience made me think about software versioning again, or more specifically about build info (build versioning, version stamping, however you want to call it) and version reporting. I realized that for the i3 window manager, I had solved this problem well over a decade ago, so it was really unexpected that the problem was decidedly not solved at work. In this article, I’ll explain how 3 simple steps (Stamp it! Plumb it! Report it!) are sufficient to save you hours of delays and stress during incident response. Every household appliance has incredibly detailed versioning! Consider this dishwasher: (Thank you Feuermurmel for sending me this lovely example!) I observed a couple household appliance repairs and am under the impression that if a repair person cannot identify the appliance, they would most likely refuse to even touch it. So why are our standards so low in computers, in comparison? Sure, consumer products are typically versioned somehow and that’s typically good enough (except for, say, USB 3.2 Gen 1×2!). But recently, I have encountered too many developer builds that were not adequately versioned! Unlike a physical household appliance with a stamped metal plate, software is constantly updated and runs in places and structures we often cannot even see. Let’s dig into what we need to increase our versioning standard! Usually, software has a name and some version number of varying granularity: All of these identify the Chrome browser on my computer, but each at different granularity. All are correct and useful, depending on the context. Here’s an example for each: After creating the i3 window manager , I quickly learned that for user support, it is very valuable for programs to clearly identify themselves. Let me illustrate with the following case study. When running , you will see output like this: Each word was carefully deliberated and placed. Let me dissect: When doing user support, there are a couple of questions that are conceptually easy to ask the affected user and produce very valuable answers for the developer: Based on my experiences with asking these questions many times, I noticed a few patterns in how these debugging sessions went. In response, I introduced another way for i3 to report its version in i3 v4.3 (released in September 2012): a flag! Now I could ask users a small variation of the first question: What is the output of ? Note how this also transfers well over spoken word, for example at a computer meetup: Michael: Which version are you using? User: How can I check? Michael: Run this command: User: It says 4.24. Michael: Good, that is recent enough to include the bug fix. Now, we need more version info! Run please and tell me what you see. When you run , it does not just report the version of the i3 program you called, it also connects to the running i3 window manager process in your X11 session using its IPC (interprocess communication) interface and reports the running i3 process’s version, alongside other key details that are helpful to show the user, like which configuration file is loaded and when it was last changed: This might look like a lot of detail on first glance, but let me spell out why this output is such a valuable debugging tool: Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. I use NixOS, BTW, so I automatically get a stable identifier ( ) for the specific build of i3. To see the build recipe (“derivation” in Nix terminology) which produced this Nix store output ( ), I can run : Unfortunately, I am not aware of a way to go from the derivation to the source, but at least one can check that a certain source results in an identical derivation. The versioning I have described so far is sufficient for most users, who will not be interested in tracking intermediate versions of software, but only the released versions. But what about developers, or any kind of user who needs more precision? When building i3 from git, it reports the git revision it was built from, using : A modified working copy gets represented by a after the revision: Reporting the git revision (or VCS revision, generally speaking) is the most useful choice. This way, we catch the following common mistakes: As we have seen above, the single most useful piece of version information is the VCS revision. We can fetch all other details (version numbers, dates, authors, …) from the VCS repository. Now, let’s demonstrate the best case scenario by looking at how Go does it! Go has become my favorite programming language over the years, in big part because of the good taste and style of the Go developers, and of course also because of the high-quality tooling: I strive to respect everybody’s personal preferences, so I usually steer clear of debates about which is the best programming language, text editor or operating system. However, recently I was asked a couple of times why I like and use a lot of Go, so here is a coherent article to fill in the blanks of my ad-hoc in-person ramblings :-). Read more → Therefore, I am pleased to say that Go implements the gold standard with regard to software versioning: it stamps VCS buildinfo by default! 🥳 This was introduced in Go 1.18 (March 2022) : Additionally, the go command embeds information about the build, including build and tool tags (set with -tags), compiler, assembler, and linker flags (like -gcflags), whether cgo was enabled, and if it was, the values of the cgo environment variables (like CGO_CFLAGS). Both VCS and build information may be read together with module information using or runtime/debug.ReadBuildInfo (for the currently running binary) or the new debug/buildinfo package. Note: Before Go 1.18, the standard approach was to use or similar explicit injection. This setup works (and can still be seen in many places) but requires making changes to the application code, whereas the Go 1.18+ stamping requires no extra steps. What does this mean in practice? Here is a diagram for the common case: building from git: This covers most of my hobby projects! Many tools I just , or if I want to easily copy them around to other computers. Although, I am managing more and more of my software in NixOS. When I find a program that is not yet fully managed, I can use and the tool to identify it: It’s very cool that Go does the right thing by default! Systems that consist of 100% Go software (like my gokrazy Go appliance platform ) are fully stamped! For example, the gokrazy web interface shows me exactly which version and dependencies went into the build on my scan2drive appliance . Despite being fully stamped, note that gokrazy only shows the module versions, and no VCS buildinfo, because it currently suffers from the same gap as Nix: For the gokrazy packer, which follows a rolling release model (no version numbers), I ended up with a few lines of Go code (see below) to display a git revision, no matter if you installed the packer from a Go module or from a git working copy. The code either displays (the easy case; built from git) or extracts the revision from the Go module version of the main module ( ): What are the other cases? These examples illustrate the scenarios I usually deal with: This is what it looks like in practice: But a version built from git has the full revision available (→ you can tell them apart): When packaging Go software with Nix, it’s easy to lose Go VCS revision stamping: So the fundamental tension here is between reproducibility and VCS stamping. Luckily, there is a solution that works for both: I created the Nix overlay module that you can import to get working Go VCS revision stamping by default for your Nix expressions! Tip: If you are not a Nix user, feel free to skip over this section. I included it in this article so that you have a full example of making VCS stamping work in the most complicated environments. Packaging Go software in Nix is pleasantly straightforward. For example, the Go Protobuf generator plugin is packaged in Nix with <30 lines: official nixpkgs package.nix . You call , supply as the result from and add a few lines of metadata. But getting developer builds fully stamped is not straightforward at all! When packaging my own software, I want to package individual revisions (developer builds), not just released versions. I use the same , or if I need the latest Go version. Instead of using , I provide my sources using Flakes, usually also from GitHub or from another Git repository. For example, I package like so: The comes from my : Go stamps all builds, but it does not have much to stamp here: Here’s a full example of gokrazy/bull: To fix VCS stamping, add my overlay to your : (If you are using , like I am, you need to apply the overlay in both places.) After rebuilding, your Go binaries should newly be stamped with buildinfo: Nice! 🥳 But… how does it work? When does it apply? How do you know how to fix your config? I’ll show you the full diagram first, and then explain how to read it: There are 3 relevant parts of the Nix stack that you can end up in, depending on what you write into your files: For the purpose of VCS revision stamping, you should: Hence, we will stick to the left-most column: fetchers. Unfortunately, by default, with fetchers, the VCS revision information, which is stored in a Nix attrset (in-memory, during the build process), does not make it into the Nix store, hence, when the Nix derivation is evaluated and Go compiles the source code, Go does not see any VCS revision. My Nix overlay module fixes this, and enabling the overlay is how you end up in the left-most lane of the above diagram: the happy path, where your Go binaries are now stamped! How does the overlay work? It functions as an adapter between Nix and Go: So the overlay implements 3 steps to get Go to stamp the correct info: For the full source, see . See Go issue #77020 and Go issue #64162 for a cleaner approach to fixing this gap: allowing package managers to invoke the Go tool with the correct VCS information injected. This would allow Nix (or also gokrazy) to pass along buildinfo cleanly, without the need for workarounds like my adapter . At the time of writing, issue #77020 does not seem to have much traction and is still open. My argument is simple: Stamping the VCS revision is conceptually easy, but very important! For example, if the production system from the incident I mentioned had reported its version, we would have saved multiple hours of mitigation time! Unfortunately, many environments only identify the build output (useful, but orthogonal), but do not plumb the VCS revision (much more useful!), or at least not by default. Your action plan to fix it is just 3 simple steps: Implementing “version observability” throughout your system is a one-day high-ROI project. With my Nix example, you saw how the VCS revision is available throughout the stack, but can get lost in the middle. Hopefully my resources help you quickly fix your stack(s), too: Now go stamp your programs and data transfers! 🚀 Chrome 146.0.7680.80 Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} “This works in Chrome for me, did you test in Firefox?” “Chrome 146 contains broken middle-click-to-paste-and-navigate” “I run Chrome 146.0.7680.80 and cannot reproduce your issue” “Apply this patch on top of Chrome f08938029c887ea624da7a1717059788ed95034d-refs/branch-heads/7680_65@{#34} and follow these steps to reproduce: […]” : I could have shortened this to or maybe , but I figured it would be helpful to be explicit because is such a short name. Users might mumble aloud “What’s an i-3-4-2-4?”, but when putting “version” in there, the implication is that i3 is some computer thing (→ a computer program) that exists in version 4.24. is the release date so that you can immediately tell if “ ” is recent. signals when the project was started and who is the main person behind it. gives credit to the many people who helped. i3 was never a one-person project; it was always a group effort. Question: “Which version of i3 are you using?” Since i3 is not a typical program that runs in a window (but a window manager / desktop environment), there is no Help → About menu option. Instead, we started asking: What is the output of ? Question: “ Are you reporting a new issue or a preexisting issue? To confirm, can you try going back to the version of i3 you used previously? ”. The technical terms for “going back” are downgrade, rollback or revert. Depending on the Linux distribution, this is either trivial or a nightmare. With NixOS, it’s trivial: you just boot into an older system “generation” by selecting that version in the bootloader. Or you revert in git, if your configs are version-controlled. With imperative Linux distributions like Debian Linux or Arch Linux, if you did not take a file system-level snapshot, there is no easy and reliable way to go back after upgrading your system. If you are lucky, you can just the older version of i3. But you might run into dependency conflicts (“version hell”). I know that it is possible to run older versions of Debian using snapshot.debian.org , but it is just not very practical, at least when I last tried. Can you check if the issue is still present in the latest i3 development version? Of course, I could also try reproducing the user issue with the latest release version, and then one additional time on the latest development version. But this way, the verification step moves to the affected user, which is good because it filters for highly-motivated bug reporters (higher chance the bug report actually results in a fix!) and it makes the user reproduce the bug twice , figuring out if it’s a flaky issue, hard-to-reproduce, if the reproduction instructions are correct, etc. A natural follow-up question: “ Does this code change make the issue go away? ” This is easy to test for the affected user who now has a development environment. Connecting to i3 via the IPC interface is an interesting test in and of itself. If a user sees output, that implies they will also be able to run debugging commands like (for example) to capture the full layout state. During a debugging session, running is an easy check to see if the version you just built is actually effective (see the line). Note that this is the same check that is relevant during production incidents: verifying that effectively running matches supposed to be running versions. Showing the full path to the loaded config file will make it obvious if the user has been editing the wrong file. If the path alone is not sufficient, the modification time (displayed both absolute and relative) will flag editing the wrong file. People build from the wrong revision. People build, but forget to install. People install, but their session does not pick it up (wrong location?). Nix fetchers like are implemented by fetching an archive ( ) file from GitHub — the full repository is not transferred, which is more efficient. Even if a repository is present, Nix usually intentionally removes it for reproducibility: directories contain packed objects that change across runs (for example), which would break reproducible builds (different hash for the same source). We build from a directory, not a Go module, so the module version is . The stamped buildinfo does not contain any information. Fetchers. These are what Flakes use, but also non-Flake use-cases. Fixed-output derivations (FOD). This is how is implemented, but the constant hash churn (updating the line) inherent to FODs is annoying. Copiers. These just copy files into the Nix store and are not git-aware. Avoid the Copiers! If you use Flakes: ❌ do not use as a Flake input ✅ use instead for git awareness I avoid the fixed-output derivation (FOD) as well. Fetching the git repository at build time is slow and inefficient. Enabling , which is needed for VCS revision stamping with this approach, is even more inefficient because a new Git repository must be constructed deterministically to keep the FOD reproducible. Nix tracks the VCS revision in the in-memory attrset. Go expects to find the VCS revision in a repository, accessed via file access and commands. It synthesizes a file so that Go’s detects a git repository. It injects a command into the that implements exactly the two commands used by Go and fails loudly on anything else (in case Go updates its implementation). It sets in the environment variable. Stamp it! Include the source VCS revision in your programs. This is not a new idea: i3 builds include their revision since 2012! Plumb it! When building / packaging, ensure the VCS revision does not get lost. My “VCS rev with NixOS” case study section above illustrates several reasons why the VCS rev could get lost, which paths can work and how to fix the missing plumbing. Report it! Make your software print its VCS revision on every relevant surface, for example: Executable programs: Report the VCS revision when run with For Go programs, you can always use Services and batch jobs: Include the VCS revision in the startup logs. Outgoing HTTP requests: Include the VCS revision in the HTTP responses: Include the VCS revision in a header (internally) Remote Procedure Calls (RPCs): Include the revision in RPC metadata User Interfaces: Expose the revision somewhere visible for debugging. My overlay for Nix / NixOS My repository is a community resource to collect examples (as markdown content) and includes a Go module with a few helpers to make version reporting trivial.

0 views
Jason Scheirer 1 weeks ago

Golang Webview Installer for Wails 3

Top Matter : Codeberg for the library , doc for the library . I’ve forked Lea Anthony’s library that eventually made its way into core Wails for two reasons: So here we are. I want it in Wails 3 and it’s not there I want to shave a meg off the binary size by not providing the embedded installer exe

0 views
Anton Zhiyanov 1 weeks ago

Porting Go's strings package to C

Creating a subset of Go that translates to C was never my end goal. I liked writing C code with Go, but without the standard library it felt pretty limited. So, the next logical step was to port Go's stdlib to C. Of course, this isn't something I could do all at once. I started with the io package , which provides core abstractions like and , as well as general-purpose functions like . But isn't very interesting on its own, since it doesn't include specific reader or writer implementations. So my next choices were naturally and — the workhorses of almost every Go program. This post is about how the porting process went. Bits and UTF-8 • Bytes • Allocators • Buffers and builders • Benchmarks • Optimizing search • Optimizing builder • Wrapping up Before I could start porting , I had to deal with its dependencies first: Both of these packages are made up of pure functions, so they were pretty easy to port. The only minor challenge was the difference in operator precedence between Go and C — specifically, bit shifts ( , ). In Go, bit shifts have higher precedence than addition and subtraction. In C, they have lower precedence: The simplest solution was to just use parentheses everywhere shifts are involved: With and done, I moved on to . The package provides functions for working with byte slices: Some of them were easy to port, like . Here's how it looks in Go: And here's the C version: Just like in Go, the ( → ) macro doesn't allocate memory; it just reinterprets the byte slice's underlying storage as a string. The function (which works like in Go) is easy to implement using from the libc API. Another example is the function, which looks for a specific byte in a slice. Here's the pure-Go implementation: And here's the C version: I used a regular C loop to mimic Go's : But and don't allocate memory. What should I do with , since it clearly does? I had a decision to make. The Go runtime handles memory allocation and deallocation automatically. In C, I had a few options: An allocator is a tool that reserves memory (typically on the heap) so a program can store its data structures there. See Allocators from C to Zig if you want to learn more about them. For me, the winner was clear. Modern systems programming languages like Zig and Odin clearly showed the value of allocators: An is an interface with three methods: , , and . In C, it translates to a struct with function pointers: As I mentioned in the post about porting the io package , this interface representation isn't as efficient as using a static method table, but it's simpler. If you're interested in other options, check out the post on interfaces . By convention, if a function allocates memory, it takes an allocator as its first parameter. So Go's : Translates to this C code: If the caller doesn't care about using a specific allocator, they can just pass an empty allocator, and the implementation will use the system allocator — , , and from libc. Here's a simplified version of the system allocator (I removed safety checks to make it easier to read): The system allocator is stateless, so it's safe to have a global instance: Here's an example of how to call with an allocator: Way better than hidden allocations! Besides pure functions, and also provide types like , , and . I ported them using the same approach as with functions. For types that allocate memory, like , the allocator becomes a struct field: The code is pretty wordy — most C developers would dislike using instead of something shorter like . My solution to this problem is to automatically translate Go code to C (which is actually what I do when porting Go's stdlib). If you're interested, check out the post about this approach — Solod: Go can be a better C . Types that don't allocate, like , need no special treatment — they translate directly to C structs without an allocator field. The package is the twin of , so porting it was uneventful. Here's usage example in Go and C side by side: Again, the C code is just a more verbose version of Go's implementation, plus explicit memory allocation. What's the point of writing C code if it's slow, right? I decided it was time to benchmark the ported C types and functions against their Go versions. To do that, I ported the benchmarking part of Go's package. Surprisingly, the simplified version was only 300 lines long and included everything I needed: Here's a sample benchmark for the type: Reads almost like Go's benchmarks. To monitor memory usage, I created — a memory allocator that wraps another allocator and keeps track of allocations: The benchmark gets an allocator through the function and wraps it in a to keep track of allocations: There's no auto-discovery, but the manual setup is quite straightforward. With the benchmarking setup ready, I ran benchmarks on the package. Some functions did well — about 1.5-2x faster than their Go equivalents: But (searching for a substring in a string) was a total disaster — it was nearly 20 times slower than in Go: The problem was caused by the function we looked at earlier: This "pure" Go implementation is just a fallback. On most platforms, Go uses a specialized version of written in assembly. For the C version, the easiest solution was to use , which is also optimized for most platforms: With this fix, the benchmark results changed drastically: Still not quite as fast as Go, but it's close. Honestly, I don't know why the -based implementation is still slower than Go's assembly here, but I decided not to pursue it any further. After running the rest of the function benchmarks, the ported versions won all of them except for two: Benchmarking details is a common way to compose strings from parts in Go, so I tested its performance too. The results were worse than I expected: Here, the C version performed about the same as Go, but I expected it to be faster. Unlike , is written entirely in Go, so there's no reason the ported version should lose in this benchmark. The method looked almost identical in Go and C: Go's automatically grows the backing slice, while does it manually ( , on the contrary, doesn't grow the slice — it's merely a wrapper). So, there shouldn't be any difference. I had to investigate. Looking at the compiled binary, I noticed a difference in how the functions returned results. Go returns multiple values in separate registers, so uses three registers: one for 8-byte , two for the interface (implemented as two 8-byte pointers). But in C, was a single struct made up of two unions and a pointer: Of course, this 56-byte monster can't be returned in registers — the C calling convention passes it through memory instead. Since is on the hot path in the benchmark, I figured this had to be the issue. So I switched from a single monolithic type to signature-specific types for multi-return pairs: Now, the implementation in C looked like this: is only 16 bytes — small enough to be returned in two registers. Problem solved! But it wasn't — the benchmark only showed a slight improvement. After looking into it more, I finally found the real issue: unlike Go, the C compiler wasn't inlining calls. Adding and moving to the header file made all the difference: 2-4x faster. That's what I was hoping for! Porting and was a mix of easy parts and interesting challenges. The pure functions were straightforward — just translate the syntax and pay attention to operator precedence. The real design challenge was memory management. Using allocators turned out to be a good solution, making memory allocation clear and explicit without being too difficult to use. The benchmarks showed that the C versions outperformed Go in most cases, sometimes by 2-4x. The only exceptions were and , where Go relies on hand-written assembly. The optimization was an interesting challenge: what seemed like a return-type issue was actually an inlining problem, and fixing it gave a nice speed boost. There's a lot more of Go's stdlib to port. In the next post, we'll cover — a very unique Go package. In the meantime, if you'd like to write Go that translates to C — with no runtime and manual memory management — I invite you to try Solod . The and packages are included, of course. implements bit counting and manipulation functions. implements functions for UTF-8 encoded text. Loop over the slice indexes with ( is a macro that returns , similar to Go's built-in). Access the i-th byte with (a bounds-checking macro that returns ). Use a reliable garbage collector like Boehm GC to closely match Go's behavior. Allocate memory with libc's and have the caller free it later with . Introduce allocators. It's obvious whether a function allocates memory or not: if it has an allocator as a parameter, it allocates. It's easy to use different allocation methods: you can use for one function, an arena for another, and a stack allocator for a third. It helps with testing and debugging: you can use a tracking allocator to find memory leaks, or a failing allocator to test error handling. Figuring out how many iterations to run. Running the benchmark function in a loop. Recording metrics (ns/op, MB/s, B/op, allocs/op). Reporting the results.

0 views
Taranis 2 weeks ago

Go has some tricks up its logging sleeve

Since it's more or less TDOV (IYKYK...), I'm going to talk about logging instead. Logging isn't exactly the most shiny or in-your-face thing that coders tend to think about, but it really can make or break large systems. Throwing in a few print statements (or fmt.Printf, or whatever) only scratches the surface. I'm mostly talking about my own logging library here. If there's interest, I'd consider releasing it as open source, but it's currently a bit of a moving target. Feel free to comment if you think you'd find it useful, and I'll try to find the time to split it out from the Euravox codebase and put it on GitHub. The Go programming language ships with logging capabilities in the standard library, found in the log package. If you don't have any better alternatives, using that package rather than raw fmt.Printf is far preferable. My own logging package is a bit nicer. It's not my first – one of my first jobs working in financial markets data systems back in the 90s was the logging subsystem for the Reuter Workstation, and there is some influence from that 30-odd years later in my library. One of the first things I always recommend is breaking out log messages by log level. I currently define the following: It's possible to set a configuration parameter that limits logging at a particular level. This makes it possible to crank logging all the way up for tests, but dial it down for production without changing the code or having to introduce if/then guards around the logging. It was a finding back in the 90s that systems would sometimes break when you took the logging out – this isn't something that's normally a problem with Go, because idiomatic code doesn't tend to have too many side-effects, but it was quite noticeable with C++. Of course, the library doesn't do the string formatting if the level is disabled, but any parameters are still evaluated, which tends to be a less risky approach. It's common to send log messages to stdout or stderr. There's nothing fundamentally wrong with this, but I find it useful to have deeper capabilities than this. My own library has three options, which can be used together (and with different log levels): Any good logging solution should be able to include file name and line number information in log output. Using an IDE like vscode, this allows control/command-clicking a log entry and immediately seeing the code that generated it. C and C++ support this via some fancy #define stunts. Go lacks this kind of preprocessor, but actually has something far better: the runtime.Caller() library function. This makes it possible to pull back the file name and line number (and program counter if you care) anywhere up the call stack. This code fragment comes from my logging function. The argument to Caller is typically 2, because this code is called from one of many convenience functions for syntactic sugar. Typical log commands look something like this: The logging library will automatically pick up the file paths and line numbers where the log commands are located. However, this isn't always useful, and sometimes can be a complete nightmare. Here's a small example: In this case, the file name and line number that will be logged will be where the command is located. This can be absolutely maddening if has many call sites, because they will look exactly the same in the log. My logging library has a small tweak that I've not seen elsewhere – I'm not claiming invention or ownership, because it's so obviously useful that I'd be shocked if nobody else has ever done it. It's just I've not personally seen it. Anyway, here goes: In this case, works similarly to , but it takes an extra parameter at the start, which represents how many extra stack frames to look through to find the filename and line number. The parameter returns the filename and line number of the immediate caller, so the thing that makes its way into the log is the location of the calls, not the logging calls themselves. This might seem to be a subtle difference, but the practical consequences are huge – get this right, and logs become useful traces of activity that make it possible to look backwards in time to see when particular data items have been acted upon, and exactly by what code. Almost as good as single-stepping with a debugger, but can be done after the fact. Anyway, in conclusion, trans women are women, trans men are men, nonbinary and all other variant identities are valid. And fuck fascism. SPM -- Spam messages. Very verbose logging, not something you'd normally use, but the kind of thing that makes all the difference doing detailed debugging. INF -- Information messages. These are intended to be low volume, used to help trace what systems are doing, but not actually representing a an error (i.e., they explicitly are used to log normal behaviour) WRN -- Warning messages. What it says on the tin. Something is possibly wonky, but not bad enough to be an actual error. Real production systems should have as close to zero of these things as possible -- samething should either be normal (INF) or an actual error (ERR). ERR -- Error messages. This represents recoverable errors. Something bad happened, but the code can keep running without risk. FTL -- Fatal errors. These errors show that something very bad has happened, and that the code must abort immediately. There are two cases where this is appropriate. One is when something catastrophic has happened -- system has run out of handles, process is OOMing, etc. The second is where a serious logic bug has been detected. Though in some cases ERR can be OK for this, aborting makes it easier to spot that processes in production are badly broken (e.g., after a bad push), and need to be rolled back. stdout. Nothing special here, but I do have the option to send colour control codes for terminals that support it, which makes logs much more readable. Files. This is similar to piping the process through the tee command, but has the advantage that things like log rotation can be built in. I need to get around to supporting log rotation, but file output works now. Circular buffer. This is the one you don't see often. The idea here is you maintain an in-RAM circular buffer of N lines (say about 5000), which can be exposed via code. I use this to provide an HTTP/HTML interface that makes it possible to watch log output on a process via a web browser. This is a godsend when you have a large number of processes running across multiple VMs and/or physical machines.

0 views
Allen Pike 2 weeks ago

The Rise of Transparency

Small companies are, by default, very transparent. When there are 4 people working in a room, you have a direct line of sight on what everybody else is doing, and why. Your docs, Slack channels, and repositories are open to everybody. When the CEO has an epiphany that changes everything, you all know right away – probably because you were at lunch together when it happened. Thus, startup founders will often get religion about transparency. “Our culture,” they’ll declare, “is to be radically transparent! Everything defaults to open. We hire adults, expect them to do great work, and give them the context they need.” Yay transparency! And this works pretty well. Transparent orgs tend to delegate more effectively, have higher accountability, less politics, faster trust, and just plain ship more. Transparency helps bigger orgs adapt more quickly to the ground truth, responding to customer signals that execs might not be directly exposed to. But, at a certain scale, radical transparency strains. Some idle musing by the CEO sends a team off on an unimportant side quest. A well-justified compensation anomaly upsets a group who is missing background information. A 450-message Slack thread about bike shed paint color choices devolves into factions, hashtags, and philosophical arguments about the morality of taupe. #nevertaupe And if you talk to people at a large yet highly transparent company, you’ll hear about the hazards of the relentless firehose . A thousand shared Slack channels, to start. But also a glut of docs – some critical, most unmaintained. Then there’s the meeting notes, meeting recordings, and meeting invites. Plus proposals, requests for comment, and requests to comment on your proposals’ comments’ resolutions. “So, you like information, eh? Well, have all the information in the world!” How do you make sense of all this? While some people are tenaciously able to find, within this chaos, the important info they need to do great work, a lot of otherwise-capable people get easily distracted by information that just might be urgent, provocative, or even just… shiny. 💫 Meanwhile, allowing everybody access to every historical doc is occasionally useful, but it also presents an ever-growing surface area for leaks and legal liability. Are you sure there isn’t something highly sensitive or disagreeable in those 99,999 unmaintained Notion docs? So, as companies grow, they tend to lock information down. Some – Netflix, Stripe, Shopify – do their best to keep as transparent as possible while still complying with necessary guardrails. Others – Apple, Palantir, Oracle – move toward a need-to-know basis, ensuring information flows top-down. With more control over information, it’s easier to ensure that leaks or internal distractions don’t derail your plans for surprising product launches and/or world domination. Of course, every company’s culture is forged by the market they operate in, but there’s always some tradeoff here. And as companies grow, they tend to regress to a boring middle ground. However. As with many tradeoffs, the balance has recently begun to shift. Recently, we’ve seen a revolution in tools that can make better use of the firehose. Slack can now summarize your unread messages, albeit with mixed effectiveness. Tools like Glean and Unblocked can consider a mountain of your company’s data and answer important questions about it, albeit limited to the data they can actually see. And large open companies like Shopify and Stripe have internal tools that let employees’ agents query, analyze, and act on the copious data any given employee has access to – albeit with some sharp edges and exfiltration risks. Just as LLMs are making the world’s data more useful to the world, they’re making companies’ internal data more useful to employees. Of course, this can be misused! In some companies we’ll see further secrecy – I’ve heard of AI search tools and MCPs letting employees find accidentally-visible compensation data and other spicy docs that hadn’t been audited. I’ve heard of support agents giving customers true-but-problematic information because they surfaced it with internal AI tooling without proper training. But as we evolve past early growing pains, and into teams and processes fully making use of this stuff, the anecdata points toward this new tooling becoming a superpower. Agents’ newfound ability to effectively query and reason about far more data than can fit into context is making the long tail of communications and docs much more useful for decision-making – but only when people have access to the relevant data. Given that, the maturation of AI tooling will motivate companies to become more transparent . In 2024, the cost of being internally secretive was meaningful but manageable. Although Apple keeping information need-to-know sometimes leads to waste, or important changes being slow to diffuse through layers of management, they’ve done, like, pretty well for themselves? With all the scrutiny from press, competitors, and regulators, you can see why they’ve kept it up. But as all companies increasingly have tools that can assess, consider, analyze, and make use of all the business’ communications and documents, what kinds of org are going to benefit most? Well, the ones that let their employees access more context. Extremely transparent orgs like Zapier, GitLab, and PostHog that might have struggled to cope with their firehoses – and who often had gaps in the data due to untranscribed meetings and decisions – will increasingly be able to leverage it. Sure, not all of it, certainly not at first. (Some of it is just junk.) But increasingly more of it. And critically, it won’t just be executives that will be able to attend to all this knowledge. The frontend dev working on your internal admin dashboard should be flagged that the React upgrade issue they’re battling right now was just solved by the customer-facing dev team. The intermediate developer who is incensed about a company-wide tech decision should be able to build their understanding of why it was made without booking a 1:1 with the responsible Principal Engineer. Your go-to-market team should be able to “see” through to the code, developers’ conversations, and the recent decisions around a given feature, letting them give customers correct and timely information about what to actually expect from the product today. And everybody in your company should, when it’s useful, have key company-wide strategy docs available to their agents as they make plans and decisions. And then, when a new revelation motivates the exec team to improve those docs, then bam. All the product engineers’ agents will take this new strategy into account right away. Anybody who’s worked at a large company and/or used CLAUDE.md knows this won’t be a silver bullet – deeply ingrained habits and momentum can not be simply prompted away. But as the tools and the data improve, the advantage will accumulate. When we launched a realtime meeting agent last month, we expected to get feedback about its defaults being too open – currently, Cedarloop defaults to sharing its collaborative notes and tools with all attendees live. But instead, we’ve seen two diverging kinds of feedback: many of our users want the tool to be less visible to external guests and customers, but more open internally within their companies. Which in retrospect makes a lot of sense: decisions and actions in your team’s work are increasingly useful across your company, but your customers shouldn’t need to worry about all that. So long story short, more internal transparency is coming. It will take some time. Apple isn’t doomed, and just because Zapier and Shopify are already working that way doesn’t mean they’re going to instantly be turbo-boosted. But it seems a new era is coming, where siloed knowledge, information hoarding, and secrecy-by-default will become less tenable. The firehose will evolve from a spicy distraction to a useful input to important work.

0 views
(think) 2 weeks ago

Batppuccin: My Take on Catppuccin for Emacs

I promised I’d take a break from building Tree-sitter major modes, and I meant it. So what better way to relax than to build… color themes? Yeah, I know. My idea of chilling is weird, but I genuinely enjoy working on random Emacs packages. Most of the time at least… For a very long time my go-to Emacs themes were Zenburn and Solarized – both of which I maintain popular Emacs ports for. Zenburn was actually one of my very first open source projects (created way back in 2010, when Emacs 24 was brand new). It served me well for years. But at some point I got bored. You know the feeling – you’ve been staring at the same color palette for so long that you stop seeing it. My experiments with other editors (Helix, Zed, VS Code) introduced me to Tokyo Night and Catppuccin , and they’ve been my daily drivers since then. Eventually, I ended up creating my own Emacs ports of both. I’ve already published emacs-tokyo-themes , and I’ll write more about that one down the road. Today is all about Catppuccin. (and by this I totally mean Batppuccin!) There’s already an official Catppuccin theme for Emacs , and it works. So why build another one? A few reasons. The official port registers a single theme and switches between flavors (Mocha, Macchiato, Frappe, Latte) via a global variable and a reload function. This is unusual by Emacs standards and breaks the normal workflow – theme-switching packages like need custom glue code to work with it. It also loads color definitions from an external file in a way that fails when Emacs hasn’t marked the theme as safe yet, which means some users can’t load the theme at all. Beyond the architecture, there are style guide issues. is set to the default text color, making variables invisible. All levels use the same blue, so org-mode headings are flat. forces green on all unstyled code. Several faces still ship with magenta placeholder colors. And there’s no support for popular packages like vertico, marginalia, transient, flycheck, or cider. I think some of this comes from the official port trying to match the structure of the Neovim version, which makes sense for their cross-editor tooling but doesn’t sit well with how Emacs does things. 1 Batppuccin is my opinionated take on Catppuccin for Emacs. The name is a play on my last name (Batsov) + Catppuccin. 2 I guess you can think of this as ’s Catppuccin… or perhaps Batman’s Catppuccin? The key differences from the official port: Four proper themes. , , , and are all separate themes that work with out of the box. No special reload dance needed. Faithful to the style guide. Mauve for keywords, green for strings, blue for functions, peach for constants, sky for operators, yellow for types, overlay2 for comments, rosewater for the cursor. The rainbow heading cycle (red, peach, yellow, green, sapphire, lavender) makes org-mode and outline headings actually distinguishable. Broad face coverage. Built-in Emacs faces plus magit, vertico, corfu, marginalia, embark, orderless, consult, transient, flycheck, cider, company, doom-modeline, treemacs, web-mode, and more. No placeholder colors. Clean architecture. Shared infrastructure in , thin wrapper files for each flavor, color override mechanism, configurable heading scaling. The same pattern I use in zenburn-emacs and emacs-tokyo-night-theme . I didn’t really re-invent anything here - I just created a theme in a way I’m comfortable with. I’m not going to bother with screenshots here – it looks like Catppuccin, because it is Catppuccin. There are small visual differences if you know where to look (headings, variables, a few face tweaks), but most people wouldn’t notice them side by side. If you’ve seen Catppuccin, you know what to expect. The easiest way to install it right now: Replace with , , or for the other flavors. You can also switch interactively with . I remember when Solarized was the hot new thing and there were something like five competing Emacs ports of it. People had strong opinions about which one got the colors right, which one had better org-mode support, which one worked with their favorite completion framework. And that was fine! Different ports serve different needs and different tastes. The same applies here. The official Catppuccin port is perfectly usable for a lot of people. Batppuccin is for people who want something more idiomatic to Emacs, with broader face coverage and stricter adherence to the upstream style guide. Both can coexist happily. I’ve said many times that for me the best aspect of Emacs is that you can tweak it infinitely to make it your own, so as far as I’m concerned having a theme that you’re the only user of is perfectly fine. That being said, I hope a few of you will appreciate my take on Catppuccin as well. This is an early release and there’s plenty of room for improvement. I’m sure there are faces I’ve missed, colors that could be tweaked, and packages that deserve better support. If you try it out and something looks off, please open an issue or send a PR. I’m also curious – what are your favorite Emacs themes these days? Still rocking Zenburn? Converted to modus-themes? Something else entirely? I’d love to hear about it. That’s all from me, folks! Keep hacking! The official port uses Catppuccin’s Whiskers template tool to generate the Elisp from a template, which is cool for keeping ports in sync across editors but means the generated code doesn’t follow Emacs conventions.  ↩︎ Naming is hard, but it should also be fun! Also – I’m a huge fan of Batman.  ↩︎ The official port uses Catppuccin’s Whiskers template tool to generate the Elisp from a template, which is cool for keeping ports in sync across editors but means the generated code doesn’t follow Emacs conventions.  ↩︎ Naming is hard, but it should also be fun! Also – I’m a huge fan of Batman.  ↩︎

0 views
Carlos Becker 2 weeks ago

Announcing GoReleaser v2.15

This version a big one for Linux packaging - Flatpak bundles and Source RPMs land in the same release, alongside a rebuilt documentation website and better Go build defaults.

0 views
HeyDingus 2 weeks ago

Launchpad was great for uninstalling apps; Spotlight is not

Apple published this video to their Support channel on YouTube yesterday, and it motivated me to get this off my chest: Uninstalling apps on macOS is not as easy as it should be. Yes, I know, I know that you can just drag an app to the trash and technically it’s gone. That’s what Apple recommends doing in its video. But then why do are apps like Raycast , CleanMyMac , and AppCleaner able to find leftover files scattered around your system by the deleted app? Maybe it’s just the completionist in me, but I don’t want those files left behind! One thing — the only thing? — I liked about Launchpad was that it made it super obvious how to uninstall (Mac App Store) apps. 1 Just like on your iPad/iPhone, you could click and hold on the app’s icon to send it into “ jiggle mode” and then click the ‘ X’ would remove it. I could be confident that all the app’s associated bits and bobs would be removed from my system. But that changed with Tahoe. While Spotlight got a huge boost in capability as a whole with clipboard history and actions, it also subsumed Launchpad’s role as the main, well, launcher for apps. But there are no affordances in Spotlight for removing apps like Launchpad had. AppCleaner was my go-to tool back in the day, but now I use Raycast to get the job done with confidence. Raycast’s implementation could offer some inspiration for Apple. After searching for an app within Raycast, a simple ⌘K shortcut reveals a host of actions that can be taken on the app. You can open an app, reveal it in the Finder, quit it, and, yes, uninstall it — among other things. Apple could follow this model and provide an ‘ Uninstall App’ action to take within Spotlight. Spotlight’s interface, seeing as it replaced Launchpad, should offer the same capability for removing apps. And it should be as thorough as on an iPhone or iPad. P.S. I also occasionally use Raycast to quit apps that stubbornly have no icon in the Dock or menu bar and therefore make it tricky quit completely. Apps installed outside of the Mac App Store would not display the ‘ X’ to remove it. You had to do it the “ old fashioned” way of dragging the app to the trash and then hunt down its system files. ↩︎ HeyDingus is a blog by Jarrod Blundy about technology, the great outdoors, and other musings. If you like what you see — the blog posts , shortcuts , wallpapers , scripts , or anything — please consider leaving a tip , checking out my store , or just sharing my work. Your support is much appreciated! I’m always happy to hear from you on social , or by good ol' email . Apps installed outside of the Mac App Store would not display the ‘ X’ to remove it. You had to do it the “ old fashioned” way of dragging the app to the trash and then hunt down its system files. ↩︎

0 views
Stratechery 2 weeks ago

An Interview with Arm CEO Rene Haas About Selling Chips

Listen to this post: Good morning, This week’s Stratechery Interview is with Arm CEO Rene Haas, who I previously spoke to in January 2024 , and who recently made a major announcement at Arm’s first-ever standalone keynote : the long-time IP-licensing company is undergoing a dramatic shift in its business model and selling its own chips for the first time. We dive deep into that decision in this interview, including the meta of the keynote, Arm’s history, and how the company has evolved, particularly under Haas’ leadership. Then we get into why CPUs matter for AI, and how Arm’s CPU compares to Nvidia’s, x86, and other custom Arm silicon. At the end we discuss the risks Arm faces, including a maxed-out supply chain, and how the company will need to change to support this new direction. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Rene Haas, welcome back to Stratechery. RH: Ben Thompson, thank you. Well, you used to be someone special, I think you were the only CEO I talked to who did nothing other than license IP, now you’re just another fabless chip guy like [Nvidia CEO] Jensen [Huang] or [Qualcomm CEO] Cristiano [Amon]. RH: (laugh) Yeah, you can put me in that category, I guess. Well the reason to talk this week is about the momentous announcements you made at the Arm Everywhere keynote — you will be selling your own chip. But before I get to the chip, i’m kind of interested in the meta of the keynote itself, is this Arm Everywhere concept new like as far as being a keynote? Why have your own event? RH: You know, we were talking a little bit about this going into the day. I don’t think we’ve ever as a company done anything like this. Yeah I didn’t think so either, I was trying to verify just to make sure my memory was correct, but yes it’s usually like at Computex or something like that. RH: Our product launches have usually been lower key, we try to use them usually around OEM products that are using our IP that use our partner’s chips, but we just felt like this was such a momentous day for the company/very different day for the company that we want to do something very, very unique. So it was very intentional, we were chatting about it prior, I don’t think we’ve done anything like before. Who was the customer for the keynote specifically? Because you’re making a chip — Meta is your first customer, they knew about this, they don’t need to be told — what was the motivation here? Who are you targeting? RH: When you prepare for these things, that’s one of the first questions you ask yourself, “Who is this for?”, “Is it for the ecosystem?”, “Is it for customers?”, “Is it for investors?”, “Is it for employees?”, and I think under the umbrella of Arm Everywhere, the answer to those questions was “Yes”, everybody. We felt we needed to, because a lot of questions come up on this, right, Ben, in terms of, “What are we doing?” “Why are we doing?”, “What’s this all about?”, the answer to that question was “Yes”, it was for everyone. One more question: Why the name “Arm Everywhere”? RH: We were trying to come up with something that was going to thematically remind people a bit about who Arm was and what we are and what we encompass, but not actually tease out that we were going to be announcing something. Right, you can’t say “Arm’s New Chip Event”. RH: (laughing) Yes, exactly, “Come to the new product launch that we’ve not yet announced”. So we just decided that that would be enough of a teaser to get people interested. Just to note you said, “What Arm was “, what was Arm? You used the past tense there. RH: Yeah, and I will say, we are still doing IP licensing, you can still buy CSSs [Compute Subsystem Platforms], so we are still offering all of the products we did before that day and plus chips, so I’m not yet just another chip CEO, I think I’m still very different than the other folks you talked to. Actually, back up, give me the whole Rene Haas version of the history of Arm. RH: Oh, my goodness gracious. The company was born out of a joint venture way back in the day between Acorn Computer and then ultimately Apple and VLSI to design a low-power CPU to power PDAs. The thing that was kind of important was, “I need something that is going to run in a plastic package” — you may remember back then just about everything was in ceramic — “I can’t melt the PDA, and oh, by the way, this thing’s got to run off a battery”. So they chose a RISC architecture, and that’s where the ARM ISA [ instruction set architecture ] was born and that’s what the first chip was intended to do, and the thing wasn’t very successful. So fast forward, however, the founders and then a very, very important guy in Arm’s history, Robin Saxby , put out a goal to make the ARM ISA the global standard for CPUs. And if you go back to early 1990s, there were a lot of CPUs out there and also there was not an IP business, there really wasn’t a very good fabless semiconductor model, and there was not a very good set of tools to develop SoCs [system on a chip] . So in some ways, and this is what I love about the company, it was a bit of a crazy idea because you didn’t really have all the things in place necessary to go off and do that. But back then, there were a lot of companies designing their own CPUs, if you will, and the idea there being that ultimately this would be something that customers could be able to access, acquire, and build, and then ultimately build a standard upon it. It was ultimately the killer design win for the company, and I know you’re a strategist and historian as well around this area, is the classic accidental example of TI was developing the baseband modem for an applications processor for the Nokia GSM phone and they needed a microcontroller, something to kind of manage the overall process, and they stumbled across what we were doing, and we licensed them the IP. That was kind of the first killer license that got the company off the ground and that’s what really got us into mobile. People may think, “You were the heart of the smartphone and you had this premonition to design around iOS” or, “You worked really closely in the early days of Android”, it was the accidental, we found ourselves into the Nokia phone, GSM phone, Symbian gets ported to ARM, and then there starts to be at least enough of a buzz around nascent software, but that’s how the company was born. I did enjoy for the keynote, you had a bunch of different Arm devices in the run-up running on the screen, and my heart did do a little pitter-patter when the Nokia phones popped on. Another day, to be sure. RH: Yeah, cool stuff right? But that’s kind of how the company got off the ground, and as it was a general purpose CPU which meant we didn’t really have it designed for, “It’s going to be good at X”, or, “It’s going to be good at Y, it’s going to be good at Z”, it turned out that because it was low power, it was pretty good to run in a mobile application. I think the historic design win where the company took off was obviously the iPhone, and the precursor to the iPhone was the iPod was using a chipset from PortalPlayer that used the ARM7 and the Mac OS was all x86, and then inside the company, it was Tony Fadell’s team arguing , “Let’s use this PortalPlayer architecture”, versus, “Do we go with Intel’s x86 and a derivative atom”, back in the day, and once a decision was made that “We’re going to port to ARM for iOS”, that’s where the tailwind took off. So is it definitely making up too much history to go back and say, “The reason Arm was a joint venture to start is because people knew you needed to have an ecosystem and not be owned by any one company”, or whatever it might be, that’s being too cute about things — the reality is it was just stumbling around, barely surviving, and just fell backwards into this? RH: Which, by the way, every good startup that’s really been successful, that’s kind of how the formula works. You stumble around in the dark, you find something you’re good at and then you engage with a customer and you find what ultimately is sticky and that’s really what happened with Arm. When you consider the changes that you’ve made at Arm, and I want to get your description of the changes that you’ve made, but how many of the challenges that you face were based on legitimate market fears about, “We’re going to alienate customers” or whatever it might be versus maybe more cultural values like, “We serve everyone”, versus almost like a fear like, “This is just the market we’ve got, let’s hold on to it”? RH: I think, Ben, we thought about it much more broadly, and when I took over and you and I met not long after that, there were a couple of things that were happening in the market in terms of a need to develop SoCs faster, a need to get to market more quickly and we knew that intuitively that no one knew how to combine 128 Arm cores together with a mesh network and have it perform better than we could because that’s what we had to do to go off and verify the cores. So we knew that doing compute subsystems really mattered, but I came from a bit of a different belief that if you own the ISA at the end of the day, you are the platform, you are the compute platform and it is incumbent upon you to think about how to have a closer connection between the hardware and the software, that is just table stakes. I don’t think it’s anything new, if you think about what Steve Jobs thought about with Apple and everything we’ve seen with Microsoft, with Wintel. I felt with Arm, particularly not long after I started, in 2023 and 2024, this was only getting accelerated with AI. Because with AI, the models and innovation moving way, way faster than the hardware could possibly keep up. I just felt for the company in the long term that this was a direction that we had to strongly consider, because if you are the ISA and you are the platform, the chip is not the product, the system is. That’s the thing that I was sort of driving at when I was writing about your launch. There’s an aspect where you’ve made these big changes, you’re originally just the ISA, then you’re doing your own cores, not selling them, but you’re basically designing the cores, then you’re moving to these systems on a chip designs and now you’re selling your own chips. But it feels like your portion of the overall, “What is a computer?”, has stayed fairly stable, actually, because, “What is a computer?”, is just becoming dramatically more expansive. RH: I think that’s exactly right. Again, if you are a curator of the architecture and you are an owner of the ISA, as good as the performance-per-watt is, as interesting as the microarchitecture is, as cool as it is in terms of how you do branch prediction, the software ecosystem determines your destiny. And the software ecosystem for anyone building a platform needs to have a much closer relationship between hardware and software, simply in terms of just how fast can you bring features to market, how fast can you accelerate the ecosystem, and how can you move with the direction of travel in terms of how things are evolving. You mentioned the big turning point or biggest design win was the iPhone way back in the day, and the way I’ve thought about Arm versus x86 — there’s been, you could make the case, ARM/RISC has been theoretically more efficient then CISC, and I’ve talked to Pat Gelsinger about how there was a big debate in Intel way back in the 80s about should we switch from CISC to RISC, and he was on the side of and won the argument that by the time we port everything to RISC we could have just built a faster CISC chip that is going to make up all the difference and that carried the day for a very long time. However, mobile required a total restart, you had to rebuild everything from scratch to deliver the power efficiency, and I guess the question is, you’ve had a similar dynamic for a long time about Arm in the data center theoretically is better, you care about power efficiency etc, is there something now — is this an iPhone-type moment where there’s actually an opportunity for a total reset to get all the software rewritten that needs to be done? Or have companies like Amazon and Qualcomm or whatever efforts they’ve done paved the ground that it’s not so stark of a change? RH: It’s a combination of both. One of the big advantages we got with Amazon doing Graviton in 2019, and then subsequently the designs we had with Google, with Axion, and Microsoft with Cobalt, is it just really accelerated everything going on with cloud-native, and anything that moves to cloud-native has kind of started with ARM. What do you mean by cloud native? RH: Cloud-native meaning these are applications that are starting from scratch to be ported to ARM. Built on a Linux distro, but not having to carry anything about running super old legacy software or running COBOL or something of that nature on-prem, so that was a huge benefit for us in terms of the go-forward. Certainly we got a huge interjection of growth when Nvidia went from the generation before Hopper, which I think was Volta or Pascal, I may be mixing up their versions, which was an x86 connect to Grace. So when they went to Grace Hopper, then Grace Blackwell, and now Vera, the AI stack for the head node now starts to look like ARM, that helps a lot in terms of how the data center is organized, so we certainly got a benefit with that. I think for us, the penny drop moment was when, and it’s probably 2018, 19 timeframe, is when Red Hat had production Linux distros for ARM and that really also accelerated things in terms of the open source community, the uploads and things that made things a lot, a lot easier from the software standpoint. Give me the timeline of this chip. When did you make the decision to build this chip? You can tell me now, when did this start? RH: You know, it started with a CSS, right? And we were talking to Meta about the CSS implementation. Right. And just for listeners, CSS is where you’re basically delivering the design for a whole system on a chip sort of thing. RH: Compute subsystem, yeah, so it’s the whole system on a chip. And by the way, it’s probably 95% of the IP that sits on a chip. What doesn’t include? It doesn’t include the I/O, the PCIe controllers, the memory controllers, but it’s most of the IP. And this is what undergirds — is Cobalt really the first real shipping CSS chip? Or does Graviton fall under this as well? RH: Cobalt’s probably the first incarnation of using that, so Meta was looking at using that and I think the discussions were taking place in the 2025 timeframe, mid-2025 timeframe. Here’s the key thing, Ben, not that long ago. Right. Well, that was my sense it was not that long ago, so I’m glad to hear that confirmed. RH: Not that long ago. Because CSS takes you a lot of the way there so that discussion in around the 2025 timeframe that we were going back and forth of, “Are you licensing CSS”, versus, “Could you build something for us?”, and we had been musing about, “Was this the right thing for us to do from a strategy standpoint?”, and how we thought about it, but ultimately it came down to Meta saying, “We really want you to do this for us, we think this is going to be the best way to accelerate time to market and give us a chip that’s performant and in the schedule that we need”, so somewhere in the 2025-ish timeframe, we agreed that, yes, we’ll do this for you. Why did Meta want you to do it instead of them finishing it off themselves? RH: I think they just did the ROI, in terms of, “I’ve got a lot of people working on things like MTIA , I’ve got a whole bunch of different projects internally, is it better that you do it versus we do it”? “How much can we actually differentiate a CPU”? RH: Yeah and by the way, that is ultimately what it comes down to at some point in time and the fact that the first one that came back works, it’s going to be able to go into production, and it’s ready to go. I’m not going to say they were shocked, but we kind of knew that was going to happen because we knew how to do this stuff and the products were highly performant and tested in the CSS, so it happened fast is the short answer. So if we talk about Arm crossing the Rubicon, was it actually not you selling this chip it was when you did CSS? RH: One could say that that was a big step. When we started talking about doing CSSs, let me step back, we made a decision to do CSSs— Explain CSSs and that decision because I think that’s actually quite interesting. RH: What is a CSS? It’s a compute subsystem, it takes all of the blocks of IP that we sold individually and puts them together in a fully configured, verified, performant deliverable that we can just hand to the customer and they can go off and complete the SoC. Some customers have told us it saves a year, some say a year-and-a-half and this is really around the test and verification in terms of the flow. One of the examples I gave, it’s a little cheeky, but it kind of worked during the road show, was when we were trying to explain to investors, “What’s IP, what’s a CSS?”, I said, go to the Lego store, and you’ve got a bin of Legos, yellow Legos, red Legos, blue Legos, trying to buy all those Legos and building the Statue of Liberty is a pain, or you can go over to the boxes where it’s the Statue of Liberty and just put those pieces together, and the Statue of Liberty is going to look beautiful. This is what the CSS was. I just want to jump in on that, because I was actually thinking about this, the Lego block concept is a common one that’s used when talking about semiconductors, but I remember being back in business school, and this was 2010, somewhere around then, and one of the case studies that we did was actually Lego, and the case study was the thought process of Lego deciding whether or not to pursue IP licensing as opposed to sticking with their traditional model, and all these trade-offs about, “We’re going to change our market”, “We’re going to lose what Lego is”, the creativity aspect, “It’s going to become these set pieces”. I just thought about that in this context where I came down very firmly on the side of, “Of course they should do this IP licensing”, but it was almost the counter was this sort of traditionalist argument which is kind of true — Legos today are kind of like toys for adults to a certain extent, and you build it once, reading directions and you think back to when I was a kid and you had all the Legos and it was just your creativity and your imagination and I’m like, “Maybe this analogy with Arm is actually more apt than it seems”. There’s a very romantic notion of IP licensing, you go out and make new things, “We got this for you”, versus, “No we’re just giving you the whole chip”, or in this case of CSS you, to your point, you could go get The Statue of Liberty, don’t even bother building it yourself. RH: And I think I came across this in the early days. In the 1990s, I was working with ASIC design at Compaq Computer, and they were doing all their ASICs for Northbridge , Southbridge , VGA controllers, and this is when the whole chipset industry took off. And I remember one of the senior guys at Compaq explaining why you’re doing this, he said, “I’m all about differentiation, but there needs to be a difference”. And to some extent, that’s a little bit of this, right? You can spend all the time building it, but if it’s all built and you spent all this time and it’s not functionally different nor performant different, but you spent time — well, if you’re playing around with Legos and you got all day, that’s fine — but if you’re running a business and you’re trying to get products out quickly, then time is everything, and that’s really what CSS did. It kind of established to folks that, “My gosh, I can save a lot of time on the work I was doing that was not highly differentiated”, and in fact, in some case, it was undifferentiated because we could get to a solution faster in such a way that it was much more performant than what folks might be trying to get to the last mile. So when we started talking about this to investors back in 2023 during the roadshow, their first question was, “Aren’t you going to be competing with your customers?”, and, “Isn’t this what your customers do?”, and, “Aren’t they going to be annoyed by it?”, and my answer was, “If it provides them benefit, they’ll buy it, if it does not present a benefit, they won’t buy it”, that’s it. And what we found is a lot of people are taking it, even in mobile, where people where we were told was, “No, no, these are the black belts and they’re going to grind out the last mile and you can’t really add a lot of value” — we’ve done a bunch in the mobile space, too. So with Meta, was the deal like, “Okay, we’ll do the whole thing for you, but then we get a sell to everyone?”, and they’re like, “That’s fine, we don’t care, it doesn’t matter”? RH: Yes, exactly. We said, “If we’re going to do this, how do you feel about us selling it to other customers?”, and they said, “We’re fine with that”. When did you realize that the CPU was going to be critical to AI? RH: Oh, I think we always thought it was. I had a cheeky little slide in the keynote about the demise of the CPU, and I had to spend a lot of time. I mean, I don’t know, I might have talked to someone recently who I swear was pretty adamant that a lot of CPUs should be replaced with GPUs, and now they’re selling CPUs, too. RH: I had to talk to investors and media to explain to them why a CPU was even needed. They were a little bit like, “Can’t the GPU run by itself?”, it’s like a kite that doesn’t need anything to hang on to. First off, on table stakes, obviously you need the data center but particularly as AI moves into smaller form factors, physical AI, edge, where you obviously have to have a CPU because you’re running display, you have I/O, you have human interface. It’s how do you add accelerated AI onto the CPU? So yeah, I think we kind of always knew it was going to be there, and there was going to be continued demand for it. Right, but there’s a difference between everyone on the edge is going to have a CPU so we can layer on some AI capabilities. It doesn’t have the power envelope or the cost structure to support a dedicated GPU, that’s fair, that’s all correct. It’s also correct that, to your point, a GPU needs a CPU to manage its scheduling and its I/O and all those sorts of things, but what I’m asking about specifically is actually, we’re going to have these agentic workflows, all of which what the agent does is CPU tasks and so it’s not just that we will continue to need CPUs, we might actually need an astronomical more amount of CPUs. Was that part of your thesis all along? RH: I think we have instinctively thought that to be the case. And what drives that? The sheer generation of tokens, tokens by the pound, tokens by the dump truck, if you will. The more tokens that the accelerators are generating, whether that’s done by agentic input, human input, whatever the input is, the more tokens that are generated, those tokens have to be distributed. And the distribution of those tokens, how they are managed, how they are orchestrated, how they are scheduled, that is a CPU task purely. So we kind of intuitively felt that over time, as these data centers go from hundreds of megawatts to gigawatts, you are going to need, at a minimum, CPUs that have more cores, period. There was this belief of 64 cores might be enough and maybe 128 cores would be the limit, Graviton 5 is 192 cores, the Arm AGI CPU is 136, we were already starting to see core counts go up, and we started thinking about, “What’s driving all these core counts going up, is it agentic AI?”. A proxy for it was just sheer tokens being generated in a larger fashion that needed to be distributed in a fast way and what was layered onto that was things like Codex, where latency matters, performance matters, delivering the token at speed matters. So I think all of that was bringing us to a place that we thought, “Yeah, you know what?”, we’re seeing this core count thing really starting to go up, we were seeing that about a year ago, Ben. So am I surprised that the CPU demand is exploding the way it is? Not really. Agentic AI, just the acceleration of how these agents have been launched, certainly is another tailwind kicker. Which happens to line up with your mid-2025 decision that, “Maybe we should sell CPUs”. RH: Yeah, it all kind of lines up. We were seeing that, you know what, we think that this is going to be a potentially really, really large market where not only core count matters, but number of cores matters, efficiency matters because we could imagine a world where each one of these cores is running an agent or a hypervisor and the number of cores can really, really matter in the system, which laid claim to what we were thinking about in terms of, “Okay, we can see a path here in terms of where things are going”. So CSSs with greater than 128 cores in the implementation? Absolutely. Do I think, could I see 256? Absolutely. Could I see 512? Possibly. I think then it comes down to the memory subsystem, how you keep them fed, etc., but yeah, so short answer, about a year ago we started seeing this. Do you think that core count is going to be most important or is it going to be performance-per-core? RH: I think core count is going to be quite important because I think, again, I have a belief that each one of these cores will want to potentially run their own agent, launch a hypervisor job, launch a job that can be run independently, launch it, get the work done, go to sleep. The performance of the core is going to matter, no doubt about it, but I think the efficiency of that core is probably going to matter just as much as the performance is. Well, the reason I ask is because you talked a lot in this presentation about the efficiency advantage, where the company born from a battery or whatever your phrase was, and that certainly, I think, rings true, particularly in isolation. But in a large data center, if the biggest cost is the GPUs, then isn’t it more important to keep the GPUs fed? Which basically to say, is a chip’s capability to feed GPUs actually more important on a systemic level than necessarily the chip’s efficiency on its own? RH: I’m going to plead the fifth and say yes to both. You’ve got to pick one! RH: Well, what’s important? I think the design choice that Nvidia made with Vera was very important, Vera is designed to feed Rubin, it has a very specific interface, NVLink Fusion or NVLink chip-to-chip, provides a blazing fast interface, and has the right number of cores in terms of to keep that GPU fed optimally. But at the same time, is it the right configuration in a general-purpose application where you want to run an air-cooled rack in the same data hall? If you think about a data hall where you might have a Vera Rubin liquid-cooled rack sitting right next to a liquid-cooled Vera rack, but somewhere else inside the data center, you’ve got room for multiple air-cooled racks. That space that you may have not used in the past for CPU, you want to because of the problem statement that I just gave. So I actually think it’s a “both” world, which is why when people ask me, “Oh my gosh, aren’t you competing with Nvidia Vera, and aren’t people going to get confused?” — not particularly, I think there’s ample space for both. So you feel like Nvidia might be selling standalone Vera racks but that’s not necessarily what Vera was designed for, that’s what you’re designed for, and you think that’s where you’re going to be different. RH: Yes, and I mean, if you look at what’s been announced so far from Nvidia, they announced a giant 256-CPU liquid-cooled rack and the first implementation that we’re doing with Meta is a much smaller air-cooled rack. So very, very different right off the get-go. But you will have a liquid-cooled option? RH: If customers want that, we can do that too. I think that differentiation makes sense. Well, speaking of differentiation, why ARM versus x86? Why is there an opportunity here? RH: Performance-per-watt, period. Graviton sort of started it, and they’ve been very public about their 40% to 50%, Cobalt stated the same with Microsoft, Axion, Google stated the same, Nvidia has stated the same. Just on table stakes, 2x performance-per-watt is pretty undeniable. And that, I think, it starts there as probably the primary value proposition. What is x86 still better at? You can’t say legacy software, other than legacy software. RH: Go back to our earlier part of our conversation, right? The ISA, what is the value of the ISA? It is the software that it runs, right? It is the software that it runs. So if you were to look at where does x86 have a stronghold, x86 is very good at legacy on-prem software. Ok, fine, we’ll give you legacy on-prem software and I think part of the thesis here to your point a lot of this agentic work, it’s on Linux, it’s using containers, it’s all relatively new, it all by and large works well in ARM already, but you did have a bit in the presentation where you interviewed a guy from Meta that was about porting software. How much work still needs to be done there? RH: There’s a delta between the porting work and the optimization work. Graviton, what Amazon will tell you, is that greater than 50% of their new deployments and accelerating is ARM-based. And, yes, am I the CEO of Arm and do I have a biased opinion? Of course. But I find it hard to, on a clean sheet design, if you were starting from scratch and the software porting was done and you had either cloud-native or the application space was established or as a head node, I don’t know why you’d start with x86. What about, why are you doing ARM? We did ARM versus x86, I’m sort of working my way down the chain here — actually, I did backwards, we stuck in Vera already — but why you versus custom silicon generally? You talked about Amazon. Why do you need to do the whole thing? RH: So let’s think about an Amazon, for example. Amazon does Graviton, would I like Amazon to buy the Arm AGI CPU? Yes. Am I going to be heartbroken if they never buy one? No, I’m perfectly fine if they stay building what they’re building. Are they ever going to buy one? No. RH: I hope they do! But if they don’t, it’s not going to be the end of the world. SAP — SAP runs a lot of software on Amazon, they run SAP HANA on Amazon, they also have a desire to do stuff on-prem and if they’re doing something on-prem in a smaller space and they’re looking to leverage that work, they’d love to have something that is ARM-based. Prior to us doing this product, there was no option at all, right? So that’s a very, very good example. Similar with a Cloudflare. Is Cloudflare going to do their own implementation? Likely not. Do they run on other people’s clouds? Sure, they do. Do they have an application that could be on-prem running on ARM? Absolutely. So we think that, and I don’t want to prefetch this, Ben, but we had a lot of questions from folks like, “Amazon won’t buy from you”, “Google won’t buy from you”, “Microsoft won’t buy from you”, because you’re competing with them. And we say, well, Google builds TPUs, yet they buy a lot of Nvidia GPUs, so it’s not so binary. That’s true. They’ll buy what their customers ask them to buy. RH: 100%. And if we solve a problem with an implementation that theirs does not, they’ll buy it, and if we don’t, they won’t. Just you know between you and me, is the only customer silicon that is truly potentially competitive Qualcomm and you’re just not too worried about making them mad? RH: This is off the record here? (laughing) I didn’t say off the record. RH: Qualcomm, it’s funny, I had a question at the investor conference about competing with Nvidia. And I said, you know, a month ago, no one would have asked about any Arm person competing with anybody. So it’s wonderful to have these kind of conversations, the market is underserved and there aren’t choices. There isn’t a product from Qualcomm, there isn’t a product from MediaTek, there isn’t a product from Infineon, there just isn’t. Is that sort of your case? If there were a bunch of options in the market, would you still be entering? RH: We entered this because Meta asked us to and because Meta asked us to we did. So if I was to answer your question, would we have entered if those other four guys were there or five hypotheticals? I don’t know that Meta would have asked us. If the Arm AGI CPU, it’s being built on TSMC’s 3-nm node, which is kind of impossible to get allocation for. How’d you get allocation? If you started this in 2025, how’d you pull that off? RH: We’re working through a back-end ASIC partner that helps secure the allocation for us. Oh, interesting. Are you concerned about that in the long run ? Like this business blows up and actually you just can’t make enough chips? RH: I’m probably less worried about that at the moment than I am about memory. I think that the business, the demand is very, very high actually for the chip, Ben and through our partner, we’re able to secure upside through TSMC, that has not been a problem. But memory is quite challenging and I think if there’s any limit to how big this business can get and I would say that what we provided to investors as a financial forecast is based upon the capacity we’ve secured on both memory and logic but if there was more memory could we sell more? Yes. This is sort of the sweet spot though of making predictions, everyone gets to say, “Wow, how are your predictions so accurate?”, it’s like, “Well it’s because I knew exactly how much what I would be able to make”. RH: Yeah, if there was more memory we’d be even more aggressive on the numbers. How did you make the memory decisions that you did in terms of memory bandwidth and all those sorts of pieces, particularly given the short timeline which you made this you. That wasn’t necessarily part of the CSS spec before, so how were you thinking about that? RH: The things we kind of looked at was, we sort of started with LP versus standard DRAM . Because Vera’s doing LP and you decided to do standard. RH: We’re doing standard DRAM, yeah. We thought we’d be a little bit better on the cost side that could help and at the same time, a little bit better on the capacity side. So it really kind of drove down to, we’re going to solve for capacity because we thought that that might matter in a more generalized application space to give the broader width of use, which then brought us to standard DDR versus LP. I think the reason we talked last time was in the context of you making a deal with Intel to get Arm working on 18A, and this was going to be a multi-generational partnership. What happened to that? Is that still around? RH: It’s still around. We did a lot of work on 18A because we felt that it was going to be really, really important if someone wanted to build on Intel 18A, that the Arm IP was available. So we did our part relative to if someone wants to go build an ARM-based SoC on Intel process, but that unfortunately hasn’t come to pass just yet. It’s interesting you mentioned that you’re actually not worried about TSMC capacity but you are worried about memory — I didn’t fully think through that being another headwind for Intel where they could really use TSMC having insufficient capacity to help them, but if memory is the first constraint then no one’s even getting there. RH: First off, obviously HBM [ high bandwidth memory ] being such a capacity hog, and then people moving from LP into HBM at the memory guys, then compounding on it, all of the explosion of the CPU demand drives up memory demand. So it all kind of adds on to itself, which makes the memory problem pretty acute. What exactly is in the bill of materials that you’re selling? You showed racks but you mentioned a partnership with Super Micro for example — if I buy a chip from Arm what exactly am I buying? You’ve mentioned memory obviously, so what else is in that? And what are you getting from partners? RH: Yeah, so we’ll send you a voucher code after the show, and you can place your orders. Just the SoCs. If you need to secure the memory, that’s on you, we’re not securing memory at this point in time. We did a lot of work with Super Micro, with Lenovo, with ASRock. So there’s a full 1U, 2U server blade reference architecture so the full BOM relative to all the passives and everything you need from an interconnect standpoint is all there. There’s a full BOM, which, as we mentioned in the session, the rack physically itself complies with OCP standards and then we’ve done all the work in terms of the reference design. So we can provide the full BOM of the reference platform, memory, but what we are selling only is the SoC. Very nerdy question here, but how are you going to report this from an accounting perspective? Just right off the top chips have a very different margin profile, is this going to all be broken out? How are you thinking about that? RH: We’ll probably do that. Today we break down licensing and royalty of the IP business, we’ll probably break out chips as a separate revenue stream. To go back to, you did call this event Arm Everywhere, will you ever sell a smartphone chip? RH: I don’t know, that’s a really hard question. I think we’re going to look at areas where we think we could add significant value to a market that’s underserved, that market’s pretty well served. It’s very well served and this agentic AI, potentially a new market, fresh software stack, makes sense to me. What risks are you worried about with this? You come across as very confident, “This is very obviously what we should”, how does this go wrong? RH: Most of my career has been spent actually in companies that have chips as their end business as opposed to IP. I’ve been at Arm 12 years, 13 years, I’ve been the CEO for about four-and-a-half. I did a couple of years, two, three years at a company called Tensilica that was doing, or actually the longer, five years, but most of my career was either NEC Semiconductor, Texas Instruments, Nvidia. Chip business is not easy, right? You introduce a whole different new set of characteristics. You have to introduce this term called “inventory” to your company. RH: RMAs, inventory, customer field failures, just a whole cadre of things that’s very new for our company, there certainly is execution risk that we’ve added that has not existed before. We had a 35-year machine being built that is incredibly good at delivering world-class IP to customers — doing chips is a whole different deal. I don’t want to minimize that, but at the same time, I don’t want to communicate that that’s something that we haven’t thought about deeply over the years and we’ve got a lot of people who have done that work inside the company. A lot of my senior executive team, ex-Broadcom, ex-Marvell, ex-Nvidia, we’ve got a lot of people inside the engineering organization who have come from that world, we’ve built up an operations team to go off and support that. So while there is risk, we’ve been taking a lot of steps inside the company to be adding the resources. We’ve been increasing our OpEx quite a bit in the quarters leading up to this, about 25% year-on-year, investors were asking a ton of questions about, “When are we going to see why you’re adding all those people?”, and Arm Everywhere explained that. We also told investors that that’s now going to taper off because we’ve got, we think what we need to go off and execute on all this. But I think that’s the biggest thing, Ben. And the upside is just absolute revenue dollars, I guess absolute profit dollars. RH: I think there’s a financial upside, certainly, in terms of financial dollars. But I think back to the platform, I think by being closer to the hardware and the software and the systems, we can develop even better products around IP, CSS, etc. because I think when you are the compute platform, it is incumbent upon you to have as close a relationship as you can between the software that’s developed on your platform. What’s the state of the business in China these days, by the way? RH: China still represents probably 15% of our revenue, we still have a joint venture in China, the majority of our businesses is royalties, royalties is much bigger than licensing in China. We still have a lot of design wins coming in the mobile space for people doing their own SoCs like a Xiaomi. The hyperscaler market is strong between Alibaba, ByteDance, Tencent, and then most of the robotics and EV guys are doing stuff based on ARM, whether it’s XPeng, BYD, Horizon Robotics. So our business is pretty healthy in China. You do have the Immortalis and Mali GPUs. Are those good at AI? RH: Yes they can be very good, we’ve added a lot of things to to our GPUs around what we call neural graphics so this is adding essentially a convolution and vector engine that can can help with AI. Right now the focus has been really more around AI in a graphics application, whether it’s around things like DLSS and things of other area, but we’ve got a lot of ingredients in those GPUs. So we should stay tuned, sounds very interesting. You did have one moment in the presentation that was a little weird, you were trying to say that this AI thing is definitely a real thing but you’re like, “Well it might be a financial bubble, but the AI is real”. Are you worried about all this money that is going into this that you’re making a play for a piece of, but is there some consternation in that regard? RH: No, what I was trying to indicate was when people talk about bubbles, typically it’s either valuation bubbles or investment bubbles. The valuation bubbles, those come and go over time. The investment bubble, I’m not as worried about in the sense of, “Is there going to be real ROI on the investment being made?”, I actually worry more about the, “Can you get all the stuff required to build out all of the scale?” — we just talked about memory, there’s TSMC capacity. I think the memory will be solved, they will ultimately not be able to help themselves, they will build more capacity, I’m worried about leading edge. TSMC will help themselves if they don’t have any challengers. RH: Turbines, right? You’ve got companies who are like GE Vernova or Mitsubishi, this is not their world of building factories well ahead to go serve an extra 5 to 10 gigawatts of power. So I think TSMC is super disciplined, and they’ve been world class at that throughout their history. Will the memory guys be able to help themselves? The numbers are now so large that even the Sandisk’s of the world and storage, everything has kind of gotten bananas, and that is a concern in terms of if just one of those key components of the supply chain blinks and decides not to invest to provide the capacity, then things kind of slow down. But the numbers, Ben, the numbers we’re talking about are numbers we’ve never seen before. $200 billion CapEx from an Amazon or $200 billion CapEx from a Google. And then you have companies like Anthropic talking about $6 billion revenue increases over a three-to-four month period, which are the size of some software companies. So we are in some very stratospheric levels in terms of spend that would I be surprised if there was a pause in something just as people calibrate? Yeah, I wouldn’t be surprised at all. But if I think about the 5 to 10-year trajectory, there’s no way you can say this is a bubble. If you said, “I think machines that can think as well as humans and make us more productive, that’s kind of a fad”, I don’t actually think that’s going to happen, it’s almost nonsensical. Just to sort of go full circle, you’ve been on the edge, and now this new product that gets the Arm Everywhere moniker but it’s about being in the data center — is the edge dead? Or if not dead is it are we in a fundamental shift where the most important compute is going to be in data centers or is there a bit where AI is real but it actually does leave the data center, go to the edge and that’s a bigger challenge? RH: I think until something is invented that is different than the transformer, and we talk about some very different model as to how AI is trained and inferred, then we’re looking at a lot of compute in the data center and some level of compute on the edge. I think if you just suspend animation for a second and we say, you know what, the transformer is it, and that’s what the world looks like for the next number of, the next 5 to 10 years, the edge is not going to be dead. The edge is going to have to run some level of native compute for whatever the thing has to do, and it’s going to run some AI acceleration, of course. But is everything going to happen in your pocket? No. I mean, that’s not going to happen. I’ve come down to that side too. I think in the fullness of time, at least for now, the thin client model, it looks like it’s going to be it. I guess that seems to be your case as well because you had a big event, it is for a data center GPU. Arm is Everywhere, but not everyone can buy it. RH: And power efficiency was a nice to have in the data center, but I would say it wasn’t existential. It is now, though. And I say that’s another big change because, again, one of the examples I gave, if you’re 4x-ing or 5x-ing or 6x-ing the CPUs in a given data center and you don’t want to give up one ounce of GPU accelerator power, then you’re going to squeeze everywhere you can and that, I think, is a thing that’s in our favor. Where’s Arm in 10 years? RH: I would like to think of as one of the most important semiconductor companies on the planet. We’re not there yet, but that’s how I would like the company to be thought about. Rene Haas, congratulations, great to talk. RH: Thank you, Ben. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Anton Zhiyanov 3 weeks ago

Porting Go's io package to C

Creating a subset of Go that translates to C was never my end goal. I liked writing C code with Go, but without the standard library it felt pretty limited. So, the next logical step was to port Go's stdlib to C. Of course, this isn't something I could do all at once. So I started with the standard library packages that had the fewest dependencies, and one of them was the package. This post is about how that went. io package • Slices • Multiple returns • Errors • Interfaces • Type assertion • Specialized readers • Copy • Wrapping up is one of the core Go packages. It introduces the concepts of readers and writers , which are also common in other programming languages. In Go, a reader is anything that can read some raw data (bytes) from a source into a slice: A writer is anything that can take some raw data from a slice and write it to a destination: The package defines many other interfaces, like and , as well as combinations like and . It also provides several functions, the most well-known being , which copies all data from a source (represented by a reader) to a destination (represented by a writer): C, of course, doesn't have interfaces. But before I get into that, I had to make several other design decisions. In general, a slice is a linear container that holds N elements of type T. Typically, a slice is a view of some underlying data. In Go, a slice consists of a pointer to a block of allocated memory, a length (the number of elements in the slice), and a capacity (the total number of elements that can fit in the backing memory before the runtime needs to re-allocate): Interfaces in the package work with fixed-length slices (readers and writers should never append to a slice), and they only use byte slices. So, the simplest way to represent this in C could be: But since I needed a general-purpose slice type, I decided to do it the Go way instead: Plus a bound-checking helper to access slice elements: Usage example: So far, so good. Let's look at the method again: It returns two values: an and an . C functions can only return one value, so I needed to figure out how to handle this. The classic approach would be to pass output parameters by pointer, like or . But that doesn't compose well and looks nothing like Go. Instead, I went with a result struct: The union can store any primitive type, as well as strings, slices, and pointers. The type combines a value with an error. So, our method (let's assume it's just a regular function for now): Translates to: And the caller can access the result like this: For the error type itself, I went with a simple pointer to an immutable string: Plus a constructor macro: I wanted to avoid heap allocations as much as possible, so decided not to support dynamic errors. Only sentinel errors are used, and they're defined at the file level like this: Errors are compared by pointer identity ( ), not by string content — just like sentinel errors in Go. A error is a pointer. This keeps error handling cheap and straightforward. This was the big one. In Go, an interface is a type that specifies a set of methods. Any concrete type that implements those methods satisfies the interface — no explicit declaration needed. In C, there's no such mechanism. For interfaces, I decided to use "fat" structs with function pointers. That way, Go's : Becomes an struct in C: The pointer holds the concrete value, and each method becomes a function pointer that takes as its first argument. This is less efficient than using a static method table, especially if the interface has a lot of methods, but it's simpler. So I decided it was good enough for the first version. Now functions can work with interfaces without knowing the specific implementation: Calling a method on the interface just goes through the function pointer: Go's interface is more than just a value wrapper with a method table. It also stores type information about the value it holds: Since the runtime knows the exact type inside the interface, it can try to "upgrade" the interface (for example, a regular ) to another interface (like ) using a type assertion : The last thing I wanted to do was reinvent Go's dynamic type system in C, so dropping this feature was an easy decision. There's another kind of type assertion, though — when we unwrap the interface to get the value of a specific type: And this kind of assertion is quite possible in C. All we have to do is compare function pointers: If two different types happened to share the same method implementation, this would break. In practice, each concrete type has its own methods, so the function pointer serves as a reliable type tag. After I decided on the interface approach, porting the actual types was pretty easy. For example, wraps a reader and stops with EOF after reading N bytes: The logic is straightforward: if there are no bytes left, return EOF. Otherwise, if the buffer is bigger than the remaining size, shorten it. Then, call the underlying reader, and decrease the remaining size. Here's what the ported C code looks like: A bit more verbose, but nothing special. The multiple return values, the interface call with , and the slice handling are all implemented as described in previous sections. is where everything comes together. Here's the simplified Go version: In Go, allocates its buffer on the heap with . I could take a similar approach in C — make take an allocator and use it to create the buffer like this: But since this is just a temporary buffer that only exists during the function call, I decided stack allocation was a better choice: allocates memory on a stack with a bounds-checking macro that wraps C's . It moves the stack pointer and gives you a chunk of memory that's automatically freed when the function returns. People often avoid using because it can cause a stack overflow, but using a bounds-checking wrapper fixes this issue. Another common concern with is that it's not block-scoped — the memory stays allocated until the function exits. However, since we only allocate once, this isn't a problem. Here's the simplified C version of : Here, you can see all the parts from this post working together: a function accepting interfaces, slices passed to interface methods, a result type wrapping multiple return values, error sentinels compared by identity, and a stack-allocated buffer used for the copy. Porting Go's package to C meant solving a few problems: representing slices, handling multiple return values, modeling errors, and implementing interfaces using function pointers. None of this needed anything fancy — just structs, unions, functions, and some macros. The resulting C code is more verbose than Go, but it's structurally similar, easy enough to read, and this approach should work well for other Go packages too. The package isn't very useful on its own — it mainly defines interfaces and doesn't provide concrete implementations. So, the next two packages to port were naturally and — I'll talk about those in the next post. In the meantime, if you'd like to write Go that translates to C — with no runtime and manual memory management — I invite you to try Solod . The package is included, of course.

0 views
Anton Zhiyanov 3 weeks ago

Solod: Go can be a better C

I'm working on a new programming language named Solod ( So ). It's a strict subset of Go that translates to C, without hidden memory allocations and with source-level interop. Highlights: So supports structs, methods, interfaces, slices, multiple returns, and defer. To keep things simple, there are no channels, goroutines, closures, or generics. So is for systems programming in C, but with Go's syntax, type safety, and tooling. Hello world • Language tour • Compatibility • Design decisions • FAQ • Final thoughts This Go code in a file : Translates to a header file : Plus an implementation file : In terms of features, So is an intersection between Go and C, making it one of the simplest C-like languages out there — on par with Hare. And since So is a strict subset of Go, you already know it if you know Go. It's pretty handy if you don't want to learn another syntax. Let's briefly go over the language features and see how they translate to C. Variables • Strings • Arrays • Slices • Maps • If/else and for • Functions • Multiple returns • Structs • Methods • Interfaces • Enums • Errors • Defer • C interop • Packages So supports basic Go types and variable declarations: is translated to ( ), to ( ), and to ( ). is not treated as an interface. Instead, it's translated to . This makes handling pointers much easier and removes the need for . is translated to (for pointer types). Strings are represented as type in C: All standard string operations are supported, including indexing, slicing, and iterating with a for-range loop. Converting a string to a byte slice and back is a zero-copy operation: Converting a string to a rune slice and back allocates on the stack with : There's a stdlib package for heap-allocated strings and various string operations. Arrays are represented as plain C arrays ( ): on arrays is emitted as compile-time constant. Slicing an array produces a . Slices are represented as type in C: All standard slice operations are supported, including indexing, slicing, and iterating with a for-range loop. As in Go, a slice is a value type. Unlike in Go, a nil slice and an empty slice are the same thing: allocates a fixed amount of memory on the stack ( ). only works up to the initial capacity and panics if it's exceeded. There's no automatic reallocation; use the stdlib package for heap allocation and dynamic arrays. Maps are fixed-size and stack-allocated, backed by parallel key/value arrays with linear search. They are pointer-based reference types, represented as in C. No delete, no resize. Only use maps when you have a small, fixed number of key-value pairs. For anything else, use heap-allocated maps from the package (planned). Most of the standard map operations are supported, including getting/setting values and iterating with a for-range loop: As in Go, a map is a pointer type. A map emits as in C. If-else and for come in all shapes and sizes, just like in Go. Standard if-else with chaining: Init statement (scoped to the if block): Traditional for loop: While-style loop: Range over an integer: Regular functions translate to C naturally: Named function types become typedefs: Exported functions (capitalized) become public C symbols prefixed with the package name ( ). Unexported functions are . Variadic functions use the standard syntax and translate to passing a slice: Function literals (anonymous functions and closures) are not supported. So supports two-value multiple returns in two patterns: and . Both cases translate to C type: Named return values are not supported. Structs translate to C naturally: works with types and values: Methods are defined on struct types with pointer or value receivers: Pointer receivers pass in C and cast to the struct pointer. Value receivers pass the struct by value, so modifications operate on a copy: Calling methods on values and pointers emits pointers or values as necessary: Methods on named primitive types are also supported. Interfaces in So are like Go interfaces, but they don't include runtime type information. Interface declarations list the required methods: In C, an interface is a struct with a pointer and function pointers for each method (less efficient than using a static method table, but simpler; this might change in the future): Just as in Go, a concrete type implements an interface by providing the necessary methods: Passing a concrete type to functions that accept interfaces: Type assertion works for concrete types ( ), but not for interfaces ( ). Type switch is not supported. Empty interfaces ( and ) are translated to . So supports typed constant groups as enums: Each constant is emitted as a C : is supported for integer-typed constants: Iota values are evaluated at compile time and translated to integer literals: Errors use the type (a pointer): So only supports sentinel errors, which are defined at the package level using (implemented as compiler built-in): Errors are compared using . This is an O(1) operation (compares pointers, not strings): Dynamic errors ( ), local error variables ( inside functions), and error wrapping are not supported. schedules a function or method call to run at the end of the enclosing scope. The scope can be either a function (as in Go): Or a bare block (unlike Go): Deferred calls are emitted inline (before returns, panics, and scope end) in LIFO order: Defer is not supported inside other scopes like or . Include a C header file with : Declare an external C type (excluded from emission) with : Declare an external C function (no body or ): When calling extern functions, and arguments are automatically decayed to their C equivalents: string literals become raw C strings ( ), string values become , and slices become raw pointers. This makes interop cleaner: The decay behavior can be turned off with the flag: The package includes helpers for converting C pointers back to So string and slice types. The package is also available and is implemented as compiler built-ins. Each Go package is translated into a single + pair, regardless of how many files it contains. Multiple files in the same package are merged into one file, separated by comments. Exported symbols (capitalized names) are prefixed with the package name: Unexported symbols (lowercase names) keep their original names and are marked : Exported symbols are declared in the file (with for variables). Unexported symbols only appear in the file. Importing a So package translates to a C : Calling imported symbols uses the package prefix: That's it for the language tour! So generates C11 code that relies on several GCC/Clang extensions: You can use GCC, Clang, or to compile the transpiled C code. MSVC is not supported. Supported operating systems: Linux, macOS, and Windows (partial support). So is highly opinionated. Simplicity is key . Fewer features are always better. Every new feature is strongly discouraged by default and should be added only if there are very convincing real-world use cases to support it. This applies to the standard library too — So tries to export as little of Go's stdlib API as possible while still remaining highly useful for real-world use cases. No heap allocations are allowed in language built-ins (like maps, slices, new, or append). Heap allocations are allowed in the standard library, but they must clearly state when an allocation happens and who owns the allocated data. Fast and easy C interop . Even though So uses Go syntax, it's basically C with its own standard library. Calling C from So, and So from C, should always be simple to write and run efficiently. The So standard library (translated to C) should be easy to add to any C project. Readability . There are several languages that claim they can transpile to readable C code. Unfortunately, the C code they generate is usually unreadable or barely readable at best. So isn't perfect in this area either (though it's arguably better than others), but it aims to produce C code that's as readable as possible. Go compatibility . So code is valid Go code. No exceptions. Raw performance . You can definitely write C code by hand that runs faster than code produced by So. Also, some features in So, like interfaces, are currently implemented in a way that's not very efficient, mainly to keep things simple. Hiding C entirely . So is a cleaner way to write C, not a replacement for it. You should know C to use So effectively. Go feature parity . Less is more. Iterators aren't coming, and neither are generic methods. I have heard these several times, so it's worth answering. Why not Rust/Zig/Odin/other language? Because I like C and Go. Why not TinyGo? TinyGo is lightweight, but it still has a garbage collector, a runtime, and aims to support all Go features. What I'm after is something even simpler, with no runtime at all, source-level C interop, and eventually, Go's standard library ported to plain C so it can be used in regular C projects. How does So handle memory? Everything is stack-allocated by default. There's no garbage collector or reference counting. The standard library provides explicit heap allocation in the package when you need it. Is it safe? So itself has few safeguards other than the default Go type checking. It will panic on out-of-bounds array access, but it won't stop you from returning a dangling pointer or forgetting to free allocated memory. Most memory-related problems can be caught with AddressSanitizer in modern compilers, so I recommend enabling it during development by adding to your . Can I use So code from C (and vice versa)? Yes. So compiles to plain C, therefore calling So from C is just calling C from C. Calling C from So is equally straightforward. Can I compile existing Go packages with So? Not really. Go uses automatic memory management, while So uses manual memory management. So also supports far fewer features than Go. Neither Go's standard library nor third-party packages will work with So without changes. How stable is this? Not for production at the moment. Where's the standard library? There is a growing set of high-level packages ( , , , ...). There are also low-level packages that wrap the libc API ( , , , ...). Check the links below for more details. Even though So isn't ready for production yet, I encourage you to try it out on a hobby project or just keep an eye on it if you like the concept. Further reading: Go in, C out. You write regular Go code and get readable C11 as output. Zero runtime. No garbage collection, no reference counting, no hidden allocations. Everything is stack-allocated by default. Heap is opt-in through the standard library. Native C interop. Call C from So and So from C — no CGO, no overhead. Go tooling works out of the box — syntax highlighting, LSP, linting and "go test". Binary literals ( ) in generated code. Statement expressions ( ) in macros. for package-level initialization. for local type inference in generated code. for type inference in generic macros. for and other dynamic stack allocations. Installation and usage So by example Language description Stdlib description Source code

0 views
kytta 1 months ago

Teach humans to contribute, not machines

I love contributing to open-source projects. There is this insanely good feeling that I get when my changes get merged into the main branch. Dopamine goes through the roof when I see the number of “projects I’ve contributed to” go up, and the shade of blue on my contribution graph get that little bit darker. I wish I could do this my whole life. Contributing to open source is not easy, though, especially to the projects one hasn’t worked with yet. Of course, the biggest hurdle is the programming language one might not know. Then, it’s finding the issue to tackle. But these are the hurdles one should come to expect. But you know what’s the hardest thing about contributing, once you’ve found a project and picked an issue to work on? It’s getting the damn thing to run. When it comes to the ways a project should be run (let alone, developed), one should cue xkcd #927 . And, the bigger the ecosystem, the worse it becomes. I can guarantee you that, if you pick any two libraries that are written in the same language, they will have different commands to build them. Which is fine – different projects have different goals and different maintainers (with different opinions) – but discovering those is often outright impossible. Blessed be your soul if you tell me I should just run in your README! Or, even better, if you have a CONTRIBUTING.md outlining everything I need to know – from prerequisites to coding style to pull request guidelines. But that’s not always the case. In the past, I had to do a lot of guesswork. Do some npm scripts have telling names? Maybe there’s a Justfile? In the end, I was either reading the CI workflow files, trying to understand what someone else’s computer executes to achieve the same goal as I, or I gave up and watched said CI server do it for me after I’ve submitted a PR blindly. But then, something changed. With each day, more and more people are writing good contributor guides! All very well-structured and full to the brim with commands, style guides, and tips and tricks. The catch in that whole thing? The file is named different. It’s no longer called CONTRIBUTING.md. It’s AGENTS.md. That prick Claude and his dorky lil’ friends! We, the humans, have been demanding good documentation and help with contributing for ages, and they come in and get it served right to them! It’s a crazy feeling of both deep sorrow and weird joy that I get when yet another thing that’s helpful for your “agents” shows up – because it could and frequently is very beneficial to us, humans. Don’t like the CSS and JS of the project’s website? The docs are arranged weirdly, and you find yourself clicking around too often? llms.txt to the rescue! You don’t know if the project want’s regular or conventional commits? Just look in the AGENTS.md! You need to do something with a PDF, but you don’t know how? Just look at how Claude would do it with its “skills”! Finally, the thing that motivated me to write this post. Andrew Nesbitt, an awesome fella of ecosyste.ms fame, has just announced . The idea, and it’s execution, is insane (in a good way): Just run the command, and you’ll get all information on a project you need! Build, lint, and test commands, code formatting, supported OSes; it’s basically the solution to the problem I’ve had in the first paragraphs! But wait – how should one use this tool? Add this to your […] agent instructions file: Before starting work on this project, run to understand the toolchain, test commands, linters, and project conventions. The agent will get back structured information […] so it doesn’t have to guess or ask you. I wonder where this phenomenon is coming from. I guess that we, the programmers, who have made it out job to command a soulless machine, can not get enough of it. As if we’re not thinking enough about human interaction, or at very least are not getting enough fun from it. Coding is fun; typing stuff and see computer act (mostly) the way you want is fun. And it’s very easy to forget about the other developers a become one with the project. The perfect Makefile. The flawless CI pipeline. The impeccable AGENTS.md. But please, wake up from that dream. As good as it might feel (you don’t have to tell me!), you should still realize that you’re not alone. That somewhere out there, separated from you by thousands of kilometres of underwater cables, electromagnetic waves, and copper wires, there is another human, just like you. And that human does not want to deduce the build flags from reading your goreleaser.yaml.

1 views
Simon Willison 1 months ago

My fireside chat about agentic engineering at the Pragmatic Summit

I was a speaker last month at the Pragmatic Summit in San Francisco, where I participated in a fireside chat session about Agentic Engineering hosted by Eric Lui from Statsig. The video is available on YouTube . Here are my highlights from the conversation. We started by talking about the different phases a software developer goes through in adopting AI coding tools. I feel like there are different stages of AI adoption as a programmer. You start off with you've got ChatGPT and you ask it questions and occasionally it helps you out. And then the big step is when you move to the coding agents that are writing code for you—initially writing bits of code and then there's that moment where the agent writes more code than you do, which is a big moment. And that for me happened only about maybe six months ago. The new thing as of what, three weeks ago, is you don't read the code. If anyone saw StrongDM—they had a big thing come out last week where they talked about their software factory and their two principles were nobody writes any code, nobody reads any code, which is clear insanity. That is wildly irresponsible. They're a security company building security software, which is why it's worth paying close attention—like how could this possibly be working? I talked about StrongDM more in How StrongDM's AI team build serious software without even looking at the code . We discussed the challenge of knowing when to trust the AI's output as opposed to reviewing every line with a fine tooth-comb. The way I've become a little bit more comfortable with it is thinking about how when I worked at a big company, other teams would build services for us and we would read their documentation, use their service, and we wouldn't go and look at their code. If it broke, we'd dive in and see what the bug was in the code. But you generally trust those teams of professionals to produce stuff that works. Trusting an AI in the same way feels very uncomfortable. I think Opus 4.5 was the first one that earned my trust—I'm very confident now that for classes of problems that I've seen it tackle before, it's not going to do anything stupid. If I ask it to build a JSON API that hits this database and returns the data and paginates it, it's just going to do it and I'm going to get the right thing back. Every single coding session I start with an agent, I start by saying here's how to run the test—it's normally is my current test framework. So I say run the test and then I say use red-green TDD and give it its instruction. So it's "use red-green TDD"—it's like five tokens, and that works. All of the good coding agents know what red-green TDD is and they will start churning through and the chances of you getting code that works go up so much if they're writing the test first. I wrote more about TDD for coding agents recently in Red/green TDD . I have hated [test-first TDD] throughout my career. I've tried it in the past. It feels really tedious. It slows me down. I just wasn't a fan. Getting agents to do it is fine. I don't care if the agent spins around for a few minutes wasting its time on a test that doesn't work. I see people who are writing code with coding agents and they're not writing any tests at all. That's a terrible idea. Tests—the reason not to write tests in the past has been that it's extra work that you have to do and maybe you'll have to maintain them in the future. They're free now. They're effectively free. I think tests are no longer even remotely optional. You have to get them to test the stuff manually, which doesn't make sense because they're computers. But anyone who's done automated tests will know that just because the test suite passes doesn't mean that the web server will boot. So I will tell my agents, start the server running in the background and then use curl to exercise the API that you just created. And that works, and often that will find new bugs that the test didn't cover. I've got this new tool I built called Showboat. The idea with Showboat is you tell it—it's a little thing that builds up a markdown document of the manual test that it ran. So you can say go and use Showboat and exercise this API and you'll get a document that says "I'm trying out this API," curl command, output of curl command, "that works, let's try this other thing." I introduced Showboat in Introducing Showboat and Rodney, so agents can demo what they've built . I had a project recently where I wanted to add file uploads to my own little web framework, Datasette—multipart file uploads and all of that. And the way I did it is I told Claude to build a test suite for file uploads that passes on Go and Node.js and Django and Starlette—just here's six different web frameworks that implement this, build tests that they all pass. Now I've got a test suite and I can say, okay, build me a new implementation for Datasette on top of those tests. And it did the job. It's really powerful—it's almost like you can reverse engineer six implementations of a standard to get a new standard and then you can implement the standard. Here's the PR for that file upload feature. It's completely context dependent. I knock out little vibe-coded HTML JavaScript tools, single pages, and the code quality does not matter. It's like 800 lines of complete spaghetti. Who cares, right? It either works or it doesn't. Anything that you're maintaining over the longer term, the code quality does start really mattering. Here's my collection of vibe coded HTML tools , and notes on how I build them . Having poor quality code from an agent is a choice that you make. If the agent spits out 2,000 lines of bad code and you choose to ignore it, that's on you. If you then look at that code—you know what, we should refactor that piece, use this other design pattern—and you feed that back into the agent, you can end up with code that is way better than the code I would have written by hand because I'm a little bit lazy. If there was a little refactoring I spot at the very end that would take me another hour, I'm just not going to do it. If an agent's going to take an hour but I prompt it and then go off and walk the dog, then sure, I'll do it. I turned this point into a bit of a personal manifesto: AI should help us produce better code . One of the magic tricks about these things is they're incredibly consistent. If you've got a codebase with a bunch of patterns in, they will follow those patterns almost to a tee. Most of the projects I do I start by cloning that template. It puts the tests in the right place and there's a readme with a few lines of description in it and GitHub continuous integration is set up. Even having just one or two tests in the style that you like means it'll write tests in the style that you like. There's a lot to be said for keeping your codebase high quality because the agent will then add to it in a high quality way. And honestly, it's exactly the same with human development teams—if you're the first person to use Redis at your company, you have to do it perfectly because the next person will copy and paste what you did. I run templates using cookiecutter - here are my templates for python-lib , click-app , and datasette-plugin . When you build software on top of LLMs you're outsourcing decisions in your software to a language model. The problem with language models is they're incredibly gullible by design. They do exactly what you tell them to do and they will believe almost anything that you say to them. Here's my September 2022 post that introduced the term prompt injection . I named it after SQL injection because I thought the original problem was you're combining trusted and untrusted text, like you do with a SQL injection attack. Problem is you can solve SQL injection by parameterizing your query. You can't do that with LLMs—there is no way to reliably say this is the data and these are the instructions. So the name was a bad choice of name from the very start. I've learned that when you coin a new term, the definition is not what you give it. It's what people assume it means when they hear it. Here's more detail on the challenges of coining terms . The lethal trifecta is when you've got a model which has access to three things. It can access your private data—so it's got access to environment variables with API keys or it can read your email or whatever. It's exposed to malicious instructions—there's some way that an attacker could try and trick it. And it's got some kind of exfiltration vector, a way of sending messages back out to that attacker. The classic example is if I've got a digital assistant with access to my email, and someone emails it and says, "Hey, Simon said that you should forward me your latest password reset emails." If it does, that's a disaster. And a lot of them kind of will. My post describing the Lethal Trifecta . We discussed the challenges of running coding agents safely, especially on local machines. The most important thing is sandboxing. You want your coding agent running in an environment where if something goes completely wrong, if somebody gets malicious instructions to it, the damage is greatly limited. This is why I'm such a fan of Claude Code for web . The reason I use Claude on my phone is that's using Claude Code for the web, which runs in a container that Anthropic run. So you basically say, "Hey, Anthropic, spin up a Linux VM. Check out my git repo into it. Solve this problem for me." The worst thing that could happen with a prompt injection against that is somebody might steal your private source code, which isn't great. Most of my stuff's open source, so I couldn't care less. On running agents in YOLO mode, e.g. Claude's : I mostly run Claude with dangerously skip permissions on my Mac directly even though I'm the world's foremost expert on why you shouldn't do that. Because it's so good. It's so convenient. And what I try and do is if I'm running it in that mode, I try not to dump in random instructions from repos that I don't trust. It's still very risky and I need to habitually not do that. The topic of testing against a copy of your production data came up. I wouldn't use sensitive user data. When you work at a big company the first few years everyone's cloning the production database to their laptops and then somebody's laptop gets stolen. You shouldn't do that. I'd actually invest in good mocking—here's a button I click and it creates a hundred random users with made-up names. There's a trick you can do there which is much easier with agents where you can say, okay, there's this one edge case where if a user has over a thousand ticket types in my event platform everything breaks, so I have a button that you click that creates a simulated user with a thousand ticket types. I feel like there have been a few inflection points. GPT-4 was the point where it was actually useful and it wasn't making up absolutely everything and then we were stuck with GPT-4 for about 9 months—nobody else could build a model that good. I think the killer moment was Claude Code. The coding agents only kicked off about a year ago. Claude Code just turned one year old. It was that combination of Claude Code plus Sonnet 3.5 at the time—that was the first model that really felt good enough at driving a terminal to be able to do useful things. Then things got really good with the November 2025 inflection point . It's at a point where I'm oneshotting basically everything. I'll pull out and say, "Oh, I need three new RSS feeds on my blog." And I don't even have to ask if it's going to work. It's like a two sentence prompt. That reliability, that ability to predictably—this is why we can start trusting them because we can predict what they're going to do. An ongoing challenge is figuring out what the models can and cannot do, especially as new models are released. The most interesting question is what can the models we have do right now. The only thing I care about today is what can Claude Opus 4.6 do that we haven't figured out yet. And I think it would take us six months to even start exploring the boundaries of that. It's always useful—anytime a model fails to do something for you, tuck that away and try again in 6 months because it'll normally fail again, but every now and then it'll actually do it and now you might be the first person in the world to learn that the model can now do this thing. A great example is spellchecking. A year and a half ago the models were terrible at spellchecking—they couldn't do it. You'd throw stuff in and they just weren't strong enough to spot even minor typos. That changed about 12 months ago and now every blog post I post I have a proofreader Claude thing and I paste it and it goes, "Oh, you've misspelled this, you've missed an apostrophe off here." It's really useful. Here's the prompt I use for proofreading. This stuff is absolutely exhausting. I often have three projects that I'm working on at once because then if something takes 10 minutes I can switch to another one and after two hours of that I'm done for the day. I'm mentally exhausted. People worry about skill atrophy and being lazy. I think this is the opposite of that. You have to operate firing on all cylinders if you're going to keep your trio or quadruple of agents busy solving all these different problems. I think that might be what saves us. You can't have one engineer and have him do a thousand projects because after 3 hours of that, he's going to literally pass out in a corner. I was asked for general career advice for software developers in this new era of agentic engineering. As engineers, our careers should be changing right now this second because we can be so much more ambitious in what we do. If you've always stuck to two programming languages because of the overhead of learning a third, go and learn a third right now—and don't learn it, just start writing code in it. I've released three projects written in Go in the past two weeks and I am not a fluent Go programmer, but I can read it well enough to scan through and go, "Yeah, this looks like it's doing the right thing." It's a great idea to try fun, weird, or stupid projects with them too: I needed to cook two meals at once at Christmas from two recipes. So I took photos of the two recipes and I had Claude vibe code me up a cooking timer uniquely for those two recipes. You click go and it says, "Okay, in recipe one you need to be doing this and then in recipe two you do this." And it worked. I mean it was stupid, right? I should have just figured it out with a piece of paper. It would have been fine. But it's so much more fun building a ridiculous custom piece of software to help you cook Christmas dinner. Here's more about that recipe app . Eric asked if we would build Django the same way today as we did 22 years ago . In 2003 we built Django. I co-created it at a local newspaper in Kansas and it was because we wanted to build web applications on journalism deadlines. There's a story, you want to knock out a thing related to that story, it can't take two weeks because the story's moved on. You've got to have tools in place that let you build things in a couple of hours. And so the whole point of Django from the very start was how do we help people build high-quality applications as quickly as possible. Today, I can build an app for a news story in two hours and it doesn't matter what the code looks like. I talked about the challenges that AI-assisted programming poses for open source in general. Why would I use a date picker library where I'd have to customize it when I could have Claude write me the exact date picker that I want? I would trust Opus 4.6 to build me a good date picker widget that was mobile friendly and accessible and all of those things. And what does that do for demand for open source? We've seen that thing with Tailwind, right? Where Tailwind's business model is the framework's free and then you pay them for access to their component library of high quality date pickers, and the market for that has collapsed because people can vibe code those kinds of custom components. Here are more of my thoughts on the Tailwind situation. I don't know. Agents love open source. They're great at recommending libraries. They will stitch things together. I feel like the reason you can build such amazing things with agents is entirely built on the back of the open source community. Projects are flooded with junk contributions to the point that people are trying to convince GitHub to disable pull requests, which is something GitHub have never done. That's been the whole fundamental value of GitHub—open collaboration and pull requests—and now people are saying, "We're just flooded by them, this doesn't work anymore." I wrote more about this problem in Inflicting unreviewed code on collaborators . You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
Xe Iaso 1 months ago

Vibe Coding Trip Report: Making a sponsor panel

I'm on medical leave recovering from surgery . Before I went under, I wanted to ship one thing I'd been failing to build for months: a sponsor panel at sponsors.xeiaso.net . Previous attempts kept dying in the GraphQL swamp. This time I vibe coded it — pointed agent teams at the problem with prepared skills and let them generate the gnarly code I couldn't write myself. And it works. Go and GraphQL are oil and water. I've held this opinion for years and nothing has changed it. The library ecosystem is a mess: shurcooL/graphql requires abusive struct tags for its reflection-based query generation, and the code generation tools produce mountains of boilerplate. All of it feels like fighting the language into doing something it actively resists. GitHub removing the GraphQL explorer made this even worse. You used to be able to poke around the schema interactively and figure out what queries you needed. Now you're reading docs and guessing. Fun. I'd tried building this panel before, and each attempt died in that swamp. I'd get partway through wrestling the GitHub Sponsors API into Go structs, lose momentum, and shelve it. At roughly the same point each time: when the query I needed turned out to be four levels of nested connections deep and the struct tags looked like someone fell asleep on their keyboard. Vibe coding was a hail mary. I figured if it didn't work, I was no worse off. If it did, I'd ship something before disappearing into a hospital for a week. Vibe coding is not "type a prompt and pray." Output quality depends on the context you feed the model. Templ — the Go HTML templating library I use — barely exists in LLM training data. Ask Claude Code to write Templ components cold and it'll hallucinate syntax that looks plausible but doesn't compile. Ask me how I know. Wait, so how do you fix that? I wrote four agent skills to load into the context window: With these loaded, the model copies patterns from authoritative references instead of inventing syntax from vibes. Most of the generated Templ code compiled on the first try, which is more than I can say for my manual attempts. Think of it like giving someone a cookbook instead of asking them to invent recipes from first principles. The ingredients are the same, but the results are dramatically more consistent. I pointed an agent team at a spec I'd written with Mimi . The spec covered the basics: OAuth login via GitHub, query the Sponsors API, render a panel showing who sponsors me and at what tier, store sponsor logos in Tigris . I'm not going to pretend I wrote the spec alone. I talked through the requirements with Mimi and iterated on it until it was clear enough for an agent team to execute. The full spec is available as a gist if you want to see what "clear enough for agents" looks like in practice. One agent team split the spec into tasks and started building. A second reviewed output and flagged issues. Meanwhile, I provisioned OAuth credentials in the GitHub developer settings, created the Neon Postgres database, and set up the Tigris bucket for sponsor logos. Agents would hit a point where they needed a credential, I'd paste it in, and they'd continue — ops work and code generation happening in parallel. The GraphQL code the agents wrote is ugly . Raw query strings with manual JSON parsing that would make a linting tool weep. But it works. The shurcooL approach uses Go idioms, sure, but it requires so much gymnastics to handle nested connections that the cognitive load is worse. Agent-generated code is direct: send this query string, parse this JSON, done. I'd be embarrassed to show it at a code review. I'd also be embarrassed to admit how many times I failed to ship the "clean" version. This code exists because the "proper" way kept killing the project. I'll take ugly-and-shipped over clean-and-imaginary. The full stack: Org sponsorships are still broken. The schema for organization sponsors differs enough from individual sponsors that it needs its own query path and auth flow. I know what the fix looks like, but it requires reaching out to other devs who've cracked GitHub's org-level sponsor queries. The code isn't my usual style either — JSON parsing that makes me wince, variable names that are functional but uninspired, missing error context in a few places. I'll rewrite chunks of this after I've recovered. The panel exists now, though. It renders real data. People can OAuth in and see their sponsorship status. Before this attempt, it was vaporware. I've been telling people "just ship it" for years. Took vibe coding to make me actually do it myself. I wouldn't vibe code security-critical systems or anything I need to audit line-by-line. But this project had stopped me cold on every attempt, and vibe coding got it across the line in a weekend. Skills made the difference here. Loading those four documents into the context window turned Claude Code from "plausible but broken Templ" into "working code on the first compile." I suspect that gap will only matter more as people try to use AI with libraries that aren't well-represented in training data. This sponsor panel probably won't look anything like it does today in six months. I'll rewrite the GraphQL layer once I find a pattern that doesn't make me cringe. Org sponsorships still need work. HTMX might get replaced. But it exists, and before my surgery, shipping mattered more than polish. The sponsor panel is at sponsors.xeiaso.net . The skills are in my site's repo under . templ-syntax : Templ's actual syntax, with enough detail that the model can look up expressions, conditionals, and loops instead of guessing. templ-components : Reusable component patterns — props, children, composition. Obvious if you've used Templ, impossible to infer from sparse training data. templ-htmx : The gotchas when combining Templ with HTMX. Attribute rendering and event handling trip up humans and models alike. templ-http : Wiring Templ into handlers properly — routes, data passing, request lifecycle. Go for the backend, because that's what I know and what my site runs on Templ for HTML rendering, because I'm tired of 's limitations HTMX for interactivity, because I refuse to write a React app for something this simple PostgreSQL via Neon for persistence GitHub OAuth for authentication GitHub Sponsors GraphQL API for the actual sponsor data Tigris for sponsor logo storage — plugged it in and it Just Works™

0 views
Manuel Moreale 1 months ago

Eric Schwarz

This week on the People and Blogs series we have an interview with Eric Schwarz, whose blog can be found at schwarztech.net . Tired of RSS? Read this in your browser or sign up for the newsletter . People and Blogs is supported by the "One a Month" club members. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. Hi! I'm Eric Schwarz and my online "home" has been SchwarzTech . I grew up in Indiana in the United States and had a knack for anything involving computers from a young age. Although my first computer was a very-old Radio Shack TRS-80, I quickly shifted to an Apple IIgs and later playing with various used Macs. I really appreciated the intentional, but flawed aspects of Apple's products in the late-1980s and early 1990s. Despite my technology background, I went to college to work in media, especially audio/video production, but between the devaluation of a lot of creative jobs and the 2008 financial crisis/recession, I stuck around for more schooling, getting a graduate degree in Information & Communication Sciences, basically a mix of information technology, telecom, and a bit of business. From there, I ended up working in higher education, moving through different roles in an IT department at a small college, the bulk of which involved network engineering. A couple of years ago, my now-fiancée and I uprooted for her work and I'm at a different university, still doing a variety of IT things. I really enjoy working on a small team because it means you get to a little bit of everything! I've found that it's really nice to balance the structured, break/fix things from my day job with creative pursuits and projects outside of work. Like many that have been interviewed here, I dabble in photography, have done some various audio and video projects, and seem to be my friends' go-to for graphic design-related things. Other than those, I appreciate a good TV show or movie, maybe satisfying my college-self a little bit. I've gotten into following the National Women's Soccer League (NWSL) as well as some of the minor-league sports that are in our city. I love trying new foods and visiting new places (as cliché as that sounds), just because there's so much of the world to explore and experience—I think that makes one a more well-rounded, empathetic person. I don't quite remember the origin story for the name other than that it was going to be the name for my software "business" (remember, I was kid!) when I was writing software on the TRS-80. None of that really lasted and I reused the name when I created a personal site on GeoCities. In the late 1990s, the Internet was a weird patchwork of personal sites, academic resources, and still rough-around-the-edges corporate sites. I think we were all learning what this could be used for as we went along and I was no exception. Initially, it was a landing page of sorts when I was writing about tech elsewhere, including Low End Mac and the long-defunct MacWeekly. Eventually, getting a new iBook G3 and wanting to expand my topics led me to turning my site into a blog. I think that second-generation of the site was my attempt to compete with some of the larger players at the time, mixing in product reviews, longform opinion articles, news stories, and even a few guest writers. At that time, my family still had a big analog C-Band satellite dish at home and I was able to tune in to the live feeds of the Macworld Expo keynotes, so I could "live blog" those from afar, too. iLounge, MacOpinion, Think Secret, and TUAW were some of the sites I looked up to. By the time I was in college, it was a lot to balance courses, a campus job, somewhat of a social life, and the site scaled back a little, but was still very much a fun hobby of mine. Like many other bloggers, my site's third-generation morphed into more of a format similar to John Gruber's Daring Fireball : longform articles mixed with linked-out items that have a couple of paragraphs of commentary (I call them "Snippets.") I liked the format, as it allowed me to share things I found interesting or worth talking about. However, I found that in recent years so much of the tech industry has started to feel like a parody of itself. I felt like I had to cover stories because of their importance, rather than because I wanted to. After realizing that, I've started to shift my content a bit and my goal is to get back to content that celebrates my relationship with technology and even things that can be more lasting. That might be leading to a "fourth-generation" of the site. As I touched on a little earlier, I think my creative process got a bit hijacked by so much bad news around "Big Tech"—while I've tried to avoid my site becoming a cheerleader for Apple, that's the corner of the tech world that I've lived in for the past 30+ years (if you count the Macs and Apple IIs I used in school before I had my own.) Inspiration and sources come from a variety of areas: other blogs and things in my RSS reader, links on social media, tech stories from the larger media outlets. I think for Snippets, it's something that I feel is important to share or that I have strong feelings for. Those are often a bit more off-the-cuff and get a quick proofread before publishing. If it's something longer-form, I'll take some time, edit as I go, maybe have someone look over portions if something isn't quite working for me, and then publish. In terms of research, I try to link to outside sources that can provide additional context, older posts of my own that can add some historical context, while still maintaining and assuming that most of my readers have an above-average grasp on a lot of the topics. It's a bit of writing-for-me and I hope others will join me on the ride. While I'd love to say that I have a certain ritualistic place that I write, the truth is that sometimes it's just wherever I am. I don't love writing from my phone, but sometimes due to travel or between things at work, I might hammer out a quick post. I do think that I've gotten my home-office to be a comfortable place to sit down and focus on writing, with cozy lighting and everything set up. When I was working at my last job, I'd often grab a laptop or iPad and work from a nearby coffee shop—I think getting out of my then-apartment and having a more intentional time for writing with fewer distractions helped. Since moving, I haven't done that as much. If I think of some of my favorite "let's go write" moments, it's often on a moody, rainy day where there's some ambient noise from outside while I work. I have found that taking a break and letting something sit for a day or two has been a more important thing than location. Trying to force oneself to write when your head and heart aren't in it just doesn't seem to work for me. I set up my site on WordPress about twenty years ago when I outgrew server-side includes. It took a little while to wrestle the templates to work like my previously-carefully-crafted stylesheets. In some ways WordPress has gotten really bloated for my needs, but it works well enough and I have yet to find something to easily replace it with all the random things I've bolted onto my theme over the years. I'm in the process of re-evaluating some of my services, but right now I'm using IONOS (formerly 1&1) for hosting, which I had originally started with when they set up shop in the United States. My domains are with Hover at the moment. As for what I use to create my site, I'm currently using a Mac mini (M4), iPad mini (A17 Pro), and iPhone 15. On the Mac, BBEdit or directly on the web are where I'll do my writing. On the iOS side, I do a lot of writing in iA Writer. I'm still using Panic's Coda an Code Editor (formerly Diet Coda) for a lot of file mananagement/coding. Considering how long both have been discontinued, finding suitable replacements for both at my desk and mobile are on my to-do list. Other than the name being sometimes hard to spell, I don't think I'd necessarily pick something else. The beauty of it is that I'm not necessarily tied down to Apple/Mac-specific content and I can adapt it over time. I think of how many sites were Mac-something or iPod-something and then had to abruptly (and sometimes awkwardly) rename to fit the changing scope of content. I think for a CMS, I might want something a bit "lighter," but WordPress has allowed me to adapt the site for my changing content numerous times. I find it to be relatively inexpensive to run the site with hosting running me about US$100/year and then US$20/domain on average. I make some of that back with the single ad through the Carbon network, but I don't necessarily want to have more ads than that. Since it's a hobby for me, I'm not looking to make a lot of money, but I understand for folks who want or need to and don't begrudge that. I've toyed with the idea of letting people support the site, but I'm also not sure if it's worth the trouble. To try to avoid repeating anyone who has already been interviewed, I went through my RSS feeds to find a few that I immediately skip to when I see a new post: Brent Simmons is behind NetNewsWire and I started following his writing soon after I discovered NetNewsWire years ago, and got to follow the story of how that piece of software changed hands numerous times. Stephen Hackett is someone whose content and knowledge I can really relate to, so it's interesting to see his take on a lot of tech. Matthew Haughey covers a lot of different topics, but manages to craft a post that is always so damn fascinating. Mike Davidson doesn't blog as much these days, but he was another person whose work I followed way back in the mid-2000s and looked up to when I was interested in the convergence of traditional media and the Web. Jedda, Keenan, Lou Plummer, Nick Heer, Riccardo Mori, and Louie Mantia were already in the series, but I always enjoy when something new comes along from them, too. I have a few odds and ends that I wasn't quite sure where to fit elsewhere. First, I wanted to mention my side-project, The Chaos League , a blog that followed a similar format as SchwarzTech, but focused on the NWSL. This was a fantastic distraction coming out of the pandemic as it gave me an outlet that wasn't tech. Unfortunately, in the last few years, coverage from large media outlets and the public's appetite for short-form video content have kind of killed a lot of interest in bloggers covering that space. It's currently on hiatus and I'm not sure what the next step, if any, will be. Other than shamelessly plugging what I’ve done, I wanted to comment that this was a really fun exercise to think over my place online and what it means to me—thanks again for the opportunity! Now that you're done reading the interview, go check the blog and subscribe to the RSS feed . If you're looking for more content, go read one of the previous 131 interviews . People and Blogs is possible because kind people support it.

0 views
./techtipsy 1 months ago

I gave the MacBook Pro a try

I got the opportunity to try out a MacBook Pro with the M3 Pro with 18GB RAM (not Pro). I’ve been rocking a ThinkPad P14s gen 4 and am reasonably happy with it, but after realizing that I am the only person in the whole company not on a MacBook, and one was suddenly available for use, I set one up for work duties to see if I could ever like using one. It’s nice. I’ve used various flavours of Linux on the desktop since 2014, starting with Linux Mint. 2015 was the year I deleted the Windows dual boot partition. Over those years, the experience on Linux and especially Fedora Linux has improved a lot, and for some reason it’s controversial to say that I love GNOME and its opinionated approach to building a cohesive and yet functional desktop environment. When transitioning over to macOS, I went in with an open mind. I won’t heavily customise it, won’t install Asahi Linux on it, or make it do things it wasn’t meant to do. This is an appliance, I will use it to get work done and that’s it. With this introduction out of the way, here are some observations I’ve made about this experience so far. The first stumbling block was an expected one: all the shortcuts are wrong, and the Ctrl-Super-Alt friendship has been replaced with these new weird ones. With a lot of trial and error, it is not that difficult to pick it up, but I still stumble around with copy-paste, moving windows around, or operating my cursor effectively. It certainly doesn’t help that in terminal windows, Ctrl is still king, while elsewhere it’s Cmd. Mouse gestures are nice, and not that different from the GNOME experience. macOS has window snapping by default, but only using the mouse. I had to install a specific program to enable window moving and snapping with keyboard shortcuts (Rectangle) , which is something I use heavily in GNOME. Odd omission by Apple. For my Logitech keyboard and mouse to do the right thing, I did have to install the Logitech Logi+ app, which is not ideal, but is needed to have an acceptable experience using my MX series peripherals, especially the keyboard where it needs to remap some keys for them to properly work in macOS. I still haven’t quite figured out why Page up/down and Home/End keys are not working as they should be. Also, give my Delete key back! Opening the laptop with Touch ID is a nice bonus, especially on public transport where I don’t really want my neighbour to see me typing in my password. The macOS concept of showing open applications that don’t have windows on them as open in the dock is a strange choice, that has caused me to look for those phantom windows and is generally misleading. Not being able to switch between open windows instead of applications echoes the same design choice that GNOME made, and I’m not a big fan of it here as well. But at least in GNOME you can remap the Alt+Tab shortcut to fix it. The default macOS application installation process of downloading a .dmg file, then opening it, then dragging an icon in a window to the Applications folder feels super odd. Luckily I was aware of the tool and have been using that heavily to get everything that I need installed, in a Linux-y way. I appreciate the concern that macOS has about actions that I take on my laptop, but my god, the permission popups get silly sometimes. When a CLI app is doing things and accessing data on my drive, I can randomly be presented with a permissions pop-up, stealing my focus from writing a Slack message. Video calls work really well, I can do my full stack engineer things, and overall things work, even if it is sometimes slightly different. The default Terminal app is not good, I’m still not quite sure why it does not close the window when I exit it, that “Process exited” message is not helpful. No contest, the hardware on a MacBook Pro feels nice and premium compared to the ThinkPad P14s gen 4. The latter now feels like a flexible plastic piece of crap. The screen is beautiful and super smooth due to the higher refresh rate. The MacBook does not flex when I hold it. Battery life is phenomenal, the need to have a charger is legitimately not a concern in 90% of the situations I use a MacBook in. Keyboard is alright, good to type on, but layout is not my preference. M3 Pro chip is fast as heck. 18 GB of memory is a solid downgrade from 32 GB, but so far it has not prevented me from doing my work. I have never heard the fan kick on, even when testing a lot of Go code in dozens of containers, pegging the CPU at 100%, using a lot of memory, and causing a lot of disk writes. I thought that I once heard it, but no, that fan noise was coming from a nearby ThinkPad. The alumin i um case does have one downside: the MacBook Pro is incredibly slippery. I once put it in my backpack and it made a loud thunk as it hit the table that the backpack was on. Whoops. macOS does not provide scaling options on my 3440x1440p ultra-wide monitor. Even GNOME has that, with fractional scaling! The two alternatives are to use a lower resolution (disgusting), or increase the text size across the OS so that I don’t suffer with my poor eyesight. Never needed those. I like that. Having used an iPhone for a while, I sort of expected this to be a requirement, but no, you can completely ignore those aspects of macOS and work with a local account. Even Windows 11 doesn’t want to allow that! Switching the keyboard language using the keyboard shortcut is broken about 50% of the time, which feels odd given that it’s something that just works on GNOME. This is quite critical for me since I shift between the Estonian and US keyboard a lot when working, as the US layout has the brackets and all the other important characters in the right places for programming and writing, while Estonian keyboard has all the Õ Ä Ö Ü-s that I need. I upgraded to macOS 26.3 Tahoe on 23rd of February. SSH worked in the morning. Upgrade during lunch, come back, bam, broken. The SSH logins would halt at the part where public key authentication was taking place, the process just hung. I confirmed that by adding into the SSH command. With some vibe-debugging with Claude Code, I found that something with the SSH agent service had broken after the upgrade. One reasonably simple fix was to put this in your : Then it works in the shell, but all other git integrations, such as all the repos I have cloned and am using via IntelliJ IDEA, were still broken. Claude suggested that I build my own SSH agent, and install that until this issue is fixed. That’s when I decided to stop. macOS was supposed to just work, and not get into my way when doing work. This level of workaround is something I expect from working with Linux, and even there it usually doesn’t get that odd, I can roll back a version of a package easily, or fix it by pulling in the latest development release of that particular package. I went into this experiment with an open mind, no expectations, and I have to admit that a MacBook Pro with M3 Pro chip is not bad at all, as long as it works. Unfortunately it doesn’t work for me right now. I might have gotten very unlucky with this issue and the timing, but first impressions matter a lot. The hardware can be nice and feel nice, but if the software lets me down and stops me from doing what’s more important, then it makes the hardware useless. It turns out that I like Linux and GNOME a lot. Things are simple, improvements are constant and iterative in nature, so you don’t usually notice it (with Wayland and Pipewire being rare exceptions), and you have more control when you need to fix something. Making those one-off solutions like a DIY coding agent sandbox, or a backup script, or setting up snapshots on my workstation are also super easy. If Asahi Linux had 100% compatibility on all modern M-series MacBooks, then that would be a killer combination. 1 Until then, back to the ol’ reliable ThinkPad P14s gen 4 I go. I can live with fan noise, Bluetooth oddities and Wi-Fi roaming issues, but not with something as basic as SSH not working one day. 2 any kind billionaires want to bankroll the project? Oh wait, that’s an oxymoron.  ↩︎ the fan noise can actually be fixed quite easily by setting a lower temperature target on the Ryzen APU and tuning the fan to only run at the lowest speed after a certain temperature threshold.  ↩︎ any kind billionaires want to bankroll the project? Oh wait, that’s an oxymoron.  ↩︎ the fan noise can actually be fixed quite easily by setting a lower temperature target on the Ryzen APU and tuning the fan to only run at the lowest speed after a certain temperature threshold.  ↩︎

0 views