Posts in Ui (20 found)
(think) 2 weeks ago

Neocaml 0.1: Ready for Action

neocaml 0.1 is finally out! Almost a year after I announced the project , I’m happy to report that it has matured to the point where I feel comfortable calling it ready for action. Even better - recently landed in MELPA , which means installing it is now as easy as: That’s quite the journey from “a fun experimental project” to a proper Emacs package! You might be wondering what’s wrong with the existing options. The short answer - nothing is wrong per se, but offers a different set of trade-offs: Of course, is the youngest of the bunch and it doesn’t yet match Tuareg’s feature completeness. But for many OCaml workflows it’s already more than sufficient, especially when combined with LSP support. I’ve started the project mostly because I thought that the existing Emacs tooling for OCaml was somewhat behind the times - e.g. both and have features that are no longer needed in the era of . Let me now walk you through the highlights of version 0.1. The current feature-set is relatively modest, but all the essential functionality one would expect from an Emacs major mode is there. leverages TreeSitter for syntax highlighting, which is both more accurate and more performant than the traditional regex-based approaches used by and . The font-locking supports 4 customizable intensity levels (controlled via , default 3), so you can pick the amount of color that suits your taste. Both (source) and (interface) files get their own major modes with dedicated highlighting rules. Indentation has always been tricky for OCaml modes, and I won’t pretend it’s perfect yet, but ’s TreeSitter-based indentation engine is already quite usable. It also supports cycle-indent functionality, so hitting repeatedly will cycle through plausible indentation levels - a nice quality-of-life feature when the indentation rules can’t fully determine the “right” indent. If you prefer, you can still delegate indentation to external tools like or even Tuareg’s indentation functions. Still, I think most people will be quite satisfied with the built-in indentation logic. provides proper structural navigation commands ( , , ) powered by TreeSitter, plus integration definitions in a buffer has never been easier. The older modes provide very similar functionality as well, of course, but the use of TreeSitter in makes such commands more reliable and robust. No OCaml mode would be complete without REPL (toplevel) integration. provides all the essentials: The default REPL is , but you can easily switch to via . I’m still on the fence on whether I want to invest time into making the REPL-integration more powerful or keep it as simple as possible. Right now it’s definitely not a big priority for me, but I want to match what the other older OCaml modes offered in that regard. works great with Eglot and , automatically setting the appropriate language IDs for both and files. Pair with ocaml-eglot and you get a pretty solid OCaml development experience. The creation of LSP really simplified the lives of a major mode authors like me, as now many of the features that were historically major mode specific are provided by LSP clients out-of-the-box. That’s also another reason why you probably want to leaner major mode like . But, wait, there’s more! There’s still plenty of work to do: If you’re following me, you probably know that I’m passionate about both Emacs and OCaml. I hope that will be my way to contribute to the awesome OCaml community. I’m not sure how quickly things will move, but I’m committed to making the best OCaml editing experience on Emacs. Time will tell how far I’ll get! If you’re an OCaml programmer using Emacs, I’d love for you to take for a spin. Install it from MELPA, kick the tires, and let me know what you think. Bug reports, feature requests, and pull requests are all most welcome on GitHub ! That’s all from me, folks! Keep hacking! is ancient and barely maintained. It lacks many features that modern Emacs users expect and it probably should have been deprecated a long time ago. is very powerful, but also very complex. It carries a lot of legacy code and its regex-based font-locking and custom indentation engine show their age. It’s a beast - in both the good and the bad sense of the word. aims to be a modern, lean alternative that fully embraces TreeSitter. The codebase is small, well-documented, and easy to hack on. If you’re running Emacs 29+ (and especially Emacs 30), TreeSitter is the future and is built entirely around it. - Start or switch to the OCaml REPL - Send the current definition - Send the selected region - Send the entire buffer - Send a phrase (code until ) to quickly switch between and files Prettify-symbols support for common OCaml operators Automatic installation of the required TreeSitter grammars via Compatibility with Merlin for those who prefer it over LSP Support for additional OCaml file types (e.g. ) Improvements to structured navigation using newer Emacs TreeSitter APIs Improvements to the test suite Addressing feedback from real-world OCaml users Actually writing some fun OCaml code with

0 views
David Bushell 2 weeks ago

Declarative Dialog Menu with Invoker Commands

The off-canvas menu — aka the Hamburger , if you must — has been hot ever since Jobs’ invented mobile web and Ethan Marcott put a name to responsive design . Making an off-canvas menu free from heinous JavaScript has always been possible, but not ideal. I wrote up one technique for Smashing Magazine in 2013. Later I explored in an absurdly titled post where I used the new Popover API . I strongly push clients towards a simple, always visible, flex-box-wrapping list of links. Not least because leaving the subject unattended leads to a multi-level monstrosity. I also believe that good design and content strategy should allow users to navigate and complete primary goals without touching the “main menu”. However, I concede that Hamburgers are now mainstream UI. Jason Bradberry makes a compelling case . This month I redesigned my website . Taking the menu off-canvas at all breakpoints was a painful decision. I’m still not at peace with it. I don’t like plain icons. To somewhat appease my anguish I added big bold “Menu” text. The HTML for the button is pure declarative goodness. I added an extra “open” prefix for assistive tech. Aside note: Ana Tudor asked do we still need all those “visually hidden” styles? I’m using them out of an abundance of caution but my feeling is that Ana is on to something. The menu HTML is just as clean. It’s that simple! I’ve only removed my opinionated class names I use to draw the rest of the owl . I’ll explain more of my style choices later. This technique uses the wonderful new Invoker Command API for interactivity. It is similar to the I mentioned earlier. With a real we get free focus management and more, as Chris Coyier explains . I made a basic CodePen demo for the code above. So here’s the bad news. Invoker commands are so new they must be polyfilled for old browsers. Good news; you don’t need a hefty script. Feature detection isn’t strictly necessary. Keith Cirkel has a more extensive polyfill if you need full API coverage like JavaScript events. My basic version overrides the declarative API with the JavaScript API for one specific use case, and the behaviour remains the same. Let’s get into CSS by starting with my favourite: A strong contrast outline around buttons and links with room to breath. This is not typically visible for pointer events. For other interactions like keyboard navigation it’s visible. The first button inside the dialog, i.e. “Close (menu)”, is naturally given focus by the browser (focus is ‘trapped’ inside the dialog). In most browsers focus remains invisible for pointer events. WebKit has bug. When using or invoker commands the style is visible on the close button for pointer events. This seems wrong, it’s inconsistent, and clients absolutely rage at seeing “ugly” focus — seriously, what is their problem?! I think I’ve found a reliable ‘fix’. Please do not copy this untested . From my limited testing with Apple devices and macOS VoiceOver I found no adverse effects. Below I’ve expanded the ‘not open’ condition within the event listener. First I confirm the event is relevant. I can’t check for an instance of because of the handler. I’d have to listen for keyboard events and that gets murky. Then I check if the focused element has the visible style. If both conditions are true, I remove and reapply focus in a non-visible manner. The boolean is Safari 18.4 onwards. Like I said: extreme caution! But I believe this fixes WebKit’s inconsistency. Feedback is very welcome. I’ll update here if concerns are raised. Native dialog elements allow us to press the ESC key to dismiss them. What about clicking the backdrop? We must opt-in to this behaviour with the attribute. Chris Ferdinandi has written about this and the JavaScript fallback . That’s enough JavaScript! My menu uses a combination of both basic CSS transitions and cross-document view transitions . For on-page transitions I use the setup below. As an example here I fade opacity in and out. How you choose to use nesting selectors and the rule is a matter of taste. I like my at-rules top level. My menu also transitions out when a link is clicked. This does not trigger the closing dialog event. Instead the closing transition is mirrored by a cross-document view transition. The example below handles the fade out for page transitions. Note that I only transition the old view state for the closing menu. The new state is hidden (“off-canvas”). Technically it should be possible to use view transitions to achieve the on-page open and close effects too. I’ve personally found browsers to still be a little janky around view transitions — bugs, or skill issue? It’s probably best to wrap a media query around transitions. “Reduced” is a significant word. It does not mean “no motion”. That said, I have no idea how to assess what is adequately reduced! No motion is a safe bet… I think? So there we have it! Declarative dialog menu with invoker commands, topped with a medley of CSS transitions and a sprinkle of almost optional JavaScript. Aren’t modern web standards wonderful, when they work? I can’t end this topic without mentioning Jim Nielsen’s menu . I won’t spoil the fun, take a look! When I realised how it works, my first reaction was “is that allowed?!” It work’s remarkably well for Jim’s blog. I don’t recall seeing that idea in the wild elsewhere. Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views
Giles's blog 3 weeks ago

Writing an LLM from scratch, part 32d -- Interventions: adding attention bias

I'm still seeing what I can do to improve the test loss for a from-scratch GPT-2 small base model, trained on code based on Sebastian Raschka 's book " Build a Large Language Model (from Scratch) ". This is the third intervention I'm trying: adding bias to the attention weight matrices. In the code from the book, we have this: So: we initialise the weights W q , W k and W v as linear layers rather than simple matrices of weights, and have a parameter to say whether or not we should add bias to those. In all of our trains so far we've set that to . Why do we have this parameter, and where did it come from? In Raschka's book, the use of the for these weights is introduced in section 3.4.2 with the wording: We can improve the implementation further by utilizing PyTorch's layers, which effectively perform matrix multiplication when the bias units are disabled. Additionally, a significant advantage of using instead of manually implementing is that has an optimized weight initialization scheme, contributing to more stable and effective model training. So, it's presented essentially as a way of getting better weights for our untrained model, which makes good sense in and of itself -- but, if that's the only reason, why don't we just hard-wire it to have ? That would be the sensible thing to do if the initialisation were the only reason, but clearly there's more to it than that. Section 4.1 has a bit more information: determines whether to include a bias vector in the layers of the multi-head attention ... We will initially disable this, following the norms of modern LLMs, but we will revisit it in chapter 6 when we load pretrained GPT-2 weights from OpenAI into our model. That looks like a typo, as the real explanation is in chapter 5, section 5 (page 164 in my copy), where we do indeed load the OpenAI weights: OpenAI used bias vectors in the multi-head attention module's linear layers to implement the query, key and value matrix computations. Bias vectors are not commonly used in LLMs anymore as they don't improve the modeling performance and are thus unnecessary. So, that all makes sense so far. QKV bias was part of the original GPT-2 models, perhaps just because it was standard at the time, inherited from something else, or perhaps for some other reason -- I can't find any reference to it in the actual paper . But people have found it doesn't help, so no-one uses it these days. But... is there some way in which an LLM of this specific size, or in some other way similar to the GPT-2 small model that we're training, might in some way benefit from having bias? That's what this experiment is for :-) One thing that occurred to me while setting this up is that we have been training on a Chinchilla-optimal number of tokens, 20x the number of parameters. Without QKV bias, we have 163,009,536 parameters, so we've been training on 3,260,190,720 tokens, rounded up to the nearest batch size, which is 3,260,252,160 in our current setup for these experiments (per-GPU micro-batches of 12, with 8 GPUs, so a total batch size of 96). These extra bias terms will be parameters, though! We're essentially making our model larger by adding them, which changes the Chinchilla calculation. How much? OK, that's essentially nothing -- 27,648 extra total paramaters on top of 163 million. I make it less than two hundredths of a percentage point larger! The correct number of tokens goes up to 3,260,743,680, so if we wanted to be very pedantic, we're under-training. But I feel like training on a larger dataset is worse in terms of comparability between the baseline and our "intervened-on" model with QKV bias. So: we'll train a model with QKV bias on 3,260,252,160 tokens, accepting that it's a tiny bit less than Chinchilla-optimal. Let's see how it goes! Here's the config file for this train. Running it gives this training chart: Pretty standard, though the loss spikes look less prominent than they have been in the other trains. Might QKV bias actually help with model stability in some way...? The train finished with these stats: Timing-wise, pretty much indistinguishable from the baseline train's 12,243.523 seconds. The final train loss looks a tad better, but we can't rely on that -- the test set loss is the important one. So it was time to download it, upload it to Hugging Face Hub , and then on to the evals. Firstly, our normal "how should you continue ": Not bad at all, borderline coherent! Next, the loss on the test set: Well, crap! Now that's a surprise. Let's look at that in the context of the other interventions to see how surprising that is, given Raschka's comments (which were undoubtedly backed up by serious research): So, adding QKV bias actually improved our test set loss by more than gradient clipping did! The loss spikes in the training chart look smaller than in the other trains 1 , so, speculating wildly, perhaps with a model of this size, the bias stabilises things somehow? Or perhaps what we're seeing is the model become that tiny bit smarter because it has some extra parameters -- albeit less than 0.02 percent more? I'm not going to spend time investigating things now, but this is a really interesting result. One extra thing that does occur to me is that the direction research has taken since GPT-2 has definitely been in the direction of larger models. The attention weight matrices are sized d emb × d emb , so excluding bias they have d emb 2 weights each. Bias adds on another d emb . So, as a model scales up, the attention-related non-bias weights will scale quadratically -- doubling d emb will square their number -- while the bias weights will scale linearly. So perhaps it's just that the effect -- whatever causes it -- gets rapidly swamped as you scale out of toy-model territory. That, at least, seems pretty plausible. One final note to self, though: these improvements are small enough that I do find myself wondering whether or not it might be some kind of noise, despite the setting of the random seeds I'm doing: I think that at the end of this, before I do a final train, it would be worth doing another baseline train and measuring the test set loss again, and doing another comparison. If it comes out exactly the same -- and I can bump up the number of significant figures in the output, it's just a formatting parameter -- then I don't need to worry. But if they vary to some degree, perhaps I'll need to update my mental model of what level of finding is significant, and what isn't. I think it goes without saying that QKV bias definitely goes onto the list of interventions we want to add when training our best-possible GPT-2 small-scale model, assuming that the random seed test goes well. That surprises me a bit, I was expecting it to have negligible impact! That, of course, is why it's worth doing these tests. Next up, I think, is trying to understand how we can tweak the learning rate, and its associated parameters like weight decay. This will need a bit of a deep dive, so you can expect the next post late next week, or perhaps even later. I'm sure you can't wait ;-) Note to self: is there some way I could quantitatively measure those?  ↩ Note to self: is there some way I could quantitatively measure those?  ↩

0 views
Evan Schwartz 3 weeks ago

Scour - January Update

Hi friends, In January, Scour scoured 805,241 posts from 16,555 feeds (939 were newly added). I also rolled out a lot of new features that I'm excited to tell you about. Maybe because of some of these, I found more posts than usual that I thought were especially worth sharing. You can find them at the bottom of this post. Let's dive in! The Scour homepage has been completely revamped. It includes a new tagline, a more succinct description, and a live demo where you can try out my feed right from that page. Let me know what you think! Scour also finally has its own logo! (And it looks great on my phone's home screen, if I do say so myself! See below ) Have you ever wondered how Scour works? There is now a full documentation section, complete with detailed write-ups about Interests , Feeds , Reactions , How Ranking Works , and more. There are also guides specifically for RSS users and readers of Hacker News , arXiv , Reddit , and Substack . All of the docs have lots of interactive elements, which I wrote about in Building Docs Like a Product . My favorite one is on the Hacker News guide where you can search for hidden gems that have been submitted to HN but that have not reached the front page. Thanks to Tiago Ferreira , Andrew Doran , and everyone else who gave me the feedback that they wanted to understand more about how Scour works! Scour is now a Progressive Web App (PWA). That means you can install it as an icon on your home screen and access it easily. Just open Scour on your phone and follow the instructions there. Thanks to Adam Benenson for the encouragement to finally do this! This is one of the features I have most wanted as a user of Scour myself. When you're browsing the feed, Scour now keeps track of which items you've seen and scrolled past so it shows you new content each time you check it. If you don't want this behavior, you can disable it in the feed filter menu or change your default view to show seen posts. If you subscribe to specific feeds, as opposed to scouring all of them, it's now easier to find the feed for an article you liked . Click the "..." menu under the post, then "Show Feeds" to show feeds where the item was found. When populating that list, Scour will now automatically search the website where the article was found to see if it has a feed that Scour wasn't already checking. This makes it easy to discover new feeds and follow websites or authors whose content you like. This was another feature I've wanted for a long time myself. Previously, when I liked an article, I'd copy the domain and try to add it to my feeds on the Feeds page. Now, Scour does that with the click of a button. Some of the most disliked and flagged articles on Scour had titles such as "The Top 10..." or "5 tricks...". Scour now automatically penalizes articles with titles like those. Because I'm explicitly trying to avoid using popularity in ranking , I need to find other ways to boost high-quality content and down-rank low-quality content. You can expect more of these types of changes in the future to increase the overall quality of what you see in your feed. Previously, posts found through Google News links would show Google News as the domain under the post. Now, Scour extracts the original link. You can now navigate your feed using just your keyboard. Type to get the list of available keyboard shortcuts. Finally, here are some of my favorite posts that I found on Scour in January. There were a lot! Happy Scouring! Have feedback for Scour? Post it on the feedback board and upvote others' suggestions to help me prioritize new features! I appreciate this minimalist approach to coding agents: Pi: The Minimal Agent Within OpenClaw , even though it didn't yet convince me to switch away from Claude Code. A long and interesting take on which software tools will survive the AI era: Software Survival 3.0 . Scour uses Litestream for backup. While this new feature isn't directly relevant, I'm excited that it's now powering Fly.io's new Sprites offering (so I expect it to be a little more actively developed): Litestream Writable VFS . This is a very cool development in embedding models: a family of different size (and, as a result, cost) models whose embeddings are interoperable with one another: The Voyage 4 model family: shared embedding space with MoE architecture . A thought-provoking piece from Every about How AI Made Pricing Hard Again . TL;DR: over are the days where SaaS businesses have practically zero marginal cost for additional users or additional usage. A nice bit of UX design history about the gas tank arrow indicator on a car, with a lesson applied to AI: The Moylan Arrow: IA Lessons for AI-Powered Experiences . Helpful context for Understanding U.S. Intervention in Venezuela . Stoolap: an interesting new embedded database. Stoolap 0.2 Released For Modern Embedded SQL Database In Rust . I keep browsing fonts and, while I decided not to use this one for Scour, I think this is a neat semi-sans-serif from an independent designer: Heliotrope .

0 views
ava's blog 3 weeks ago

a month without caffeine - conclusion

In January, I wrote about doing a month without caffeine and gave an update one week in. In the original post, I wrote about realizing I was using it to override exhaustion rather than addressing it. I had been relying on matcha, black tea, come coffee and caffeinated water flavours to get through poor sleep, university pressure, workouts, and social commitments, which ultimately led to burnout. So I decided to quit for a month, and also not allow any decaf products, since they also contain a lesser amount. I experienced withdrawal headaches, nausea and changes to my hunger, but also my energy became steadier, my mood calmer, and my concentration more sustainable without the sharp spikes and crashes. I concluded with some lessons for when I resume, namely reserving caffeinated drinks for when it really matters, not consuming them after noon, reducing the caffeine intake (less strong matcha or black tea, less coffee shots etc.) and not using it to suppress hunger or other needs. Now that the month has passed, I'm back to report that it continued like the first post ended; I feel very calm, emotions and situations are more manageable, focus and task-switching is less of an issue. Getting up and going to bed feel easier. What took the longest to normalize were the gastrointestinal effects; it became clear my body relied on the caffeine to do that business at the usual times, and at first, everything was very delayed and I dealt with constipation. But during the third week, it went back to normal. I've had quite a few moments towards the end where I almost gave in, but I persisted. Sometimes I just really crave a specific taste or mouthfeel, and nothing can really replace matcha for me. It's such a comfort and reward. I'm also very, very used to having specific kind of beverages to study or work on something, so breaking that was difficult. I think this reset was great. I found out I can just go without caffeine as well without a meaningful drop in productivity, and I genuinely feel happier, more rested and stable. Now I know it's still entirely optional and I can enjoy it for the taste or specific rituals to get ready :) I like to think I have reset my palate with this too, which will come in handy for upcoming matcha reviews ! Now I will enjoy a mocha chocolate bar I saved for this! Reply via email Published 04 Feb, 2026

0 views
Jeff Geerling 1 months ago

Ode to the AA Battery

Recently this post from @Merocle caught my eye: I'm fixing my iFixit soldering station. I haven't used it for a long time and the battery has gone overdischarge. I hope it will come back to life. Unfortunately, there are no replacements available for sale at the moment. Devices with built-in rechargeable batteries have been bugging me a lot lately. It's convenient to have a device you can take with you and use anywhere. And with modern Li-ion cells, battery life is remarkable.

0 views
iDiallo 1 months ago

Chatbots Only Exist Because the UI Sucks

There was a time when building a good UI was really hard. My default Microsoft Word window had at least five toolbars. My web browser opened to Yahoo, where finding anything felt impossible. Internet Explorer sprouted toolbars I never remembered installing. We crammed features into every nook and cranny of the screen. Then minimalist design took over the web. Finally, we had breathing room. Finally, I could see the actual background color of a page. But I also couldn't find anything useful because every feature was buried behind a hidden menu. Google became our default support tool. When was the last time you clicked the help menu in your favorite application? Never. You Google how to perform a task, and Google shows you how. But what if you're not looking for "how to do X?" What if you need to know where your order is? What if you need information specific to your account that Google has no way of accessing? For that, only the company or its support team can help you. But support teams are expensive. You need something resembling a call center. Even outsourcing doesn't eliminate the cost. A few years back, I worked at a small startup building AI chatbots to answer exactly these types of questions. Where is my order? How do I return this? Can I exchange these shoes for a different size? We automated the responses. It was amazing. We resolved up to 40% of queries without human intervention. We had clients worldwide, and it worked remarkably well. But one question remained in the back of my mind. Why are customers using the chatbot in the first place? Why are they asking for tracking information? Why don't they just check the website? After reviewing dozens of client websites, I reached a simple conclusion: their UI sucked . Our clients made their interfaces terrible in two ways. One unintentional, one deliberate. Many e-commerce sites bury order information in bizarre places. You navigate through submenus, click three links deep, only to find your order with no useful details. Poor integration between the e-commerce platform and the internal POS system means the order status never reflects reality. Sometimes you have to call and read your order number to someone with direct system access. Even if they couldn't provide real-time updates, why didn't they simply state on the website: "Orders update within 24 hours." Any information beats silence . Then there are the deliberate cases. Companies that actively hide information from customers. Facebook won't give you a direct link to close your account. Instead, their help page lists seven steps to reach the deletion page. I've bookmarked it, but they regularly update the process to ensure you always jump through hoops. At the AI startup, some potential clients explicitly asked us to create a maze to frustrate customers trying to cancel subscriptions. When we refused, they ghosted us. I found it ironic that clients would pay us to automate order information delivery instead of simply adding a prominent link to their homepage. We even built a product, a single page where customers could handle all order-related transactions themselves. Our clients never realized that if they'd just built that page themselves, they wouldn't have needed us at all. When a customer reaches for your chatbot, it's not because they think chatbots are cool. It's because you've failed them. Somewhere along the way, they couldn't find their tracking information. They got lost in a labyrinth of FAQs. They hit a roadblock trying to resolve a simple issue. By the time they open that chat window, they're already frustrated or confused. They don't want another layer of complexity. They want a fast, simple solution. Your chatbot is a band-aid covering a wound you inflicted. Fix your UI. Make information easy to find. Stop hiding basic functionality behind menus and mazes and you might not need that expensive chatbot after all.

2 views
Abhinav Sarkar 1 months ago

Implementing Co, a Small Language With Coroutines #5: Adding Sleep

In the previous post , we added channels to Co , the small language we are implementing in this series of posts. In this post, we add the primitive to it, enabling time-based coroutine scheduling. We then use sleep to build a simulation of digital logic circuits. This post was originally published on abhinavsarkar.net . This post is a part of the series: Implementing Co, a Small Language With Coroutines . Sleep is a commonly used operation in concurrent programs. It pauses the execution of the current Thread of Computation (ToC) for a specified duration, after which the ToC is resumed automatically. Sleep is used for various purposes: polling for events, delaying execution of an operation, simulating latency, implementing timeouts, and more. Sleep is generally implemented as a primitive operation in most languages, delegating the actual implementation to the underlying operating system. The operating system’s scheduler removes the ToC from the list of runnable ToCs , places it in a list of sleeping ToCs , and after the specified duration, moves it back to the list of runnable ToCs for scheduling. Since Co implements its own ToC (coroutine) scheduler, we implement sleep as a primitive operation within the interpreter itself 1 . We start by exposing and as built-in functions to Co : The built-in function takes one argument—the duration in milliseconds to sleep for. The function returns the current time in milliseconds since the Unix epoch . Both of them delegate to the functions explained next. The function evaluates its argument to a number, checks that it is non-negative, and then calls the function in the monad. calls and returns the milliseconds wrapped as a . The implementation of sleep is more involved than other built-in functions because it interacts with the coroutine scheduler. When a coroutine calls , we want to suspend the coroutine, and schedule it to be resumed after the specified duration. There may be multiple coroutines in the sleep state at a time, and they must be resumed according to their wakeup time (time at which sleep was called + sleep duration), and not in any other order. To be efficient, it is also important that the scheduler does not poll repeatedly for new coroutines to wake up and run, but instead waits till the right time. These are the two requirements for our coroutine scheduler. And the solution is: delayed coroutines. The coroutines we have implemented so far were scheduled to run immediately. To implement sleep, we extend the coroutine concept with Delayed Coroutines —coroutines that are scheduled to run at a specific future time. Now the data type holds an to signal when the coroutine is ready to be run. The old-style coroutines that run immediately are created ready to run by the function. But delayed coroutines are different: The key difference from a regular coroutine is that the used for signaling is created empty. We fork a thread 2 that sleeps 3 for the specified sleep duration, and then signals that the coroutine is ready to run by filling the . An is a synchronization primitive 4 —essentially a mutable box that can hold a value or be empty. When we call on an empty , it blocks until another thread fills it. This is what makes it powerful for our use case: instead of the interpreter repeatedly polling the queue asking “is this coroutine ready yet?”, we let the interpreter wait on the . The forked thread signals readiness at the right time by filling the . The interpreter wakes up immediately—no wasted CPU cycles, no busy-waiting. We already have a of coroutines in our . It is a min-priority queue sorted by timestamps, which we have been using as a FIFO queue till now. Now we use it for its real purpose: storing delayed coroutines sorted by their wakeup times. The queue also tracks the maximum wakeup time of all coroutines in the queue. This information is useful for calculating how long the interpreter should sleep before termination. The core operations on the queue are: We saw the function earlier : The function enqueues the given value at the given time in the queue. The function enqueues the value at the current time, thus scheduling it to run immediately. The function dequeues the value with the lowest priority from the queue, which in this case, is the value that is enqueued first. The function returns the monotonically increasing current system time. The function dequeues the coroutine with lowest priority, so if we use the wakeup time as priority, it will dequeue the coroutine that is to be run next. That works! The function calculates and tracks the maximum wakeup times of the coroutines as well. Next, we implement the scheduling of delayed coroutines: The function enqueues a coroutine in the interpreter coroutine queue with the specified wakeup time. We also improve the function to wait for the coroutine to be ready before running it. The function call blocks till the thread that was forked when creating the coroutine wakes up and fills the . So we don’t have to poll the queue. That’s all we have to do for having delayed coroutines. With the infrastructure in place, the function becomes straightforward: When a coroutine calls , we capture the current environment and use to capture the continuation—the code that should run after the sleep completes. We then create a new delayed coroutine with this continuation, schedule it for the future, and run the next coroutine in the queue. The scheduler machinery takes care of running the delayed coroutine at the right time. We also modify the function from the previous post to handle delayed coroutines. It now sleeps till the last wakeup time before checking if the queue is empty: Notice how we use the function we just defined in . The function calculates how long to sleep before the last coroutine becomes ready: That’s all for sleeping. This may be too much to take in, so let’s go through some examples. Sleep can be used for polling/waiting for events, delaying execution, simulating latency, implementing timeouts, and more. Let’s see some simple uses. An interesting example of sleep is the infamous sleep sort , which sorts a list of numbers by spawning a coroutine for each number that sleeps for the duration of that number, then prints it: Running this program prints what we expect: Don’t use for sorting your numbers though. Moving on. With sleep, we can implement JavaScript-like and functions: The function spawns a coroutine that sleeps for the specified duration and then calls the callback function. The function repeatedly calls a callback at a fixed interval using to reschedule itself. Running the above code prints alternating and every 1 second, forever: Notice that the scheduling is not accurate up to milliseconds, but only approximate. As a more complex example of using sleep, we implement a simulator for digital logic circuits, from basic Logic gates to a Ripple carry adder . The idea is to model circuits as a network of wires and gates, where the wires carry digital signal values ( or ), and the logic gates transform input signals to output signals with a propagation delay. The digital circuit simulation example is from the Wizard Book . Quoting an example: An inverter is a primitive function box [logic gate] that inverts its input. If the input signal to an inverter changes to 0, then one inverter-delay later the inverter will change its output signal to 1. If the input signal to an inverter changes to 1, then one inverter-delay later the inverter will change its output signal to 0. But first, we’ll need to make some lists. We implement a simple cons list (a singly linked list) using a trick from the book itself : creates an empty list, and we grow the list by prepending an element to it by calling the function. returns the first element of a list, and returns the rest of them. Notice that a cell is just a closure that holds references to its first and rest parameters, and returns a selector function to retrieve them. Next, we define a helper function to call a list of actions, yielding after each one: A wire holds a mutable signal value and a list of actions to call when the signal changes: A wire provides three operations: The function connects two wires, causing the signal from one to propagate to another. First, we define the basic logic operations: And a utility function to schedule a function to run after a delay: With these building blocks, we define the logic gates. Each gate computes its output based on its inputs and schedules the output update after a propagation delay specific to the gate: We add the action to each input wire, which runs when the input signals change, and sets the signal on the output wire after a delay. Let’s test an And gate: For probing, we define a helper that logs signal changes with milliseconds elapsed since start of the run: The output: It works as expected. You can notice the sleep and the And gate delay in action. Using the basic logic gates, next we build adders. A Half adder is a digital circuit that adds two bits: It has two input signals/bits and , and two output bits and . We simply connect the And, Or and Not gates with input, output and intermediate wires in our code as shown in the diagram: Nice and simple. Let’s test it: And the output: In binary, . Correct! Notice again how the signal propagation through the gates is delayed. Next up is the full adder. A Full adder adds three bits, two inputs and a carry-in: Notice that a full adder uses two half adders. Again, we follow the diagram and connect the wires: Let’s skip the demo for full adder and jump to something more exciting. A Ripple-carry adder chains together multiple full adders to add multi-bit numbers. The diagram below shows a four-bit adder: We create a ripple-carry adder that can add any number of bits. First we need some helper functions: creates a list of wires to represent an N-bit input/output. sets the bits of a N-bit wire list to a given N-bit value. Now we write a ripple-carry adder: The ripple-carry adder uses one full adder per bit, cascading the carry-out bit of each input bit-pair’s sum to the next pair of bits. To demonstrate, let’s add two 4-bit numbers: This one runs for a while because of the collective delays. Let me pick out the final output: We add and in binary, resulting in , which is correct again. Everything works perfectly. With sleep, we’ve now implemented all major features of Co —a complete concurrent language with first-class coroutines, channels, and time-based scheduling. With the addition of sleep, we’ve completed our implementation of Co —a small language with coroutines and channels. Over these five posts, we went from parsing source code to building a full interpreter that handles cooperative multitasking using coroutines. The key insight was realizing that coroutines are just environments plus continuations. By designing our interpreter to use continuation-passing style, we gained the ability to suspend execution at any point and resume it later. Channels built naturally on top of that, providing a way for coroutines to synchronize and pass messages. And sleep extended the scheduler to handle time-based execution, unlocking patterns like timeouts and periodic tasks. The examples we built along the way—pubsub system, actor system, and digital circuit simulation—show what becomes possible once these primitives are in place. Starting with basic arithmetic and functions, we ended up with a language capable of expressing real concurrent programs. What comes next? Maybe a compiler for Co ? Stay tuned by subscribing to the feed or the email newsletter . The full code for the Co interpreter is available here . If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! The sleep implementation in Co is not interruptible. That is, if a coroutine is sleeping, it cannot be resumed before the specified duration. This is different from sleep implementations in most programming languages, where the sleep operation can be interrupted by sending a signal to the sleeping ToC. ↩︎ Threads in GHC are Green Threads and are very cheap to create and run. It is perfectly okay to fork a new one for each delayed coroutine. ↩︎ So in a way, we cheat here by using the sleep primitive provided by the GHC runtime to implement our sleep primitive. If we write a compiler for Co , we’ll have to write our own runtime where we’ll have to implement our sleep function using the functionalities provided by the operating systems. ↩︎ To learn more about how s can be used to communicate between threads, read the chapter 24 of Real World Haskell . ↩︎ This post is a part of the series: Implementing Co, a Small Language With Coroutines . If you liked this post, please leave a comment . The Interpreter Adding Coroutines Adding Channels Adding Sleep 👈 Introduction Adding Sleep Delayed Coroutines Queuing Coroutines Implementing Sleep Sleep in Action Sleep Sort JavaScript-like Timeouts and Intervals Bonus Round: Digital Circuit Simulation Conjuring Lists Logic Gates Ripple-carry Adder : returns the current signal value. : sets a new signal value and calls all actions if the value changed. : adds an action to be called when the signal changes, and calls it immediately. The sleep implementation in Co is not interruptible. That is, if a coroutine is sleeping, it cannot be resumed before the specified duration. This is different from sleep implementations in most programming languages, where the sleep operation can be interrupted by sending a signal to the sleeping ToC. ↩︎ Threads in GHC are Green Threads and are very cheap to create and run. It is perfectly okay to fork a new one for each delayed coroutine. ↩︎ So in a way, we cheat here by using the sleep primitive provided by the GHC runtime to implement our sleep primitive. If we write a compiler for Co , we’ll have to write our own runtime where we’ll have to implement our sleep function using the functionalities provided by the operating systems. ↩︎ To learn more about how s can be used to communicate between threads, read the chapter 24 of Real World Haskell . ↩︎ The Interpreter Adding Coroutines Adding Channels Adding Sleep 👈

0 views
Jason Fried 1 months ago

Great design was a Cuisinart

The original Cuisinart was peak interface/appliance design. Two perfect paddle buttons. One for ON that catches and stays down for continual operation. And one for PULSE/OFF that does exactly what it says. It doesn't stay down, it returns to reset when you let go. Hold it down as long as you want to pulse. Let go and it stops. Or just push it down to disengage ON, and turn it off. The description may sound a bit convoluted because of the dual purpose PULSE/OFF button, but when you use it it's absolutely obvious. And couldn't be better. Big flat toggle buttons under satisfying tension. Clear affordances. You push them down, not in. One click catches, one doesn't. You can use them with bare hands, oven mitts, fingers coated in slippery whatever. Doesn't matter. Plenty of surface area. Outstanding design. It's all been downhill from there. -Jason

0 views
Gabriel Weinberg 1 months ago

As AI displaces jobs, the US government should create new jobs building affordable housing

We have a housing shortage in the U.S., and it is arguably a major cause of long-term unrest about the economy. Putting aside whether AI will eliminate jobs on net, it will certainly displace a lot of them. And the displaced people are unlikely to be the same people who will secure the higher-tech jobs that get created. For example, are most displaced truck drivers going to get jobs in new industries that require a lot of education? Put these two problems together and maybe there is a solution hiding in plain sight: create millions of new jobs in housing. Someone has to build all the affordable homes we need, so why not subsidize jobs and training for those displaced by AI? These jobs will arguably offer an easier onramp and are sorely needed now (and likely for the next couple of decades as we chip away at this housing shortage). Granted, labor may not be the primary bottleneck in the housing shortage, but it is certainly a factor and one that is seemingly being overlooked. There are many bills in Congress aimed at increasing housing supply through new financing and relaxed regulatory frameworks. A program like this would help complete the package. None of this has been happening via market forces alone, so the government would therefore need to create a new program at a large scale, like the Works Progress Administration (WPA) at the end of the Great Depression, but this time squarely focused on affordable housing (and otherwise narrowly tailored to avoid inefficiencies). There are a lot of ways such a program could work (or not work), including ways to maximize the long-term public benefit (and minimize its long-term public cost), but this post is just about floating the high-level idea. So there you have it. I’ll leave you though with a few more specific thought starters: Every state could benefit since every state has affordable housing issues. Programs become more politically viable when more states benefit from them. Such a program could be narrowly tailored, squarely focused on affordable housing (as mentioned above), but also keeping the jobs time-limited (the whole program could be time-limited and tied to overall housing stock), and keeping the wages slightly below local market rates (to complement rather than compete with private construction). It could also be tailored to those just affected by AI, but that doesn’t seem like the right approach to me. The AI job market impact timeline is unclear, but we can nevertheless start an affordable-housing jobs program now that we need today, which can also serve as a partial backstop for AI-job fallout tomorrow. It seems fine to me if some workers who join aren't directly displaced by AI, since the program still creates net new jobs we will need anyway and to some extent jobs within an education band are fungible. We will surely need other programs as well to help displaced workers specifically (for example, increased unemployment benefits). Thanks for reading! Subscribe for free to receive new posts or get the audio version . We have a housing shortage in the U.S., and it is arguably a major cause of long-term unrest about the economy. Putting aside whether AI will eliminate jobs on net, it will certainly displace a lot of them. And the displaced people are unlikely to be the same people who will secure the higher-tech jobs that get created. For example, are most displaced truck drivers going to get jobs in new industries that require a lot of education? Put these two problems together and maybe there is a solution hiding in plain sight: create millions of new jobs in housing. Someone has to build all the affordable homes we need, so why not subsidize jobs and training for those displaced by AI? These jobs will arguably offer an easier onramp and are sorely needed now (and likely for the next couple of decades as we chip away at this housing shortage). Granted, labor may not be the primary bottleneck in the housing shortage, but it is certainly a factor and one that is seemingly being overlooked. There are many bills in Congress aimed at increasing housing supply through new financing and relaxed regulatory frameworks. A program like this would help complete the package. None of this has been happening via market forces alone, so the government would therefore need to create a new program at a large scale, like the Works Progress Administration (WPA) at the end of the Great Depression, but this time squarely focused on affordable housing (and otherwise narrowly tailored to avoid inefficiencies). There are a lot of ways such a program could work (or not work), including ways to maximize the long-term public benefit (and minimize its long-term public cost), but this post is just about floating the high-level idea. So there you have it. I’ll leave you though with a few more specific thought starters: Every state could benefit since every state has affordable housing issues. Programs become more politically viable when more states benefit from them. Such a program could be narrowly tailored, squarely focused on affordable housing (as mentioned above), but also keeping the jobs time-limited (the whole program could be time-limited and tied to overall housing stock), and keeping the wages slightly below local market rates (to complement rather than compete with private construction). It could also be tailored to those just affected by AI, but that doesn’t seem like the right approach to me. The AI job market impact timeline is unclear, but we can nevertheless start an affordable-housing jobs program now that we need today, which can also serve as a partial backstop for AI-job fallout tomorrow. It seems fine to me if some workers who join aren't directly displaced by AI, since the program still creates net new jobs we will need anyway and to some extent jobs within an education band are fungible. We will surely need other programs as well to help displaced workers specifically (for example, increased unemployment benefits).

0 views
Stratechery 2 months ago

An Interview with Rivian CEO RJ Scaringe About Building a Car Company and Autonomy

Listen to this post: Good morning, Today’s Stratechery Interview is with Rivian founder and CEO RJ Scaringe . Last week Rivian held their Autonomy and AI Day , where the company unveiled its plans for a fully integrated approach to self-driving . Rivian is building everything from its own chips to its own sensors — including video, LiDAR, and radar — and if all goes well, the company will supply a multitude of companies, particularly Volkswagen. In this interview we cover all aspects of Rivian, including the long path to starting the company, production challenges, and why partnerships with Amazon and Volkswagen are so important, and point to relationships in the future. We also dive into autonomy, and why Rivian is taking a different path than Tesla, plus I ask why CarPlay isn’t available on Rivian vehicles, and what that reveals about their nature. As an aside to podcast listeners: due to a mind-boggling mistake by me, the first 20 minutes of this podcast are considerably lower audio quality. I forgot to hit ‘Record’, so the segment that remains is what the Rivian PR represenative captured on her phone. I’m incredible grateful for the save. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for content and clarity. RJ Scaringe, welcome to Stratechery. RJS: Happy to be here. We are here to talk about Rivian and your recent Autonomy and AI Day. Before we get to that, however, I want to learn more about you and your background, and how you ended up sitting with me today. You were, as I understand it, into cars at a very early age. RJS: I’ve been around cars as long as I could remember. As a kid I was restoring and working on cars. I spent time in restoration shops helping and slowly learning how to do more than “help”, but actually really help. And then around the age of, I guess 10-ish, decided I wanted to start a car company. Oh, okay. So there was no like, “Oh, I got a computer and started typing out BASIC”, this is straight cars all the way. RJS: Yeah, I knew I wanted to start a car company. And at that point when you’re a kid, you have no idea what it entails, you have no idea what the business is going to be, but I just knew that it was something I wanted to do and I sort of started charting out my future path with that as the end state goal, with that as the context. So I went and worked as a machinist, I ultimately went to school for engineering, I did a degree in mechanical engineering, then I went and did a master’s and a PhD focused on automotive. And then the day after I finished my PhD, I officially started Rivian. So why did you think it was necessary to do that level of education? Not just a bachelor’s or not just gain experience, but to go all the way through to the PhD? RJS: It was actually pretty intentional. I knew that to start a car company, I was intellectually honest with myself that it would take a lot of money and I knew that I didn’t have any money. So that meant for me to do this and be successful, I would need to get other people to invest money into the idea and typically in the tech space, you could start something with not a lot of capital that you can make a very crude version of your first product… Because it’s software not hardware RSJ: Exactly, and I also knew that I didn’t want to go work for 25 or 30 years to accumulate experience that would make me credible. So I was like, “What’s the fastest path to credibility?”, and my thought was it would be a PhD. I said, “If I get a PhD from a top school” — I went to MIT — I thought that would be some earned credibility that would make it more likely that investors would want to get into the company. I didn’t grow up around venture capital, I had no idea around any of these things, that was my hypothesis. And amazingly, it proved to be a key element in Rivian’s journey, because one of our early investors, one of our earliest large investors, I should say, was someone that I was introduced to through MIT and was an alumni of the school and I was connected with them through the provost, and that was ultimately what led to some of the really critical early capital into the business. So is it the end state that the PhD was totally worth it, but the actual academics was completely incidental to this introduction? RJS: Yeah, and I think as is the case I think with all higher education, the biggest takeaway is to learn how to learn, and to learn how to solve complex problems. I think undergrad, you’re learning how to learn. Graduate school, particularly for technical degrees, you identify a problem and you work really hard to solve that problem, and you have broad responsibility and broad scope on doing that activity, and you build confidence and you build skillsets around problem solving. But the problems are going to change in the course of a life or in the course of your career, the things I was working on 20 years ago have nothing to do with Rivian whatsoever. Well, I’m actually kind of curious about that. What were the things that you were working on 20 years ago, and why aren’t they applicable? RJS: Well, in the case of automotive research, in 2005 the work that was funded, which is the kind of work that you do as a PhD student, so you get sponsored by companies or by grants, was to look at making engines, internal combustion engines more efficient, that was primarily the focus. And so I was working on something called a homogeneous charge compression ignition engine, which is a different type of combustion. We’d compression ignite a pre-mixed fuel air mixture, very hard to get- Like diesel? RJS: It’s like diesel efficiency with gasoline-like cleanliness is the idea. Obviously, it’s not a technology that has any runway and makes any sense in the future. (laughing) I’m hearing about if for the first time right now. RJS: Yeah, so it was an interesting project. It was really a study in software controls, because that was the challenge of this project. But I didn’t take a single piece of that and use it in starting Rivian. Now, that’s different, some folks turn their PhD into the foundation of a business and start a business off the back of it, I had the benefit of just being in the automotive lab, I had the benefit of working closely with car companies. Big, large car companies were funding a lot of the work and it further solidified my view that I didn’t want to go work at one of those companies, and I thought the likelihood of me learning the necessary skills is lower working at one of those places than me learning by doing, by just going and starting a company as a 26-year-old PhD graduate. Right. So if you start out with cars as a child and you’re coming all the way up, at what point did you know that the car — you wanted to start a car company, at what point did you know it was going to be electric and not internal combustion? RJS: That was far less clear. I wanted to make cars very efficient and I wanted to design cars that would essentially help define what the future of state would look like. But when you’re 10, you have no idea what that means. When you’re 20 — and at this point, this was early 2000s, it still wasn’t that clear — and so it didn’t really become clear until I started the business. Even before starting the business, one of the concepts that was competing for what Rivian ultimately would become was this idea I had for a pedal-powered car, which at the time I was thinking could be a hybrid-electric, except the hybrid, it was human-plus-electric drive and amazingly, full circle, that happens with e-bikes. E-bike is the most popular electric— Oh right, yeah. RJS: But 25 years ago, 20-plus years ago, that wasn’t clear that e-bikes were going to be an explosive success as they’ve been. And then I created within Rivian, a skunkworks team that’s now spun out into a new company to actually focus on this pedal and pedal-hybrid electric vehicles. So we have a quadricycle we’re doing with Amazon as a first big customer, but the name of this company is Also, and the idea of this spin-out from Rivian is that if you want to electrify the world, you need to electrify vehicles, but you also need to electrify everything else, and so Also is doing everything else. So Tesla started in 2003. Was there any inspiration or connection there, or is it just incidental that it ended up being kind of around the same time? RJS: Yeah, so they started in ’03, I started in ’09. Of course I was aware of Tesla, but Tesla launched their first car, the Roadster , before I even started Rivian and so they launched the Roadster and then they were working on the Model S , which it doesn’t get talked about a lot, but there was a time when the Model S was considering using a series hybrid architecture as well. Ultimately, they went pure EV but that was in like 2008, 2009, and I started the company just as, my view is there’s going to need to be a lot of successful choices, and I’d been on that mission for a while — what I didn’t expect is just the process of raising capital is really hard. So is that actually where they did help you a lot, just because eventually once they got over the hump and it was a successful venture, did that make it easier for you in the long run? RJS: I think so, and I think Tesla was the existence proof that I’d say more than raising capital, what Tesla did is they showed that electric cars could be cool. RJS: And they did that with the Roadster. So they launched this Roadster, they took a Lotus Elise , they re-engineered it, they made it electric. It was super fast, it was really cool, this was way before anybody had thought about electric vehicles as something that could be fun or fast accelerating. Now it’s hard to believe that 20 years ago this was the case, but at the time they really took electric cars from this perspective like golf carts, to like, “Oh, this can be a highly capable performance machine”, and that just shifted mindset, and that was important. So you start Rivian in 2009, I believe the first vehicle comes out in 2021 . That’s a long time period, what was going on for those 12 years? Are these painful memories? RJS: No, no, no, no, they’re useful memories. In the beginning you have no capital, so you can’t realistically make progress on building something like a car unless you have some level of capital. So if you’re spending $1 million a year, you need to spend 5,000 times more than that to ultimately launch a car, something like that, maybe more, and so you’re not able to actually make real progress, you’re just working on demos and proof of concepts. And we didn’t even start with $1 million, we started with zero. So the first financing was I refinanced the house that I owned, which is comical when I think back now that my level of conviction and optimism. No, it’s awesome. RJS: I thought, “I’ll refinance my house, take the $100,000 to get out of it and use that to start a car company”, so that was what we did. But it’s very hard to then hire, we’re just getting a semblance of traction to have some capital, some amount of money that we could actually make real progress on a product. Right. And this was all — you still didn’t really know for sure what you were going to build, right? RJS: That’s what I was going to say, it’s actually really helpful that in those years we didn’t have capital, because we could have started building the wrong thing, so it provided me this few year period where I was learning how to run a company. I’d never run a company before, I was learning how to lead teams, I was learning how to hire, I was learning how to have hard conversations, I was learning how to raise capital, I was learning about strategy and design and brand and all these things. And so it was a really wonderful period of time for me because we were iterating so dramatically, so significantly on the strategy, the product, the type of company we’re building, the skillsets we want to accumulate and build in-house, in ways that we couldn’t today. I couldn’t walk in the door to Rivian, say, “All right, everybody, we’re going to do a completely different set of products, get ready”. You were on an e-bike back then, you could sort of go where you wanted to. The bigger you get, the more locked in you are. RJS: Yeah, the whole team could fit in one little room with one little table, investor management was very straightforward, it just gave us the freedom to be very iterative. And I look back and I’m so thankful that happened because this squiggly path led us to what Rivian ultimately became, this idea of building a really strong brand around enabling and inspiring adventure that scales across different price points and form factors. We came up with the concept for R1 as the flagship product, then we would follow that with R2 , which is now about to launch, and then R3 , which is going to launch shortly thereafter, and things took a lot longer. Once we even got all that defined, we still had to raise a lot more capital. We then raised a lot of capital and we’re on the path of execution, and there’s some big unplanned events. COVID was very, very, very challenging and maybe the worst possible time you could imagine it. Right. You’re just about to launch. RJS: Yeah, so trying to build a plant starting in 2020, which is sort of wild, and then turn on a supply chain with a bunch of suppliers that didn’t want to work with us, we had to pay extremely high prices to them just to get them to provide us components. Just a bunch of things that when you’re planning it years before, you don’t think, “Well, there’s going to be a supply chain crisis, there’s going to be a pandemic”, and there’s going to be all these externalities that make it really hard to start in that moment. You mentioned getting to this adventure brand identity, the R1T, R1S being your initial products, what was the process of honing in on that? Why did you decide that was the way to go? I mean, from the outside, you view obviously, Rivian, you’re always going to be compared to Tesla in a certain respect. They have this futuristic car looking very aerodynamic, and Rivian comes along, it’s like, “Yes, thank God, a pickup truck”, not a Cybertruck, they got an SUV. That’s my perception of the outside, but what was it like inside? RJS: Yeah, I mean, it wasn’t too different from that. We recognized that in order for us to earn the right to exist, we needed to do something that was unique and could stand on its own, and so some of the early things we thought about, we’d originally thought about doing a sports car, we realized that we were just going to be too close to what Tesla had done, and what Tesla had done well, by the way. So we went through deep soul searching to say, “What are the things we’re passionate about?”, “What are the things we want to enable?”, “What are the things that are going to matter?”, once everything’s electric, imagine every car on the road is electric, you can’t say we’re differentiated because we’re electric. “Why are you differentiated?”, “What is the reason for someone to choose to buy our products?”, and so we went through a lot of those thought processes and came out of it with this idea of preserving and inspiring the adventures that you want to have in your lifetime, the kinds of things you want to take photographs of. The reason why you want cars is you can go anywhere— RJS: Yeah, you can do all this, you can go to your grandparents’ house, you can go to the beach, you can go climbing. So that led to this really clear vision, which then led to product requirements. “Okay, if we want a car that’s going to enable and inspire adventure, what does it want to look like?”, “What are the features it needs to have?”, so storage becomes a really big consideration, being able to drive on any type of terrain becomes a big consideration, and then you say, “Okay, what’s the vehicle form factors that are going to do that?” — a re-imagination of a pickup truck and a re-imagination of a large SUV, that’s a great flagship. So it wasn’t always as direct as that single sentence, sometimes it took us a month to get there or more, but a lot of iteration, a lot of the product concepts, some of the early R1T stuff that we put together looked really futuristic and not inviting, which is the word we use all the time, like inviting you to use it. Not wanting to get dirty, or it didn’t want to get used or you don’t want to put a surfboard on the top so we became really very intentional around, “What is a Rivian?”, “What is not a Rivian?”, so we do all these exercises from a design aesthetic point of view, which of course now we know our aesthetic, but in 2015, we had no idea what a Rivian aesthetic was, so we had to define that. So we do these is/is not exercises, it’s all things that it’s amazing sitting here today to see it having played out where people actually connect and resonate with the brand that we were hoping to. It’s an incredibly strong brand, you can identify it right away. You mentioned the COVID production challenges, there’s also a bit where actually scaling up production is just really hard, even if everything is perfect. How do you distribute the blame between COVID and between the fact that actually, “This is much harder than I thought it would be”, in terms of the challenges in getting out the door? RJS: Yeah. I think we made a tactical or strategic error, which is we decided to launch three vehicles at the same time. Yep, that’s one of my questions coming up. RJS: Launching any vehicle is really hard. So just to put this into perspective, you have around 30,000 discrete components, which you purchase as a company as maybe 3,000 items and the reason there’s more discrete components is you buy something like a headlight as a single assembly, but it has many components in it. But all those tier two, tier three, tier four supply components, any one of those can stop production if they’re missing and so you still have to think about it considering the full complexity of all the parts, every single mechanical part that’s in the vehicle and so turning on a supply chain for the first time is hard for any product. For a car, it’s really hard. And for a car when the supply chain doesn’t want to work with you, meaning business is thriving, it’s very different than let’s say 2010 or ’11 or ’12 when the suppliers were all beat up from the recession and were willing to take any business. In 2020, they were busy, they didn’t want to take on this new customer Rivian with an unproven brand and unproven product. So it was very, very hard to get them to work with us and so just getting all those suppliers to ramp at the same rate on one car would’ve been tough. And the reason I say same rate, if some ramp faster than others and you have inventory issues. Right, then you have these working capital problems. RJS: You have to ramp at the same time so you can make a complete car and sell it, sounds so simple. (laughing) No, don’t worry. It does not sound simple at all. RJS: So we were doing that across three vehicles at the same time, that was already a big — the R1T, R1S launched at the same time plus a commercial van . And then on top of that, we had COVID, which made everything more challenging. Yeah, so it was maybe the most perfect of perfect storms for difficulty and so I wouldn’t use COVID as an excuse or I’m not putting blame there, I’m just saying it’s just a reality, it was just a reality. Oh, for sure. No question, I don’t think anyone is denying that. RJS: But if I were to do it over again, I would’ve launched probably the SUV first, spaced out maybe 12 months, then launched the truck, spaced out probably 12 months, launched the van, and had smoother launches that consume less capital that would allow us to get to profitability faster. But hey, you learn. And so here we are in R2- RJS: Yeah. So R2 is like, we’re launching one build combination, we’re launching a launch edition, we’re not launching R3 at the same time. You’ll laugh at this, Ben, there was a lot of debate like, “Oh man, R3 is so cool”, we had thousands of customers like, “Oh, we can’t wait to get an R3 as well, can you guys launch that quicker?”, and we’re like, “Should we try to launch R2 and R3?”, “No, no, don’t do it! Don’t do it! Hold it back!”, and so we held back. It’s like you needed to hire someone back in 2021 that says, “If we ever consider doing this again, stomp on the table and say ‘No’.” RJS: As a product person, you have all these ideas, you want to see them out in the world as fast as possible so simplicity and focus has been a major emphasis for us and so the entire business is laser locked on launching R2 and it’s a beautiful thing. We don’t have other programs that we have to manage, it’s like, “Let’s get that, that has to ramp quickly, that’s what’s going to drive us to profitability”. It’s key for cash flow, we have this enormous R&D spend we’ve created intentionally to build out all these vertically integrated technologies , whether it’s our chips, our software, our compute platforms, our high voltage architectures, that the scale that R&D necessitates the scale of more than just R1, more than just a flagship product that needs a mass market product, and that’s what R2 brings us. Got it. So you mentioned the van. Ideally from a production standpoint, you do SUV, then you do truck, then you do van. The van though came with a lot of money from Amazon , is that a critical component in why you launched it maybe sooner than you should have? RJS: No, at the time I didn’t think it was sooner. I mean, at the time I thought it was the right thing to launch them all at the same time. Amazon’s still our largest shareholder and they’ve been a great partner and they were an investor in us when we were private, as you said. But what’s so exciting about that program is it took a space that has the logistics based on last-mile e-commerce space that has such a clear value prop for electrification, meaning the vehicles start and end the day in the same spot, which is a great thing from a charging point of view. You know what they’re going to do, that you can deterministically control what they’re going to do in terms of mileage. You know your 99th percentile route in terms of number of miles, and you know your one percentile, so you can really optimize it for total cost of ownership, and so that’s what we did. So we went about and said, “Let’s make the ultimate delivery van, let’s make it the most cost-effective way to deliver”. So you talk about the complexity of doing three vehicles. Is that just in terms of getting started or is there a production capability, like you only do so many things, or is that part fine? It’s just the part of getting started? RJS: Part of the challenge is when you’re launching a manufacturing and supply chain infrastructure for the first time, in our case, we didn’t fully appreciate all the things you need to be really good at to do it and so we tried to very quickly learn how to be able to launch multiple programs at the same time, which eventually Rivian should be able to launch multiple vehicles in the same year at the same time, but we just didn’t have the maturity of process, maturity of our organization or the depth of teams to be able to support that. The issue is not things you can plan for, it’s all the little things you don’t plan for, and there’s all these little things, each of which requires problem solving. So I used to describe it in 2021 and 2022, is it’s not like there’s some giant unlock. Like, “If we just solve this, we will make more vehicles and we’ll get our cost structure in line”. There was just a stack of thousands, truly thousands of little things that needed to be adjusted or changed or negotiated and I think the thing that compounded all this that was really hard, is a lot of those issues were at our suppliers, and then those suppliers had a lot of leverage over us, because they know that in that time where they couldn’t get enough- If this doesn’t get done, you’re done. RJS: We broke up with a lot of these suppliers, but some of them would just say, “We want you to pay us twice what we previously negotiated if you want parts”, and we’d say, “No”. and they’d say, “Okay, fine, we just won’t send you the parts”, and we’d say, “Okay, how about one and a half?”. We just had no leverage. So that’s changed so dramatically and we see it with R2. R2’s the first, I’d say, clean sheet from a supply chain point. Even with the updates we made to R1, we were able to get rid of a lot of that and they call it inflation-related, COVID-related cost growth that was born out of a lack of leverage that we had, R2 is the first time we were able to really reset the negotiations. You think of it from the perspective of if you’re Volkswagen, the leverage is the other way around, which is Volkswagen has so much scale and so many diverse sets of suppliers that they could say, “Hey, if you don’t bring your costs down, we’re just going to switch to another supplier”, we didn’t have that. We couldn’t say, “Well, look, we’re going to pull this other program from you” — it was no leverage, so we sort of were complete takers in that. Yeah, that makes sense. Before we got into the AI stuff, I did want to ask about the VW partnership . This includes access to your electric vehicle software, electrical architectures, you get supply chain expertise from them. How do you characterize this deal as a whole? I should also mention a sort of massive investment on their side as well, give me the framework of that deal and why it’s important. RJS: Yeah, it’s a $5.8 billion deal, some of which is technology licensing, some of which are investments. Right, I was going to ask about that. Some of it is just actually putting money in the company, and some of it is they’re going to license your software and things like that going forward. RJS: Yeah, and a lot of those are upfront licensing fees, most of which have already been paid and before I get to the business of it, it’s important to talk about the mission of it. We’ve spent a lot of time developing what we call a zonal architecture, but essentially think of it as a number of computers consolidating into one that perform a wide array of functions across a physical zone of the vehicle and it allows us to do things like over-the-air updates very seamlessly because rather than having a bunch of smaller function or domain-based electronic control units, little mini computers run the software for different functions, we run all this software for those functions on one computer on our OS, which makes it much easier to update. And so the strategy there was, “Boy, we’ve spent a mountain of investment building this tech stack, it’d be really nice to see it applied in another way. Yup. You need to get leverage on that investment and you just don’t have the volume by yourself. RJS: And it aligns to our mission in terms of enabling more electric vehicles to get highly compelling electric vehicles on the road and then it gives us a lot of scale, scale for sourcing the components that are shared and then it gives us the benefits of other, what we think of as joint sourcing agreements, so sourcing partnerships that can exist with Volkswagen. It’s been a great relationship, those types of relationships are very, very hard to build because it does require buy-in from the top so one of the things that allowed us to work so well with Amazon, I mean, you think about Amazon and it’s one of the largest companies in the world, certainly the largest e-commerce company in the world, and imagine they go out and say, “We’re going to build our future logistics network around a van that’s being not dual sourced, but single sourced to one company” — this is in 2019 — “has never built a car before at scale, and they’re like a startup”. But that was born out of a great relationship that I had with [Former Amazon CEO] Jeff [Bezos] and Jeff’s trust in supporting us and that enabled them to really lean in with us and lean in in defining the product, defining what it was, that was a really big leap. So we’ve built, I’d say, organizationally, really great capability of taking the strengths of being a fast-moving startup and working with very large companies as partners and in the case of Volkswagen, my relationship with Oliver Blume , the CEO of the group — so Volkswagen Group is, we think of VW as a brand, but they’re a group — they’ve got Porsche, Audi, Lamborghini, SEAT, Škoda, these are brands that aren’t sold in the United States, but it’s the second-largest car company in the world, largest industrial company in Europe, a huge company. But having Oliver and I aligned just allowed us to really move through the deal mechanics and the deal structuring quite quickly. So this bit, as you sort of zoom out, the deal makes a lot of sense to me. Actually, I think it makes a lot of sense for both sides. RJS: Yeah, it’s a win-win. They get expertise that they’re not going to develop internally. I’ve had plenty of German cars, the software is okay for what it is, I don’t think it’s going to sort of go where you’re going to go. You also have on your side, you can do these huge investments like you talked about last week , and we’re about to transition into that, with the promise of scale that is much more than you can certainly deliver today. Is there a bit of you though is like, “If we had ramped up correctly, if we had not done multiple vehicles, we could actually be at scale, we could keep this all to ourselves”, or is this ultimately the best outcome that you’re sharing with them in the long run? RJS: I think in hindsight, I wish we’d ramped up more quickly, there’s things you’d change, but they’re also all things you’d learn from. We don’t spend any time lamenting them or anything like that. But to be clear, both in the case of our in house software and zonal controllers, which is what we’ve done with in Infotainment, which is what we’ve done with Volkswagen, as well as our autonomy platform and AI platforms , which is separate from the Volkswagen venture. Is that part of the deal? RJS: No, that’s not part of the deal. RJS: That’s 100% Rivian. RJS: But in both cases, we developed them thinking that we would eventually leverage this, not just with our own products, but with other companies as well. Got it. Okay. RJS: And so Volkswagen was, in many ways, the ideal first customer. And the reason I say it’s the ideal first customer, 1) it’s huge, as we’ve already described, but 2) it has the complexity of managing across many different brands, and so being able to support a company like Volkswagen Group, which spans very premium brands, like Porsche, down to one of the products that’s been announced that we’re doing together, the Volkswagen ID1 , which is a $22,000 EV, it’s the existence proof that we, Rivian, can support working across large complex organizations, across large ranges of price and product features, and across very different vehicle form factors. And if you’re another car company, you couldn’t look at Rivian and say, maybe before you could have, but now you couldn’t, say, “Well, I don’t think you could do this at this price point” — well, actually we cover every price point across the spectrum. So there’s an opportunity for other car companies to do the same thing. RJS: Absolutely, yeah. And now on the autonomy front, I think the opportunity there is actually bigger because this is a very, very hard problem to solve, it requires vertical integration in ways that are not typically — it’s just things that OEMs typically don’t do. Tell me your vertical integration story, because it is really interesting. You’re on last week, you talk about everything from your chip to your sensors to your software, you talked about building your own compiler. We are talking total front-to-back, end-to-end vertical integration. Why is that important? RJS: Yeah. It’s important to just talk about how autonomy is now being developed, and I do think for anyone listening to this, it’s very, very important to understand this because there’s perhaps some histories to how it was done before. The idea of a vehicle driving itself isn’t a new idea, that’s been something, it’s been in sci-fi movies for decades, but in terms of actual technology development, it started in, call it early 2010s, in that time range, so roughly 15, 20 years ago. The early platforms and what was done in terms of the approach up until the very early 2020s was something that was designed around a rules-based approach and so what you would have is you’d have a set of sensors, perception that would identify objects in the world, so all the things in the scene, so that’s cars, people, bikes, kids, balls bouncing on the street, everything that you can see, it would identify all those objects, it would classify the objects as to what they are, it would then associate vectors to those objects, acceleration of velocity, and it would hand all those objects and their classifications and their vector associations to a rules-based planner. The rules-based planner was a team of software developers attempt to codify what are the rules of the road. So, I’m going to oversimplify here, but think of it as a whole series of if/then statements. Totally deterministic, by the biggest spaghetti code mess you’ve ever seen because there’s so many possibile exceptions and issues. RJS: It’s a giant, giant code base that’s trying to describe how the world works. And so, it wasn’t actually AI as we think about AI today, there was machine vision. Machine learning, neural nets, yeah. RJS: Yeah, there was machine vision for the object detection classification, but in terms of the planning and the actuation of vehicle was very much a rules-based environment. Then along came the idea of neural nets, and the idea of transformers to do encoding, and that happened, of course, in the LLM world, but that’s also happening in the physical world. Everything can be a token. We think about it, everyone thinks one of the context of letters and words, but everything can be a token. RJS: Yeah, everything can be tokenized and the whole world changed in self-driving, so everything that was done prior to, call it 2020, 2021 is largely throwaway, meaning the way the systems are now developed is you build, you need to have complete vertical control, it needs to be one developer that controls all the perception, because you don’t want a pre-processed set of outputs from a camera, you want the raw signals from a camera. If you have other modalities like a radar or LiDAR, you want the raw signals from those, you want to feed it in through a transformer-based encoding process early, so fuse all that information early, and build a complex, it’s hard to imagine in our human brains, but it’s a complex multidimensional neural net that describes how the vehicle drives. Then you want to train that with lots and lots of data, and you’re training it offline. The word that gets used all the time is end-to-end, so it’s trained end-to-end from the vehicle through the human drivers back to the model and so, to do that well, you need a few ingredients, you need this vertically-controlled perception platform, you need a really robust onboard data infrastructure that can both trigger interesting data events, hold them, do something to them to make them a little easier to move off the vehicle, ideally through Wi-Fi, and a worst case through LTE, but mostly through Wi-Fi, all that data gets moved off the vehicles, and this is happening at millions and millions of miles accumulating just in the course of a day. And so all that data is moving off the vehicle, and then you’re training it on thousands and thousands of GPUs. You’re going around and around and around, and it gets better and better and better. That’s an approach that is so different, as I said from what was done before, but to do that, you need all those ingredients. Well, you need cars on the road. RJS: You need cars on the road. So, we looked at it, we launched in 2021 with our Gen 1 architecture, we almost immediately after that realized we needed a complete rethink of our self-driving approach. Right, that’s exactly what I was going to ask. Was this an issue where in some respects you launched later than you wanted to because all the supply chain issues, but then you actually launched earlier than you wanted to because you didn’t have the right sort of stuff on your cars? RJS: Well, we launched — and we didn’t realize, and this is the thing, and even some of our Gen 1 customers are not happy with this, but when we developed the Gen 1 system, this was on 2018, 2019, we didn’t know this big technical massive shift was going to happen. So, our Gen 1 architecture uses a Mobileye front-facing camera, and it uses — it’s a collection of things, it’s very classical rules-based approach, if you’re going to develop something around AI, it’s a completely different architecture, not a single shared line of code, not a single shared piece of hardware. So we started working in the beginning of 2021, right after launching on a whole new clean sheet, everything new, we didn’t try to morph anything over, it’s a complete melt and re-pour. In that new architecture, we designed cameras, we designed a new radar, we designed a new compute platform, we built, we call this our Gen 2 architecture. We built it around an Nvidia processor, we designed a data flywheel, we designed an offline training program. The vehicle launched in the middle of 2024, the features then were trained on a very small number of miles, which was our own internal fleet and now over the course of last year, we’ve built up enough data that’s allowed us this flywheel starting to spin. Yep. And that data is only coming from the Gen 2 vehicles, right? Not from the Gen 1 ones? RJS: Only Gen 2. Gen 1, it’s asymptoted, both in terms of capability and it has no value to us in terms of data, so only Gen 2. And so, in parallel to kicking off this Gen 2 platform, which we said, we need to get this in the field as fast as possible because we need to start the data flywheel, we also need to get better hardware so that when we have the model built, we can run it with a higher ceiling. That kicked off updates to the cameras that are going to our Gen 3 architecture, very importantly, an in-house silicon program. Why is that very important? RJS: Compute inference on the vehicle, we wanted to have — what would we have in our Gen 2 is around 200 TOPS [Trillions of Operations per Second], we wanted that to be closer to 200 TOPS per chip, so 400 TOPS total, sparse TOPS. Well, what’s going to be in Gen 3 will be 1,600 sparse TOPS, but importantly, we designed it specifically around a vision-based robotic platform. And so, the utilization of those TOPS is very high, much higher than what we see in other platforms that are more generalized, and then the power efficiency is very high, and then the cost is much lower. So we have a very, very high capability, low cost platform for which we can afford to put enormous compute in. All that is true, but the actual development of that is very expensive. Is this going to pay off with those lower unit costs, and that increased capability with just your vehicles, or like the VW deal, is this something that you’re going to be looking to sell broadly? RJS: Well, this is an interesting one. Even on its own within Rivian, just R1 and R2, it’ll pay off because the cost savings are so significant on the chips. But more than that, we believe we’re very, very — we’re spending billions of dollars in developing our self-driving platform, our level of conviction as this being one of the most important, I shouldn’t even say one of the most, the most important shift in transportation and transportation technology means that we want it to control the whole platform. Then once we control the whole platform, it makes it a very interesting system that can be provided to other manufacturers. And so, I think in time, the number of companies that will have all the ingredients to do what I’ve just described, they’d be very limited, I think there’ll be less than five in the West. Did you get any of this thinking from Jeff Bezos? Because there is a bit here where our cars are where we get out and develop this and prove it out, but the real payoff is to do the platform at scale across other entities. It sounds a little Amazon-like. RJS: Amazon’s our largest shareholder, and Jeff’s somebody I look to for a lot of inspiration on these kinds of things. So, certainly I think there’s some of that. We think of our vehicles as our own dog food, but we’re going to make a platform that’s so darn good that we think others will- You’ll sell a lot of vehicles. RJS: And if others aren’t buying our platform, we’ll monetize it through selling more vehicles, and we’ll grab market share. I think on both sides of that, we can win. I do think that it’s going to move far faster than anyone realizes. I think, the way I describe it is if you look at the last three or four years of development in autonomy, and you try to draw a line to represent the slope of improvement, and you look at the next three or four years, the two lines are completely unrelated. Totally agree. RJS: But the acceleration is going to be so fast. And what I’m surprised is people aren’t — I say this, I don’t think people fully realize it, but the LLM space should teach us that. Yeah. GPT-1, GPT-2, GPT-3, GPT-4. RJS: But look at the 1.0 architectures. Oh, which are rules-based. Yeah, to your real point, it’s exactly what it is. RJS: Rules-based. And look at the progress that was made on Alexa, let’s say, relative to the progress that’s happened on GPT-3, 4 now, and beyond, it’s just like they’re not even closely related. And so the same thing is happening in the physical world with cars, and if you don’t have a data flywheel approach, you’re just not in the game and there’s no way you can compete. And so, very few people have that, far fewer I think is right. A big differentiator between what you’re doing and what Tesla is doing, and we have to sort of come back to it, they shifted to the pure neural network approach, but they’re doing vision only. Do you just think that’s a fundamentally flawed decision? RJS: We have a different point of view. Right. Because you have radar and LiDAR too, is the difference there. RJS: Yeah. There’s a lot of alignment, and we both agree, and we’re both approaching it as building a neural net. So, I want to call that out that we have a very aligned view. Right. Your core philosophy is absolutely the same. And I think there’s an extent where Waymo is getting there as well. RJS: The same philosophy. And then it’s like, “How can we teach the brain as fast as possible?” is our question. They have the biggest fleet of data acquisition in the world, they have fewer cameras, that have far less dynamic range. When I say dynamic range, I mean performance on very low light conditions, and very bright light conditions. Right, yep. RJS: We have much better dynamic range that of course adds bill of material cost, but we did that intentionally. And then, we have the benefit of our whole fleet, all Gen 3 R2s, think of those as ground truth vehicles. They’ll have LiDAR and radar on them. Tesla just has a few ground truth vehicles that do have radar and LiDAR, but they’re trying to service the whole fleet. RJS: Yeah, I’m looking out the window here at El Camino and you just have to stand at the corner and see Teslas driving around and around everywhere. One will go by eventually, yeah. So that’s the question, is the benefit of putting radar and LiDAR on all your cars, is that just something you need to do now so you can just gather that much more data that much more quickly? Or is that going to be a necessary component for at scale, everyone has an autonomous vehicle and they need to have radar and LiDAR? RJS: Yeah, I think, the way I look at it is, in the absolute fullness of time, I think the sensor set will continue to evolve. But in the process of building the models and until cameras can become meaningfully better, there’s very low cost, very fast ways to supplement the cameras that solve their weaknesses. So seeing through fog we can solve with a radar, seeing through dense snow or rain we can solve with a radar, seeing extremely far distances well beyond that of a camera or human eye, we can solve that with a LiDAR, our LiDAR is 900 feet. And then the benefit of having that data set from the radar and the LiDAR is you can more quickly train the cameras. The cameras, when I say train, it doesn’t mean we’re in there writing code to do this. I think my audience broadly gets how this works, yeah. RJS: The model understands this and so you feed this in and the neural net understands because you have the benefits of these non-overlapping modalities that have different strengths and weaknesses to identify, “Is that blurry thing out there actually a car?”, “Is it a person?”, “Is it a reflection off of a building?”, and when you have the benefit of radar and the benefit of LiDAR, that blurry thing way off in the distance that the camera sees starts to become — you can ground truth that much faster. And then you teach your camera to figure out what it is. RJS: Then your cameras become better, and so that’s our thesis. And of course, that’s important that we have a thesis that’s different than Tesla, if we had an identical thesis to Tesla on perception- They just have way more cars out there. RJS: Yeah, the only way to catch up is with building a fleet of millions of vehicles, we want to catch up faster than that. So is it also sort of this advantage that — to what extent do you feel the auto industry, you start out and you’re sort of the outsider, you can’t get suppliers to help you, they’re ripping you off, all the sorts of problems you talked about. Now you’re like, I can imagine Volkswagen at a minimum is looking at you, “Please figure this out, we have a relationship, we can sort of jump on if need be” — do you get that sense more broadly from the industry? Because I don’t think anyone expects Tesla to share their technology, Google is sort of its own thing, do you have the potential to be the industry champion in some ways? RJS: We hope. I mean, I think every manufacturer has three choices, it’s pretty simple. They’re either going to develop their own autonomy platforms, they’re going to buy an autonomy platform, or they’re going to make this not a priority and they’re going to lose market share. But the last one, you have to accept that in not too much time, if you don’t prioritize this, you will lose market share. It’d be like trying to sell a house without electricity, it’s going to become so fundamental to the functioning of the vehicle. Why do you think that autonomy is so tightly tied to electrical vehicles? Because there’s no reason an ICE vehicle couldn’t be autonomous. RJS: No, no. It’s more coincidence, it’s funny. I’d say autonomy, connectivity and modern infotainment and electrification are all completely separate topics, so there’s no reason they have to converge into one thing. It’s more just coincidence that all these things happen to be occurring at the same time and the electric vehicles tend to be the more advanced vehicles because they’re on new architecture. So it’s why you start to see from other non-peer review manufacturers that their EVs tend to be the most advanced but autonomy doesn’t care if it’s an engine or if it’s an electric motor. Right. It makes sense that’s just how it happened historically. Right. It makes sense that’s just how it happened historically. I do need to ask this question, I think I know what the answer is, but people will be mad at me if I don’t ask. Why is there no CarPlay in Rivians? RJS: It is a good question, we get asked that a lot. We’re very convicted on this point. We believe that the aggregation of applications and the experience, and importantly now with AI acting as a web that’s integrating all these different applications into a singular experience where you can talk to the car and ask for things and where it has knowledge of the state of health of the vehicle, the state of charge, distance, outside temperature, everything becomes much more seamless in time if the vehicle is its own singular ecosystem versus having a window within the vehicle that’s into another ecosystem. And is that the issue, just the implementation effort on your side or that the customers are actually short-circuiting themselves? RJS: We could turn on CarPlay really quickly, but then you end up with — you either enter into the CarPlay environment and it’s like Apple’s, they get to play the role of aggregating what apps are there and how they decide what’s integrated, how it’s done, versus us, and I think where it becomes really important is when AI happens. Our view is a lot of the applications will start to go away and you’ll have your AI assistant. There may be things happening below agent to agent under the covers, but when you say, “Rivian, tell me what’s on my schedule for later today”, you don’t care that it has to go agent to agent to Google Calendar to pull that out, you just want the information, that interface becomes really important, it becomes so fundamental to the user experience and the whole user journey. So as we’ve thought about this, inserting any sort of abstraction layer or aggregation layer that’s not our own just is extremely risky and you start to build dependencies on that that are hard to reverse. Is there a bit where Tesla covered for you because they don’t have CarPlay either, but now there’s a rumor they might add it and it might make it a little more difficult to hold your convictions? RJS: Maybe. As it’s always the case on these things, I think there’s people that are really used to having CarPlay and our goal is to make it such that the car is so good that they don’t even think of that. And if they were to go back to CarPlay, they’d miss having the integrated holistic experience that we can create. It’s interesting because I thought you were just going to go more on the — you just gave this strong pitch for integration and top-to-down, side-to-side, that wasn’t the core to your answer. I think your answer made a lot of sense in the future best interface, I can see your customers getting themselves on a local maxima because that’s what they’re used to and it’s there and they’re missing how much better it can be. But I guess it goes to your point, infotainment and electrification and autonomy, those are all separate areas. RJS: So think of it like this. The challenge is CarPlay is not everything, so if you have CarPlay and the vehicle’s driving itself, in most CarPlay instances, it takes over the whole screen. RJS: There are instances where you could have a screen in a screen, but then that is very — I always joke, this is something Apple would never do. They would never have a screen in a screen on their own devices. They would say, we want to have one experience and so you have one screen that’s putting up information that’s very specific to the vehicle operation that are things that are like, “Is the door open or closed?”, and then you have another that’s mapping— It’s competing. RJS: It’s like you have two different UIs playing out and I just think it’s poor UI, it’s a poor user experience. The only reason people want that is they’ve been trained because they’re in cars that have such bad UI that the life raft to escape the horrible UI that is embedded in the car is CarPlay, and CarPlay is a really important function for that. If I’m in a non-Rivian or non-Tesla and I get in, it’s like a disaster and I’m like, “Oh thank goodness there’s CarPlay”. It has some thoughtful UI, but we have a really thoughtful UI and the few things that are missing we’ve been adding. So we brought Google Maps in, which was a big one, there’s more mapping platforms that’ll come in over time. We’ve got all the music platforms, including Apple Music, natively integrated. But soon with AI integration, I just think a lot of this fades away because you want a singular layer and that may mean we’re running ChatGPT to do some portions, we may be running Gemini to do other portions, but we get to be the arbiter of all this stuff under the surface. What are we using for onboard diagnostics? What are we using for on the edge knowledge? What are we using for cloud knowledge? All that we get to build and decide on ourselves. And I think importantly, given how fast the models are moving, we have the ability to plug or unplug different models at our discretion, we can decide what’s the best model to use. For the record, I agree with your decision. And I think if Tesla added CarPlay it would be a bad decision. And the reason is, I think unless you own one of these vehicles, I have a Tesla, I don’t have a Rivian, but the tangible difference is, and people say this, but until you experience it it’s not quite clear, it is a computer on wheels, and the way I think about it is for ICE cars that I’ve had, automatic windshield wiping is like a luxury feature or automatic lights. If you step back it’s like, “Wait, this is a software thing that we can do it once and do it generally, of course even your cheapest Tesla is going to have this feature and then you get to remove the physical control and you should never even need to interface with that”. And if your car is a computer first and foremost, you have to go in on the user interface, it’s nuts to put something else there, even if people are crapping about it in the short run. So there’s my pitch for you for that answer next time. RJS: And I also think that people that are in Teslas and Rivians that are actually driving it, the number of people that actually complain about it is very, very low. The number of people that say they’re not buying Rivian because of CarPlay is a higher number, but once you get into it, you’re like, “Oh, what was I worried about? This is really good!”, and I think the same trend exists for Tesla. Yeah. RJ, it was very good to talk to you, thanks for coming on, I’m excited to see how this develops. RJS: Yeah, this has been great. Thanks so much. I appreciate the time, Ben. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Simon Willison 2 months ago

Useful patterns for building HTML tools

I've started using the term HTML tools to refer to HTML applications that I've been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past two years, almost all of them written by LLMs. This article presents a collection of useful patterns I've discovered along the way. First, some examples to show the kind of thing I'm talking about: These are some of my recent favorites. I have dozens more like this that I use on a regular basis. You can explore my collection on tools.simonwillison.net - the by month view is useful for browsing the entire collection. If you want to see the code and prompts, almost all of the examples in this post include a link in their footer to "view source" on GitHub. The GitHub commits usually contain either the prompt itself or a link to the transcript used to create the tool. These are the characteristics I have found to be most productive in building tools of this nature: The end result is a few hundred lines of code that can be cleanly copied and pasted into a GitHub repository. The easiest way to build one of these tools is to start in ChatGPT or Claude or Gemini. All three have features where they can write a simple HTML+JavaScript application and show it to you directly. Claude calls this "Artifacts", ChatGPT and Gemini both call it "Canvas". Claude has the feature enabled by default, ChatGPT and Gemini may require you to toggle it on in their "tools" menus. Try this prompt in Gemini or ChatGPT: Or this prompt in Claude: I always add "No React" to these prompts, because otherwise they tend to build with React, resulting in a file that is harder to copy and paste out of the LLM and use elsewhere. I find that attempts which use React take longer to display (since they need to run a build step) and are more likely to contain crashing bugs for some reason, especially in ChatGPT. All three tools have "share" links that provide a URL to the finished application. Examples: Coding agents such as Claude Code and Codex CLI have the advantage that they can test the code themselves while they work on it using tools like Playwright. I often upgrade to one of those when I'm working on something more complicated, like my Bluesky thread viewer tool shown above. I also frequently use asynchronous coding agents like Claude Code for web to make changes to existing tools. I shared a video about that in Building a tool to copy-paste share terminal sessions using Claude Code for web . Claude Code for web and Codex Cloud run directly against my simonw/tools repo, which means they can publish or upgrade tools via Pull Requests (here are dozens of examples ) without me needing to copy and paste anything myself. Any time I use an additional JavaScript library as part of my tool I like to load it from a CDN. The three major LLM platforms support specific CDNs as part of their Artifacts or Canvas features, so often if you tell them "Use PDF.js" or similar they'll be able to compose a URL to a CDN that's on their allow-list. Sometimes you'll need to go and look up the URL on cdnjs or jsDelivr and paste it into the chat. CDNs like these have been around for long enough that I've grown to trust them, especially for URLs that include the package version. The alternative to CDNs is to use npm and have a build step for your projects. I find this reduces my productivity at hacking on individual tools and makes it harder to self-host them. I don't like leaving my HTML tools hosted by the LLM platforms themselves for a couple of reasons. First, LLM platforms tend to run the tools inside a tight sandbox with a lot of restrictions. They're often unable to load data or images from external URLs, and sometimes even features like linking out to other sites are disabled. The end-user experience often isn't great either. They show warning messages to new users, often take additional time to load and delight in showing promotions for the platform that was used to create the tool. They're also not as reliable as other forms of static hosting. If ChatGPT or Claude are having an outage I'd like to still be able to access the tools I've created in the past. Being able to easily self-host is the main reason I like insisting on "no React" and using CDNs for dependencies - the absence of a build step makes hosting tools elsewhere a simple case of copying and pasting them out to some other provider. My preferred provider here is GitHub Pages because I can paste a block of HTML into a file on github.com and have it hosted on a permanent URL a few seconds later. Most of my tools end up in my simonw/tools repository which is configured to serve static files at tools.simonwillison.net . One of the most useful input/output mechanisms for HTML tools comes in the form of copy and paste . I frequently build tools that accept pasted content, transform it in some way and let the user copy it back to their clipboard to paste somewhere else. Copy and paste on mobile phones is fiddly, so I frequently include "Copy to clipboard" buttons that populate the clipboard with a single touch. Most operating system clipboards can carry multiple formats of the same copied data. That's why you can paste content from a word processor in a way that preserves formatting, but if you paste the same thing into a text editor you'll get the content with formatting stripped. These rich copy operations are available in JavaScript paste events as well, which opens up all sorts of opportunities for HTML tools. The key to building interesting HTML tools is understanding what's possible. Building custom debugging tools is a great way to explore these options. clipboard-viewer is one of my most useful. You can paste anything into it (text, rich text, images, files) and it will loop through and show you every type of paste data that's available on the clipboard. This was key to building many of my other tools, because it showed me the invisible data that I could use to bootstrap other interesting pieces of functionality. More debugging examples: HTML tools may not have access to server-side databases for storage but it turns out you can store a lot of state directly in the URL. I like this for tools I may want to bookmark or share with other people. The localStorage browser API lets HTML tools store data persistently on the user's device, without exposing that data to the server. I use this for larger pieces of state that don't fit comfortably in a URL, or for secrets like API keys which I really don't want anywhere near my server - even static hosts might have server logs that are outside of my influence. CORS stands for Cross-origin resource sharing . It's a relatively low-level detail which controls if JavaScript running on one site is able to fetch data from APIs hosted on other domains. APIs that provide open CORS headers are a goldmine for HTML tools. It's worth building a collection of these over time. Here are some I like: GitHub Gists are a personal favorite here, because they let you build apps that can persist state to a permanent Gist through making a cross-origin API call. All three of OpenAI, Anthropic and Gemini offer JSON APIs that can be accessed via CORS directly from HTML tools. Unfortunately you still need an API key, and if you bake that key into your visible HTML anyone can steal it and use to rack up charges on your account. I use the secrets pattern to store API keys for these services. This sucks from a user experience perspective - telling users to go and create an API key and paste it into a tool is a lot of friction - but it does work. Some examples: You don't need to upload a file to a server in order to make use of the element. JavaScript can access the content of that file directly, which opens up a wealth of opportunities for useful functionality. Some examples: An HTML tool can generate a file for download without needing help from a server. The JavaScript library ecosystem has a huge range of packages for generating files in all kinds of useful formats. Pyodide is a distribution of Python that's compiled to WebAssembly and designed to run directly in browsers. It's an engineering marvel and one of the most underrated corners of the Python world. It also cleanly loads from a CDN, which means there's no reason not to use it in HTML tools! Even better, the Pyodide project includes micropip - a mechanism that can load extra pure-Python packages from PyPI via CORS. Pyodide is possible thanks to WebAssembly. WebAssembly means that a vast collection of software originally written in other languages can now be loaded in HTML tools as well. Squoosh.app was the first example I saw that convinced me of the power of this pattern - it makes several best-in-class image compression libraries available directly in the browser. I've used WebAssembly for a few of my own tools: The biggest advantage of having a single public collection of 100+ tools is that it's easy for my LLM assistants to recombine them in interesting ways. Sometimes I'll copy and paste a previous tool into the context, but when I'm working with a coding agent I can reference them by name - or tell the agent to search for relevant examples before it starts work. The source code of any working tool doubles as clear documentation of how something can be done, including patterns for using editing libraries. An LLM with one or two existing tools in their context is much more likely to produce working code. I built pypi-changelog by telling Claude Code: And then, after it had found and read the source code for zip-wheel-explorer : Here's the full transcript . See Running OCR against PDFs and images directly in your browser for another detailed example of remixing tools to create something new. I like keeping (and publishing) records of everything I do with LLMs, to help me grow my skills at using them over time. For HTML tools I built by chatting with an LLM platform directly I use the "share" feature for those platforms. For Claude Code or Codex CLI or other coding agents I copy and paste the full transcript from the terminal into my terminal-to-html tool and share that using a Gist. In either case I include links to those transcripts in the commit message when I save the finished tool to my repository. You can see those in my tools.simonwillison.net colophon . I've had so much fun exploring the capabilities of LLMs in this way over the past year and a half, and building tools in this way has been invaluable in helping me understand both the potential for building tools with HTML and the capabilities of the LLMs that I'm building them with. If you're interested in starting your own collection I highly recommend it! All you need to get started is a free GitHub repository with GitHub Pages enabled (Settings -> Pages -> Source -> Deploy from a branch -> main) and you can start copying in pages generated in whatever manner you like. Bonus transcript : Here's how I used Claude Code and shot-scraper to add the screenshots to this post. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . svg-render renders SVG code to downloadable JPEGs or PNGs pypi-changelog lets you generate (and copy to clipboard) diffs between different PyPI package releases. bluesky-thread provides a nested view of a discussion thread on Bluesky. The anatomy of an HTML tool Prototype with Artifacts or Canvas Switch to a coding agent for more complex projects Load dependencies from CDNs Host them somewhere else Take advantage of copy and paste Build debugging tools Persist state in the URL Use localStorage for secrets or larger state Collect CORS-enabled APIs LLMs can be called directly via CORS Don't be afraid of opening files You can offer downloadable files too Pyodide can run Python code in the browser WebAssembly opens more possibilities Remix your previous tools Record the prompt and transcript Go forth and build A single file: inline JavaScript and CSS in a single HTML file means the least hassle in hosting or distributing them, and crucially means you can copy and paste them out of an LLM response. Avoid React, or anything with a build step. The problem with React is that JSX requires a build step, which makes everything massively less convenient. I prompt "no react" and skip that whole rabbit hole entirely. Load dependencies from a CDN. The fewer dependencies the better, but if there's a well known library that helps solve a problem I'm happy to load it from CDNjs or jsdelivr or similar. Keep them small. A few hundred lines means the maintainability of the code doesn't matter too much: any good LLM can read them and understand what they're doing, and rewriting them from scratch with help from an LLM takes just a few minutes. ChatGPT JSON to YAML Canvas made with GPT-5.1 Thinking - here's the full ChatGPT transcript Claude JSON to YAML Artifact made with Claude Opus 4.5 - here's the full Claude transcript Gemini JSON to YAML Canvas made with Gemini 3 Pro - here's the full Gemini transcript hacker-news-thread-export lets you paste in a URL to a Hacker News thread and gives you a copyable condensed version of the entire thread, suitable for pasting into an LLM to get a useful summary. paste-rich-text lets you copy from a page and paste to get the HTML - particularly useful on mobile where view-source isn't available. alt-text-extractor lets you paste in images and then copy out their alt text. keyboard-debug shows the keys (and values) currently being held down. cors-fetch reveals if a URL can be accessed via CORS. exif displays EXIF data for a selected photo. icon-editor is a custom 24x24 icon editor I built to help hack on icons for the GitHub Universe badge . It persists your in-progress icon design in the URL so you can easily bookmark and share it. word-counter is a simple tool I built to help me write to specific word counts, for things like conference abstract submissions. It uses localStorage to save as you type, so your work isn't lost if you accidentally close the tab. render-markdown uses the same trick - I sometimes use this one to craft blog posts and I don't want to lose them. haiku is one of a number of LLM demos I've built that request an API key from the user (via the function) and then store that in . This one uses Claude Haiku to write haikus about what it can see through the user's webcam. iNaturalist for fetching sightings of animals, including URLs to photos PyPI for fetching details of Python packages GitHub because anything in a public repository in GitHub has a CORS-enabled anonymous API for fetching that content from the raw.githubusercontent.com domain, which is behind a caching CDN so you don't need to worry too much about rate limits or feel guilty about adding load to their infrastructure. Bluesky for all sorts of operations Mastodon has generous CORS policies too, as used by applications like phanpy.social species-observation-map uses iNaturalist to show a map of recent sightings of a particular species. zip-wheel-explorer fetches a file for a Python package from PyPI, unzips it (in browser memory) and lets you navigate the files. github-issue-to-markdown fetches issue details and comments from the GitHub API (including expanding any permanent code links) and turns them into copyable Markdown. terminal-to-html can optionally save the user's converted terminal session to a Gist. bluesky-quote-finder displays quotes of a specified Bluesky post, which can then be sorted by likes or by time. haiku uses the Claude API to write a haiku about an image from the user's webcam. openai-audio-output generates audio speech using OpenAI's GPT-4o audio API. gemini-bbox demonstrates Gemini 2.5's ability to return complex shaped image masks for objects in images, see Image segmentation using Gemini 2.5 . ocr is the first tool I built for my collection, described in Running OCR against PDFs and images directly in your browser . It uses and to allow users to open a PDF in their browser which it then converts to an image-per-page and runs through OCR. social-media-cropper lets you open (or paste in) an existing image and then crop it to common dimensions needed for different social media platforms - 2:1 for Twitter and LinkedIn, 1.4:1 for Substack etc. ffmpeg-crop lets you open and preview a video file in your browser, drag a crop box within it and then copy out the command needed to produce a cropped copy on your own machine. svg-render lets the user download the PNG or JPEG rendered from an SVG. social-media-cropper does the same for cropped images. open-sauce-2025 is my alternative schedule for a conference that includes a downloadable ICS file for adding the schedule to your calendar. See Vibe scraping and vibe coding a schedule app for Open Sauce 2025 entirely on my phone for more on that project. pyodide-bar-chart demonstrates running Pyodide, Pandas and matplotlib to render a bar chart directly in the browser. numpy-pyodide-lab is an experimental interactive tutorial for Numpy. apsw-query demonstrates the APSW SQLite library running in a browser, using it to show EXPLAIN QUERY plans for SQLite queries. ocr uses the pre-existing Tesseract.js WebAssembly port of the Tesseract OCR engine. sloccount is a port of David Wheeler's Perl and C SLOCCount utility to the browser, using a big ball of WebAssembly duct tape. More details here . micropython is my experiment using @micropython/micropython-webassembly-pyscript from NPM to run Python code with a smaller initial download than Pyodide.

1 views
annie's blog 2 months ago

Fish bowl

Our very brains, our human nature, our desire for comfort, our habits, our social structures, all of it, pushes us into being fish bowl swimmers. Tiny people moving in tiny circles. Staying in the circumscribed ruts of our comfort. Ignoring a whole big world of what's different and new and interesting just beyond. That's the problem: stuff out there might be new, and interesting, but it's also different. The newness — which is really not new, at all, it's just new to us, so — the differentness, of another mindset or culture, language or belief system, method or opinion or morality or lifestyle, sends our inward threat-o-meter into overdrive. We interpret new and different as scary and difficult , because in terms of our emotions and our mental somersaulting, it is. We don't know how to act. We don't know how to evaluate. We don't know what is safe. We don't know where we fit in. We don't know how our safe, comfortable fish bowl living is affected by this new, different, expanded puddle. Sameness makes us comfortable. And comfort is the height, the very pinnacle, the crowning achievement in our pursuit of happiness. What I mean is that we've mistaken comfort for happiness. All the ways we could pursue happiness, all the freedom and technology and abilities we have to pursue meaning and joy and interaction and challenge and exploration and improvement and aliveness … All of that, at our fingertips, and being comfortable tends to top the list of what we actually want, what we're willing to put effort towards. This seems pathetic. It is pathetic. But also: We're working hard all the time in ways we often don't acknowledge. We have infinite options but finite agency. We have endless information access and very little processing power. We get fucking worn out. It's a lot of work to make a string of decent choices for 10 or 12 hours at a time. It's a lot of effort, some days (most days), to do what is required of us to feel like decent human beings, and the idea of putting in more effort, expending more energy, is exhausting. So we value comfort highly. We're tired. We're exhausted by constant inputs, invisible demands, and the burden of infinite options. Of course we don't leap out of our comfort zones when the opportunity arises: we've already been out of it for so long, on high alert. Our brains are efficiency machines. By valuing comfort so highly, and by equating comfort with sameness, we have programmed our brains to ignore the unfamiliar. Ever wondered why you can feel bored when you have constant stimulation? This is why. We carefully allocate our energy to the highest priorities. Things that aren't familiar don't help. So we ignore them. Of course, we can't always ignore stuff that is different. Sometimes it is right there, glaringly obvious, annoyingly immune to our discomfort, and we are forced to see it, acknowledge it, encounter it, at least mentally. But don't worry! We have defenses! Oh baby, do we have defenses. If we can't keep these alien objects from encroaching upon our consciousness, we can, at least, quickly evaluate the threat they pose and deal with them appropriately. Threat is precisely how we see things that are different. Comfort is bolstered, even built, by the familiar. All things unfamiliar are threats to our comfort. So we're quick to see other groups, philosophies, lifestyles, belief systems, family structures, choices, etc., as weird and wrong. We want to believe they are wrong, because we want to believe that pursuing our own comfort is right. We want to believe we have our priorities in check. Our very desire for comfort creeps into our logical reasoning, so deeply does the desire go. So insidiously does it carry out its programmed mission: to keep us from being uncomfortable, our brains will subvert objectivity and keep us from seeing the fallacies in our own thinking, keep us from recognizing that we are, at heart, selfish and misguided creatures whose greatest delight is sitting around and feeling pretty good about ourselves. If needed, then, we will happily sacrifice the validity and value of every thing, person, or choice that is different from what we know and define as normal. We will, for the sake of our own rightness, define all different things as wrong. We don't even hesitate. Hesitation is a sign that you might be starting to see the truth of your own motivation. If you start hesitating before defining, before casting judgment, before categorizing and labeling, look out: your comfort is at stake. Your brain is scurrying, be sure of it, to come up with great reasons for you to resist this awful urge to be fair. Fair. Fair? Fair! Fair has no place in the pursuit of comfort. Equality is not a factor here. If we value all people equally, we must admit that our own comfort is not the highest priority. We must admit that others, too, have valid needs, valid ideas, that the fact of their differentness is not adequate reason for us to deny them the same respect and autonomy we demand for ourselves. We can't have that. That sort of thinking gets us in trouble. That sort of thinking demolishes the layer upon layer of defensive triggers and traps that we have laid, so carefully, over the entire course of our lives. We are aware, so very aware, of how it could all fall apart. We know the reasons are thin. We know, deep down, the very idea of a fish bowl is absurd. We live in an ocean, and it's big, and it's full of creatures, and we're terrified. We want to believe we can limit what is around us. We want a fish bowl so we can feel like the biggest fish in it. It is the only way we know to feel safe. But there is another way: to see, first, that the fish bowl is an illusion of our own making, with imaginary walls upheld by discriminatory defense systems. If we can begin to see that the walls are not even real, we can see a way out. Maybe we can stop putting so much work into keeping them in place. It's scary. It is being alive. The threat only exists when we think we have something of our own, something utterly more important than all else, to protect and defend. But we don't. We are swimming in this together, all of us. There is no safer ocean, only this one.

0 views
Ruslan Osipov 3 months ago

Turns out Windows has a package manager

I have a Windows 11 PC, and something that really annoyed me about Windows for decades is the inability to update all installed programs at once. It’s just oh-so-annoying to have to update a program manually, which is worse for things I don’t use often - meaning every time I open a program, I have to deal with update pop-ups. I was clearly living under a rock, because all the way in 2020 Microsoft introduced package manager which lets you install, and more importantly update packages. It’s as simple as opening a command line (ideally as administrator, so you don’t have to keep hitting yes on the permission prompt for every program), and runinng . Yup, that’s it. You’ll update the vast majority of software you have installed. Some software isn’t compatible, but when I ran the command for the first time, Windows updated a little over 20 packages, which included the apps I find myself having to update manually the most often. To avoid having to do this manually, I’ve used windows Task Scheduler to create a new weekly task which runs a file, which consists of a single line: I just had to make sure Run with the highest privileges is enabled in task settings. So long, pesky update reminders. My Windows apps will finally stay up-to-date, hopefully.

0 views
Kev Quirk 4 months ago

Ten Pointless Facts About Me

I've seen this doing the rounds on a few blogs recently, so wanted to add my own version because I'm a narcissist. 🙃 Pete Moore did his version yesterday, and David did his version all the way back in April. I actually had this in draft from around then, but never got around to finishing it (there’s always something more fun to write). Well, I don’t have anything more fun to write at the moment, so Pete’s post prompted me to get it done. So here’s Ten Pointless Facts About Me… Kinda. A pet hate of mine is having food stuck in my teeth. So I always clean them out with a toothpick every time I eat. 🤢 All 3. I mostly drink water and coffee, but do enjoy a cup of tea with breakfast at the weekend. Crocs! I love Crocs! But I don’t wear them outdoors - they’re more like comfy slippers for around the house for me. When I’m out of the house, it’s usually trainers or walking shoes. Usually the latter as I’ll take comfort over fashion any day. My personal favourites are Merrell and Columbia. Anything lemon flavoured. Usually lemon drizzle, or lemon cheesecake (not the America kind though 🇬🇧). I always have a pint of water next to the bed. So the first thing I always do is to take a drink to freshen my mouth, then go to the bathroom to get rid of the water I drank the night before. Probably 28…ish. I think late 20s is a good balance between health, disposable income, and level of responsibility. I actually don’t know. 8 maybe? I have a few winter hats, a cap, some summer hats, and my old beret from when I was in the Army. A photo of one of the watches that I’m selling . I don’t take a lot of photos really. When I do, they’re mostly of my pets, my kids, or my motorbikes. No idea. I have a pretty low bar when it comes to TV and movies. I can usually find something I enjoy in pretty much everything I watch. The worst movie I’ve watched though was Dog Man ; absolute steaming pile of dog shit (pun intended). 💩 I didn’t have any serious aspirations to be honest. I was too busy being a child to worry about adult stuff. I did want to be a doctor for a while, but then I realised that I don’t like blood, and that I’m not clever enough. And that’s it, those are the Ten Pointless Facts About Me . Maybe you found it interesting and learned something about me? If you want to take part, here’s the questions in a copy/paste format to dump into your own blog post… Thanks for reading this post via RSS. RSS is great, and you're great for using it. ❤️ Reply to this post by email Do you floss your teeth? Tea, coffee, or water? Footwear preference? Favourite dessert? The first thing you do when you wake up? Age you’d like to stick at? How many hats do you own? Describe the last photo you took? Worst TV show? As a child, what was your aspiration for adulthood?

0 views
Ruslan Osipov 4 months ago

Thoughts on 3D printing

A few months back my wife gifted me a 3D printer: an entry level Bambu Lab A1 Mini . It’s a really cool little machine - it’s easy to set up, and it integrates with Maker World - a vast repository of free 3D models. Now that I’ve lived with a 3D printer for nearly half a year, I’d like to share what I’ve learned. After booting up the printer, printing benchy - a little boat which tests printer calibration settings, and seeing thousands of incredible designs on Bambu Lab’s Maker World - I thought I will never have to buy anything ever again. I was wrong. While some stuff printer on a 3D printer is fantastic, it’s not always the best replacement for mass produced objects. Many of the mass produced plastic items are using injection molding - liquid plastic that gets poured into a mold - and that produces a much stronger final product. That might be different if you’re printing with tougher plastics like ABS, but you also wouldn’t be using beginner-friendly machines like the A1 Mini to do that. So yeah, you still need to buy the heavy duty plastic stuff. And even as you print things, I wouldn’t say it’s cheaper than buying things from a store. It’s probably about the same, given the occasional failed prints, costs of the 3D printer, the need for multiple filaments, and the fact that by having a 3D printer you’re more likely to print things you don’t exactly need. Oh, I’ve printed so many useless things - it’s amazing. The Elden Ring warrior jar Alexander planter. Solair of Astora figurine. A beautiful glitch art sculpture. I even got a 0.2mm nozzle (smaller than the default 0.4mm) and managed to 3D print passable wargame and D&D miniatures. Which was pretty awesome, although you have to pay for the nicest looking models, which does take away from enjoyment of making plastic miniatures appear in your house “out of nowhere”. I’m not against artists getting paid, they certainly deserve it, but printed models were comparable to an mid-range Reaper miniature if you know what I mean, which certainly isn’t terrible, but it’s harder to justify breaking even. Maybe I could get better at getting the small details printed nicely. Oh, and if you’re into wargames - this thing easily prints incredible terrain. A basic 3D printer will pay for itself once you furnish a single battlefield. Once you’re done with printing basic things, you do need to start fiddling with the settings. Defaults only take you so far, and if you want a smoother surface, smaller details, or improvement in any other quality indicator - you have to tinker with the settings and produce test prints. It’s a hobby in it’s own, and it’s fun and rewarding, but this can get in the way when you’re just trying to print something really cool. But the most incredible feeling of accomplishment came when I needed something specific around the house, and I’d be able to design it. We bought some hanging plants, and I wished I could just hang it on the picture rail of our century home. And I was able to design a hanger, and it took me 3 iterations to create an item that fits my house perfectly and that I love. My mom needed a plastic replacement part for a long discontinued juicer. I was able to design the thing (don’t worry, I covered PLA in food-safe epoxy), and the juicer will see another few decades of use. Door stops, highly specific tools, garden shenanighans - the possibilities are endless. It took me a few months to move past using others’ designs and making my own - Tinkercad has been sufficient for my use cases so far, although I’m sure I’ll outgrow it as my projects get more complicated. 3D printers aren’t quite yet the consumer product, but my A1 Mini shoed me that this future is getting closer. Some day, we might all have a tiny 3D printer in our home (or have a cheap corner 3D printing shop?), to quickly and effortlessly create many household objects. Until then, 3D printers remain a tinkerer’s tool, but a really fun one at that, and modern printers are reducing the barrier to entry, making it much easier to get into the hobby.

0 views
codedge 4 months ago

Random wallpaper with swaybg

Setting a wallpaper in Sway, with swaybg, is easy. Unfortunately there is no way of setting a random wallpaper automatically out of the box. Here is a little helper script to do that. The script is based on a post from Silvain Durand 1 with some slight modifications. I just linked the script my sway config instead of setting a background there. Sway config : The script spawns a new instance, changes the wallpaper, and kills the old instance. With this approach there is no flickering of the background when changing. An always up-to-date version can be found in my dotfiles . Original script from Silvain Durand: https://sylvaindurand.org/dynamic-wallpapers-with-sway/   ↩︎ Original script from Silvain Durand: https://sylvaindurand.org/dynamic-wallpapers-with-sway/   ↩︎

0 views

LLMs Eat Scaffolding for Breakfast

We just deleted thousands of lines of code. Again. Each time a new LLM model comes out, that’s the same story. LLMs have limitations so we build scaffolding around them. Each models introduce new capabilities so that old scaffoldings must be deleted and new ones be added. But as we move closer to super intelligence, less scaffoldings are needed. This post is about what it takes to build successfully in AI today. Every line of scaffolding is a confession: the model wasn’t good enough. LLMs can’t read PDF? Let’s build a complex system to convert PDF to markdown LLMs can’t do math? Let’s build compute engine to return accurate numbers LLMs can’t handle structured output? Let’s build complex JSON validators and regex parsers LLMs can’t read images? Let’s use a specialized image to text model to describe the image to the LLM LLMs can’t read more than 3 pages? Let’s build a complex retrieval pipeline with a search engine to feed the best content to the LLM. LLMs can’t reason? Let’s build chain-of-thought logic with forced step-by-step breakdowns, verification loops, and self-consistency checks. etc, etc... millions of lines of code to add external capabilities to the model. But look at models today: GPT-5 is solving frontier mathematics, Grok-4 Fast can read 3000+ pages with its 2M context window, Claude 4.5 sonnet can ingest images or PDFs, all models have native reasoning capabilities and support structured outputs. The once essential scaffolding are now obsolete. Those tools are backed in the model capabilities. It’s nearly impossible to predict what scaffolding will become obsolete and when. What appears to be essential infrastructure and industry best practice today can transform into legacy technical debt within months. The best way to grasp how fast LLMs are eating scaffolding is to look at their system prompt (the top-level instruction that tells the AI how to behave). Looking at the prompt used in Codex, OpenAI coding agent from GPT-o3 model to GPT-5 is mind-blowing. GPT-o3 prompt: 310 lines GPT-5 prompt: 104 lines The new prompt removed 206 lines. A 66% reduction. GPT-5 needs way less handholding. The old prompt had complex instructions on how to behave as a coding agent (personality, preambles, when to plan, how to validate). The new prompt assumes GPT-5 already knows this and only specifies the Codex-specific technical requirements (sandboxing, tool usage, output formatting). The new prompt removed all the detailed guidance about autonomously resolving queries, coding guidelines, git usage. It’s also less prescriptive. Instead of “do this and this” it says “here are the tools at your disposal.” As we move closer to super intelligence, the models require more freedom and leeway (scary, lol!). Advanced models require simple instructions and tooling. Claude Code, the most sophisticated agent today, relies on a simple filesystem instead of a complex index and use bash commands (find, read, grep, glob) instead of complex tools. It moves so fast. Each model introduces a new paradigm shift. If you miss a paradigm shift, you’re dead. Having an edge in building AI applications require deep technical understanding, insatiable curiosity, and low ego. By the way, because everything changes, it’s good to focus on what won’t change Context window is how much text you can feed the model in a single conversation. Early model could only handle a couple of pages. Now it’s thousands of pages and it’s growing fast. Dario Amodei the founder of Anthropic expects 100M+ context windows while Sam Altman hinted at billions of context tokens . It means the LLMs can see more context so you need less scaffolding like retrieval augmented generation. November 2022 : GPT-3.5 could handle 4K context November 2023 : GPT-4 Turbo with 128K context June 2024 : Claude 3.5 Sonnet with 200K context June 2025 : Gemini 2.5 Pro with 1M context September 2025 : Grok-4 Fast with 2M context Models used to stream at 30-40 tokens per second. Today’s fastest models like Gemini 2.5 Flash and Grok-4 Fast hit 200+ tokens per second. A 5x improvement. On specialized AI chips (LPUs), providers like Cerebras push open-source models to 2,000 tokens per second. We’re approaching real-time LLM: full responses on complex task in under a second. LLMs are becoming exponentially smarter. With every new model, benchmarks get saturated. On the path to AGI, every benchmark will get saturated. Every job can be done and will be done by AI. As with humans, a key factor in intelligence is the ability to use tools to accomplish an objective. That is the current frontier: how well a model can use tools such as reading, writing, and searching to accomplish a task over a long period of time. This is important to grasp. Models will not improve their language translation skills (they are already at 100%), but they will improve how they chain translation tasks over time to accomplish a goal. For example, you can say, “Translate this blog post into every language on Earth,” and the model will work for a couple of hours on its own to make it happen. Tool use and long-horizon tasks are the new frontier. The uncomfortable truth: most engineers are maintaining infrastructure that shouldn’t exist. Models will make it obsolete and the survival of AI apps depends on how fast you can adapt to the new paradigm. That’s what startups have an edge over big companies. Bigcorp are late by at least two paradigms. Some examples of scaffolding that are on the decline: Vector databases : Companies paying thousands/month for when they could now just put docs in the prompt or use agentic-search instead of RAG ( my article on the topic ) LLM frameworks : These frameworks solved real problems in 2023. In 2025? They’re abstraction layers that slow you down. The best practice is now to use the model API directly. Prompt engineering teams : Companies hiring “prompt engineers” to craft perfect prompts when now current models just need clear instructions with open tools Model fine-tuning : Teams spending months fine-tuning models only for the next generation of out of the box models to outperform their fine-tune (cf my 2024 article on that ) Custom caching layers : Building Redis-backed semantic caches that add latency and complexity when prompt caching is built into the API. This cycle accelerates with every model release. The best AI teams master have critical skills: Deep model awareness : They understand exactly what today’s models can and cannot do, building only the minimal scaffolding needed to bridge capability gaps. Strategic foresight : They distinguish between infrastructure that solves today’s problems versus infrastructure that will survive the next model generation. Frontier vigilance : They treat model releases like breaking news. Missing a single capability announcement from OpenAI, Anthropic, or Google can render months of work obsolete. Ruthless iteration : They celebrate deleting code. When a new model makes their infrastructure redundant, they pivot in days, not months. It’s not easy. Teams are fighting powerful forces: Lack of awareness : Teams don’t realize models have improved enough to eliminate scaffolding (this is massive btw) Sunk cost fallacy : “We spent 3 years building this RAG pipeline!” Fear of regression : “What if the new approach is simple but doesn’t work as well on certain edge cases?” Organizational inertia : Getting approval to delete infrastructure is harder than building it Resume-driven development : “RAG pipeline with vector DB and reranking” looks better on a resume than “put files in prompt” In AI the best team builds for fast obsolescence and stay at the edge. Software engineering sits on top of a complex stack. More layers, more abstractions, more frameworks. Complexity was a sophistication. A simple web form in 2024? React for UI, Redux for state, TypeScript for types, Webpack for bundling, Jest for testing, ESLint for linting, Prettier for formatting, Docker for deployment…. AI is inverting this. The best AI code is simple and close to the model. Experienced engineers look at modern AI codebases and think: “This can’t be right. Where’s the architecture? Where’s the abstraction? Where’s the framework?” The answer: The model ate it bro, get over it. The worst AI codebases are the ones that were best practices 12 months ago. As models improve, the scaffolding becomes technical debt. The sophisticated architecture becomes the liability. The framework becomes the bottleneck. LLMs eat scaffolding for breakfast and the trend is accelerating. Thanks for reading! Subscribe for free to receive new posts and support my work. LLMs can’t read PDF? Let’s build a complex system to convert PDF to markdown LLMs can’t do math? Let’s build compute engine to return accurate numbers LLMs can’t handle structured output? Let’s build complex JSON validators and regex parsers LLMs can’t read images? Let’s use a specialized image to text model to describe the image to the LLM LLMs can’t read more than 3 pages? Let’s build a complex retrieval pipeline with a search engine to feed the best content to the LLM. LLMs can’t reason? Let’s build chain-of-thought logic with forced step-by-step breakdowns, verification loops, and self-consistency checks. Vector databases : Companies paying thousands/month for when they could now just put docs in the prompt or use agentic-search instead of RAG ( my article on the topic ) LLM frameworks : These frameworks solved real problems in 2023. In 2025? They’re abstraction layers that slow you down. The best practice is now to use the model API directly. Prompt engineering teams : Companies hiring “prompt engineers” to craft perfect prompts when now current models just need clear instructions with open tools Model fine-tuning : Teams spending months fine-tuning models only for the next generation of out of the box models to outperform their fine-tune (cf my 2024 article on that ) Custom caching layers : Building Redis-backed semantic caches that add latency and complexity when prompt caching is built into the API. Deep model awareness : They understand exactly what today’s models can and cannot do, building only the minimal scaffolding needed to bridge capability gaps. Strategic foresight : They distinguish between infrastructure that solves today’s problems versus infrastructure that will survive the next model generation. Frontier vigilance : They treat model releases like breaking news. Missing a single capability announcement from OpenAI, Anthropic, or Google can render months of work obsolete. Ruthless iteration : They celebrate deleting code. When a new model makes their infrastructure redundant, they pivot in days, not months. Lack of awareness : Teams don’t realize models have improved enough to eliminate scaffolding (this is massive btw) Sunk cost fallacy : “We spent 3 years building this RAG pipeline!” Fear of regression : “What if the new approach is simple but doesn’t work as well on certain edge cases?” Organizational inertia : Getting approval to delete infrastructure is harder than building it Resume-driven development : “RAG pipeline with vector DB and reranking” looks better on a resume than “put files in prompt”

1 views
ptrchm 4 months ago

Event-driven Modular Monolith

The main Rails app I currently work on has just turned eight. It’s not a huge app. It doesn’t deal with web-scale traffic or large volumes of data. Only six people working on it now. But eight years of pushing new code adds up. This is a quick overview of some of the strategies we use to keep the codebase maintainable. After the first few years, our codebase suffered from typical ailments: tight coupling between domains, complex database queries spread across various parts of the app, overgrown models, a maze of side effects triggered by ActiveRecord callbacks , endlessly chained associations (e.g. ) – with an all-encompassing model sitting on top of the pile. Modular Monolith Pub/Sub (Events) Patterns Service Objects Repositories for Database Queries Slim and Dumb Models Bonus: A Separate Frontend App How Do I Start?

0 views