Latest Posts (20 found)

Firefox AI Killswitch

Nice to see that the Firefox team have actually implemented their "AI killswitch" in the way that they said they would. Here's a screenshot from my copy of Firefox 148: Very happy to see this land, and it means I can end my hunt for a new browser for the time being. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Croissant and CORS proxy update

Croissant is my home-cooked RSS reader. I wish it was only a progressive web app (PWA) but due to missing CORS headers, many feeds remain inaccessible. My RSS feeds have the header and so should yours! Blogs Are Back has a guide to enable CORS for your blog . Bypassing CORS requires some kind of proxy. Other readers use a custom browser extension. That is clever, but extensions can be dangerous. I decided on two solutions. I wrapped my PWA in a Tauri app . This is also dangerous if you don’t trust me. I also provided a server proxy for the PWA. A proxy has privacy concerns but is much safer. I’m sorry if anyone is using Croissant as a PWA because the proxy is now gone. If a feed has the correct CORS headers it will continue to work. Sorry for the abrupt change. That’s super lame, I know! To be honest I’ve lost a bit of enthusiasm for the project and I can’t maintain a proxy. Croissant was designed to be limited in scope to avoid too much burden. In hindsight the proxy was too ambitious. Technically, yes! But you’ll have to figure that out by yourself. If you have questions, such as where to find the code, how the code works etc, the answer is no. I don’t mean to be rude, I just don’t have any time! You’re welcome to ask for support but unless I can answer in 30 seconds I’ll have to decline. Croissant is feature complete! It does what I set out to achieve. I have fixed several minor bugs and tweaked a few styles. Until inspiration (or a bug) strikes I won’t do another update anytime soon. Maybe later in the year I’ll decide to overhaul it? Who can predict! Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views

Quick Clarification on Pure Comments

A couple of people have reached out to me asking if I can offer a version of Pure Comments for their site, as they don't run Pure Blog . I obviously didn't make this clear in the announcement , or on the (now updated) Pure Comments site. Pure Comments can be used on ANY website. It's just an embed script, just like Disqus (only with no bloat or tracking). So you just have to upload the files to wherever you want to host Pure Comments, then add the following embed code wherever you want comments to display (replacing the example domain): You can use Pure Comments on WordPress, Bear Blog, Jekyll, 11ty, Hugo, Micro.blog, Kirby, Grav, and even Pure Blog! Anywhere you can inject the little snippet of code above, Pure Comments will work. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
(think) Today

Building Emacs Major Modes with TreeSitter: Lessons Learned

Over the past year I’ve been spending a lot of time building TreeSitter-powered major modes for Emacs – clojure-ts-mode (as co-maintainer), neocaml (from scratch), and asciidoc-mode (also from scratch). Between the three projects I’ve accumulated enough battle scars to write about the experience. This post distills the key lessons for anyone thinking about writing a TreeSitter-based major mode, or curious about what it’s actually like. Before TreeSitter, Emacs font-locking was done with regular expressions and indentation was handled by ad-hoc engines (SMIE, custom indent functions, or pure regex heuristics). This works, but it has well-known problems: Regex-based font-locking is fragile. Regexes can’t parse nested structures, so they either under-match (missing valid code) or over-match (highlighting inside strings and comments). Every edge case is another regex, and the patterns become increasingly unreadable over time. Indentation engines are complex. SMIE (the generic indentation engine for non-TreeSitter modes) requires defining operator precedence grammars for the language, which is hard to get right. Custom indentation functions tend to grow into large, brittle state machines. Tuareg’s indentation code, for example, is thousands of lines long. TreeSitter changes the game because you get a full, incremental, error-tolerant syntax tree for free. Font-locking becomes “match this AST pattern, apply this face”: And indentation becomes “if the parent node is X, indent by Y”: The rules are declarative, composable, and much easier to reason about than regex chains. In practice, ’s entire font-lock and indentation logic fits in about 350 lines of Elisp. The equivalent in tuareg is spread across thousands of lines. That’s the real selling point: simpler, more maintainable code that handles more edge cases correctly . That said, TreeSitter in Emacs is not a silver bullet. Here’s what I ran into. TreeSitter grammars are written by different authors with different philosophies. The tree-sitter-ocaml grammar provides a rich, detailed AST with named fields. The tree-sitter-clojure grammar, by contrast, deliberately keeps things minimal – it only models syntax, not semantics, because Clojure’s macro system makes static semantic analysis unreliable. 1 This means font-locking forms in Clojure requires predicate matching on symbol text, while in OCaml you can directly match nodes with named fields. To illustrate: here’s how you’d fontify a function definition in OCaml, where the grammar gives you rich named fields: And here’s the equivalent in Clojure, where the grammar only gives you lists of symbols and you need predicate matching: You can’t learn “how to write TreeSitter queries” generically – you need to learn each grammar individually. The best tool for this is (to visualize the full parse tree) and (to see the node at point). Use them constantly. You’re dependent on someone else providing the grammar, and quality is all over the map. The OCaml grammar is mature and well-maintained – it’s hosted under the official tree-sitter GitHub org. The Clojure grammar is small and stable by design. But not every language is so lucky. asciidoc-mode uses a third-party AsciiDoc grammar that employs a dual-parser architecture – one parser for block-level structure (headings, lists, code blocks) and another for inline formatting (bold, italic, links). This is the same approach used by Emacs’s built-in , and it makes sense for markup languages where block and inline syntax are largely independent. The problem is that the two parsers run independently on the same text, and they can disagree . The inline parser misinterprets and list markers as emphasis delimiters, creating spurious bold spans that swallow subsequent inline content. The workaround is to use on all block-level font-lock rules so they win over the incorrect inline faces: This doesn’t fix inline elements consumed by the spurious emphasis – that requires an upstream grammar fix. When you hit grammar-level issues like this, you either fix them yourself (which means diving into the grammar’s JavaScript source and C toolchain) or you live with workarounds. Either way, it’s a reminder that your mode is only as good as the grammar underneath it. Getting the font-locking right in was probably the most challenging part of all three projects, precisely because of these grammar quirks. I also ran into a subtle behavior: the default font-lock mode ( ) skips an entire captured range if any position within it already has a face. So if you capture a parent node like and a child was already fontified, the whole thing gets skipped silently. The fix is to capture specific child nodes instead: These issues took a lot of trial and error to diagnose. The lesson: budget extra time for font-locking when working with less mature grammars . Grammars evolve, and breaking changes happen. switched from the stable grammar to the experimental branch because the stable version had metadata nodes as children of other nodes, which caused and to behave incorrectly. The experimental grammar makes metadata standalone nodes, fixing the navigation issues but requiring all queries to be updated. pins to v0.24.0 of the OCaml grammar. If you don’t pin versions, a grammar update can silently break your font-locking or indentation. The takeaway: always pin your grammar version , and include a mechanism to detect outdated grammars. tests a query that changed between versions to detect incompatible grammars at startup. Users shouldn’t have to manually clone repos and compile C code to use your mode. Both and include grammar recipes: On first use, the mode checks and offers to install missing grammars via . This works, but requires a C compiler and Git on the user’s machine, which is not ideal. 2 The TreeSitter support in Emacs has been improving steadily, but each version has its quirks: Emacs 29 introduced TreeSitter support but lacked several APIs. For instance, (used for structured navigation) doesn’t exist – you need a fallback: Emacs 30 added , sentence navigation, and better indentation support. But it also had a bug in offsets ( #77848 ) that broke embedded parsers, and another in that required to disable its TreeSitter-aware version. Emacs 31 has a bug in where an off-by-one error causes to leave ` *)` behind on multi-line OCaml comments. I had to skip the affected test with a version check: The lesson: test your mode against multiple Emacs versions , and be prepared to write version-specific workarounds. CI that runs against Emacs 29, 30, and snapshot is essential. Most TreeSitter grammars ship with query files for syntax highlighting ( ) and indentation ( ). Editors like Neovim and Helix use these directly. Emacs doesn’t – you have to manually translate the patterns into and calls in Elisp. This is tedious and error-prone. For example, here’s a rule from the OCaml grammar’s : And here’s the Elisp equivalent you’d write for Emacs: The query syntax is nearly identical, but you have to wrap everything in calls, map upstream capture names ( ) to Emacs face names ( ), assign features, and manage behavior. You end up maintaining a parallel set of queries that can drift from upstream. Emacs 31 will introduce which will make it possible to use files for font-locking, which should help significantly. But for now, you’re hand-coding everything. When a face isn’t being applied where you expect: TreeSitter modes define four levels of font-locking via , and the default level in Emacs is 3. It’s tempting to pile everything into levels 1–3 so users see maximum highlighting out of the box, but resist the urge. When every token on the screen has a different color, code starts looking like a Christmas tree and the important things – keywords, definitions, types – stop standing out. Less is more here. Here’s how distributes features across levels: And follows the same philosophy: The pattern is the same: essentials first, progressively more detail at higher levels. This way the default experience (level 3) is clean and readable, and users who want the full rainbow can bump to 4. Better yet, they can use to cherry-pick individual features regardless of level: This gives users fine-grained control without requiring mode authors to anticipate every preference. Indentation issues are harder to diagnose because they depend on tree structure, rule ordering, and anchor resolution: Remember that rule order matters for indentation too – the first matching rule wins. A typical set of rules reads top to bottom from most specific to most general: Watch out for the empty-line problem : when the cursor is on a blank line, TreeSitter has no node at point. The indentation engine falls back to the root node as the parent, which typically matches the top-level rule and gives column 0. In neocaml I solved this with a rule that looks at the previous line’s last token to decide indentation: This is the single most important piece of advice. Font-lock and indentation are easy to break accidentally, and manual testing doesn’t scale. Both projects use Buttercup (a BDD testing framework for Emacs) with custom test macros. Font-lock tests insert code into a buffer, run , and assert that specific character ranges have the expected face: Indentation tests insert code, run , and assert the result matches the expected indentation: Integration tests load real source files and verify that both font-locking and indentation survive on the full file. This catches interactions between rules that unit tests miss. has 200+ automated tests and has even more. Investing in test infrastructure early pays off enormously – I can refactor indentation rules with confidence because the suite catches regressions immediately. When I became the maintainer of clojure-mode many years ago, I really struggled with making changes. There were no font-lock or indentation tests, so every change was a leap of faith – you’d fix one thing and break three others without knowing until someone filed a bug report. I spent years working on a testing approach I was happy with, alongside many great contributors, and the return on investment was massive. The same approach – almost the same test macros – carried over directly to when we built the TreeSitter version. And later I reused the pattern again in and . One investment in testing infrastructure, four projects benefiting from it. I know that automated tests, for whatever reason, never gained much traction in the Emacs community. Many popular packages have no tests at all. I hope stories like this convince you that investing in tests is really important and pays off – not just for the project where you write them, but for every project you build after. This one is specific to but applies broadly: compiling TreeSitter queries at runtime is expensive. If you’re building queries dynamically (e.g. with called at mode init time), consider pre-compiling them as values. This made a noticeable difference in ’s startup time. The Emacs community has settled on a suffix convention for TreeSitter-based modes: , , , and so on. This makes sense when both a legacy mode and a TreeSitter mode coexist in Emacs core – users need to choose between them. But I think the convention is being applied too broadly, and I’m afraid the resulting name fragmentation will haunt the community for years. For new packages that don’t have a legacy counterpart, the suffix is unnecessary. I named my packages (not ) and (not ) because there was no prior or to disambiguate from. The infix is an implementation detail that shouldn’t leak into the user-facing name. Will we rename everything again when TreeSitter becomes the default and the non-TS variants are removed? Be bolder with naming. If you’re building something new, give it a name that makes sense on its own merits, not one that encodes the parsing technology in the package name. I think the full transition to TreeSitter in the Emacs community will take 3–5 years, optimistically. There are hundreds of major modes out there, many maintained by a single person in their spare time. Converting a mode from regex to TreeSitter isn’t just a mechanical translation – you need to understand the grammar, rewrite font-lock and indentation rules, handle version compatibility, and build a new test suite. That’s a lot of work. Interestingly, this might be one area where agentic coding tools can genuinely help. The structure of TreeSitter-based major modes is fairly uniform: grammar recipes, font-lock rules, indentation rules, navigation settings, imenu. If you give an AI agent a grammar and a reference to a high-quality mode like , it could probably scaffold a reasonable new mode fairly quickly. The hard parts – debugging grammar quirks, handling edge cases, getting indentation just right – would still need human attention, but the boilerplate could be automated. Still, knowing the Emacs community, I wouldn’t be surprised if a full migration never actually completes. Many old-school modes work perfectly fine, their maintainers have no interest in TreeSitter, and “if it ain’t broke, don’t fix it” is a powerful force. And that’s okay – diversity of approaches is part of what makes Emacs Emacs. TreeSitter is genuinely great for building Emacs major modes. The code is simpler, the results are more accurate, and incremental parsing means everything stays fast even on large files. I wouldn’t go back to regex-based font-locking willingly. But it’s not magical. Grammars are inconsistent across languages, the Emacs APIs are still maturing, you can’t reuse files (yet), and you’ll hit version-specific bugs that require tedious workarounds. The testing story is better than with regex modes – tree structures are more predictable than regex matches – but you still need a solid test suite to avoid regressions. If you’re thinking about writing a TreeSitter-based major mode, do it. The ecosystem needs more of them, and the experience of working with syntax trees instead of regexes is genuinely enjoyable. Just go in with realistic expectations, pin your grammar versions, test against multiple Emacs releases, and build your test suite early. Anyways, I wish there was an article like this one when I was starting out with and , so there you have it. I hope that the lessons I’ve learned along the way will help build better modes with TreeSitter down the road. That’s all I have for you today. Keep hacking! See the excellent scope discussion in the tree-sitter-clojure repo for the rationale.  ↩︎ There’s ongoing discussion in the Emacs community about distributing pre-compiled grammar binaries, but nothing concrete yet.  ↩︎ Regex-based font-locking is fragile. Regexes can’t parse nested structures, so they either under-match (missing valid code) or over-match (highlighting inside strings and comments). Every edge case is another regex, and the patterns become increasingly unreadable over time. Indentation engines are complex. SMIE (the generic indentation engine for non-TreeSitter modes) requires defining operator precedence grammars for the language, which is hard to get right. Custom indentation functions tend to grow into large, brittle state machines. Tuareg’s indentation code, for example, is thousands of lines long. Use to verify the node type at point matches your query. Set to to see which rules are firing. Check the font-lock feature level – your rule might be in level 4 while the user has the default level 3. The features are assigned to levels via . Remember that rule order matters . Without , an earlier rule that already fontified a region will prevent later rules from applying. This can be intentional (e.g. builtin types at level 3 take precedence over generic types) or a source of bugs. Set to – this logs which rule matched for each line, what anchor was computed, and the final column. Use to understand the parent chain. The key question is always: “what is the parent node, and which rule matches it?” Remember that rule order matters for indentation too – the first matching rule wins. A typical set of rules reads top to bottom from most specific to most general: Watch out for the empty-line problem : when the cursor is on a blank line, TreeSitter has no node at point. The indentation engine falls back to the root node as the parent, which typically matches the top-level rule and gives column 0. In neocaml I solved this with a rule that looks at the previous line’s last token to decide indentation: See the excellent scope discussion in the tree-sitter-clojure repo for the rationale.  ↩︎ There’s ongoing discussion in the Emacs community about distributing pre-compiled grammar binaries, but nothing concrete yet.  ↩︎

0 views

How To Run Services on a Linux Server

I have been running services myself for a few years on Linux servers. It took a while to figure out what works best. Here's what I've learned. First of all, all maintenance is done on headless servers via SSH. Learning this might seem daunting for some at first, but it is truly unbeatable in terms of productivity and speed. To easily log in via SSH, add the SSH keys to the server and then add the server to your `~/.ssh/config`. For example, ``` Host arnold Hostname 123.456.789.012 User rik IdentityFile ~/.ssh/arnold ``` Now you can log in via `ssh arnold` instead of having to ma...

0 views

I made a voice note taker

Have you ever always wanted a very very small voice note recorder that would fit in your pocket? Something that would always work, and always be available to take a note at the touch of a button, with no fuss? Me neither. Until, that is, I saw the Pebble Index 01 , then I absolutely needed it right away and had to have it in my life immediately, but alas, it is not available, plus it’s disposable, and I don’t like creating e-waste. What was a poor maker like me supposed to do when struck down so cruelly by the vicissitudes of fate? There was only one thing I could do: I could build my own, shitty version of it for $8, and that’s exactly what I did. Like everyone else, I have some sort of undiagnosed ADHD, which manifests itself as my brain itching for a specific task, and the itch becoming unbearable unless I scratch it. This usually results in me getting my phone out, no matter where I am or who I’m with, and either noting stuff down or doing the task, which some people perceive as rude, for inexplicable reasons that are almost certainly their fault. Because, however, it has proved easier to just not get my phone out in polite company than convince everyone of how wrong they are, I just do the former now, but that makes the itch remain. Also, sometimes I’m just in the middle of something, and an idea pops into my head for later pursuit, but I get distracted by a squirrel, a car going by, or the disturbing trend of the constant and persistent erosion of civil rights all over the world, and I forget the idea. The Pebble Index showed me that there’s a better way, a device that’s unobtrusive, available, and reliable enough that I could just press a button, speak into it, and know for sure that my sonorous voice would reach the bowels of my phone, where it would be stored safely until I was bored and wanted something to do. I didn’t want to have to get my phone out, unlock it, open a voice recorder app, hold down a button, speak, wonder if it heard me, look at the button, realize I had already pressed it, press it again, say the thing again, press it again to stop, exit the app, lock my phone, and put it back into my pocket. I wanted to take a thing out, press a button, speak, release the button, done. The initial thinking was that I’d use a microcontroller (an ESP32 is my microcontroller of choice these days), a microphone, and a lithium battery, and that’s basically all the hardware this needs! Most of the heavy lifting would need to be done in software. This would need: Luckily, I know enough about electronics to know that LLMs would definitely know how to build something like that. Indeed, Claude confirmed my suspicions by saying that all I need is a microphone and an ESP32. It recommended an ESP32-C6 but I went with an ESP32-S3 , as it had an onboard charge controller and would be able to charge a lithium battery from USB, which is very handy when you’re making a thing that runs on battery. The ESP32 is a microcontroller, a little computer that’s just really small. The main difference of the S3 from the C6 is that the S3 is more capable, and has more power. I keep an assortment of random components around, so I had an ESP32-S3 board. It’s a no-name, crappy one from AliExpress, not a good, Seeed-branded one from AliExpress, but it would have to do. Unfortunately, I didn’t have a MEMS microphone (which is basically an angelic grain of rice that can hear, with excellent quality), but I did have an electret mic, which is huge and bad quality and would sound like an old-timey radio, but it was there and it was ready and it was willing, and after a few beers it seemed like it was right, or at least right for right now. I also had a very thin LiPo battery, which would suit very well. For the final device I’d want a battery that’s a tiny bit shorter, as this one was around 40% longer than the ESP32, but it would do great for now. I quickly soldered everything together and recorded some audio. It worked! It worked and nobody was going to take that from me, even though it was crackly and the quality wasn’t great. Unfortunately, at this stage I realized that the analog electret microphone consumes too much energy, even when sleeping, which is terrible on a device that would spend more time sleeping than the beauty from that fairytale, Sleepy the Dwarf. To counteract that, I decided to use a MOSFET to cut power to the mic when the device was asleep. A MOSFET is a little switch that you can turn on and off from a microcontroller, basically. Full disclosure here, before using the MOSFET to turn the mic on and off, I went down a multi-hour rabbit hole trying to design a latching circuit that would allow the ESP32 to turn itself off and consume almost no power. Instead, it consumed a lot of my time, without anything to show for it, because I didn’t manage to make it work at all. The MOSFET for the mic worked fairly well, though, and the device didn’t consume much power when asleep. The real gains, however, were going to be had when the MEMS microphone I ordered arrived, as those use infinitesimal amounts of current when asleep, and have much better sound quality as well, as they are digital. The analog microphone crackled and popped and took a while to stabilize after boot, which was unfortunate because I wanted the device to be ready as soon as the user pressed the button. There was also a recording bug where the recording was missing a few milliseconds of audio every so often, which led to dropped phonemes and words sometimes sounding like other words because parts of them were dropped. All these problems were weird enough and hard enough to debug that I resolved to just wait for my digital MEMS microphone to arrive, which would solve them in one fell swoop, as it is digital and amazing. After the relatively easy part of connecting a few wires together, now came the hard part: Designing a case for the whole thing that would fit without leaving much empty space, to make the device as small as possible. This was very hard to do with this massive microphone that was as tall as everything else (including battery) combined. I initially tried to point the microphone downward while mounting it at the top, so it would take up the least amount of vertical space possible, but the PCB made that hard, as the microphone was soldered to it. I ended up desoldering the mic from the PCB, trimming the PCB to make it shorter, and connecting the mic to it with wires. That allowed me to make the case (and thus the device) smaller, but at what cost? Nothing, turns out, because it worked great. The device was working great, but I didn’t want it tethered to my computer, I wanted to be able to take it out and about and show it the wonders of the world. To do this, I needed Bluetooth. Unfortunately, I have exactly zero idea how Bluetooth works, and would need to spend days or weeks figuring stuff out, but, luckily for me, I had a Claude subscription. It took a bit of back-and-forth, but I did manage to end up with a Python script that would connect to the pendant, download the audio files, and convert them from ADPCM to MP3, for expanded compatibility. To maximize battery life, the way things worked was: This worked really well, the device was awake for a small amount of time (10 seconds), but it could be awoken at any time just by tapping the button. At that point, it would transfer to the PC any files that were on the pendant, and go back to sleep. One downside was that transfers would take an inordinate amount of time, sometimes reaching 2 minutes for a 10-second clip. OpenAI’s Codex was really helpful here, finding a solution for fast BLE transfers that made sending files 100x faster than it was before. Because I’m too impatient to wait for the slow boat from China, I ordered the same microphone locally. I had to pay an arm and a leg in shipping and impatience fees, but it was worth it, because I finally had a MEMS mic! It’s so cute and tiny, I immediately found a spot for it over the board, added the switch, added a voltage divider for sensing battery voltage, and that was it! The new mic sounds fantastic, it sounds better than recording with your phone, for some odd reason that I’m sure is all in my head. What’s more, it doesn’t have the weird bugs that plagued me with the analog mic. With this smaller mic, I could now design a better case. I designed the case you see on the right, which is the second generation. There will be a third, when I receive the shorter battery, which means I will have a choice of either making the device longer but half as thick, or around 40% shorter. I think I will go for longer but thinner, I’d quite prefer to have a thin device in my pocket, even if it’s long, than a stubby one that pokes out. Still, the new battery (and the new case) will mark the completion of this project and make me a very happy man. For the second-gen case, I decided to jazz it up and add a red stripe around it, because it was easy to do and because I think it looks good. Unfortunately, the feature I wanted most (fillets, i.e. rounded corners) wasn’t possible due to the lack of empty space inside the case. I hope the final device will have some more space for fillets, at least. Once I was done with the device, it was time to make it more ergonomic: I’d need to create an Android app so I wouldn’t have to wait to get to my PC. I also knew I wanted note transcription, as it’s really useful to be able to see what you said without having to listen to the audio again. Unfortunately again, I have no idea about Android development, only having written a small app years ago. Fortunately, though, Claude turned out to be pretty good at it, and one-shotted this app that you see here. For the transcription, I used GPT-4o Transcribe, which is great and understands both English and Greek, languages I fail to speak in equal measure. I have to say, it’s pretty magical to speak into a little box and to see the audio already captured and transcribed on your phone. With the Android app, I could now test the device in real-world use. One thing I noticed is that battery dies way too fast. I suspect that has something to do with the cheap board, so I’ve ordered an original Seeed Xiao board, and I hope that will fix the problem once and for all, as they advertise low power usage and they’re a trustworthy brand. I also added a “webhook” convenience function to the Android app, so that the latter would be able to send the transcription to a server for further processing. The device is extremely reliable, which makes me a lot more likely to use it. I know that, if I press the button, the audio will be recorded and stored, and nothing will happen to it, which makes for a very relaxed and calming experience. Before I continue, I want to say you can find all the files in this project (firmware, Android app, whatever else) in its GitHub repository: https://github.com/skorokithakis/middle That’s right, I called it Middle, because it was the next thing after the Index. I know it’s a silly name, I don’t care, don’t use it, I’m not changing it. In the “draw the rest of the fucking owl” portion of this article, I realized I didn’t want the notes to just go to my phone when LLMs exist. I wanted an LLM to take the notes and do something with them, so I spent a few weeks writing an AI agent that’s more useful than what currently exists. The device’s Android app sends the transcribed text to this AI, which processes it. I’m going to write another post about this, but basically, I wanted an AI personal assistant that could help with all the little chores in my life. AI assistants are interesting because they’re: This means that, when everyone inevitably asks “what is it good for”, I can’t really give a good answer, because the answer is “it takes care of all the little annoyances for me”, but nobody has the same annoyances and can’t really imagine what the bot does, so they don’t engage with it. The amazing thing for AI assistants for me is the fact that they can string together multiple (otherwise small) tools to do something that’s more valuable than the sum of its parts. For example, I asked the agent to give me a daily briefing every morning, consisting of my todos for the day, my calendar events, whether any refund has hit my bank, and whether any packages are due to be delivered today. The agent also checks my gym bookings and asks me every morning if I do plan to go, or if I intend to cancel. If I tell it to cancel, it does, but if I say I’ll go, it sets an alarm for a few minutes before, which I’m much more likely to see than my calendar’s one. It will also (entirely of its own volition) mention things like “you have a gym booking today 7-8pm but you have a restaurant booking at 9pm and it’ll take you more than an hour to shower and make it”, which a regular calendar wouldn’t be able to figure out. I’ve made it fantastically secure, everything is sandboxed and you can run it on your laptop without fear. I use it constantly throughout the day for many little things, and the integration with the device takes the whole setup to another level. You can find the bot here: https://github.com/skorokithakis/stavrobot Do let me know if you try it, it’s like OpenClaw but won’t steal your data and eat your firstborn. If you have any ideas, feedback, flamebait, or whatever, you can Tweet or Bluesky me, or email me directly. A way for the device to record audio onto some sort of persistent storage, for the case where you didn’t have your phone close to you. A way for the device to sleep, consuming almost no power, until it was woken up by the button. A way to transfer the files from the device to the phone, for later listening. A battery indicator would be very nice, so I knew when to recharge it. You pressed the button. If you held it down for more than half a second, the recording would “count”. If there was a recording made (i.e. if you held the button down long enough), it would be saved. Bluetooth would turn on and look for a phone or computer that’s ready to receive. The device would send the file and go to sleep again. Very open-ended tools, and Highly personal.

0 views
Jeff Geerling Yesterday

How to Securely Erase an old Hard Drive on macOS Tahoe

Apparently Apple thinks nobody with a modern Mac uses spinning rust (hard drives with platters) anymore. I plugged in a hard drive from an old iMac into my Mac Studio using my Sabrent USB to SATA Hard Drive enclosure, and opened up Disk Utility, clicked on the top-level disk in the sidebar, and clicked 'Erase'. Lo and behold, there's no 'Security Options' button on there, as there had been since—I believe—the very first version of Disk Utility in Mac OS X!

0 views

On NVIDIA and Analyslop

Hey all! I’m going to start hammering out free pieces again after a brief hiatus, mostly because I found myself trying to boil the ocean with each one, fearing that if I regularly emailed you you’d unsubscribe. I eventually realized how silly that was, so I’m back, and will be back more regularly. I’ll treat it like a column, which will be both easier to write and a lot more fun. As ever, if you like this piece and want to support my work, please subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5000 to 18,000 words, including vast, extremely detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . I am regularly several steps ahead in my coverage, and you get an absolute ton of value. In the bottom right hand corner of your screen you’ll see a red circle — click that and select either monthly or annual.  Next year I expect to expand to other areas too. It’ll be great. You’re gonna love it.  Before we go any further, I want to remind everybody I’m not a stock analyst nor do I give investment advice.  I do, however, want to say a few things about NVIDIA and its annual earnings report, which it published on Wednesday, February 25: NVIDIA’s entire future is built on the idea that hyperscalers will buy GPUs at increasingly-higher prices and at increasingly-higher rates every single year. It is completely reliant on maybe four or five companies being willing to shove tens of billions of dollars a quarter directly into Jensen Huang’s wallet. If anything changes here — such as difficulty acquiring debt or investor pressure cutting capex — NVIDIA is in real trouble, as it’s made over $95 billion in commitments to build out for the AI bubble .  Yet the real gem was this part: Hell yeah dude! After misleading everybody that it intended to invest $100 billion in OpenAI last year ( as I warned everybody about months ago , the deal never existed and is now effectively dead ), NVIDIA was allegedly “close” to investing $30 billion . One would think that NVIDIA would, after Huang awkwardly tried to claim that the $100 billion was “ never a commitment ,” say with its full chest how badly it wanted to support OpenAI and how intentionally it would do so. Especially when you have this note in your 10-K: What a peculiar world we live in. Apparently NVIDIA is “so close” to a “partnership agreement” too , though it’s important to remember that Altman, Brockman, and Huang went on CNBC to talk about the last deal and that never came together. All of this adds a little more anxiety to OpenAI's alleged $100 billion funding round which, as The Information reports , Amazon's alleged $50 billion investment will actually be $15 billion, with the next $35 billion contingent on AGI or an IPO: And that $30 billion from NVIDIA is shaping up to be a Klarna-esque three-installment payment plan: A few thoughts: Anyway, on to the main event. New term: analyslop, when somebody writes a long, specious piece of writing with few facts or actual statements with the intention of it being read as thorough analysis.  This week, alleged financial analyst Citrini Research (not to be confused with Andrew Left’s Citron Research)  put out a truly awful piece called the “2028 Global Intelligence Crisis,” slop-filled scare-fiction written and framed with the authority of deeply-founded analysis, so much so that it caused a global selloff in stocks .  This piece — if you haven’t read it, please do so using my annotated version — spends 7000 or more words telling the dire tale of what would happen if AI made an indeterminately-large amount of white collar workers redundant.  It isn’t clear what exactly AI does, who makes the AI, or how the AI works, just that it replaces people, and then bad stuff happens. Citrini insists that this “isn’t bear porn or AI-doomer fan-fiction,” but that’s exactly what it is — mediocre analyslop framed in the trappings of analysis, sold on a Substack with “research” in the title, specifically written to spook and ingratiate anyone involved in the financial markets.  Its goal is to convince you that AI (non-specifically) is scary, that your current stocks are bad, and that AI stocks (unclear which ones those are, by the way) are the future. Also, find out more for $999 a year. Let me give you an example: The goal of a paragraph like this is for you to say “wow, that’s what GPUs are doing now!” It isn’t, of course. The majority of CEOs report little or no return on investment from AI , with a study of 6000 CEOs across the US, UK, Germany and Australia finding that “ more than 80%  [detected] no discernable impact from AI on either employment or productivity .” Nevertheless, you read “GPU” and “North Dakota” and you think “wow! That’s a place I know, and I know that GPUs power AI!”  I know a GPU cluster in North Dakota — CoreWeave’s one with Applied Digital that has debt so severe that it loses both companies money even if they have the capacity rented out 24/7 . But let’s not let facts get in the way of a poorly-written story. I don’t need to go line-by-line — mostly because I’ll end up writing a legally-actionable threat — but I need you to know that most of this piece’s arguments come down to magical thinking and the utterly empty prose. For example, how does AI take over the entire economy?  That’s right, they just get better. No need to discuss anything happening today. Even AI 2027 had the balls to start making stuff about “OpenBrain” or whatever. This piece literally just says stuff, including one particularly-egregious lie:  This is a complete and utter lie. A bald-faced lie. This is not something that Claude Code can do. The fact that we have major media outlets quoting this piece suggests that those responsible for explaining how things work don’t actually bother to do any of the work to find out, and it’s both a disgrace and embarrassment for the tech and business media that these lies continue to be peddled.  I’m now going to quote part of my upcoming premium (the Hater’s Guide To Private Equity, out Friday), because I think it’s time we talked about what Claude Code actually does. I’ve worked in or around SaaS since 2012, and I know the industry well. I may not be able to code, but I take the time to speak with software engineers so that I understand what things actually do and how “impressive” they are. Similarly, I make the effort to understand the underlying business models in a way that I’m not sure everybody else is trying to, and if I’m wrong, please show me an analysis of the financial condition of OpenAI or Anthropic from a booster. You won’t find one, because they’re not interested in interacting with reality. So, despite all of this being very obvious , it’s clear that the markets and an alarming number of people in the media simply do not know what they are talking about or are intentionally avoiding thinking about it. The “AI replaces software” story is literally “Anthropic has released a product and now the resulting industry is selling off,” such as when it launched a cybersecurity tool that could check for vulnerabilities (a product that has existed in some form for nearly a decade) causing a sell-off in cybersecurity stocks like Crowdstrike — you know, the one that had a faulty bit of code cause a global cybersecurity incident that lost the Fortune 500 billions , and resulted in Delta Airlines having to cancel over 1,200 flights over a period of several days .  There is no rational basis for anything about this sell-off other than that our financial media and markets do not appear to understand the very basic things about the stuff they invest in. Software may seem complex, but (especially in these cases) it’s really quite simple: investors are conflating “an AI model can spit out code” with “an AI model can create the entire experience of what we know as ‘software,’ or is close enough that we have to start freaking out.” This is thanks to the intentionally-deceptive marketing pedalled by Anthropic and validated by the media. In a piece from September 2025, Bloomberg reported that Claude Sonnet 4.5 could “code on its own for up to 30 hours straight,”  a statement directly from Anthropic repeated by other outlets that added that it did so “on complex, multi-step tasks,” none of which were explained. The Verge, however, added that apparently Anthropic “ coded a chat app akin to Slack or Teams ,” and no, you can’t see it, or know anything about how much it costs or its functionality. Does it run? Is it useful? Does it work in any way? What does it look like? We have absolutely no proof this happened other than Anthropic saying it, but because the media repeated it it’s now a fact.  As I discussed last week, Anthropic’s primary business model is deception , muddying the waters of what’s possible today and what might be possible tomorrow through a mixture of flimsy marketing statements and chief executive Dario Amodei’s doomerist lies about all white collar labor disappearing .  Anthropic tells lies of obfuscation and omission.  Anthropic exploits bad journalism, ignorance and a lack of critical thinking. As I said earlier, the “wow, Claude Code!” articles are mostly from captured boosters and people that do not actually build software being amazed that it can burp up its training data and make an impression of software engineering.  And even if we believe the idea that Spotify’s best engineers are not writing any code , I have to ask: to what end? Is Spotify shipping more software? Is the software better? Are there more features? Are there less bugs? What are the engineers doing with the time they’re saving? A study from last year from METR said that despite thinking they were 24% faster, LLM coding tools made engineers 19% slower.  I also think we need to really think deeply about how, for the second time in a month, the markets and the media have had a miniature shitfit based on blogs that tell lies using fan fiction. As I covered in my annotations of Matt Shumer’s “Something Big Is Happening,” the people that are meant to tell the general public what’s happening in the world appear to be falling for ghost stories that confirm their biases or investment strategies, even if said stories are full of half-truths and outright lies. I am despairing a little. When I see Matt Shumer on CNN or hear from the head of a PE firm about Citrini Research, I begin to wonder whether everybody got where they were not through any actual work but by making the right noises.  This is the grifter economy, and the people that should be stopping them are asleep at the wheel. NVIDIA beat estimates and raised expectations, as it has quarter after quarter. People were initially excited, then started reading the 10-K and seeing weird little things that stood out. $68.1 billion in revenue is a lot of money! That’s what you should expect from a company that is the single vendor in the only thing anybody talks about.  Hyperscaler revenue accounted for slightly more than 50% of NVIDIA’s data center revenue . As I wrote about last year , NVIDIA’s diversified revenue — that’s the revenue that comes from companies that aren’t in the magnificent 7 — continues to collapse. While data center revenue was $62.3 billion, 50% ($31.15 billion) was taken up by hyperscalers…and because we don’t get a 10-Q for the fourth quarter, we don’t get a breakdown of how many individual customers made up that quarter’s revenue. Boo! It is both peculiar and worrying that 36% (around $77.7 billion) of its $215.938 billion in FY2026 revenue came from two customers. If I had to guess, they’re likely Foxconn or Quanta computing, two large Taiwanese ODMs (Original Design Manufacturers) that build the servers for most hyperscalers.  If you want to know more, I wrote a long premium piece that goes into it (among the ways in which AI is worse than the dot com bubble). In simple terms, when a hyperscaler buys GPUs, they go straight to one of these ODMs to put them into servers. This isn’t out of the ordinary, but I keep an eye on the ODM revenues (which publish every month) to see if anything shifts, as I think it’ll be one of the first signs that things are collapsing. NVIDIA’s inventories continue to grow, sitting at over $21 billion (up from around $19 billion last quarter). Could be normal! Could mean stuff isn’t shipping. NVIDIA has now agreed to $27 billion in multi-year-long cloud service agreements — literally renting its GPUs back from the people it sells them to — with $7 billion of that expected in its FY2027 (Q1 FY2027 will report in May 2026).  For some context, CoreWeave (which reports FY2025 earnings today, February 26) gave guidance last November that it expected its entire annual revenue to be between $5 billion and $5.15 billion. CoreWeave is arguably the largest AI compute vendor outside of the hyperscalers. If there was significant demand, none of this would be necessary. NVIDIA “invested” $17.5bn in AI model makers and other early-stage AI startups, and made a further $3.5bn in land, power, and shell guarantees to “support the build-out of complex datacenter infrastructures.” In total, it spent $21bn propping up the ecosystem that, in turn, feeds billions of dollars into its coffers.  NVIDIA’s l ong-term supply and capacity obligations soared from $30.8bn to $95.2bn , largely because NVIDIA’s latest chips are extremely complex and require TSMC to make significant investments in hardware and facilities , and it’s unwilling to do that without receiving guarantees that it’ll make its money back.  NVIDIA expects these obligations to grow .  NVIDIA’s accounts receivable (as in goods that have been shipped but are yet to be paid for) now sits at $38.4 billion, of which 56% ($21.5 billion) is from three customers. This is turning into a very involved and convoluted process! It turns out that it's pretty difficult to actually raise $100 billion. This is a big problem, because OpenAI needs $655 billion in the next five years to pay all its bills , and loses billions of dollars a year. If OpenAI is struggling to raise $100 billion today, I don't see how it's possible it survives. If you're to believe reports, OpenAI made $13.1 billion in revenue in 2025 on $8 billion of losses , but remember, my own reporting from last year said that OpenAI only made around $4.329 billion through September 2025 with $8.67 billion of inference costs alone. It is kind of weird that nobody seems to acknowledge my reporting on this subject. I do not see how OpenAI survives. it coded for 30 hours [from which you are meant to intimate the code was useful or good and that these hours were productive].  it made a Microsoft Teams competitor [that you are meant to assume was full-featured and functional like Teams or Slack, or…functional? And they didn’t even have to prove it by showing you it]  It was able to write uninterruptedly [which you assume was because it was doing good work that didn’t need interruption].

0 views

Est-ce que ChatGPT sait ce qu'est une question?

J’expliquais récemment à un ami que ChatGPT, dans son essence, n’est « qu’un » modèle de prédiction du mot suivant, celui qui vient après une suite d’autres mots. Ainsi, quand on lui demande « Quelle est la capitale de la France ? », il ne répond pas (vraiment) à la question : il complète plutôt une séquence de mots sur laquelle il a été entraîné, en profondeur et avec une très grande efficacité.

0 views

Build Your Own Key-Value Storage Engine—Week 7

Curious how leading engineers tackle extreme scale challenges with data-intensive applications? Join Monster Scale Summit (free + virtual). It’s hosted by ScyllaDB, the monstrously fast and scalable database. Agenda Week 0: Introduction Week 1: In-Memory Store Week 2: LSM Tree Foundations Week 3: Durability with Write-Ahead Logging Week 4: Deletes, Tombstones, and Compaction Week 5: Leveling and Key-Range Partitioning Week 6: Block-Based SSTables and Indexing Week 7: Bloom Filters and Trie Memtable Over the last few weeks, you refined your LSM tree to introduce leveling. In case of a key miss, the process requires the following steps: Lookup from the memtable. Lookup from all the L0 SSTables. Lookup from one L1 SSTable. Lookup from one L2 SSTable. Last week, you optimized the lookups by introducing block-based SSTables and indexing, but a lookup is still not a “free” operation. Worst case, it requires fetching two pages (one for the index block and one for the data block) to find out that a key is missing in an SSTable. This week, you will optimize searches by introducing a “tiny” level of caching per SSTable. If you’re an avid reader of The Coder Cafe 1 , we already discussed a great candidate for such a cache: One that doesn’t consume too much memory to make sure we don’t increase space amplification drastically. One that is fast enough so that a lookup doesn’t introduce too much overhead, especially if we have to check a cache before making any lookup in an SSTable. You will implement a cache using Bloom filters : a space-efficient, probabilistic data structure to check for set membership. A Bloom filter can return two possible answers: The element is definitely not in the set (no false negatives). The element may be in the set (false positives are possible). In addition to optimizing SSTable lookups, you will also optimize your memtable. In week 2, you implemented a memtable using a hashtable. Let’s get some perspective to understand the problems of using a hashtable: A memtable buffers writes. As it’s the main entry point for writes, a write has to be fast. → OK: a hashtable has average inserts, plus ( : the length of the key) for hashing. For reads, doing a key lookup has to be fast → OK: average lookups, plus to hash. Doing range scanning operations (week 5, optional work), such as: “ Give me the list of keys between bar and foo “ → A hashtable, because it’s not an ordered data structure, is terrible: you end up touching everything so with the number of elements in the hashtable. Flush to L0 → A hashtable isn’t ordered, so it requires sorting all the keys ( ) with n the number of elements) to produce the SSTables. Because of these negative points, could we find a better data structure? Yes! This week, you will switch the memtable to a radix trie (see Further Notes for a discussion on alternative data structures). A trie is a tree-shaped data structure usually used to store strings efficiently. The common example to illustrate a trie is to store a dictionary. For example, suppose you want to store these two words: Despite that starts with the same four letters, you need to store a total of 4 + 5 = 9 letters. Tries optimize the storage required by sharing prefixes. Each node stores one letter. Here’s an example of a trie storing these two words in addition to the word foo ( nodes represent the end of a word): As you can see, we didn’t duplicate the first four letters of to store . In this very example, instead of storing 9 letters for and , we stored only five letters. Yet, you’re not going to implement a “basic” trie for your memtable; instead, you will implement a compressed trie called a radix trie (also known as a patricia 2 trie). Back to the previous example, storing one node (one square) has an overhead. It usually means at least one extra field to store the next element, usually a pointer. In the previous example, we needed 11 nodes in total, but what if we could compress the number of nodes required? The idea is to combine nodes with a single child: This new trie stores the exact same information, except it requires 6 nodes instead of 11. That’s what radix tries are about. To summarize the benefits of switching a memtable from a hashtable to a radix trie: Ordered by design: Tries keep keys in order and make prefix/range lookups natural, which helps for and for streaming a sorted flush. No rebalancing/rehashing pauses: The shape doesn’t depend on insertion order, and operations don’t need rebalancing; you avoid periodic rehash work. Prefix compression: A radix trie can cut duplicated key bytes in the memtable, reducing in-memory space. 💬 If you want to share your progress, discuss solutions, or collaborate with other coders, join the community Discord server ( channel): Join the Discord Let’s size the Bloom filter. You will target: (false-positive rate) = 1% (max elements per SSTable) = 1,953 (hash functions) = 5 Using the formula from the Bloom Filters post: We get ≈ 19,230 bits, i.e., 2,404 B. We will round up to 2,496 B (39 × 64 B), so the bitset is a whole number of cache lines. NOTE : Using =7 would shave only ~2–3% space for ~40% more hash work, so =5 is a good trade-off. To distribute elements across the bitvector, you will use the following approach. You will use xxHash64 with two different constant seeds to get two base hashes, then derive k indices by double hashing (pseudo-code): The required changes to introduce Bloom filters: For each SSTable in the MANIFEST, cache its related Bloom filter in memory. Since each Bloom filter requires only a small amount of space, this optimization has a minimal memory footprint. For example, caching 1,000 Bloom filters of the type you designed requires less than 2.5 MB of memory. SSTable creation: For each new SSTable you write, initialize an empty bitvector of 2,496 B. Build the Bloom filter in memory as you emit the keys (including tombstones): Compute based on the key. For each , set bit at position . When the SSTable is done, persist a sidecar file next to it (e.g., and ) and the file. Update the cache containing the Bloom filters. Compaction: Delete from memory the Bloom filters corresponding to deleted SSTables. Before reading an SSTable: Compute based on the key. If all the bits of are set: The key may be present, therefore, proceed with your normal lookup in the SSTable. Otherwise: Skip this SSTable. Now, let’s replace your hashtable with a trie. : Compressed edge fragment. : A map keyed by the next character after to a node. : An enum with the different possible values: : The node is just a prefix, no full key ends here. : A full key exists at this node. : This key was explicitly deleted. : If is , the corresponding value. Root is a sentinel node with an empty . Walk from the root, matching the longest common prefix against . If partial match in the middle of an edge, split once: Create a parent with the common part, two children: the old suffix and the new suffix. Descend via the next child (next unmatched character). At the terminal node: set and Walk edges by longest-prefix match. If an edge doesn’t match, return not found. At the terminal node: If : return If or , return not found. Walk as in . If the path doesn’t fully exist, create the missing suffix nodes with so that a terminal node exists. At the terminal node: set (you may have to clear ). Flush process: In-order traversal: : Emit tombstone. : Emit nothing. There are no changes to the client. Run it against the same file ( put-delete.txt ) to validate that your changes are correct. Use per-SSTable random seeds for the Bloom hash functions. Persist them in the Bloom filter files. In Bloom Filters , you introduced blocked Bloom filters, a variant that optimizes spatial locality by: Dividing the bloom filter into contiguous blocks, each the size of a cache line. Restricting each query to a single block to ensure all bit lookups stay within the same cache line. Switch to blocked Bloom filters and see the impacts on latency and throughput. If you implemented the operation from week 5 (optional work), wire it to your memtable radix trie. That’s it for this week! You optimized lookups with per-SSTable Bloom filters and switched the memtable to a radix trie, an ordered data structure. Since the beginning of the series, everything you built has been single-threaded, and flush/compaction remains stop-the-world. In two weeks, you will finally tackle the final boss of LSM trees: concurrency. If you want to dive more into tries, Trie Memtables in Cassandra is a paper that explains why Cassandra moved from a skip list + B-tree memtable to a trie, and what it changed for topics such as GC and CPU locality. A popular variant of radix trie is the Adaptive Radix Tree (ART): it dynamically resizes node types based on the number of children to stay compact and cache-friendly, while supporting fast in-memory lookups, inserts, and deletes. This paper (or this summary ) explores the topic in depth. You should also be aware that tries aren’t the only option for memtables, as other data structures exist. For example, RocksDB relies on a skip list. See this resource for more information. About Bloom filters, some engines keep a Bloom filter not only per SSTable but per data-block range as well. This was the case for RocksDB’s older block-based filter format ( source ). RocksDB later shifted toward partitioned index/filters, which partition the index and full-file filter into smaller blocks with a top-level directory for on-demand loading. The official doc delves into the new approach. Missing direction in your tech career? At The Coder Cafe, we serve timeless concepts with your coffee to help you master the fundamentals. Written by a Google SWE and trusted by thousands of readers, we support your growth as an engineer, one coffee at a time. ❤️ If you enjoyed this post, please hit the like button. I’m sure you are. Week 0: Introduction Week 1: In-Memory Store Week 2: LSM Tree Foundations Week 3: Durability with Write-Ahead Logging Week 4: Deletes, Tombstones, and Compaction Week 5: Leveling and Key-Range Partitioning Week 6: Block-Based SSTables and Indexing Week 7: Bloom Filters and Trie Memtable Over the last few weeks, you refined your LSM tree to introduce leveling. In case of a key miss, the process requires the following steps: Lookup from the memtable. Lookup from all the L0 SSTables. Lookup from one L1 SSTable. Lookup from one L2 SSTable. One that doesn’t consume too much memory to make sure we don’t increase space amplification drastically. One that is fast enough so that a lookup doesn’t introduce too much overhead, especially if we have to check a cache before making any lookup in an SSTable. The element is definitely not in the set (no false negatives). The element may be in the set (false positives are possible). A memtable buffers writes. As it’s the main entry point for writes, a write has to be fast. → OK: a hashtable has average inserts, plus ( : the length of the key) for hashing. For reads, doing a key lookup has to be fast → OK: average lookups, plus to hash. Doing range scanning operations (week 5, optional work), such as: “ Give me the list of keys between bar and foo “ → A hashtable, because it’s not an ordered data structure, is terrible: you end up touching everything so with the number of elements in the hashtable. Flush to L0 → A hashtable isn’t ordered, so it requires sorting all the keys ( ) with n the number of elements) to produce the SSTables. As you can see, we didn’t duplicate the first four letters of to store . In this very example, instead of storing 9 letters for and , we stored only five letters. Yet, you’re not going to implement a “basic” trie for your memtable; instead, you will implement a compressed trie called a radix trie (also known as a patricia 2 trie). Back to the previous example, storing one node (one square) has an overhead. It usually means at least one extra field to store the next element, usually a pointer. In the previous example, we needed 11 nodes in total, but what if we could compress the number of nodes required? The idea is to combine nodes with a single child: This new trie stores the exact same information, except it requires 6 nodes instead of 11. That’s what radix tries are about. To summarize the benefits of switching a memtable from a hashtable to a radix trie: Ordered by design: Tries keep keys in order and make prefix/range lookups natural, which helps for and for streaming a sorted flush. No rebalancing/rehashing pauses: The shape doesn’t depend on insertion order, and operations don’t need rebalancing; you avoid periodic rehash work. Prefix compression: A radix trie can cut duplicated key bytes in the memtable, reducing in-memory space. (false-positive rate) = 1% (max elements per SSTable) = 1,953 (hash functions) = 5 Startup: For each SSTable in the MANIFEST, cache its related Bloom filter in memory. Since each Bloom filter requires only a small amount of space, this optimization has a minimal memory footprint. For example, caching 1,000 Bloom filters of the type you designed requires less than 2.5 MB of memory. SSTable creation: For each new SSTable you write, initialize an empty bitvector of 2,496 B. Build the Bloom filter in memory as you emit the keys (including tombstones): Compute based on the key. For each , set bit at position . When the SSTable is done, persist a sidecar file next to it (e.g., and ) and the file. Update the cache containing the Bloom filters. Compaction: Delete from memory the Bloom filters corresponding to deleted SSTables. Lookup: Before reading an SSTable: Compute based on the key. If all the bits of are set: The key may be present, therefore, proceed with your normal lookup in the SSTable. Otherwise: Skip this SSTable. : Compressed edge fragment. : A map keyed by the next character after to a node. : An enum with the different possible values: : The node is just a prefix, no full key ends here. : A full key exists at this node. : This key was explicitly deleted. : If is , the corresponding value. : Walk from the root, matching the longest common prefix against . If partial match in the middle of an edge, split once: Create a parent with the common part, two children: the old suffix and the new suffix. Descend via the next child (next unmatched character). At the terminal node: set and : Walk edges by longest-prefix match. If an edge doesn’t match, return not found. At the terminal node: If : return If or , return not found. : Walk as in . If the path doesn’t fully exist, create the missing suffix nodes with so that a terminal node exists. At the terminal node: set (you may have to clear ). In-order traversal: : Emit . : Emit tombstone. : Emit nothing. Dividing the bloom filter into contiguous blocks, each the size of a cache line. Restricting each query to a single block to ensure all bit lookups stay within the same cache line.

0 views

Flexible I/O for Database Management Systems with xNVMe

Flexible I/O for Database Management Systems with xNVMe Emil Houlborg, Simon A. F. Lund, Marcel Weisgut, Tilmann Rabl, Javier González, Vivek Shah, Pınar Tözün CIDR’26 This paper describes xNVMe , a storage library (developed by Samsung), and demonstrates how it can be integrated into DuckDB. Section 2 contains the hard sell for . The “x” prefix serves a similar role to the “X” in DirectX. It is fast, while also being portable across operating systems and storage devices. The C API will feel like home for folks who have experience with low-level graphics APIs (no shaders on the disk yet, sorry). There are APIs to open a handle to a device, allocate buffers, and submit NVMe commands (synchronously or asynchronously). Listing 3 has an example, which feels like “Mantle for NVMe”: Source: https://www.cidrdb.org/cidr2026/papers/p6-houlborg.pdf The API works on Linux, FreeBSD, Windows, and macOS. Some operating systems have multiple backends available (e.g., , ). The point of this paper is that it is easy to drop into an existing application. The paper describes , which is an implementation of the DuckDB interface and uses . creates dedicated queues for each DuckDB worker thread to avoid synchronization (similar tricks are used by applications calling graphics APIs in parallel). The paper also describes how supports shiny new NVMe features like Flexible Data Placement (FDP). This allows DuckDB to pass hints to the SSD to colocate buffers with similar lifetimes (which improves garbage collection performance). Most of the results in the paper show comparable performance for vs the baseline DuckDB filesystem. Fig. 5 shows one benchmark where yields a significant improvement: Source: https://www.cidrdb.org/cidr2026/papers/p6-houlborg.pdf Dangling Pointers I think the long-term success of will depend on governance. Potential members of the ecosystem could be scared off by Samsung’s potential conflict of interest (i.e., will Samsung privilege Samsung SSDs in some way?) There is a delicate balancing act between an API driven by a sluggish bureaucratic committee, and an API which is dominated by one vendor. Subscribe now

0 views
Stratechery Yesterday

An Interview with Bill Gurley About Runnin’ Down a Dream

An interview with long-time (retired) VC Bill Gurley about his new book about building a career you love, Uber, and the modern state of VC.

0 views
Brain Baking Yesterday

Managing Multiple Development Ecosystem Installs

In the past year, I occasionally required another Java Development Kit besides the usual one defined in to build certain modules against older versions and certain modules against bleeding edge versions. In the Java world, that’s rather trivial thanks to IntelliJ’s project settings: you can just interactively click through a few panels to install another JDK flavour and get on with your life. The problem starts once you close IntelliJ and want to do some command line work. Luckily, SDKMan , the “The Software Development Kit Manager”, has got you covered. Want to temporarily change the Java compiler for the current session? . Want to change the default? . Easy! will point to , a symlink that gets rewired by SDKMan. A Java project still needs a dependency management system such as Gradle, but you don’t need to install a global specific Gradle version. Instead, just points to the jar living at . Want another one? Change the version number in and it’ll be auto-downloaded. Using Maven instead? Tough luck! Just kidding: don’t use but , the Maven Wrapper that works exactly the same. .NET comes with built-in support to change the toolchain (and specify the runtime target), more or less equal to a typical Gradle project. Actually, the command can both build list its own installed toolchains: . Yet installing a new one is done by hand. You switch toolchains by specifying the SDK version in a global.json file and tell the compiler to target a runtime in the file. In Python , the concept of virtual environments should solve that problem: each project creates its own that points to a specific version of Python. Yet I never really enjoyed working with this system: you’ve got , , , , , … That confusing mess is solved with a relatively new kid in town: uv , “An extremely fast Python package and project manager, written in Rust.” It’s more than as it also manages your multiple development ecosystems. Want to install a new Python distribution? . Want to temporarily change the Python binary for the current session? . Creating a new project with will also create a virtual environment, meaning you don’t run your stuff with but with that auto-selects the correct version. Lovely! What about JS/TS and Node ? Of course there the options are many: there’s nvm —but that’s been semi-abandoned ?—and of course someone built a Rust-alternative called fnm , but you can also manage Node versions with . I personally don’t care and use instead, which is aimed at not managing but replacing the Node JS runtime. But who will manage the bun versions? PHP is more troublesome because it’s tied to a web server. Solutions such as Laravel Nerd combine both PHP and web server dependency management into a sleek looking tool that’s “free”. Of course you can let your OS-system package manager manage your SDK packages: and then . That definitely feels a bit more hacky. For PHP, I’d even consider Mise. Speaking of which… Why use a tool that limits the scope to one specific development environment? If you’re a full-stack developer you’ll still need to know how to manage both your backend and frontend dev environment. That’s not needed with Mise-en-place , a tool that manages all these things . Asdf is another popular one that manages any development environment that doesn’t have its own dedicated tool. I personally think that’s an extraction layer too far. You’ll still need to dissect these tools separately in case things go wrong. Some ecosystems come with built-in multi-toolkit support, such as Go : simply installs into your directory 1 . That means you’ve installed the compiler (!) in exactly the same way as any other (global) dependency, how cool is that? The downside of this is that you’ll have to remember to type instead of so there’s no symlink rewiring involved. or can do that—or the above Mise. But wait, I hear you think, why not just use containers to isolate everything? Spinning up containers to build in an isolated environment: sure, that’s standard practice in continuous integration servers, but locally? Really? Really. Since the inception of Dev Containers by Microsoft, specifically designed for VS Code, working “inside” a container is as easy as opening up the project and “jumping inside the container”. From that moment on, your terminal, IntelliSense, … runs inside that container. That means you won’t have to wrestle Node/PHP versions on your local machine, and you can even use the same container to build your stuff on the CI server. That also means your newly onboarded juniors don’t need to wrestle through a week of “installing stuff”. Microsoft open sourced the Dev Container specification and the JetBrains folks jumped the gun: it has support for but I have yet to try it out. Of course the purpose was to integrate this into GitHub: their cloud-based IDE Codespaces makes heavy use of the idea—and yes, there’s an open-source alternative . Is there Emacs support for Dev Containers? Well, Tramp allows you to remotely open and edit any file, also inside a container . So just install the Dev Container CLI, run it and point Emacs to a source file inside it. From then on, everything Emacs does—including the LSP server, compilation, …—happens inside that container. That means you’ll also have to install your LSP binaries in there. devcontainer.el just wraps complication commands to execute inside the container whilst still letting you edit everything locally in case you prefer a hybrid approach. And then there’s Nix and devenv . Whatever that does, it goes way over my head! You’ll still have to execute after that.  ↩︎ Related topics: / containers / By Wouter Groeneveld on 26 February 2026.  Reply via email . You’ll still have to execute after that.  ↩︎

0 views

curl up 2026

The annual curl users and developers meeting, curl up, takes place May 23-24 2026 in Prague, Czechia. We are in fact returning to the same city and the exact same venue as in 2025. We liked it so much! This is a cozy and friendly event that normally attracts around 20-30 attendees. We gather in a room through a weekend and we talk curl. The agenda is usually setup with a number of talks through the two days, and each talk ends with a follow-up Q&A and discussion session. So no big conference thing, just a bunch of friends around a really large table. Over a weekend. Anyone is welcome to attend – for free – and everyone is encouraged to submit a talk proposal – anything that is curl and Internet transfer related goes. We make an effort to attract and lure the core curl developers and the most active contributors of recent years into the room. We do this by reimbursing their travel and hotel expenses. The agenda is a collaborative effort and we are going to work on putting it together from now all the way until the event, in order to make sure we make the best of the weekend and we get to talk to and listen to all the curl related topics we can think of! Help us improve the Agenda in the curl-up wiki: https://github.com/curl/curl-up/wiki/2026 Meeting up in the real world as opposed to doing video meetings helps us get to know each other better, allows us to socialize in ways we otherwise never can do and in the end it helps us work better together – which subsequently helps us write better code and produce better outcomes! It also helps us meet and welcome newcomers and casual contributors. Showing up at curl up is an awesome way to dive into the curl world wholeheartedly and in the deep end. Needless to say this event costs money to run. We pay our top people to come, we pay for the venue and pay for food. We would love to have your company mentioned as top sponsor of the event or perhaps a social dinner on the Saturday? Get in touch and let’s get it done! Everyone is welcome and encouraged to attend – at no cost. We only ask that you register in advance (the registration is not open yet). We always record all sessions on video and make them available after the fact. You can catch up on previous years’ curl up sessions on the curl website’s video section . We also live-stream all the sessions on curl up during both days. To be found on my twitch channel: curlhacker . Our events are friendly to everyone. We abide to the code of conduct and we never had anyone be even close to violating that,

0 views

Notes on Linear Algebra for Polynomials

We’ll be working with the set P_n(\mathbb{R}) , real polynomials of degree \leq n . Such polynomials can be expressed using n+1 scalar coefficients a_i as follows: The set P_n(\mathbb{R}) , along with addition of polynomials and scalar multiplication form a vector space . As a proof, let’s review how the vector space axioms are satisfied. We’ll use p(x) , q(x) and r(x) as arbitrary polynomials from the set P_n(\mathbb{R}) for the demonstration. Similarly, a and b are arbitrary scalars in . Associativity of vector addition : This is trivial because addition of polynomials is associative [1] . Commutativity is similarly trivial, for the same reason: Commutativity of vector addition : Identity element of vector addition : The zero polynomial 0 serves as an identity element. \forall p(x)\in P_n(\mathbb{R}) , we have 0 + p(x) = p(x) . Inverse element of vector addition : For each p(x) , we can use q(x)=-p(x) as the additive inverse, because p(x)+q(x)=0 . Identity element of scalar multiplication The scalar 1 serves as an identity element for scalar multiplication. For each p(x) , it’s true that 1\cdot p(x)=p(x) . Associativity of scalar multiplication : For any two scalars a and b : Distributivity of scalar multiplication over vector addition : For any p(x) , q(x) and scalar a : Distributivity of scalar multiplication over scalar addition : For any scalars a and b and polynomial p(x) : Since we’ve shown that polynomials in P_n(\mathbb{R}) form a vector space, we can now build additional linear algebraic definitions on top of that. A set of k polynomials p_k(x)\in P_n(\mathbb{R}) is said to be linearly independent if implies a_i=0 \quad \forall i . In words, the only linear combination resulting in the zero vector is when all coefficients are 0. As an example, let’s discuss the fundamental building blocks of polynomials in P_n(\mathbb{R}) : the set \{1, x, x^2, \dots x^n\} . These are linearly independent because: is true only for zero polynomial, in which all the coefficients a_i=0 . This comes from the very definition of polynomials. Moreover, this set spans the entire P_n(\mathbb{R}) because every polynomial can be (by definition) expressed as a linear combination of \{1, x, x^2, \dots x^n\} . Since we’ve shown these basic polynomials are linearly independent and span the entire vector space, they are a basis for the space. In fact, this set has a special name: the monomial basis (because a monomial is a polynomial with a single term). Suppose we have some set polynomials, and we want to know if these form a basis for P_n(\mathbb{R}) . How do we go about it? The idea is using linear algebra the same way we do for any other vector space. Let’s use a concrete example to demonstrate: Is the set Q a basis for P_n(\mathbb{R}) ? We’ll start by checking whether the members of Q are linearly independent. Write: By regrouping, we can turn this into: For this to be true, the coefficient of each monomial has to be zero; mathematically: In matrix form: We know how to solve this, by reducing the matrix into row-echelon form . It’s easy to see that the reduced row-echelon form of this specific matrix is I , the identity matrix. Therefore, this set of equations has a single solution: a_i=0 \quad \forall i [2] . We’ve shown that the set Q is linearly independent. Now let’s show that it spans the space P_n(\mathbb{R}) . We want to analyze: And find the coefficients a_i that satisfy this for any arbitrary , and \gamma . We proceed just as before, by regrouping on the left side: and equating the coefficient of each power of separately: If we turn this into matrix form, the matrix of coefficients is exactly the same as before. So we know there’s a single solution, and by rearranging the matrix into I , the solution will appear on the right hand side. It doesn’t matter for the moment what the actual solution is, as long as it exists and is unique. We’ve shown that Q spans the space! Since the set Q is linearly independent and spans P_n(\mathbb{R}) , it is a basis for the space. I’ve discussed inner products for functions in the post about Hilbert space . Well, polynomials are functions , so we can define an inner product using integrals as follows [3] : Where the bounds a and b are arbitrary, and could be infinite. Whenever we deal with integrals we worry about convergence; in my post on Hilbert spaces, we only talked about L^2 - the square integrable functions. Most polynomials are not square integrable, however. Therefore, we can restrict this using either: Let’s use the latter, and restrict the bounds into the range [-1,1] , setting w(x)=1 . We have the following inner product: Let’s check that this satisfies the inner product space conditions. Conjugate symmetry : Since real multiplication is commutative, we can write: We deal in the reals here, so we can safely ignore complex conjugation. Linearity in the first argument : Let p_1,p_2,q\in P_n(\mathbb{R}) and a,b\in \mathbb{R} . We want to show that Expand the left-hand side using our definition of inner product: The result is equivalent to a\langle p_1,q\rangle +b\langle p_2,q\rangle . Positive-definiteness : We want to show that for nonzero p\in P_n(\mathbb{R}) , we have \langle p, p\rangle > 0 . First of all, since p(x)^2\geq0 for all , it’s true that: What about the result 0 though? Well, let’s say that Since p(x)^2 is a non-negative function, this means that the integral of a non-negative function ends up being 0. But p(x) is a polynomial, so it’s continuous , and so is p(x)^2 . If the integral of a continuous non-negative function is 0, it means the function itself is 0. Had it been non-zero in any place, the integral would necessarily have to be positive as well. We’ve proven that \langle p, p\rangle=0 only when p is the zero polynomial. The positive-definiteness condition is satisfied. In conclusion, P_n(\mathbb{R}) along with the inner product we’ve defined forms an inner product space . Now that we have an inner product, we can define orthogonality on polynomials: two polynomials p,q are orthogonal (w.r.t. our inner product) iff Contrary to expectation [4] , the monomial basis polynomials are not orthogonal using our definition of inner product. For example, calculating the inner product for 1 and x^2 : There are other sets of polynomials that are orthogonal using our inner product. For example, the Legendre polynomials ; but this is a topic for another post. A special weight function w(x) to make sure the inner product integral converges Set finite bounds on the integral, and then we can just set w(x)=1 .

0 views
Evan Hahn Yesterday

Introducing gzpeek, a tool to parse gzip metadata

In short: gzip streams contain metadata, like the operating system that did the compression. I built a tool to read this metadata. I love reading specifications for file formats. They always have little surprises. I had assumed that the gzip format was strictly used for compression. My guess was: a few bytes of bookkeeping, the compressed data, and maybe a checksum. But then I read the spec . The gzip header holds more than I expected! In addition to two bytes identifying the data as gzip, there’s also: The operating system that did the compression. This was super surprising to me! There’s a single byte that identifies the compressor’s OS: for Windows, for the Amiga, for Unix, and many others I’d never heard of. Compressors can also set for an “unknown” OS. Different tools set this value differently. zlib, the most popular gzip library, changes the flag based on the operating system . (It even defines some OSes that aren’t in the spec, like for BeOS.) Many other libraries build atop zlib and inherit this behavior, such as .NET’s , Ruby’s , and PHP’s . Java’s , JavaScript’s , and Go’s set the OS to “unknown” regardless of operating system. Some, like Zopfli and Apache’s , hard-code it to “Unix” no matter what. All that to say: in practice, you can’t rely on this flag to determine the source OS, but it can give you a hint. Modification time for the data. This can be the time that compression started or the modification time of the file. It can also be set to if you don’t want to communicate a time. This is represented as an unsigned 32-bit integer in the Unix format. That means it can represent any moment between January 1, 1970 and February 7, 2106. I hope we devise a better compression format in the next ~80 years, because we can only represent dates in that range. In my testing, many implementations set this to . A few set it to the current time or the file’s modification time—the command is one of these. FTEXT , a boolean flag vaguely indicating that the data is “probably ASCII text”. When I say vaguely, I mean it: the spec “deliberately [does] not specify the algorithm used to set this”. This is apparently for systems which have different storage formats for ASCII and binary data. In all my testing, nobody sets this flag to anything but . An extra flag indicating how hard the compressor worked. signals that it was compressed with max compression (e.g., ), for the fastest algorithm, and for everything else. In practice, zlib and many others set this correctly per the spec, but some tools hard-code it to . And as far as I can tell, this byte is not used during decompression, so it doesn’t really matter. The original file name . For example, when I run , the name is set to . This field is optional, so many tools don’t set it, but the command line tool does. You can disable that with . A comment . This optional field is seldom used, and many decompressors ignore it. But you could add a little comment if you want. Extra arbitrary data . If the other metadata wasn’t enough, you can stuff whatever you want into arbitrary subfields. Each subfield has a two-byte identifier and then 0 or more bytes of additional info. That’s way more info than I expected! I was intrigued by this metadata and I’ve been wanting to learn Zig , so I wrote gzpeek . gzpeek is a command-line tool that lets you inspect the metadata of gzip streams. Here’s how to read metadata from a gzipped file: It extracts everything I listed above: the operating system, original file name, modification time, and more. I used it a bunch when surveying different gzip implementations. Give it a try, and let me know what gzip metadata you find. The operating system that did the compression. This was super surprising to me! There’s a single byte that identifies the compressor’s OS: for Windows, for the Amiga, for Unix, and many others I’d never heard of. Compressors can also set for an “unknown” OS. Different tools set this value differently. zlib, the most popular gzip library, changes the flag based on the operating system . (It even defines some OSes that aren’t in the spec, like for BeOS.) Many other libraries build atop zlib and inherit this behavior, such as .NET’s , Ruby’s , and PHP’s . Java’s , JavaScript’s , and Go’s set the OS to “unknown” regardless of operating system. Some, like Zopfli and Apache’s , hard-code it to “Unix” no matter what. All that to say: in practice, you can’t rely on this flag to determine the source OS, but it can give you a hint. Modification time for the data. This can be the time that compression started or the modification time of the file. It can also be set to if you don’t want to communicate a time. This is represented as an unsigned 32-bit integer in the Unix format. That means it can represent any moment between January 1, 1970 and February 7, 2106. I hope we devise a better compression format in the next ~80 years, because we can only represent dates in that range. In my testing, many implementations set this to . A few set it to the current time or the file’s modification time—the command is one of these. FTEXT , a boolean flag vaguely indicating that the data is “probably ASCII text”. When I say vaguely, I mean it: the spec “deliberately [does] not specify the algorithm used to set this”. This is apparently for systems which have different storage formats for ASCII and binary data. In all my testing, nobody sets this flag to anything but . An extra flag indicating how hard the compressor worked. signals that it was compressed with max compression (e.g., ), for the fastest algorithm, and for everything else. In practice, zlib and many others set this correctly per the spec, but some tools hard-code it to . And as far as I can tell, this byte is not used during decompression, so it doesn’t really matter. The original file name . For example, when I run , the name is set to . This field is optional, so many tools don’t set it, but the command line tool does. You can disable that with . A comment . This optional field is seldom used, and many decompressors ignore it. But you could add a little comment if you want. Extra arbitrary data . If the other metadata wasn’t enough, you can stuff whatever you want into arbitrary subfields. Each subfield has a two-byte identifier and then 0 or more bytes of additional info.

0 views
Chris Coyier 2 days ago

Tucci Pan Review

Stanley Tucci has a set of cookware named after him that GreenPan sells. I’ve got these two pans: I forget where they came from exactly, some silent auction or something, but I unboxed and started using them about 8 months ago. I was so hyped the first few months! It’s my daily-driver pan. I’d say it’s used once a day, on average. Then it looses it’s luster after a while. I could scrub the bottom, but I just don’t care about that. The inside was more concerning. I hit up their customer support, as it’s not just the aesthetics that were dimming here, the pan really seems maybe half as nicely non-stick as it was 8 months ago, and cleaning it with non-abrasive techniques takes much longer. Fill the pan halfway with water and bring it to a simmer for about 2 minutes. Pour out the water and place the pan on a safe sturdy surface. Carefully use a Melamine sponge (Mr. Clean Magic Eraser, our Restoring Sponge or any melamine sponge) and a little plain water on the warm surface to wipe away the food or stuck on oil.  This should do the trick. Fair enough: that technique worked well to remove what they called “a layer of carbonized oil”. I got it entirely clean with a bit of elbow grease. I’d say the pan performs 10% better after that. But it ain’t back to its former glory. I highly suspect at the one-year mark the pan is basically gonna be toast. So my review is:   it’s an incredible pan for 6 months and a so-so pan for 6 months, then you’re done. There is some kind of coating, and it’s way better than average, but it’s just not a forever thing. If you can stomach a few hundred bucks a year to replace it, go for it. Me, I’ve got some research to do on what to replace it with because I think I want a little longer longevity. And yes, I’ve got a well-seasoned cast-iron I’ve used most of my life. That’s fine, but I wanna try other things. Specifically, less-honkin’ pans that are easier to handle. Ultra extremely non-stick Washing them with a soft sponge is nearly effortless because of how non-stick they are. Feels good, like I’m taking care of it correctly. The edges of the pan, with the steep angles, are perfect for that cool chef move where you toss/flip stuff in the pan with a wrist movement.

1 views
Simon Willison 2 days ago

I vibe coded my dream macOS presentation app

I gave a talk this weekend at Social Science FOO Camp in Mountain View. The event was a classic unconference format where anyone could present a talk without needing to propose it in advance. I grabbed a slot for a talk I titled "The State of LLMs, February 2026 edition", subtitle "It's all changed since November!". I vibe coded a custom macOS app for the presentation the night before. I've written about the last twelve months of development in LLMs in December 2023 , December 2024 and December 2025 . I also presented The last six months in LLMs, illustrated by pelicans on bicycles at the AI Engineer World’s Fair in June 2025. This was my first time dropping the time covered to just three months, which neatly illustrates how much the space keeps accelerating and felt appropriate given the November 2025 inflection point . (I further illustrated this acceleration by wearing a Gemini 3 sweater to the talk, which I was given a couple of weeks ago and is already out-of-date thanks to Gemini 3.1 .) I always like to have at least one gimmick in any talk I give, based on the STAR moment principle I learned at Stanford - include Something They'll Always Remember to try and help your talk stand out. For this talk I had two gimmicks. I built the first part of the talk around coding agent assisted data analysis of Kākāpō breeding season (which meant I got to show off my mug ), then did a quick tour of some new pelicans riding bicycles before ending with the reveal that the entire presentation had been presented using a new macOS app I had vibe coded in ~45 minutes the night before the talk. The app is called Present - literally the first name I thought of. It's built using Swift and SwiftUI and weighs in at 355KB, or 76KB compressed . Swift apps are tiny! It may have been quick to build but the combined set of features is something I've wanted for years . I usually use Keynote for presentations, but sometimes I like to mix things up by presenting using a sequence of web pages. I do this by loading up a browser window with a tab for each page, then clicking through those tabs in turn while I talk. This works great, but comes with a very scary disadvantage: if the browser crashes I've just lost my entire deck! I always have the URLs in a notes file, so I can click back to that and launch them all manually if I need to, but it's not something I'd like to deal with in the middle of a talk. This was my starting prompt : Build a SwiftUI app for giving presentations where every slide is a URL. The app starts as a window with a webview on the right and a UI on the left for adding, removing and reordering the sequence of URLs. Then you click Play in a menu and the app goes full screen and the left and right keys switch between URLs That produced a plan. You can see the transcript that implemented that plan here . In Present a talk is an ordered sequence of URLs, with a sidebar UI for adding, removing and reordering those URLs. That's the entirety of the editing experience. When you select the "Play" option in the menu (or hit Cmd+Shift+P) the app switches to full screen mode. Left and right arrow keys navigate back and forth, and you can bump the font size up and down or scroll the page if you need to. Hit Escape when you're done. Crucially, Present saves your URLs automatically any time you make a change. If the app crashes you can start it back up again and restore your presentation state. You can also save presentations as a file (literally a newline-delimited sequence of URLs) and load them back up again later. Getting the initial app working took so little time that I decided to get more ambitious. It's neat having a remote control for a presentation... So I prompted: Add a web server which listens on 0.0.0.0:9123 - the web server serves a single mobile-friendly page with prominent left and right buttons - clicking those buttons switches the slide left and right - there is also a button to start presentation mode or stop depending on the mode it is in. I have Tailscale on my laptop and my phone, which means I don't have to worry about Wi-Fi networks blocking access between the two devices. My phone can access directly from anywhere in the world and control the presentation running on my laptop. It took a few more iterative prompts to get to the final interface, which looked like this: There's a slide indicator at the top, prev and next buttons, a nice big "Start" button and buttons for adjusting the font size. The most complex feature is that thin bar next to the start button. That's a touch-enabled scroll bar - you can slide your finger up and down on it to scroll the currently visible web page up and down on the screen. It's very clunky but it works just well enough to solve the problem of a page loading with most interesting content below the fold. I'd already pushed the code to GitHub (with a big "This app was vibe coded [...] I make no promises other than it worked on my machine!" disclaimer) when I realized I should probably take a look at the code. I used this as an opportunity to document a recent pattern I've been using: asking the model to present a linear walkthrough of the entire codebase. Here's the resulting Linear walkthroughs pattern in my ongoing Agentic Engineering Patterns guide , including the prompt I used. The resulting walkthrough document is genuinely useful. It turns out Claude Code decided to implement the web server for the remote control feature using socket programming without a library ! Here's the minimal HTTP parser it used for routing: Using GET requests for state changes like that opens up some fun CSRF vulnerabilities. For this particular application I don't really care. Vibe coding stories like this are ten a penny these days. I think this one is worth sharing for a few reasons: This doesn't mean native Mac developers are obsolete. I still used a whole bunch of my own accumulated technical knowledge (and the fact that I'd already installed Xcode and the like) to get this result, and someone who knew what they were doing could have built a far better solution in the same amount of time. It's a neat illustration of how those of us with software engineering experience can expand our horizons in fun and interesting directions. I'm no longer afraid of Swift! Next time I need a small, personal macOS app I know that it's achievable with our existing set of tools. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Swift, a language I don't know, was absolutely the right choice here. I wanted a full screen app that embedded web content and could be controlled over the network. Swift had everything I needed. When I finally did look at the code it was simple, straightforward and did exactly what I needed and not an inch more. This solved a real problem for me. I've always wanted a good way to serve a presentation as a sequence of pages, and now I have exactly that.

1 views
Kev Quirk 2 days ago

Introducing Pure Comments (and Pure Commons)

A few weeks ago I introduced Pure Blog a simple PHP based blogging platform that I've since moved to and I'm very happy. Once Pure Blog was done, I shifted my focus to start improving my commenting system . I ended that post by saying: At this point it's battle tested and working great. However, there's still some rough edges in the code, and security could definitely be improved. So over the next few weeks I'll be doing that, at which point I'll probably release it to the public so you too can have comments on your blog, if you want them. I've now finished that work and I'm ready to release Pure Comments to the world. 🎉 I'm really happy with how Pure Comments has turned out; it slots in perfectly with Pure Blog, which got me thinking about creating a broader suite of apps under the Pure umbrella. I've had Simple.css since 2022, and now I've added Pure Blog and Pure Comments to the fold. So I decided I needed an umbrella to house these disparate projects. That's where Pure Commons comes in. My vision for Pure Commons is to build it into a suite of simple, privacy focussed tools that are easy to self-host, and have just what you need and no more. Well, concurrent to working on Pure Comments, I've also started building a fully managed version that people will be able to use for a small monthly fee. That's about 60% done at this point, so I should be releasing that over the next few weeks. In the future I plan to add a managed version of Pure Blog too, but that will be far more complex than a managed version of Pure Comments. So I think that will take some time. I'm also looking at creating Pure Guestbook , which will obviously be a simple, self-hosted guestbook along the same vein as the other Pure apps. This should be relatively simple to build, as a guestbook is basically a simplified commenting system, so most of the code is already exists in Pure Comments. Looking beyond Pure Guestbook I have some other ideas, but you will have to wait and see... In the meantime, please take a look as Pure Comments - download the source code , take it for a spin, and provide any feedback/bugs you find. If you have any ideas for apps I could add to the Pure Commons family, please get in touch. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Martin Fowler 2 days ago

Fragments: February 25

I don’t tend to post links to videos here, as I can’t stand watching videos to learn about things . But some talks are worth a watch, and I do suggest this overview on how organizations are currently using AI by Laura Tacho. There’s various nuggets of data from her work with DX: These are interesting numbers, but most of them are averages, and those who know me know I teach people to be suspicious of averages . Laura knows this too: average doesn’t mean typical.. there is no typical experience with AI Different companies (and teams within companies) are having very different experiences. Often AI is an amplifier to an organization’s practices, for good or ill. Organizational performance is multidimensional, and these organizations are just going off into different extremes based on what they were doing before. AI is an accelerator, it’s a multiplier, and it is moving organizations off in different directions. (08:52) Some organizations are facing twice as many customer incidents, but others are facing half. ❄                ❄                ❄                ❄                ❄ Rachel Laycock (Thoughtworks CTO) shares her reflections on our recent Future of Software Engineering retreat in Utah. On the latter: One of the most interesting and perhaps immediately applicable ideas was the concept of an ‘agent subconscious’, in which agents are informed by a comprehensive knowledge graph of post mortems and incident data. This particularly excites me because I’ve seen many production issues solved by the latent knowledge of those in leadership positions. The constant challenge comes from what happens when those people aren’t available or involved. ❄                ❄                ❄                ❄                ❄ Simon Willison (one of my most reliable sources for information about LLMs and programming) is starting a series of Agentic Engineering Patterns : I think of vibe coding using its original definition of coding where you pay no attention to the code at all, which today is often associated with non-programmers using LLMs to write code. Agentic Engineering represents the other end of the scale: professional software engineers using coding agents to improve and accelerate their work by amplifying their existing expertise. He’s intending this to be closer to evergreen material, as opposed to the day-to-day writing he does (extremely well) on his blog. One of the first patterns is Red/Green TDD This turns out to be a fantastic fit for coding agents. A significant risk with coding agents is that they might write code that doesn’t work, or build code that is unnecessary and never gets used, or both. Test-first development helps protect against both of these common mistakes, and also ensures a robust automated test suite that protects against future regressions. ❄                ❄                ❄                ❄                ❄ Aaron Erickson is one of those technologists with good judgment who I listen to a lot As much fun as people are having with OpenClaw, I think the days of “here is my agent with access to all my stuff” are numbered. Fine scoped agents who can read email and cleanse it before it reaches the agentic OODA loop that acts on it, policy agents (a claw with a job called “VP of NO” to money being spent) You structure your agents like you would a company. Insert friction where you want decisions to be slow and the cost of being wrong is high, reduce friction where you want decisions to be fast and the cost of being wrong is trivial or zero. I’ve posted here a lot about security concerns with agents. Right now I think this notion of fine-scoped agents is the most promising direction. Last year Korny Sietsma wrote about how to mitigate agentic AI security risks . His advice included to split the tasks, so that no agent has access to all parts of the Lethal Trifecta: This approach is an application of a more general security habit: follow the Principle of Least Privilege. Splitting the work, and giving each sub-task a minimum of privilege, reduces the scope for a rogue LLM to cause problems, just as we would do when working with corruptible humans. This is not only more secure, it is also increasingly a way people are encouraged to work. It’s too big a topic to cover here, but it’s a good idea to split LLM work into small stages, as the LLM works much better when its context isn’t too big. Dividing your tasks into “Think, Research, Plan, Act” keeps context down, especially if “Act” can be chunked into a number of small independent and testable chunks. ❄                ❄                ❄                ❄                ❄ Doonesbury outlines the opportunity for aging writers like myself . (Currently I’m still writing my words the old fashioned way.) ❄                ❄                ❄                ❄                ❄ An interesting story someone told me. They were at a swimming pool with their child, she looked at a photo on a poster advertising an event there and said “that’s AI”. Initially the parents didn’t think it was, but looking carefully spotted a tell-tale six fingers. They concluded that fresher biological neural networks are being trained to quickly recognize AI. ❄                ❄                ❄                ❄                ❄ I carefully curate my social media streams, following only feeds where I can control whose posts are picked up. In times gone by, editors of newspapers and magazines would do a similar job. But many users of social media are faced with a tsunami of stuff, much of it ugly, and don’t have to tools to control it. A few days ago I saw an Instagram reel of a young woman talking about how she had been raped six years ago, struggled with thoughts of suicide afterwards, but managed to rebuild her life again. Among the comments – the majority of which were from men – were things like “Well at least you had some”, “No way, she’s unrapeable”, “Hope you didn’t talk this much when it happened”, “Bro could have picked a better option.” Reading those comments, which had thousands of likes and many boys agreeing with them, made me feel sick. My tendencies are to free speech, and I try not to be a Free Speech Poseur, but the deluge of ugly material on the internet isn’t getting any better. The people running these platforms seem to be “tackling” this problem by putting their heads in the sand and hoping it won’t hurt them. It is hurting their users. 92.6% of devs are using AI assistants devs reckon it’s saving them 4 hours per week 27% of code is written by AI without significant human intervention AI cuts onboarding time by half We need to address cognitive load The staff engineer role is changing What happens to code reviews? Agent Topologies What exactly does AI mean for programming languages? Self-healing systems

0 views