Posts in Java (20 found)
マリウス 4 days ago

KTT x 80Retros GAME 1989 Orange

I picked up the KTT x 80Retros GAME 1989 Orange switches a while ago at Funkeys , a physical brick-and-mortar mechanical keyboard store in Yongsan-gu, Seoul , and it’s my first linear switch. Given its surprisingly cheap price I really didn’t expect much from it to be honest. KTT is a name people normally associate with budget options, like Peaches , Sea Salts , and Strawberries . It’s the kind of switches that show up in beginner build guides and they are generally good stuff, but not really the kind of thing that made me stop and think about what I was typing on. However, the GAME 1989 Orange changed that perception for me, and it did it in a way I genuinely didn’t see coming. But before we get into the switch itself, we need to talk about the vibe , because the vibe is half the story here. 80Retros is a relatively young brand out of China that debuted on ZFrontier around December 2023 with an interest check for their GAME 1989 cherry-profile PBT keycap set inspired by the original Game Boy . They describe themselves as lovers of all things vintage and retro, and unlike a lot of brands that slap “retro” on things as a marketing afterthought, they actually seem to mean it. What’s remarkable is how fast they’ve moved since then. Within a few years, they went from a single keycap IC to pushing out nearly a dozen different switches across two separate manufacturers ( KTT and HMX ), along with matching keycap sets in multiple colorways. The G.O.A.T. of switch reviews himself, ThereminGoat , covered this in detail in his HMX Volume 0-T review , and the GAME timeline is pretty interesting: The original HMX -manufactured GAME 1989 switches came first, followed by what he calls the “Film Trio” (the KD200 , FJ400 , and GAME 1989 Classic ), all packaged in these absolutely gorgeous film canister-inspired containers that look like oversized Kodak rolls. The film canister thing started as a nod to the KD200 and FJ400 being camera-brand-inspired, but the community loved the packaging so much that 80Retros seemingly just kept using it for everything. Even for switches that have nothing to do with photography. The KTT -manufactured GAME 1989 Orange and Red are the newer entries in this expanding catalogue, released as part of an “Expanded Film Series” in early 2025 alongside a Silent White variant and an HMX XMAS switch. So we’re looking at a brand that is absolutely not slowing down. On paper, PC top and PA66 bottom is a pretty classic material combo. KTT has used variations of this pairing for years. What makes this switch interesting is the KT2 stem made out of their proprietary UPE blend. UPE ( ultra-high molecular weight polyethylene ) is a material that’s been showing up more and more in the switch world, but it’s one of those things where the specific manufacturer’s blend matters enormously. Keygeek ’s U4 , for example, sounds glassy and solid. KTT ’s KT2 is more dry, a bit foamy, and (this is the part I didn’t expect) it brings an audible character that I can only describe as “marble-y” . It’s not soft, but it’s not hard either. It sits in this interesting middle ground. At 4mm travel with a pole bottom-out the switch is technically a long-pole linear, but the full travel distance means it doesn’t feel like one in the snappy, sharp way that most long-poles do. The pole bottom-out is there, but it’s mellowed out by the travel length and the stem material. More on that later. Stock smoothness is good, and I mean genuinely good. Probably not HMX -tier buttery, and probably not the absolute smoothest thing I’ve tried in the recent years, but there’s a quality to the travel that feels deliberate and controlled. The factory lube is present but light. A thin coating on the bottom housing railings, some on the stem legs and leaf, and the springs seem lightly done too. There is a texture to the keystroke and some people might call it scratch, but I’m not sure that would be fair, though it’s not entirely wrong either. UPE blends can be unpredictable when paired with other housing materials. Sometimes you get something silky, sometimes you get audible friction. The KT2 blend with this PC/PA66 housing produces a slight tactile grain in the travel that I genuinely enjoy. It’s subtle enough that you won’t notice it during normal typing speed, but if you slow-press a single key at ear level, it’s there. Spring-wise, 40g actuation bottoming out at around 50g is on the lighter side, especially for me and my usual Frankenswitches . I wouldn’t call it featherweight, but if you tend to bottom out hard, you’ll definitely hit the end of the stroke with minimal effort. The springs are clean, without noticeable ping in my set. The factory lube on the springs seems to do its job. One thing to note is that there’s reportedly about a 3g variance between individual switches. I couldn’t verify that precisely, but I did notice the occasional key that felt marginally different. Not a dealbreaker for me, but if you’re the kind of person who weighs every spring in a batch, keep it in mind. As for wobble, it is present. There’s some slight vertical (north-south) wobble and maybe a touch of east-west if you go looking for it. This seems to be a known trade-off with KTT ’s newer molds. Their older switches like the Hyacinths seemingly had incredibly tight tolerances, but those molds are from a different era. KTT has been retooling to accommodate new materials like their KT2 and KT3 blends, and the fit isn’t quite as snug as the old stuff. As for films, they probably do help to tighten up the housings and I’ve read that filming the switches apparently also compresses the sound profile slightly. Personally, the wobble doesn’t bother me too much. The sound profile is where the GAME 1989 Orange gets genuinely interesting, because the sound profile is busy , and I mean that in a good way. The bottom-out is lower-pitched than you’d typically expect from a PC -topped switch. The PA66 bottom housing and the KT2 stem material seemingly pull the tone down into a territory that’s thocky without being mushy. There’s a definite pop to the keystroke, and the bottom-out has weight to it. The top-out (the return stroke) is a touch brighter, creating this slight tonal contrast between the downstroke and upstroke that gives the switch a lot of auditory dimension. There’s a lot happening acoustically at any given keystroke and none of it sounds muddied or confused. The “marble-y” quality I mentioned earlier really comes through in the sound. It’s not a wet, lubed sound, but a relatively dry and more textured one, with a character that feels… natural, in lack of better words. The slight scratch in the travel actually adds to the sound profile rather than detracting from it. The initial contact, the pole hitting bottom, the spring compression, the return remains distinct of each other and layered. Volume-wise, it’s moderate. Definitely not silent, but also not exactly loud. Slightly quieter than your average long-pole, which makes sense given the full 4mm travel and the way the KT2 material absorbs some of the impact energy. I haven’t yet tested it on any of my aluminium builds , but at least on the few keyboards Funkeys had these switches on, as well as on my Kunai , I find that the sound profile works beautifully. Having that said, these switches are definitely less ideal for quiet/public environments, like open space offices and cafes. The switches come factory lubed and they work just fine stock. I’d personally resist the urge to lube them further unless you specifically want to kill the audible scratch, which I think is part of the charm. If you do lube, know that you’re trading character for smoothness, and these are already reasonably smooth to begin with. They accept films, and filming them does seem to tighten the sound slightly with less resonance in the housing, a more compressed signature. Depending on your build and plate material, that might be exactly what you want or exactly what you don’t. Try a few with and without before committing. As for the packaging, if you buy the 35-switch sets, they come in those aforementioned film canister containers. It’s genuinely lovely and a nice touch that makes the whole experience feel considered. Not something I’d pay extra for, but it’s a detail that matters for the overall product identity. One thing to note is that the canisters open very easily. I wouldn’t walk around holding them upside down unless I’d want to play find 35 switches hidden underneath the furniture . The KTT x 80Retros GAME 1989 Orange surprised me. It’s a switch that trades the ultra-polished, frictionless perfection for something with a dry, textured, slightly scratchy keystroke that somehow comes together into a sound profile that’s warm, full, and more complex than it has any right to be at this price point. It’s not perfect. The wobble is there, and the housing tolerances aren’t as tight as the best in the business. It doesn’t feel like every other linear on the market, at least not like the ones I had the chance to try over the past years. It has character, which, in a hobby that’s increasingly crowded with technically excellent but personality-free switches, has its charm. If you want the smoothest linear available, look elsewhere. If you want something that sounds interesting, feels engaging, and comes wrapped an homage to a long gone era give the 1989 Orange a shot. I’m genuinely glad I did. Disclaimer: I’m not a switch scientist. I don’t own a force curve rig, I can’t tell you the exact durometer of the KT2 blend, and my ears are probably not calibrated to the standards of someone like ThereminGoat . This review is based on my personal experience typing on these switches across a few different boards and ultimately actively using them on my primary keyboard . Your mileage may vary based on your plate material, case, keycaps, and other factors. Take everything here as one person’s experience and use it as a starting point for your own.

0 views
Takuya Matsuyama 1 weeks ago

What a Japanese cooking principle taught me about overcoming AI fatigue

Hey, what's up? It's Takuya . I've been thinking about how to live well in this AI era as a developer, content creator, and artist. By “living well,” I mean enjoying the act of creating while maintaining good mental health. I imagine many of my readers are also wondering how to survive — and even thrive — amid the rapid changes brought by recent AI advancements. I don’t have all the answers. No one can predict the future precisely. But I believe it’s more a matter of direction than strategy — rather than trying to build some kind of moat around your life as a software-oriented artist. Where do you want to go? What do you want to see? That's what matters. Strategies/moats can be flexibly changed along with the situations. In this article, I’d like to explore a few life principles from Japanese culture. I recently read a book called "一汁一菜でよいという提案" (The Proposal for One Soup, One Dish) by traditional food expert Yoshiharu Doi (土井善晴), and found it very interesting for keeping the pace of your life healthy. 一汁一菜 pic.twitter.com/hJVYhJX4lE We are currently in the midst of "AI fatigue." New services emerge every day, and big company releases change workflows weekly. Chasing every hype doesn't make us more secure; it just fogs up the crucial skills we need to foster. It’s easy to fall into the trap of trying too hard to keep up, only to end up exhausted. Traditional Japanese culture offers a different perspective — one that helps us stay grounded and resilient in the face of uncertainty. Let's dive into it. Video version: Doi-san's book emphasizes that by stripping away the unnecessary, we find our "comfortable place." As he puts it: It's easy to get your wishlist or to-do list flooded if you don't have any clear rules, policy, or direction in your life. Let's think about it. For example, on social media, the algorithms try to grab and hold your attention as long as possible by displaying so much gossip and drama depending on your interests. But you have full control of not willingly seeing them. I'm always careful not to spend time tracking gossip or drama on the internet. It lets me keep calm and avoid comparing myself with others, causing me to feel miserable. To form a rhythm of life, you have to decide what NOT to do in your daily life. In terms of cooking, Doi-san proposes a system called " One-soup One-dish (一汁一菜)", which helped a lot of people who cook for their family every day. This is a style of meal centered around rice, with one soup and one side dish. Eating is an essential part of living. Yet, keeping it simple is surprisingly hard. There are endless food products, recipes, and health tips — and TV shows constantly push us to make beautiful, varied meals for our families every day. Oh, it looks very similar to today's tech industry, doesn't it? People are unconsciously exhausted by all of this, feeling as if they have to create something great every single day. Doi-san saw through this — and freed them by saying: Software developers can't live without software. It is literally an essential part of our lives, and keeping it simple is surprisingly hard. Let's learn from his philosophy behind his principle. Why does cooking matter so much? Because it is something you do every day, which makes you or someone you love feel really comfortable, as he puts it: I was deeply moved by this. I cook for my family every evening. It helps me shift from work to home—a transition I don’t get naturally, since I don’t have a commute. My 4-year-old daughter sometimes says, “I can smell something good.” It makes me happy, too. What’s important is having something you do every day that makes you feel safe, comfortable, and happy. It could be anything, such as playing an instrument, going for a walk in the morning, painting, singing, swimming, reading before bed, or meditating. It should be something you never get bored with. Something you’ve truly enjoyed in the past. Something that doesn’t make you compare yourself to others, but instead helps you be mindful. If you don't have it yet, step away from your computer and go outside to experience new things. I feel like it’s important that this habit doesn’t involve a screen, as he suggests: What matters is finding something you can return to every day — something you never tire of, like rice and miso soup, rather than something instantly stimulating but quickly exhausting, like Netflix or doom-scrolling on social media. The more I adapt to algorithms and AI, the more I value organic connections – both with people and with ideas. In his book, Doi-san explains that the things we never tire of are often the things humans didn't "engineer" to be perfect: This concept of "not a human feat" (or rather, not a calculated feat) is exactly what’s missing from our digital lives. Algorithms are the "processed seasoning" of the internet — designed to give you an instant hit of dopamine, but leaving you feeling empty and "tired of the flavor" an hour later. I’ve realized that my most resilient moments don't come from a perfectly optimized prompt or a viral post. They come from the "fermentation" of daily life — the slow, messy, unscripted interactions that haven't been optimized for engagement. For example, when I have a quick, casual chat with the barista at Starbucks, or when I’m swapping stories with other parents ( mama-tomo ) while dropping my daughter off at kindergarten, I feel like I’m participating in a natural rhythm, not an algorithm. Small moments like these give me a real sense that I belong to society and am truly living in it. My best ideas work the same way. They rarely strike when I’m glaring at a screen, trying to force a breakthrough. Instead, they "descend" upon me when I’m out for a walk or simply staring blankly at the scenery. It feels less like distillation — which focuses on seeking speed, purity, and efficiency — and more like fermentation . It’s about creating the right environment and then letting the subconscious work its magic over time. You can't rush miso, and you can't rush a truly original thought. Each season brings a variety of foods to enjoy, and Doi-san emphasizes the importance of appreciating them: In Japan, we celebrate the arrival of the first bonito or the last of the winter cabbage. It’s enjoyable to appreciate these changes, but notice one thing: people don’t "chase" them. You don't feel like a failure if you missed the peak of cherry blossom season; you simply look forward to the next cycle. Yet, in the tech industry, we treat trends like a race we are constantly losing. Instead of trying so hard to "stay in the loop," why not view new technologies as seasonal arrivals? You don't have to master every single one. You are living in "Technical Nature" just as much as you are living in real nature. If a new AI model drops, it’s like the first bamboo shoots of spring — interesting, worth trying, but not something to stress over. You can learn anything when it becomes necessary for your craft. By shifting from "chasing" to "appreciating," you replace FOMO with curiosity. When we stop being obsessed with "catching up" and start allowing ourselves the "leisure time" mentioned earlier, something vital happens: playfulness is born. True creativity doesn't come from a place of survival or anxiety. It comes from playing with the tools available to us, much like a chef plays with the ingredients of the season. For a developer, this might be the optimal way of life. Don't just be a user of technology; be someone who resonates with its constant birth and decay, using that rhythm to create something that feels truly alive. I experienced burnout last year . The philosophy of one soup, one dish has helped me step back from that. It reminds me to strip away the noise and return to what really matters. Relax! You’re not as bad as you think. Trust your instincts. Listen to your body. And let’s enjoy this rare moment of change we’re living through :)

1 views
Max Bernstein 1 weeks ago

Value numbering

Welcome back to compiler land. Today we’re going to talk about value numbering , which is like SSA, but more. Static single assignment (SSA) gives names to values: every expression has a name, and each name corresponds to exactly one expression. It transforms programs like this: where the variable is assigned more than once in the program text, into programs like this: where each assignment to has been replaced with an assignment to a new fresh name. It’s great because it makes clear the differences between the two expressions. Though they textually look similar, they compute different values. The first computes 1 and the second computes 2. In this example, it is not possible to substitute in a variable and re-use the value of , because the s are different. But what if we see two “textually” identical instructions in SSA? That sounds much more promising than non-SSA because the transformation into SSA form has removed (much of) the statefulness of it all. When can we re-use the result? Identifying instructions that are known at compile-time to always produce the same value at run-time is called value numbering . To understand value numbering, let’s extend the above IR snippet with two more instructions, v3 and v4. In this new snippet, v3 looks the same as v1: adding v0 and 1. Assuming our addition operation is some ideal mathematical addition, we can absolutely re-use v1; no need to compute the addition again. We can rewrite the IR to something like: This is kind of similar to the destructive union-find representation that JavaScriptCore and a couple other compilers use, where the optimizer doesn’t eagerly re-write all uses but instead leaves a little breadcrumb / instruction 1 . We could then run our copy propagation pass (“union-find cleanup”?) and get: Great. But how does this happen? How does an optimizer identify reusable instruction candidates that are “textually identical”? Generally, there is no actual text in the IR . One popular solution is to compute a hash of each instruction. Then any instructions with the same hash (that also compare equal, in case of collisions) are considered equivalent. This is called hash-consing . When trying to figure all this out, I read through a couple of different implementations. I particularly like the Maxine VM implementation. For example, here is the (hashing) and functions for most binary operations, slightly modified for clarity: The rest of the value numbering implementation assumes that if a function returns 0, it does not wish to be considered for value numbering. Why might an instruction opt-out of value numbering? An instruction might opt out of value numbering if it is not “pure”. Some instructions are not pure. Purity is in the eye of the beholder, but in general it means that an instruction does not interact with the state of the outside world, except for trivial computation on its operands. (What does it mean to de-duplicate/cache/reuse ?) A load from an array object is also not a pure operation 2 . The load operation implicitly relies on the state of the memory. Also, even if the array was known-constant, in some runtime systems, the load might raise an exception. Changing the source location where an exception is raised is generally frowned upon. Languages such as Java often have requirements about where exceptions are raised codified in their specifications. We’ll work only on pure operations for now, but we’ll come back to this later. We do often want to optimize impure operations as well! We’ll start off with the simplest form of value numbering, which operates only on linear sequences of instructions, like basic blocks or traces. Let’s build a small implementation of local value numbering (LVN). We’ll start with straight-line code—no branches or anything tricky. Most compiler optimizations on control-flow graphs (CFGs) iterate over the instructions “top to bottom” 3 and it seems like we can do the same thing here too. From what we’ve seen so far optimizing our made-up IR snippet, we can do something like this: The find-and-replace, remember, is not a literal find-and-replace, but instead something like: (if you have been following along with the toy optimizer series) This several-line function (as long as you already have a hash map and a union-find available to you) is enough to build local value numbering! And real compilers are built this way, too. If you don’t believe me, take a look at this slightly edited snippet from Maxine’s value numbering implementation. It has all of the components we just talked about: iterating over instructions, map lookup, and some substitution. This alone will get you pretty far. Code generators of all shapes tend to leave messy repeated computations all over their generated code and this will make short work of them. Sometimes, though, your computations are spread across control flow—over multiple basic blocks. What do you do then? Computing value numbers for an entire function is called global value numbering (GVN) and it requires dealing with control flow (if, loops, etc). I don’t just mean that for an entire function, we run local value numbering block-by-block. Global value numbering implies that expressions can be de-duplicated and shared across blocks. Let’s tackle control flow case by case. First is the simple case from above: one block. In this case, we can go top to bottom with our value numbering and do alright. The second case is also reasonable to handle: one block flowing into another. In this case, we can still go top to bottom. We just have to find a way to iterate over the blocks. If we’re not going to share value maps between blocks, the order doesn’t matter. But since the point of global value numbering is to share values, we have to iterate them in topological order (reverse post order (RPO)). This ensures that predecessors get visited before successors. If you have , we have to visit first and then . Because of how SSA works and how CFGs work, the second block can “look up” into the first block and use the values from it. To get global value numbering working, we have to copy ’s value map before we start processing so we can re-use the instructions. Maybe something like: Then the expressions can accrue across blocks. can re-use the already-computed from because it is still in the map. …but this breaks as soon as you have control-flow splits. Consider the following shape graph: We’re going to iterate over that graph in one of two orders: A B C or A C B. In either case, we’re going to be adding all this stuff into the value map from one block (say, B) that is not actually available to its sibling block (say, C). When I say “not available”, I mean “would not have been computed before”. This is because we execute either A then B or A then C. There’s no world in which we execute B then C. But alright, look at a third case where there is such a world: a control-flow join. In this diagram, we have two predecessor blocks B and C each flowing into D. In this diagram, B always flows into D and also C always flows into D. So the iterator order is fine, right? Well, still no. We have the same sibling problem as before. B and C still can’t share value maps. We also have a weird question when we enter D: where did we come from? If we came from B, we can re-use expressions from B. If we came from C, we can re-use expressions from C. But we cannot in general know which predecessor block we came from. The only block we know for sure that we executed before D is A. This means we can re-use A’s value map in D because we can guarantee that all execution paths that enter D have previously gone through A. This relationship is called a dominator relationship and this is the key to one style of global value numbering that we’re going to talk about in this post. A block can always use the value map from any other block that dominates it. For completeness’ sake, in the diamond diagram, A dominates each of B and C, too. We can compute dominators a couple of ways 4 , but that’s a little bit out of scope for this blog post. If we assume that we have dominator information available in our CFG, we can use that for global value numbering. And that’s just what—you guessed it—Maxine VM does. It iterates over all blocks in reverse post-order, doing local value numbering, threading through value maps from dominator blocks. In this case, their method gets the immediate dominator : the “closest” dominator block of all the blocks that dominate the current one. And that’s it! That’s the core of Maxine’s GVN implementation . I love how short it is. For not very much code, you can remove a lot of duplicate pure SSA instructions. This does still work with loops, but with some caveats. From p7 of Briggs GVN : The φ-functions require special treatment. Before the compiler can analyze the φ-functions in a block, it must previously have assigned value numbers to all of the inputs. This is not possible in all cases; specifically, any φ-function input whose value flows along a back edge (with respect to the dominator tree) cannot have a value number. If any of the parameters of a φ-function have not been assigned a value number, then the compiler cannot analyze the φ-function, and it must assign a unique, new value number to the result. It also talks about eliminating useless phis, which is optional, but would the strengthen global value numbering pass: it makes more information transparent. But what if we want to handle impure instructions? Languages such as Java allow for reading fields from the / object within methods as if the field were a variable name. This makes code like the following common: Each of these reference to and is an implicit reference to or , which is semantically a field load off an object. You can see it in the bytecode (thanks, Matt Godbolt): When straightforwardly building an SSA IR from the JVM bytecode for this method, you will end up with a bunch of IR that looks like this: Pretty much the same as the bytecode. Even though no code in the middle could modify the field (which would require a re-load), we still have a duplicate load. Bummer. I don’t want to re-hash this too much but it’s possible to fold Load and store forwarding into your GVN implementation by either: See, there’s nothing fundamentally stopping you from tracking the state of your heap at compile-time across blocks. You just have to do a little more bookkeeping. In our dominator-based GVN implementation, for example, you can: Not so bad. Maxine doesn’t do global memory tracking, but they do a limited form of load-store forwarding while building their HIR from bytecode: see GraphBuilder which uses the MemoryMap to help track this stuff. At least they would not have the same duplicate instructions in the example above! We’ve now looked at one kind of value numbering and one implementation of it. What else is out there? Apparently, you can get better results by having a unified hash table (p9 of Briggs GVN ) of expressions, not limiting the value map to dominator-available expressions. Not 100% on how this works yet. They note: Using a unified hash-table has one important algorithmic consequence. Replacements cannot be performed on-line because the table no longer reflects availability. Which is the first time that it occurred to me that hash-based value numbering with dominators was an approximation of available expression analysis. There’s also a totally different kind of value numbering called value partitioning (p12 of Briggs GVN ). See also a nice blog post about this by Allen Wang from the Cornell compiler course . I think this mostly replaces the hashing bit, and you still need some other thing for the available expressions bit. Ben Titzer and Seth Goldstein have some good slides from CMU . Where they talk about the worklist dataflow approach. Apparently this is slower but gets you more available expressions than just looking to dominator blocks. I wonder how much it differs from dominator+unified hash table. While Maxine uses hash table cloning to copy value maps from dominator blocks, there are also compilers such as Cranelift that use scoped hash maps to track this information more efficiently. (Though Amanieu notes that you may not need a scoped hash map and instead can tag values in your value map with the block they came from, ignoring non-dominating values with a quick check. The dominance check makes sense but I haven’t internalized how this affects the set of available expressions yet.) You may be wondering if this kind of algorithm even helps at all in a dynamic language JIT context. Surely everything is too dynamic, right? Actually, no! The JIT hopes to eliminate a lot of method calls and dynamic behaviors, replacing them with guards, assumptions, and simpler operations. These strength reductions often leave behind a lot of repeated instructions. Just the other day, Kokubun filed a value-numbering-like PR to clean up some of the waste. ART has a recent blog post about speeding up GVN. Go forth and give your values more numbers. There’s been an ongoing discussion with Phil Zucker on SSI, GVN, acyclic egraphs, and scoped union-find. TODO summarize Commutativity; canonicalization Seeding alternative representations into the GVN Aegraphs and union-find during GVN https://github.com/bytecodealliance/rfcs/blob/main/accepted/cranelift-egraph.md https://github.com/bytecodealliance/wasmtime/issues/9049 https://github.com/bytecodealliance/wasmtime/issues/4371 Writing this post is roughly the time when I realized that the whole time I was wondering why Cinder did not use union-find for rewriting, it actually did! Optimizing instruction by replacing with followed by copy propagation is equivalent to union-find.  ↩ In some forms of SSA, like heap-array SSA or sea of nodes, it’s possible to more easily de-duplicate loads because the memory representation has been folded into (modeled in) the IR.  ↩ The order is a little more complicated than that: reverse post-order (RPO). And there’s a paper called “A Simple Algorithm for Global Data Flow Analysis Problems” that I don’t yet have a PDF for that claims that RPO is optimal for solving dataflow problems.  ↩ There’s the iterative dataflow way (described in the Cooper paper (PDF)), Lengauer-Tarjan (PDF), the Engineered Algorithm (PDF), hybrid/Semi-NCA approach (PDF), …  ↩ initialize a map from instruction numbers to instruction pointers for each instruction if wants to participate in value numbering if ’s value number is already in the map, replace all pointers to in the rest of the program with the corresponding value from the map otherwise, add to the map doing load-store forwarding as part of local value numbering and clearing memory information from the value map at the end of each block, or keeping track of effects across blocks track heap write effects for each block at the start of each block B, union all of the “kill” sets for every block back to its immediate dominator finally, remove the stuff that got killed from the dominator’s value map V8 Hydrogen Writing this post is roughly the time when I realized that the whole time I was wondering why Cinder did not use union-find for rewriting, it actually did! Optimizing instruction by replacing with followed by copy propagation is equivalent to union-find.  ↩ In some forms of SSA, like heap-array SSA or sea of nodes, it’s possible to more easily de-duplicate loads because the memory representation has been folded into (modeled in) the IR.  ↩ The order is a little more complicated than that: reverse post-order (RPO). And there’s a paper called “A Simple Algorithm for Global Data Flow Analysis Problems” that I don’t yet have a PDF for that claims that RPO is optimal for solving dataflow problems.  ↩ There’s the iterative dataflow way (described in the Cooper paper (PDF)), Lengauer-Tarjan (PDF), the Engineered Algorithm (PDF), hybrid/Semi-NCA approach (PDF), …  ↩

0 views
Langur Monkey 2 weeks ago

Fine-tuning Qwen3.5 for Gaia Sky

A little over a year ago I set up a local pipeline to use different LLM s to respond to Gaia Sky questions using RAG . In that post, I built a dynamic scrapper that parsed the Gaia Sky website and documentation and ingested the content it into a vector database. Then, I built a minimal terminal chatbot interface that received the user prompt, queried the database for semantically similar data, and built up the context for each LLM call. The results were promising, and I found that they (obviously) strongly depended on the model used. Fast forward a few months, and the Qwen 3.5 models were released by Alibaba. The general consensus is that they are quite good for their size. I’ve been testing them for local inference with a similar impression. I thought that it would be interesting to repeat the exercise of creating a Gaia Sky AI assistant, but using a radically different approach: Instead of RAG, I would fine-tune the model itself. In this post, I describe this fine-tuning project, from the creation and engineering of the training dataset to the fine-tuning and production of the final GGUF models. This project is composed by two, very distinct parts, which map to top-level chapters in this post: At the end I quickly evaluate the results in the testing section. The source code, dataset, and models discussed in this post are in the following repositories: Here is the hardware I have used to create the dataset and fine-tune the model: The creation of the training dataset is the most important piece of work in this project. It is composed of three parts: When I started this project, my first instinct was “more is better.” I thought that if I fed the model every single , , , and file in the Gaia Sky repositories (project, documentation, etc.), it would emerge as an expert. Oh boy, was I wrong. A large codebase contains a lot of boilerplate noise. Getters, setters, license blocks, and infrastructure code that doesn’t actually help a model understand how the engine works or how to write scripts for it. I soon realized that the dataset is the single most important part of the project, and it needed a surgical approach. The plan was to automate the process of creating the dataset to a degree, and then use it to fine-tune the Qwen 3.5 4B and 8B model variants. I wrote to act as a high-pass filter. Instead of a blind crawl, I implemented an allowlist system. I would only let in the load-bearing files: Almost every source file in Gaia Sky starts with a copyright header. This is “dead weight” for training. I added a regex-based stripper to ensure the model’s limited context window was filled with code logic, not license text: The output of this first phase was a file where each line represented a single, cleaned-up file. It looked like this: This provided the “Context” for the next phase. However, a model trained directly on this would just learn to autocomplete files. To make it an assistant , we had to turn these files into a conversation. Once I had a clean extraction of the most relevant information pieces, I faced a new problem. A raw dump of a file is great for a search engine, but it is not a conversation. To turn these files into training data, I used a “teacher” model, Qwen 3.5 27B, to look at each file and generate a specific number of Q&A pairs. I wrote to handle this. The script calculates how many questions a file is worth based on its length and type. A long documentation file might get 25 Q&A pairs, while a short shader might only get 4. Below is an extract of the method that computes the number of target pairs for a file. Initially, I used the MoE Qwen 3.5 30B A3B, but it was consistently outputting the wrong format. Then I switched to the 27B dense model, and it performed a little better. Even so, I had to tell the model exactly how to behave. Here are the key items I learned the hard way: I also found that, at these model sizes, it is better to batch Q&A pairs instead of asking the model to provide 20 of them in one go. I finally gravitated to 3 Q&A pairs per inference call. To prevent the model from repeating itself across batches, I tracked existing questions and fed them back into the prompt as exclusions. The prompt is constructed as follows: It consists of the following parts: However, LLMs are chatty. Even with such strict instructions, even the 27B model sometimes messes up. It would still sometimes leak its own reasoning into the output. It would start its response with or it would include meta-talk like This created “dirty” data that polluted the dataset and undermined the fine-tuning process. If I trained on this, the final model would start every answer by talking to itself. To fix this, I built . This script is a heavy-duty cleaner that uses regex to strip out training artifacts. It first tries to rescue bad rows, and if it fails, it deletes them. If the model accidentally put both the question and answer in the “output” field, the sanitizer attempts to detect the question mark and splits them back into the correct structure. Here is a look at what the data looked like before and after the sanitization process. The sanitizer also had to deal with Javadoc remnants. Since Gaia Sky is a Java project, class- and method-level comment blocks are full of HTML tags like , , and (Javadoc syntax). The script converts these into clean Markdown so the LLM learns a consistent documentation style. By the end of this process, I had . This contains a curated, clean list of questions and answers based on the whole project documentation. Documentation is important, but I want the model to learn some Gaia Sky scripting as well. To do so, a new API/scripting dataset needed to be generated. To solve this, I built a synthetic data factory designed to teach the model both the content of the API and the context of how to use it. The first step was grounding the model. I wrote a script ( ) that scans the Java source files and uses Regex to pair Javadoc comments with their corresponding method signatures. The method uses regular expressions and a little bit of logic to generate Q&A pairs based on method signatures and their Javadoc documentation: This process produces the , which is used in the next step. It contains the API calls with their respective documentation. However, knowing a function exists isn’t enough. The model needs to know how to script with it. To address this, I developed to transform those raw Java signatures into a diverse pedagogical dataset. As input, it gets all test and showcase scripts in the Gaia Sky repository, and the raw API JSONL file. It produces four types of output, termed A, B, C, and D: Type A: The API reference These are direct “How do I use X?” pairs. They include the parameters, the return types, and a basic Python example. Type B: The task synthesis This step is optional, and I ended up not including it in the final dataset. However, I think it is still worth mentioning. I used the larger teacher dense model (27B) to generate complex tasks (e.g., “Write a script that navigates to Mars, waits 5 seconds, and takes a screenshot”). The script provided the teacher model with a safe list of real functions extracted in Step 1 as a sort of guardrail. If the teacher tried to hallucinate a command, the script flagged and discarded it. The results of this section were kind of underwhelming, possibly because more parameters are needed for such open-ended tasks. Type C: Adversarial error correction This is my favorite part. I programmatically broke the API calls to teach the model how to fix its own mistakes. The script would generate a wrong script (e.g., using instead of or missing a required argument) and then provide the correct version. The end goal was to prevent common LLM failures before they happen. Type D: The “gold standard” Library Finally, I indexed the actual test and showcase scripts from the Gaia Sky repository. These are human-written, battle-tested scripts that show the model how to handle complex logic, loops, and math. Finally, I prepared a small file with essential project information that must appear in the final integrated training dataset. It only contains 17 lines of Q&A, but it is rather important. Here is an excerpt of a few lines (formatted for readability): The final dataset was composed by concatenating the three parts, documentation, API, and identity. It can be explored here: Once the dataset was ready, it was time for the actual fine-tuning. With a dataset of 3,800+ specialized Gaia Sky pairs ready, it was time for the actual training. For this, I leaned on two heavy hitters in the open-source world: Unsloth and Qwen 3.5 . I started by training the 4B model, and then realized that I could also fit the 9B one in my GPU. In the post I’ll focus on the larger version of the model. I went as high as my local hardware allowed. Otherwise, I would have tried the 27B model, or even the 122B-A10B. Training a model with 9 billion parameters typically requires a massive server cluster, but by using 4-bit LoRA (Low-Rank Adaptation) , I was able to squeeze the entire process onto a single RTX 5080 (16GB) . The RTX 5080 is a beast, but to get the most out of it, I enabled TensorFloat-32 (TF32) . This allows the GPU to handle the heavy matrix multiplications of deep learning much faster than standard , without the precision loss of . I used the following parameters for the fine-tuning: LoRA Rank: 32 A balance between learning new patterns (like the Gaia Sky API) and retaining general knowledge. Target Modules: All major projection layers) Learning rate: \(2.0\times10^{-4}\) Optimizer: AdamW 8-bit The dataset is downloaded directly from the hub. It gets tokenized properly and passed into the SFT Trainer object for fine-tuning. The full code is in the file , and is surprisingly compact. The heavy lifting is done by and , of course. Once the LoRA weights are trained, they are dead weight until converted into a format people could actually use. I wrote to automate the most tedious part of the pipeline: After roughly an hour of training, and another one of quantizing and uploading, I had a fine-tuned Gaia Sky expert that performs great. Or did I? The next section discusses the performance of the model. Given the limited time I have and the low-parameter models used, my tests have been scarce. My expectations are not very high, but I still wanted to test the output of the fine-tuning and see how well the new knowledge was retained by the model. I only tested the Qwen 3.5 9B fine-tuned variant and compared it to the base model. You can get this model from HuggingFace: I used the same questions as in the Gaia Sky AI assistant post from 2025. My evaluation of the results is totally subjective, and includes a comparison with the base Qwen 3.5 9B. First, I recommend using a low temperature setting ( 0.2 in my tests) to avoid hallucinations and make sure the model does not get very “creative”. Also, I would suggest using the following system prompt, which frames the conversation and sets the tone: You are the Gaia Sky AI Expert, a specialized assistant with deep knowledge of the Gaia Sky universe, its Python API (v1 and v2), and its Java source code. Your goal is to help users explore the cosmos, write scripts, and understand the engine’s inner workings. That said, let’s go for the first question. Q: State the possible usages of Gaia Sky. A: Gaia Sky is a powerful open-source universe explorer and planetarium software designed for both interactive exploration and scientific visualization. Its primary use cases include: This is an excellent response to an easy question. It did not invent anything and correctly applied documentation knowledge like camera paths or LOD. Compared to base Qwen, which hallucinated a lot, this is much more grounded. Next is a question about the installation procedure on Linux. Q: How do I install Gaia Sky on Linux? A: To install Gaia Sky on Linux, follow these steps: Download the Installer Visit https://gaiasky.space/download and download the package for your architecture (e.g., ). Install via Package Manager Open a terminal and run: If dependency errors occur, fix them with: Launch Gaia Sky Run the command: The application will start in your default browser or terminal window. This is not bad. I did not specify the distribution, so it just assumed Debian. The instructions themselves look fine. Still, I wanted to see if it had absorbed the knowledge for other distros, so I asked specifically for Arch Linux: Q: How do I install it on Arch Linux specifically? A: To install Gaia Sky on Arch Linux , you have two primary options depending on whether you prefer the official repositories or building from source: The most straightforward method is via the Arch User Repository (AUR) . You will need an AUR helper like , , or . Install via AUR Helper : Open your terminal and run: This will automatically resolve dependencies (Java, OpenGL libraries) and install the application. Verify Installation : Once installed, launch it with: If you prefer to compile Gaia Sky yourself or need a specific version not yet in the AUR: Clone the Repository : Build and Install : Use Maven to build the project (requires Java Development Kit): After building, you can run it directly from the directory or install it using the provided Gradle wrapper if available in newer versions. Very good response. It recommends using the AUR package (correct), and also provides instructions to build and run it from source (also correct). However, it hallucinates the use of Maven instead of Gradle. The base model produced a long wall of text with mostly nonsense that would look plausible to the non-expert. In general, the responses from the base model were much longer and their content was very often factually wrong. The next question is a bit more technical. Q: How are cubemap textures used in Gaia Sky? A: Cubemaps in Gaia Sky serve a specific purpose related to planetary rendering rather than general environment mapping for the entire scene. They are primarily utilized to generate high-resolution, seamless spherical textures for planets (like Earth or Mars) by stitching together six square faces into a continuous surface. This technique allows for detailed texture representation without the distortion often seen in standard equirectangular projections near poles. Key Technical Details: This feature is distinct from standard environment maps used for sky reflections or star glow effects, focusing instead on accurate planetary visualization. This is excellent, even though it seems to imply that cubemaps are not used for environment maps and reflections, which is incorrect. However, it correctly identifies that they are mainly used to avoid distortions at the poles. Maybe it increased the importance of it because there is an entire documentation section devoted to cubemap textures in planets. The base model produced a long, Markdown-formatted, pretentious output that even included totally made-up scripts. Again, the fine-tuned response was much more grounded and correct. Now, onto the scripting questions. Here is where the fine-tuned model starts to hesitate and invent slightly. Here are the possible reasons: I personally think it has more to do with the first two reasons than the third. Q: How do I get the camera position with the APIv1 and scripting in Gaia Sky? A: To retrieve the current camera position using the Legacy API (v1) , you call : For the Modern Component API (v2) , you use from the module: First, it invented (correct is ) for APIv1. For APIv2, it correctly used , but the way it gets the object is incorrect. It should have done something like So, I think either this model is too small to be effectively fine-tuned for Gaia Sky scripting, or the training dataset is insufficient to achieve sufficient retention. Could also be both, as I said above. The base model has no idea about Gaia Sky scripting or anything related to it, so it just makes stuff up. Not even worth further mention. I used the Laptop 2 described above for testing and inference, with 28/32 layers on the GPU and a context window of ~4k, and I consistently got about 12 tok/s. Performance is exactly the same as with the base models, so this section is this short. This fine-tuning experiment has yielded valuable insights into the strengths and limitations of domain-specific model adaptation at lower parameter counts for local use. I think the foundational approach was sound. The dataset curation process, with its surgical filtering, teacher-based distillation, and rigorous sanitization, successfully encoded domain knowledge into the model. Proof of this is evident in the testing: the fine-tuned model, as opposed to the base one, correctly answered conceptual and documentation-heavy questions about Gaia Sky’s purpose, installation, and rendering techniques without hallucinating. It understood architectural details like LOD and cubemaps, and avoided inventing features that don’t exist. This demonstrates that fine-tuning can be an effective alternative to RAG for teaching models about a specific domain. However, it also struggled. The 9B model hit a hard ceiling when it came to API scripting and method names. It invented instead of , misunderstood how to instantiate APIv2 objects, and generally lacked the capacity to reliably retain the specific, syntactic details of the API surface. This is a classic problem: smaller models can absorb concepts and documentation , but struggle to memorize exact function signatures and usage patterns. With only 3,800+ training pairs and a 9B parameter budget, the model simply didn’t have enough capacity to encode both general knowledge and precise API details. So, what are the next steps? I believe the 4B and 9B models are too small for reliable Gaia Sky scripting assistance. My next experiment will be to fine-tune the Qwen 3.5 27B model . The jump from 9B to 27B parameters should provide substantially more capacity to encode API signatures without sacrificing general knowledge. Additionally, I could increase the scripting dataset by: That said, the hardware constraint is real. 27B requires more than my RTX 5080 can reasonably handle for full fine-tuning. However, with careful quantization (using 8-bit optimizers or even lower precision), 4-bit LoRA, and possibly gradient checkpointing, it may fit. If not, a cloud provider like Lambda Labs or Paperspace might be the way forward for a single training run. All in all, I think fine-tuning is a viable path for building domain-expert models, but it requires the right balance of dataset quality, model size, and hardware. For Gaia Sky specifically, a 27B model with a more robust scripting dataset would likely be the sweet spot before considering the jump to 70B+ models. I consider the infrastructure proven. It’s now a matter of scale. Training dataset creation Fine tuning Dataset creation and fine-tuning – gaiasky-finetune Gaia Sky training dataset repository – gaiasky-training-dataset Qwen3.5 Gaia Sky fine-tuned models – gaiasky-qwen-3.5-gguf Desktop PC – Arch Linux, Intel(R) Core(TM) i7-7700 (8) @ 4.20 GHz, 32 GB RAM, NVIDIA GeForce GTX 1070 8 GB. Laptop 1 – Windows 11, WSL2 (Arch Linux), Intel(R) Core(TM) Ultra 9 275HX (24) @ @ 3.07 GHz, 32 GB RAM, NVIDIA GeForce RTX 5080 Mobile 16 GB. Laptop 2 – Arch Linux, Intel(R) Core(TM) i7 8750H (12) @ 4.10 GHz, 16 GB RAM, NVIDIA GeForce GTX 1060 Mobile 6 GB. Documentation dataset API dataset Documentation: By far, the most important data, containing exhaustive human-written documentation pages. We convert the documentation RST files to Markdown with , and then we add some additional key files, like the project’s . Core Logic: Selected Java files that are representative of the brain of the engine (main loop, scene, renderer, etc.). Visual Logic: Selected shader files that define the look of the cosmos (stars, particles, PBR, etc.). Match the answer type: If the question doesn’t ask for code, don’t provide it. Grounding: Every claim must be directly grounded on the source text. Diversity: Every question must cover a different detail. Base text – This is composed by the raw strings in the variable. The file hint ( ) – We add hints depending on the filetype. The following table displays the hint for each type. Filetype Extensions Hint Java This is Java source code. Focus on class responsibilities, method signatures, and architectural patterns. Do NOT generate Python scripting examples. Python This is Python scripting code for the Gaia Sky API. Questions about usage and parameters are appropriate. Shader This is a GLSL shader. Focus on the rendering technique, uniforms, and mathematical operations. Do NOT generate Python scripting examples. Docs This is documentation. Focus on concepts, features, workflows, and user-facing features. The pair count ( ) – Contains the number of Q&A pairs to generate. The previous Q&A pairs, if any ( ) – This is constructed by listing the existing pairs, as parsed by the program in the output, or accumulated in the current run. The filepath and content ( , ) – Contain the file name and the actual content, which is capped to fit within the context length. Type A: The API reference These are direct “How do I use X?” pairs. They include the parameters, the return types, and a basic Python example. Type B: The task synthesis This step is optional, and I ended up not including it in the final dataset. However, I think it is still worth mentioning. I used the larger teacher dense model (27B) to generate complex tasks (e.g., “Write a script that navigates to Mars, waits 5 seconds, and takes a screenshot”). The script provided the teacher model with a safe list of real functions extracted in Step 1 as a sort of guardrail. If the teacher tried to hallucinate a command, the script flagged and discarded it. The results of this section were kind of underwhelming, possibly because more parameters are needed for such open-ended tasks. Type C: Adversarial error correction This is my favorite part. I programmatically broke the API calls to teach the model how to fix its own mistakes. The script would generate a wrong script (e.g., using instead of or missing a required argument) and then provide the correct version. The end goal was to prevent common LLM failures before they happen. Type D: The “gold standard” Library Finally, I indexed the actual test and showcase scripts from the Gaia Sky repository. These are human-written, battle-tested scripts that show the model how to handle complex logic, loops, and math. gaiasky-training-dataset@HuggingFace . LoRA Rank: 32 A balance between learning new patterns (like the Gaia Sky API) and retaining general knowledge. Target Modules: All major projection layers) Learning rate: \(2.0\times10^{-4}\) Optimizer: AdamW 8-bit Quantization: Converting the model to Q4_K_M GGUF , or whatever other quant. This reduces the model size enough that it can run on almost any modern laptop while keeping its capabilities mostly intact. HF upload: Automatically pushing the finished file to HuggingFace so the community can pull it directly into LM Studio or Ollama. gaiasky-qwen-3.5-gguf@HuggingFace . API Expertise: You know both the Legacy API (v1) and the Modern Component API (v2: camera, time, data, scene, graphics, camcorder, etc.). Prioritize the Modern API (v2) for new scripts unless Legacy is requested. Never invent method names, be 100% sure, otherwise suggest the user to look up the docs. Source Code: You understand the underlying Java architecture (libGDX, Ashley ECS, LWJGL3, OpenGL) and can explain how features like LOD, shaders, and data loaders are implemented. Documentation: You are familiar with the Gaia Sky capabilities, installation guides, and dataset management. Be Grounded: If a feature or method does not exist in your training data, do not invent it. State: “I don’t see that specific feature in the current Gaia Sky implementation.” Code Style: Always provide clean, commented Python or Java code. Use snake_case for Modern API calls. Tone: Professional, scientific, and enthusiastic about astrovisualization. Answer greetings warmly, then offer Gaia Sky-specific assistance. Interactive Exploration : Navigate the Milky Way, explore exoplanets, view the solar system in high detail, and time-travel through cosmic history using an intuitive interface. Scientific Visualization : Visualize complex astrophysical datasets (catalogs, simulations) with advanced rendering techniques like LOD (Level of Detail), shaders, and particle effects. Education & Outreach : Create immersive presentations or tutorials to teach astronomy concepts through dynamic 3D models of celestial objects. Scripting & Automation : Automate camera paths, data loading, and scene manipulation via Python (Modern API v2) or Java for reproducible workflows and research presentations. Data Analysis : Load and manipulate large-scale astronomical datasets directly within the application to analyze spatial distributions without external tools. Download the Installer Visit https://gaiasky.space/download and download the package for your architecture (e.g., ). Install via Package Manager Open a terminal and run: If dependency errors occur, fix them with: Launch Gaia Sky Run the command: The application will start in your default browser or terminal window. Install via AUR Helper : Open your terminal and run: This will automatically resolve dependencies (Java, OpenGL libraries) and install the application. Verify Installation : Once installed, launch it with: Clone the Repository : Build and Install : Use Maven to build the project (requires Java Development Kit): After building, you can run it directly from the directory or install it using the provided Gradle wrapper if available in newer versions. Dependencies : Ensure your system has up-to-date OpenGL drivers and Java (JDK 17+ is recommended). The AUR package typically handles these automatically. Data Location : By default, datasets are stored in . You may want to create symbolic links from if you prefer system-wide data management. Ensure you have at least 4GB of free disk space for datasets and cache files. For GPU acceleration, verify that your graphics drivers support OpenGL 4.3+. Application : Used specifically for planet textures where high fidelity is required across all latitudes and longitudes. Structure : Consists of six square images (front, back, left, right, top, bottom) that are algorithmically mapped onto a sphere. Rendering Engine : The underlying rendering engine uses these cubemap textures to apply surface details dynamically as the planet rotates or changes perspective. Low parameter count not being able to retain the information properly. Very low number of test and showcase scripts used in fine-tuning. Bad design of the training dataset. Expanding the test and showcase scripts used in Type D training. Generating more synthetic scripting examples in Type A (API reference) with better coverage of edge cases and parameter variations. Adding adversarial examples (Type C) for the most commonly confused API patterns.

0 views
マリウス 2 weeks ago

Updates 2026/Q1

This post includes personal updates and some open source project updates. 안녕하세요 and greetings from Asia! Right now I’m in Seoul, Korea. I’ll start this update with a few IRL experiences regarding my time here and some mechanical keyboard related things. If you’re primarily here for the technical stuff, you can skip forward or even skip all of the personal things and jump straight to the open source projects . With that said, let’s dive straight into it. Seoul has been one of the few places that I genuinely love coming back to. I cannot pinpoint why that is, but there’s a particular rhythm to the capital that’s hard to explain until you’ve lived in it for a while. Not the tourist rhythm, where you tick off palaces and night markets to “complete your bucket list” but the deeper, slower one that makes the city truly enjoyable. The rhythm of picking a neighborhood, learning its backstreets, finding your morning coffee spot, and then finding a different one the following week. I spent my time here doing exactly that, and what follows are some honest reflections on a city that continues to surprise me. As some of you might know by now, I’m basically the Mark Wiens of coffee, because I travel for coffee , except that I don’t film myself and put it online. But I’ve surely had a lot of coffee, in a lot of cities. However, Seoul’s coffee scene operates on a completely different level. The sheer density of independently run coffee shops is staggering. Within a fifteen-minute walk in neighborhoods like Mangwon , Hapjeong , or Sangsu , you can pass dozens of places where someone is carefully dialing in their espresso, roasting their own beans, and serving a beautifully made Americano for usually around three or four thousand KRW . That’s roughly two to three US dollars for a genuinely excellent cup of coffee, which is a pretty solid value proposition. I’ve been in Seoul before, multiple times actually, and I had the chance to find genuinely great cafes which I kept on my list of places to revisit whenever I would happen to come back. And so I did. But as life moves forward, places change or, in more unfortunate circumstances, even close down for good. das ist PROBAT is one of the places that sadly closed just a few days before I arrived. In its spot is now a new Ramen restaurant that seemed fairly popular. A few other places I’d loved on previous visits and that are still operating left me genuinely disappointed this time around. Compile Coffee was one of the sharper letdowns. Two years ago, it was a highlight. This time, however, the experience felt rushed and careless. The barista hurried through the ordering process, despite no one else waiting in line, and the cappuccino that followed was a spectacle for all the wrong reasons. The milk was frothed to an almost comical extreme, the liquid poured in first, then the foam scooped in one spoonful at a time, and finally a thick layer of chocolate powder on top that I hadn’t asked for. It felt like watching a car accident happening slowly enough for every detail to remain stuck in one’s head, yet too fast to articulate anything about it. I gave the place another try a few weeks after this incident only to experience a similarly rushed and somewhat unloving execution. Another change that I hadn’t seen coming was Bean Brothers in Hapjeong . The coffee house converted from their old industrial-style space to a noticeably more polished and… well, “posh” one. The new spot is nice enough, but the vibe has shifted towards a more upscale, less alternative one. In addition, they also opened up a new location in Sangsu , which leans further in that direction, with wait times for walk-ins that suggest a clientele they’re specifically courting. Bean Brothers seems to be evolving into a streamlined, upscale chain, and while that’s not inherently bad, it’s a different thing from what originally made it special. And last but not least, there’s Anthracite Coffee Roasters , specifically the Seogyo location , which had been one of my absolute favorite spots back in 2023. It pains me to say this, but the place has become a ripoff, with this specific location charging eight thousand KRW for a hot (drip coffee) Americano to go. For context, the healthy food chain Preppers serves a full meal consisting of a big portion of rice and a protein, as well as some greenery, for 8,900 KRW. The cup of drip coffee at Anthracite is only halfway full, and most of the time it arrives already lukewarm, which makes it essentially useless as a to-go option, unless all you want is to gulp down around 120ml of coffee. You’d think a place charging premium prices would at least discount a thousand Won for takeaways, as many Seoul cafes do. The Seogyo location’s commitment to drip coffee not only makes it feel somewhat pretentious considering the prices, but also adds a whole other layer of issues. During peak hours, the wait is considerable, and the coffee menu is limited to a small rotation of options that, more often than not, skew toward the acidic side of the spectrum. If that’s your preference, there’s nothing wrong with that. But when combined with the pricing, the lukewarm temperatures, and the half-filled cups, the experience increasingly feels like you’re paying for a brand name rather than a good cup of coffee. However, the beautiful thing about Seoul’s coffee culture is that for every established spot that drifts toward becoming another Starbucks experience, ten new places pop up that more than make up for it. The ecosystem is relentlessly self-renewing. In the same neighborhood as Anthracite ’s Seogyo location, I discovered a handful of places that are not only better in the cup, but dramatically more affordable: These are only a handful of places that I think of off the top of my head, but rest assured that there are plenty more. The quiet confidence of people who care about the craft without needing to perform it is what makes these places special. No gimmicks, no inflated prices justified by whatever interior design. Just friendly people and good coffee that’s made well and respects the customer. The time in Seoul reinforced what I already knew from past visits. This city is one of the best places in the world to simply be in. The neighborhoods are endlessly walkable, the infrastructure works beautifully (with the exception of traffic lights and escalators, but more on that in a bit), and the coffee culture, despite the occasional disappointment from places that have lost their way, remains one of the richest and most dynamic I’ve encountered anywhere. The disappointments, if anything, make the discoveries sweeter. The food also deserves a mention. Seoul is one of those cities where even a quick, unremarkable lunch tends to be delicious and more often than not at a sane price, judging from a global perspective. Compared to other capital cities like London or, worse, Madrid , in which food prices are frankly absurd, especially when taking the generally low quality into account, the cost of food in Seoul still strikes me as overall reasonable. Unlike for example Madrid , which is an almost homogenous food scene, Seoul offers incredibly diverse options, ranging from traditional Korean food, all the way to Japanese, Thai, Vietnamese and even European and Latin American food. And while the Italian pasta in many places in Seoul might not convince an actual Italian gourmet, it suddenly becomes a very high bar to complain about dishes that originate as far as twelve thousand kilometers/seven thousand miles away and that have almost no local cultural influence . Another beautiful thing about Seoul, at least for keyboard -enthusiasts like I am, is the availability of actual brick-and-mortar keyboard stores. Seoul is home to three enthusiast keyboard shops: Funkeys , SwagKeys , and NuPhy . The first two are local vendors that have physical locations across Seoul, the latter is a Hong Kong-based manufacturer of entry-level enthusiast boards that just opened a showroom in Mapo-gu . I took the time to try to visit each of them and I even scooped up some new hardware. The Funkeys store is located in the Tongsan district, on the second floor of a commercial space. The store is relatively big and stocks primarily FL-Esports , AULA , and 80Retros boards, keycaps and switches, but you can also find a few more exclusive items like the Angry Miao CyberBoard . I seized the opportunity to test (and snap up) some 80Retros switches, but more on that further down below. SwagKeys is probably a name that many people in the keyboard enthusiast community have stumbled upon at least once. They are located in the Bucheon area and they used to have a showroom, which I tried to visit. Sadly, it wasn’t clear to me that the showroom was temporarily (permanently?) closed, so I basically ended up standing in front of locked doors of an otherwise empty space. Luckily, however, SwagKeys have popup stores in different malls, which I have visited as well. Unfortunately in those popup stores they only seem to offer entry-level items; Enthusiast products are solely available through their web shop and cannot be ordered and picked-up at any of their pop-up locations. I was curious to test and maybe get the PBS Modern Abacus , which SwagKeys had in stock at that time, but none of the pop-ups had it available. Exclusive SwagKeys pop-up. This is a shared space with plenty of other brands to choose from. The NuPhy showroom in the Mapo-gu area is a small space packed with almost all the products the brand offers, from keyboards, over switches and keycaps all the way to accessories and folios /bags. However, the showroom is exactly that: A showroom. There’s no way to purchase any of the hardware. As with almost everything in Seoul, your best bet is to order it from NuPhy’s official Korean store, which accepts Naver Pay . Apart from Funkeys , SwagKeys and NuPhy , there are various brands (like Keychron , Razer and Logitech ) that can be found across in-store pop-ups in different malls. It’s interesting to see a society like the one in Seoul, that has largely moved away from offline-shopping for almost everything but fashion (more on this in a moment) having that many shops and pop-ups selling entry-level mechanical keyboards. I guess with keyboards being something in which haptics and personal preference play a big role, it makes sense to have places for people to test the various boards and switches, even if most of them will ultimately only sell the traditional Cherry profiles. Speaking of mechanical keyboards, I happened to be in the right place at the right time this year to visit the Seoul Mechanical Keyboard Expo 2026 at the Seoul Trade Exhibition Center ( SETEC ) in the Gangnam area. It was an interesting experience despite being less of a traditional enthusiast community event and more of a manufacturer trade fair targeting average users. Because yes, the average user in Korea does indeed seem to have a soft-spot for mechanical keyboards. This, however, meant that most vendors would primarily showcase the typical mainstream products, like Cherry profile keycaps and boards that are more affordable. For example while Angry Miao were around, their Hatsu board was nowhere to be seen. And it made sense: Every vendor had little signs with QR codes that would lead to their store’s product page for people to purchase it right away. Clearly, the event was geared more toward the average consumer than the curious enthusiast. It was nevertheless interesting to see an event like this happening in the wild . Getting around is different in Seoul than it is in other cities. If you’re navigating Seoul with Google Maps , you’re doing it wrong. Naver Map is simply superior in every way that matters for daily life here, although this might soon change . Not only does Naver show you where the crosswalks are, something you don’t realize you need until you’ve jaywalked across six lanes of traffic because Google told you the entrance was “right there” , but it also shows last order times for restaurants and cafes, saving you from going to places only to find out they’re not serving anymore. And public transit arrival times? Accurate to a degree that feels almost unsettling. You trust Naver , because it earns that trust. Clearly, however, me being me , I only used Naver without an account and on a separate profile on my GrapheneOS phone . Also, I mostly use it for finding places and public transit; For everything else CoMaps works perfectly fine, and I take care to contribute to OSM whenever I can. Note: The jaywalking example isn’t too far-fetched. You’re very tempted to cross at red lights simply because traffic light intervals in Seoul are frankly terrible. As a pedestrian you age significantly waiting for the stoplight to finally turn green. If you’re unlucky, you’re at a large crossing that is followed by smaller crossings, which for reasons I cannot comprehend turn green for pedestrians at the exact same time. Unless you are Usain Bolt there is no way to make it across multiple crossings in one go, leading you to have to stop at every crossing for around three minutes. That doesn’t sound like much, until you’re out at -15°C/5°F. Seoul has too many pedestrian crossings with traffic lights, and too few simple marked crosswalks. This is however probably due to drivers often not giving a damn about traffic rules and almost running over people trying to cross at regular marked crossings. My gut feeling tells me that, because of the indifference of drivers, the government decided to punish every traffic participant by building traffic lights at almost every corner. However, this didn’t have the (supposedly) intended effect, as especially scooters, but also regular cars often couldn’t care less about their bright red stop light. Considering the amount of CCTVs (more on this in just a second) one could assume that traffic violations are being enforced strictly. However, judging by the negligence of drivers towards traffic rules I would guess that this is probably not happening. Circling back to the painfully long waiting times at crossings, that are only outrivalled by painfully slow escalators literally everywhere, a route for which CoMaps estimates 10 minutes can hence easily become a 20 minute walk. Naver , however, appears to be making time estimations based on average waiting times at crossings, leading to it being more accurate than CoMaps in many cases. With Naver being independent of Google , it works without any of the Google Play Services bs that apps often require for anything related to location. And don’t get me wrong, Naver is just as much of an E Corp as Google , but there’s something worth appreciating on a broader level here. Korea built and maintains its own mapping platform rather than ceding that ground to US big tech, and it shows. Naver Map is designed by people who actually navigate Korean cities, and that local knowledge is baked into every interaction. I would love to see more countries doing the same, especially European ones. While there is Nokia HERE Maps HERE WeGo in Europe, it’s as bad for public transport as you might expect from a joint venture between Audi , BMW and Mercedes-Benz , and it is not at all comparable to Naver Maps , let alone Naver as a whole. One big caveat with Naver , however, is that it will drain your battery like a Mojito on a hot summer evening, so it’s essential to carry a power bank . Even on a Pixel 8 , the app feels terribly clunky and slow. In addition, the swiping recognition more often than not mistakes horizontal swipes (for scrolling through photos of a place) for vertical swiping, making it really cumbersome to use. I assume that on more modern Samsung and Apple devices the app probably works significantly better, as the Korean market appears to be absolutely dominated by these two brands. As a matter of fact, the Google Pixel is not even being sold in Korea, which brings me to one important aspect of life in Seoul that might be interesting for the average reader of this site. As much as I enjoy Seoul, it is an absolute privacy disaster. CCTV cameras in Seoul are everywhere and the city government actively expands and upgrades them as part of its public-safety and smart city initiatives. The systems are “AI” -enabled and can automatically detect unusual behavior or safety risks . It’s hard to find a definitive number, but it’s estimated that Seoul is covered with around 110,000 to 160,000 surveillance cameras, with an ongoing expansion of the network. This makes Seoul one of the most surveilled major cities in the world. In addition to CCTV surveillance, Seoul is also almost completely cashless. Most places only accept card/NFC payments with cash payments being a highly unusual thing to do. While there are still ATMs around, getting banknotes is almost pointless. You can top up your transit card using cash, and you might be thinking that at least this way nobody knows who owns the card and you cannot be tracked, but with the amount of “AI” cameras everywhere, there’s no need to track people using an identifier as primitive as a transit card. Speaking of which, mobile connectivity is another thing. In Korea SIM cards are registered using an ID/Passport. From what I have found, there’s no way to get even just a pre-paid SIM without handing over your ID. In addition, with everything being cashless, your payment details are also connected to the SIM card. You could of course try to only use the publicly available WiFi to get around and spare yourself the need for a SIM card. However, the moment you’d want to order something online, you will need a (preferably Korean) phone number that can retrieve verification SMS and you might even need to verify your account with an ID. You might think that this doesn’t really matter because online shopping isn’t something vital that you have to do. But with Seoul being almost completely online in terms of shopping you cannot find even the most basic things easily in brick-and-mortar stores. For example, I was looking to upgrade my power brick from the UGREEN X757 15202 Nexode Pro GaN 100W 3-Port charger that I’ve been using for the past year to the vastly more powerful UGREEN 55474 Nexode 300W GaN 5-Port charger. I bought the 3-Port Nexode last year during my time in Japan , in a Bic Camera . However, in Seoul it was impossible to find any UGREEN product. In fact, I could not find any household name products, like Anker or Belkin , regardless of where I looked. Everyone kept telling me to look online, on Naver or Coupang . Short story long, to be able to live a normal life in Seoul you will unfortunately have to hand over your details at every corner. Note : Only one day before publishing this update, the popular Canadian YouTuber Linus Tech Tips uploaded a video titled “Shopping in Korea’s Abandoned Tech Mall” , which perfectly captures the sad state of offline tech stores in Seoul. What I found more shocking than this, however, is that it doesn’t seem like privacy concerns are part of the public discourse. The dystopian picture that people in the Western hemisphere paint in literature and movies, in which conglomerates run large parts of society and the general population are merely an efficient workforce and consumers isn’t far off from how society here appears to be working. At the end of February I ran into an issue that I had seen before : Back then, I attributed it to either alpha particles or cosmic rays, as I was unable to reproduce the issue nor reliably find bad regions in the RAM. This time, however, my laptop was crashing periodically, for seemingly no reason at all. After running the whole playbook of and to verify the filesystem, as well as multiple rounds of the , I found several RAM addresses that were reported faulty. I decided to seize the opportunity and publish a post on BadRAM . At this point, I removed one of the two 32GB RAM sticks and it appears to have helped at least somewhat: The device now only crashes every few hours rather than every twenty or so minutes. But with RAM and SSD prices being what they are, I’m not even going to attempt to actually fix the issue. After all, it might well be that whatever is causing the buzzing sound I’ve been hearing on my Star Labs StarBook has also had an impact on the RAM modules or even the logic board. I’m going to hold on to this hardware for as long as possible, but I’ve also realized that the StarBook has aged quicker than I anticipated. I have therefore been glancing at alternatives for quite a while now. I love what Star Labs has done with the StarBook Mk VI AMD in terms of form factor and Linux support. Back when I bought it , the Zen 3 Ryzen 7 5800U had already been on the market for almost 4 years and wasn’t exactly modern anymore. However, its maturity gave me hope that Linux support would be flawless (which is the case) and that Star Labs would eventually be able to deliver on their promises. When I purchased the device, Star Labs had advertised an upcoming upgrade from its American Megatrends EFI (“BIOS”) to Coreboot , an open-source alternative. Years later, however, this upgrade is still nowhere to be seen . At this point it is highly unlikely, that Coreboot on the AMD StarBook will ever materialize. As already hinted exactly one year ago I’m done waiting for Star Labs and I am definitely not going to look into any of their other (largely obsolete) AMD offerings, especially considering the outrageous prices. I’m also not going to consider any of their StarBook iterations, whether it’s the regular version, or the Horizon , given that none of them come with AMD CPUs any longer, and, more importantly, that their Intel processors are far too outdated for their price tags. Let alone all the quirks the Star Labs hardware appears to be having, and the firmware features that sometimes make me wonder what the actual f… the Star Labs people are smoking. Note : The firmware update lists the following update: * Remove the power button debounce (double press is no longer required) “Power button debounce” is what Star Labs calls the requirement to double-press the power button in order to power on the laptop when it is not connected to power. It is mind-boggling that this feature made it into the firmware to begin with. Who in their right mind thought “Hey, how about we introduce a new feature with the coming firmware update which we won’t communicate anywhere, which requires the user to press the power button quickly twice in a row for their device to power on, but only when no power cable is connected? And how about if they only press it once when no power cord is attached the device simply won’t boot, but it will nevertheless produce a short audible sound to make it seem like it tried to boot, but in reality it won’t boot?” …? Because this is exactly what the “power button debounce” was about. I believe it got introduced sometime around , but I can’t really tell, because Star Labs didn’t mention it anywhere. Short story long, instead of spending more money on obsolete and quirky Star Labs hardware, I have identified the ASUS ExpertBook Ultra as a potential successor. The ExpertBook Ultra is supposed to be released in Q2 in its highest performance variant, featuring the Intel Core Ultra X9 Series 3 388H “Panther Lake” processor, running at 50W TDP and sporting up to 64 GB LPDDR5x memory, which is the model that I’m interested in. I will wait out the reviews, specifically for Linux, but unless major issues are to be expected I’ll likely upgrade to it. “Wait, aren’t you Team Red?” , you might be wondering. And, yes, for the past decade I’ve been solely purchasing AMD CPUs and GPUs, with one exception that was a MacBook with Intel CPU. However, at this point I’m giving up on ever finding an AMD-based laptop that fits my specs, because sadly with AMD laptops it’s always something : Either the port selection sucks, or there’s no USB4 port at all, or if there is it’s only on one specific side, or the display and/or display resolution sucks, or the battery life is bad, or you can only get some low-TDP U variant, or the device is an absolute chonker, or or or. It feels like with an AMD laptop I always have to make compromises at a price point at which I simply don’t want to have to make these compromises anymore. So unless AMD and the manufacturers – looking specifically at you, Lenovo! – finally get their sh#t together to build hardware that doesn’t feel like it’s artificially choked, I’m going back to Team Blue . “Panther Lake” seems to have made enough of a splash, TDP-performance-wise, that it is worth considering Intel again, despite the company’s history of monopolistic business tactics, its anti-consumer behavior, its major security flaws, its quality control issues, and its general douchebag attitude towards everything and everyone. The ASUS ExpertBook Ultra appears to feature the performance that I want, with all the connectivity that I need, packaged in a form factor that I find aesthetically pleasing and lightweight enough to travel with. If the Intel Core Ultra X9 388H notably exceeds the preliminary benchmarks and reviews of the Intel Core Ultra X7 358H version of the ExpertBook Ultra , then I’m “happy” to pay the current market premium for a device that will hopefully hold up for much longer and with fewer quirks than I’ve experienced with the StarBook . With a Speedometer 3.1 rating of around 30 and reporting 11:25:05 hours for on my current device, however, I’m fairly certain that even the X7 358H will be a significant improvement. “Did you hear about the latest XPS 14 & 16 from Dell? They also come with Panther Lake!” , I hear you say. See here and there on why those are seemingly disappointing options. The tl;dr is that Dell only feeds them 25W (14") / 35W (16"), instead of the 45W that ASUS runs the CPU at. I can’t tell for sure how long I’ll be able to continue working on the StarBook . While I can do the most critical things, the looming threat of data-corruption and -loss is frightening. The continuous crashes also introduce unnecessary overhead. I’m hoping for ASUS to make the ExpertBook Ultra available rather sooner than later, but if there’s no clarity on availability soon I might have to go with a different option. Ultrabook Review luckily has a full list of Panther Lake laptops to help with finding alternatives. What’s the second best thing that can happen when your computer starts failing? Exactly: Your phone (slowly) dying. It appears that the infamous Pixel 8 green-screen-of-death hit my GrapheneOS device, making it almost impossible to use it. Not only does the display glitch terribly, but it appears that the lower bottom part of the phone gets abnormally hot. When the glitching began, it would be sufficient to literally slap the bottom part of the phone and it would temporarily stop glitching. Sadly, the effectiveness of this workaround has decreased so much over time that now I basically need to squeeze the bottom part of the phone for the glitching to stop. The moment I decrease force, the screen starts glitching again. My plan was to keep the Google Pixel 8 for the next few years and eventually move to a postmarketOS /Linux phone as soon as there will be a viable option. Sadly it seems that I’m going to have to spend more money on Google’s bs hardware to get another GrapheneOS device for the time being. Unfortunately Google is not selling the Pixel devices across Asia, making it hard to find an adequate replacement for the phone right now. I might just have to suck it up and wait until I’ll pass by a region in which Pixel devices are more widely available. Of course, I luckily brought backups , although those run malware and are hence less than ideal options. My Anker Soundcore Space Q45 have died on me during a flight, for absolutely no reason at all. I purchased them back at the end of May 2024 and now, after not even 2 years it appears that the electronics inside of them broke in a way in which the headphones cannot be turned off or on again. They seem to be in a sort-of odd state in between, in which pressing e.g. the ANC button does something and makes the LED light up, but there’s no Bluetooth connectivity whatsoever. When connecting them via USB-C to power or to another device, the LED changes dozens of times per second between white and red. Holding the power button makes the LED turn on (white) but nothing else. The moment the power button is let go, the LED turns back off. This is yet another Anker product that broke only shortly after its warranty expired and I’m starting to see a common theme here. Hence, I will avoid Anker products going forward, especially given the tedious support that I had experienced in the past with one of their faulty power banks. I still use the Soundcore headphones via audio jack, as this luckily works independently of the other electronics. To avoid anything bad happening, especially during flights, I opened the left earcup and removed the integrated battery. The USI 2.0 stylus that I had bought back in mid September of 2024 from the brand Renaisser is another hardware item that has pretty much died. It seems like the integrated battery is done, hence the pen doesn’t turn on anymore unless a USB-C cable is connected to it to power it externally. While I’m still using it, it is slightly inconvenient to have a relatively stiff USB-C cable pull on the upper end of the pen while writing or editing photos, which is what I use the pen primarily for. As mentioned in the Seoul part, I picked up a handful of mechanical keyboard-related items, namely MX switches for my keyboard(s) . KTT x 80Retros GAME 1989 Orange , 40g (22mm KOS single-stage extended, bag lubed with Krytox 105 ), lubed with Krytox 205G0 . 80Retros x HMX Monochrome , 42g (48g bottom out), LY stem, PA12 top housing, HMX P2 bottom housing, 22mm spring, factory lubed, 2mm pre-travel, 3.5mm total. I invested quite some time in pursuing my open source projects in the past quarter, hence there are a few updates to share. This quarter I have finally found the time to also update my feature and make it work with the latest version of Ghostty , the cross-platform terminal emulator written in Zig. You can use this commit if you want to patch your version of Ghostty with this feature. It is unlikely that the Ghostty team is ever going to include this feature in their official release, yet I’m happy to keep maintaining it as it’s not a lot of code. I have updated and it now supports a new flag (that does not support), which makes it possible to build a complete power management policy directly through command-line arguments. I have documented it in detail in the repository , but the idea is that the flag allows executing arbitrary shell commands when the battery reaches a specific percentage, either by charging or discharging. The flag takes three arguments: For , the command fires when the battery percentage drops to or below the given value. For , it fires when the percentage reaches or exceeds it. The command fires once when the condition is met and will only fire again after the condition has cleared and been met again. Additionally, the flag can be specified multiple times to define different rules. This makes it possible to build a complete power management policy, from low-battery warnings to automatic shutdown, without any external scripts or configuration files. The benefit this has over, let’s say, rules, is that script execution as the current user is significantly easier, less hacky and poses fewer overall security risks, as does not need to (read: should not ) be run in privileged mode. Another one of my Zig tools that got a major update is , the command line tool for getting answers to everyday questions like or more importantly . The new version has received an update to work with Zig 0.15.0+ and its command line arguments parser logic was rewritten from scratch to be able to handle more complex cases. In addition, is now able to do a handful of velocity conversions, e.g. . As a quick side note, alongside the Breadth-first search implementation that it is using, , has also been updated to support Zig 0.15.0+. I had some fun a while ago building an XMPP bot that’s connected to any OpenAI API (e.g. ) and is able to respond when mentioned and respond to private messages. It preserves a single context across all messages, which might not be ideal in terms of privacy, but it is definitely fun in a multi-user chat – hey, btw, come join ours! The code is relatively crude and simple. Again, this was a just a two-evening fun thing, but you can easily run the bot yourself, check the README and the example configuration for more info. The work on my new project, ▓▓▓▓▓▓▓▓▓▓▓, which I had announced in my previous status update sadly didn’t progress as quickly as I was expecting it to, due to (amongst other things) the RAM issues that I’ve had to deal with. It also turns out that when writing software in 2026, everyone seems to expect instant results, given all the Codexes and Claudes that are usually being employed these days to allow even inexperienced developers to vibe code full-blown Discord alternatives within shorts periods of time. However, because I don’t intend to go down that path with ▓▓▓▓▓▓▓▓▓▓▓, it will sadly take some more time for me to have a first alpha ready. To everyone who reached out to offer their help with alpha testing: You will be the first ones to get access as soon as it’s ready. Kauf Roasters : A roastery with a clear focus on simplicity and quality without pretension. Identity Coffee Lab : This one stunned me. A hot Americano to go for 3,000 KRW. That’s almost a third of what Anthracite charges. And the coffee isn’t just cheaper, it is significantly better! It’s a bigger cup, it’s notably less acidic, and, here’s the part that really got me, it comes out steaming hot and stays that way for a good twenty minutes. You can actually walk around and sip it casually, even in freezing cold temperatures, just the way a to-go coffee is meant to be enjoyed, instead of gulping it down before it turns into cold brew. Oscar Coffee Booth : This became a personal favorite. Another spot where the coffee is serious, the price is fair, and nobody is trying to impress you with anything other than a well-made drink. On top of that the owner is a genuinely kind person. : Either (aliases: , ) or (aliases: , ) : The battery level (number from 0 to 100) : The shell command to execute

0 views
ava's blog 2 weeks ago

rose ▪ bud ▪ thorn - march 2026

Reply via email Published 31 Mar, 2026 I was featured as a Country Reporter on noyb's channels! My summaries made it into their newsletter 4 times this month. I reached Gold Status in my volunteering (20+ summarized and translated decisions for GDPRhub) now. Next up is the Magenta Status at 35+ :) I've written 4 exams this month; if I'd pass all of them, that's 30 ECTS! I think I'll pass 3. Switched away from Discord . I have no issue with being classified as a teen on the platform because it doesn't stop me from doing anything, but the move fit in with living my actual values like I do with other tech/media things (preferring open source, EU, etc.). I'm both on Matrix and Fluxer. Did some spring cleaning, like clearing out the fridge, wiping the inside, and rearranging the contents, together with throwing away expired toiletries, putting like 2 years of used batteries in the battery collection bin, decluttering a drawer, and vacuuming under and behind the sofa and bed. I've really felt like pouring extra energy into my looks lately. Got back into oil massages for my scalp, hair treatments, sheet masks, teeth bleaching, and got my nails done again (after going natural since December) and got a pedicure, too. I bought new dress pants that are so insanely comfortable, good looking and flattering, it's ridiculous! My yearly gyn checkup came back fine, and I finally caved and got proper treatment for my PCOS and endometriosis. I went out for some runs in the late evening :) haven't run outside in ages, I usually limit it to the treadmill. I went out to parks and forests , enjoying the weather and my free time after the exams. It was super healing and relaxing. Journaled more. Went to a vegan food fair. I applied to a job opening sent to me by a fellow blogger (James) and got an interview !!! I think I did well :) Upcoming: More decluttering and selling, tidying up the basement. Planning to go to two museum exhibitions soon before they close. Gonna go on vacation with two friends for 8 days next month! Booked tickets for an upcoming data protection event. Working on business cards (and maybe stickers?) for it. I've had some issues with my illnesses . :( The stress of intense studying most of February and March, weird weather changes, straining work stuff, eating a little too much sugar, the family situation, and starting two new medications this month sent my body over the edge. That made my fitness goals and studying a bit harder. I also unfortunately didn’t taper off a bigger dose of an anti-anxiety med I occasionally take as needed and accidentally caused agonizing withdrawal symptoms without realizing in time 🥴 I cut contact to last family member I was still talking to. It's stressful to withstand all the attempts to reach out to me, and to stick with the decision without guilt. My wardrobe is stressing me a little. I preferred not to own much. Unfortunately, the less you have, the more you wear the same things, the more they get washed and worn out. At some point, you want to replace a lot of it at the same time. That's not only financially hurtful, but also annoying when you have the goal to sew most of your clothes yourself, and you currently neither have the time nor the energy to buy fabric and sew the things you need. I am annoyed at walking into these fast fashion places, seeing nothing I like, then forcing myself to look at stuff more closely and everything is XS, feels like a trash bag, and costs too much for how flimsy and unethical it is. I'll have to try my luck with thrifting more, but even that has been overrun with Shein trash. If I make it to the second interview round, I might have to deny it. I like the company, they’re a great and respected employer, generous, and the interview was fun… but there are some dealbreakers for me, which hurts. I sat with it after, and slept over it now, and I just don’t think I’ll be happy in these circumstances. :( I wish it wasn’t so, because they were in the Top 3 of places I’d wanna work at, and I want a job in data protection badly. But it doesn’t feel right, and I can’t justify moving forward with it, all things considered. It feels like the wrong time for me. Maybe another open position in a couple years?

0 views
Brain Baking 2 weeks ago

App Defaults In March 2026

It’s been almost three years since sharing my toolkit defaults (2023) . High time to report an update. There’s a second reason to post this now: I’ve been trying to get back into the Linux groove (more on that later), so I’m hoping to either change the defaults below in the near future or streamline them across macOS & Linux. When the default changed I’ll provide more information; otherwise see the previous post as linked above. Some more tools that have been adapted that don’t belong in one of the above categories: A prediction for this post in 2027: all tools have been replaced with Emacs. All silliness aside; Emacs is the best thing that happened to me in the last couple of months. Related topics: / lists / app defaults / By Wouter Groeneveld on 29 March 2026.  Reply via email . Backup system : Still Restic, but I added Syncthing into the loop to get that 1-2-3 backup number higher. I still have to buy a fire-proof safe (or sync it off-site). Bookmarks and Read It Later systems : Still Alfred & Obsidian. Experimenting with Org-mode and org-capture; hoping to migrate this category to Emacs as well. Browser : Still Firefox. Calendar and contacts : Still Self-hosted Radicale. Chat : Mainly Signal now thanks to bullying friends into using it . Cloud File Storage : lol, good one. Coding environment : For light and quick scripting, Sublime Text Emacs! Otherwise, any of the dedicated tools from the JetBrains folks. and can only do so much; it’s dreadful in Java. Image editor : Still ImageMagick + GIMP. Mail : Apple Mail for macOS for brainbaking Mu4e in Emacs! and Microsoft Outlook for work Apple Mail for the work Exchange server. I didn’t want to mix but since Mu cleared up Mail, that’s much better than Outlook. Music : Still Navidrome. Notes : Still pen & paper but I need to remind myself to take up that pen more often. Password Management : Still KeePassXC. Photo Management : Still PhotoPrism. I considered replacing it but I barely use it; it’s just a photo dump place for now. Podcasts : I find myself using the Apple Podcast app more often than in 2023. I don’t know if that’s a bad thing—it will be if I want to migrate to Linux. Presentations : Haven’t found the need for one. RSS : Still NetNewsWire but since last year it’s backed by a FreshRSS server making cross-platform reading much better. Android client app used is Randrop now, so that’s new. Spreadsheets : For student grading, Google Sheets or Excel if I have to share it with colleagues . My new institution is pro Teams & Office 365. Yay. Text Editor : I’m typing this Markdown post in Sublime Text Emacs. Word Processing : Still Pandoc if needed. Terminal : emulator: iTerm2 Ghostty, but evaluating Kitty as well (I hated how the iTerm2 devs shoved AI shit in there); shell: Zsh migrated to Fish two days ago! The built-in command line option autocomplete capabilities are amazing. Guess what: more and more I’m using eshell and Emacs. Karabiner Elements to remap some keys (see the explanation ) I tried out Martha as a Finder alternative. It’s OK but I’d rather dig into Dired (Emacs)—especially if I see the popularity of tools like that just steal Dired features. I replaced quite a few coreutils CLI commands with their modern counterparts: now is , now is , now is , now is , and can be used to enhance shell search history but Fish eliminated that need. AltTab for macOS replaces the default window switcher. The default didn’t play nice with new Emacs frames and I like the mini screenshot.

1 views
Chris Coyier 2 weeks ago

Hawai’i

I’m just back from the United States 50th state, a staggering 2,500 miles from the mainland. For the next week or two, I’ll pronounce it Ha-Vie-ee, like how it’s pronounced in the native Hawaiian language. A language, by the way, that only a few thousand people speak natively, no doubt due to the 91 years (1896-1987) where there was “strict physical punishment” for speaking it in schools. We humans are pretty damn uncool to each other sometimes. Ruby and I travelled there ( again! ) with some wonderful family friends, Matt, Becky, and their kids, Monroe and Zoey. A nice reminder of how rare and lovely it is to have a situation where the kids are friends, and the adults are friends, and everyone travels together well. We stayed in a villa at the Fairmont Kea Lani on Maui. I’ve been to Hawaii before, but this was my first time on Maui. It was a beautiful place to stay. A beautiful property and buildings right on the beach. The villa had two spacious rooms, a full kitchen, and a living room with a pull-out couch, on which all the kids slept together. I’ve stayed at fancy resorts before where the staff uses special greetings with guests. But in Hawaii, naturally, it’s “Aloha.” Probably because, ya know, a real word, and basically the whole brand of Hawaii. But I just can’t shake the feeling that it’s kinda cheesy. Like, do Hawaii long-timers say Aloha to each other? Like it’s 5:21 am and a local is getting a coffee at the gas station in a local neighborhood, do they say Aloha to the cashier? Do they get an Aloha back? I kept meaning to ask this of locals, but kept forgetting. Or not having the exact 1.5 beers in me it takes to reach that perfect level of fun and charm to ask strangers semi-intimate questions. If I were forced to guess, I’d guess Aloha is more of a thing they have to do at work with the tourists. Like your boss side-eyes you if you just say “Hello, good morning” instead. I never said it back, which felt weird. My goal was kind of a winkwink, it’s cool , you don’t have to do the cheesy tourist thing with me, I very promise I don’t care. The first night, we got checked in and b-lined it to Monkeypod . We’d all been there before (at a different location) and have talked about it endlessly. It’s a micro-chain with 4 locations across two islands. It’s just: great. They make a Mai Tai with Honey Liliko‘i Foam on the top which I have fond memories, and it was every bit as good as I remembered. I had wings and mahi-mahi tacos. 10/10. I never get the fish. I don’t like fish. I like specific little bits of seafood once in a while, but rarely cooked slabs of fish. So on that very first night, I decided I’d get fish every night on this trip. Maybe if I try enough of it, I’ll come around. It didn’t work. I struck out more times than I hit. But no big regrets. I tried. Timing-wise, it wasn’t the absolute perfect time to be in Hawaii. But it was spring break for our school district, so C’est la Vie. Unprecedented rain with some flooding. A rather ironic situation after the horrible fires just a few years back. We were watching the weather and reading the news weeks in advance, but things didn’t seem dire enough to cancel the trip. Honestly, some overcast weather isn’t the worst. None of us left with sunburns. It allows you to hang out outside all day, which you just can’t do on full-sun days, as it exhausts you. The first full day turned out to be one of the rainiest, and we spent most of it at the pool anyway. I got us a cabana that turned out to be awfully useful. Being in the pool in the rain is no big deal, but lying out in chairs in the rain is annoying. And you certainly can’t crack open the laptop or read a paperback. I did both that day and was loving it. We were trying to book an ATV tour for ourselves, and that was the one thing we just couldn’t get done. The rainstorms just weren’t letting it happen. Apparently, there was too much debris and whatnot on the trails; the places that offered these tours didn’t reopen until after we left. We started most mornings at the breakfast buffet, included in our fancy villa booking. It was pretty crowded as they couldn’t seat people outside in the wet. Then we’d hit the water without fail. A few days we did the ocean, but came to understand it really wasn’t a good time for that. Storms wash landcrap out to sea, making the water look muddy. Apparently, that’s worse than just looking ugly; it can harbor dangerous bacteria. The guy at 808 clothing told me that you’d have to be a real idiot to go out in it and that real Hawaiians would never. Last year, some lady had to have her legs sliced open to flush out the bacteria (or something? The guy was pretty weird). Also, later, our zip-line guide told us she loves to surf and wouldn’t go out because the “muddy” water is extra-attractive to sharks, since the low visibility helps them more than it helps their prey. Also later, we went to a surfing beach absolutely full of obviously local surfers. Turns out people don’t exactly speak for all people. We did some knee-deep ocean stuff because it’s hard to resist. One day we drove up to Paia, a northern coastal city with extreme charm. Unfortunately we got there when it was pouring pretty good, so we spent most of it hustling between store overhangs. You could really see how close to flooding everything can get, quickly. We mostly just did a little shopping, walking around, and snacking in Paia, and I didn’t take many photos there. It was super cute though, highly recommended. I sorta regret not buying a Ukulele bass from the music shop there as I’ve been eyeing one up like forever, ever since going on a trip with Brad Frost where he brought his. Which reminds me: we had the kids to Uke lessons at the Fairmont and it was kind of a mess. Probably skip that. The hostess at the bar we stopped at told us how to get down to the turtle beach nearby (Ho’okipa). It was really pouring when we got there, so we just parked for a while and watched the surfers. Really amazing to watch. Huge waves. The turtle beach didn’t disappoint! Hitting the pool was a daily event. The kids are old enough that we could shoo them out the door to the pool and not worry about it too much. Two of the kids had trackable wrist watches that could make calls, so that was extra convenient. There was a swim-up bar that I appreciated existing despite never getting around to using it. I did us the walk-up bar once, and the Zach Bryan impersonator bartended made me a cocktail despite it being almost an hour after it was supposed to close. He was being fawned over by two woman who wanted to make sure he had their number for later. I enjoyed that, naturally. Ruby’s favorite experience, and perhaps mine too, was the zip lining we did. We chose Haleakalā Zipline Tours as, well, it was open, and it’s location high up mid-island looked cool. It was. The two charming guides helped make it fun, showering us with bird-facts and about their conservation efforts. Ruby had to get over some fears of zip lining at all, which she did and of course now loves it. I left thinking of other zip lining we could to back home and hoping to see a ʻAlalā (Hawaiian crow). We hit Black Rock Pizza on the way home, my only non-fish dinner. The very last day, our friends moved on to another island, while we were hitting the redeye flight home. We had most of the day to kill, so we wandered around the property a bit, wandered some stores, then went to the local cinema to catch Project Hail Mary (fun!) and then off to the airport. Only a 5-hour flight back to Seattle compared to the 7-hour flight from Salt Lake City on the way there. We both slept a little and it went easy breezy.

0 views
Stratechery 2 weeks ago

2026.13: So Long to Sora

Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Agents Over Bubbles . R.I.P. Sora, 2025-2026. AI Sam came, AI Sam saw, and AI Sam stole those GPUs. We’ll always have the memories . Unfortunately, it turns out that Sam would rather have the GPUs , so on Sharp Tech this week, Ben and I eulogized the app that took over the world for about two weeks last year . That included thoughts on copyright battles that may have sealed its fate, why Ben’s reluctant to be too critical, and more signs that OpenAI is serious about its enterprise pivot. Come for that conversation, and then stay for a rollicking spring mailbag that includes a great take on search advertising, F1 venting, the Vision Pro and my wife, kids and phones, and more. — Andrew Sharp The 2026 Bullseye List. The NBA Playoffs are only a few weeks away, which means Ben Golliver and I are already in preparation mode, including a delightful episode today running through a “Bullseye List” of superstars who will be under pressure this spring . We discuss everyone from Kevin Durant and Alperin Sengun to Jalen Brunson, Chet Holmgren, and Victor Wembanyama, a debatable inclusion, but undeniably the most magnetic star in the league right now. And yes, given my Luka takes in January , and Luka looking incredible throughout March, I did take accountability and add myself to bullseye list. — AS Arm’s Big Shift. If you wanted more evidence that AI is changing everything, look no further than Arm: the company was famous for its high margin IP-licensing business model, but this week announced that instead of (just) facilitating other company’s making chips, it would start making and selling chips itself. Naturally, their first offering is explicitly focused on AI data centers. I explained Arm’s motivations in Wednesday Update , and interviewed Arm CEO Rene Haas to get his point of view on Thursday . — Ben Thompson Arm Launches Own CPU, Arm’s Motivation, Constraints and Systems — Arm is selling its own chips, not just licensing IP. It’s a big change compared to Arm’s history, but not surprising given how computing is evolving. An Interview with Arm CEO Rene Haas About Selling Chips — An interview with Arm CEO Rene Haas about the company’s decision to not just license IP but make their own chips. Tilting at Windmills — As the Iran war continues, let’s take a look at the Democratic Party, institutional media, and offshore wind farms. John Ternus and Responsible Individuals Sora and Mac Pro Dead Singapore’s Sound Card Hero A Giant Mess with Super Micro; Completely Correct Xiong’an Progress; The PRC’s Balancing Act on Iran; Manus, Apple and Router News The Intrigue(?) in the East, Peterson and Acuff On Center Stage, Revisiting Draft Kevin Durant The BULLSEYE List in 2026: Playoff Questions for Ant, Chet, Tatum, Mitchell, Wemby, and Beyond A Spring Break Mailbag: RIP Sora, Ads and Surplus, F1 Going in Reverse, Elon Inc., Smartphone Parenting, and More

0 views
Robin Moffatt 2 weeks ago

Look Ma, I made a JAR! (Building a connector for Kafka Connect without knowing Java)

As a non-Java coder, for the last ten years I’ve stumbled my way through the JVM-centric world of "big data" (as it was called then), relying on my wits with SQL and config files to just about muddle through. One of the things that drew me to Kafka Connect was that I could build integrations between Kafka and other systems without needing to write Java, and the same again for ksqlDB and Flink SQL—now stream processing was available to mere RDBMS mortals and not just the Java adonises. One thing defeated me though; if a connector didn’t exist for Kafka Connect, then I was stuck. I’d resort to cobbled-together pipelines leaning heavily on kafkacat kcat, such as I did in this blog post . I built some cool analytics on top of maritime AIS data about ships' locations, but the foundations were shaky at best: No failure logic, no schema handling, no bueno. What I really needed was a connector for Kafka Connect. However for that, you need Java. I don’t write Java. But Claude can write Java.

0 views
Ahead of AI 3 weeks ago

A Visual Guide to Attention Variants in Modern LLMs

I had originally planned to write about DeepSeek V4. Since it still hasn’t been released, I used the time to work on something that had been on my list for a while, namely, collecting, organizing, and refining the different LLM architectures I have covered over the past few years. So, over the last two weeks, I turned that effort into an LLM architecture gallery (with 45 entries at the time of this writing), which combines material from earlier articles with several important architectures I had not documented yet. Each entry comes with a visual model card, and I plan to keep the gallery updated regularly. You can find the gallery here: https://sebastianraschka.com/llm-architecture-gallery/ Figure 1: Overview of the LLM architecture gallery and its visual model cards. After I shared the initial version, a few readers also asked whether there would be a poster version. So, there is now a poster version via Redbubble . I ordered the Medium size (26.9 x 23.4 in) to check how it looks in print, and the result is sharp and clear. That said, some of the smallest text elements are already quite small at that size, so I would not recommend the smaller versions if you intend to have everything readable. Figure 2: Poster version of the architecture gallery with some random objects for scale. Alongside the gallery, I was/am also working on short explainers for a few core LLM concepts. So, in this article, I thought it would be interesting to recap all the recent attention variants that have been developed and used in prominent open-weight architectures in recent years. My goal is to make the collection useful both as a reference and as a lightweight learning resource. I hope you find it useful and educational! Self-attention lets each token look at the other visible tokens in the sequence, assign them weights, and use those weights to build a new context-aware representation of the input. Multi-head attention (MHA) is the standard transformer version of that idea. It runs several self-attention heads in parallel with different learned projections, then combines their outputs into one richer representation. Figure 3: Olmo 2 as an example architecture using MHA. The sections below start with a whirlwind tour of explaining self-attention to explain MHA. It’s more meant as a quick overview to set the stage for related attention concepts like grouped-query attention, sliding window attention, and so on. If you are interested in a longer, more detailed self-attention coverage, you might like my longer Understanding and Coding Self-Attention, Multi-Head Attention, Causal-Attention, and Cross-Attention in LLMs article. EXAMPLE ARCHITECTURES GPT-2 , OLMo 2 7B , and OLMo 3 7B Attention predates transformers and MHA. Its immediate background is encoder-decoder RNNs for translation. In those older systems, an encoder RNN would read the source sentence token by token and compress it into a sequence of hidden states, or in the simplest version into one final state. Then the decoder RNN had to generate the target sentence from that limited summary. This worked for short and simple cases, but it created an obvious bottleneck once the relevant information for the next output word lived somewhere else in the input sentence. In short, the limitation is that the hidden state can’t store infinitely much information or context, and sometimes it would be useful to just refer back to the full input sequence. The translation example below shows one of the limitations of this idea. For instance, a sentence can preserve many locally reasonable word choices and still fail as a translation when the model treats the problem too much like a word-by-word mapping. (The top panel shows an exaggerated example where we translate the sentence word by word; obviously, the grammar in the resulting sentence is wrong.) In reality, the correct next word depends on sentence-level structure and on which earlier source words matter at that step. Of course, this could still be translated fine with an RNN, but it would struggle with longer sequences or knowledge retrieval tasks because the hidden state can only store so much information as mentioned earlier. Figure 4: Translation can fail even when many individual word choices look reasonable because sentence-level structure still matters (Original source LLMs-from-scratch ). The next figure shows that change more directly. When the decoder is producing an output token, it should not be limited to one compressed memory path. It should be able to reach back to the more relevant input tokens directly. Figure 5: Attention breaks the RNN bottleneck by letting the current output position revisit the full input sequence instead of relying on one compressed state alone (Original source LLMs-from-scratch ). Transformers keep that core idea from the aforementioned attention-modified RNN but remove the recurrence. In the classic Attention Is All You Need paper, attention becomes the main sequence-processing mechanism itself (instead of being just part of an RNN encoder-decoder.) In transformers, that mechanism is called self-attention, where each token in the sequence computes weights over all other tokens and uses them to mix information from those tokens into a new representation. Multi-head attention is the same mechanism run several times in parallel. For a sequence of tokens, attention needs one row of weights per token, so overall we get a matrix. Each row answers a simple question. When updating this token, how much should each visible token matter? In a decoder-only LLM, future positions are masked out, which is why the upper-right part of the matrix is grayed out in the figure below. Self-attention is fundamentally about learning these token-to-token weight patterns, under a causal mask, and then using them to build context-aware token representations. Figure 6: A concrete masked attention matrix where each row belongs to one token, each entry is an attention weight, and future-token entries are removed by the causal mask (Original source Understanding and Coding Self-Attention ). 1.4 Self-Attention Internals The next figure shows how the transformer computes the attention matrix ( ) from the input embeddings , which is then used to produce the transformed inputs ( ). Here , , and stand for queries, keys, and values. The query for a token represents what that token is looking for, the key represents what each token makes available for matching, and the value represents the information that gets mixed into the output once the attention weights have been computed. The steps are as follows: , , and are weight matrices that project the input embeddings into , , and produces the raw token-to-token relevance scores softmax converts those scores into the normalized attention matrix that we discussed in the previous section is applied to to produce the output matrix Note that the attention matrix is not a separate hand-written object. It emerges from , , and softmax. Figure 7: The full single-head pipeline, from input embeddings X to the normalized attention matrix A and output representations Z (Original source Understanding and Coding Self-Attention ). The next figure shows the same concept as the previous figure but the attention matrix computation is hidden inside the “scaled-dot-product attention” box, and we perform the computation only for one input token instead of all input tokens. This is to show a compact form of self-attention with a single head before extending this to multi-head attention in the next section. Figure 8: One attention head is already a complete mechanism. One set of learned projections produces one attention matrix and one context-aware output stream (Original source Understanding and Coding Self-Attention ). 1.5 From One Head To Multi-Head Attention One set of matrices gives us one attention head, which means one attention matrix and one output matrix . (This concept was illustrated in the previous section.) Multi-head attention simply runs several of these heads in parallel with different learned projection matrices. This is useful because different heads can specialize in different token relationships. One head might focus on short local dependencies, another on broader semantic links, and another on positional or syntactic structure. Figure 9: Multi-head attention keeps the same basic attention recipe, but repeats it across several heads in parallel so the model can learn several token-to-token patterns at once (Original source Understanding and Coding Self-Attention ). 2. Grouped-Query Attention (GQA) Grouped-query attention is an attention variant derived from standard MHA. It was introduced in the 2023 paper GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints by Joshua Ainslie and colleagues. Instead of giving every query head its own keys and values, it lets several query heads share the same key-value projections, which makes KV caching much cheaper (primarily as a memory reduction) without changing the overall decoder recipe very much. Figure 10: GQA keeps the same overall attention pattern as MHA, but collapses the number of key-value heads by sharing them across multiple query heads (Original source: The Big LLM Architecture Comparison ). EXAMPLE ARCHITECTURES Dense: Llama 3 8B , Qwen3 4B , Gemma 3 27B , Mistral Small 3.1 24B , SmolLM3 3B , and Tiny Aya 3.35B . Sparse (Mixture-of-Experts): Llama 4 Maverick , Qwen3 235B-A22B , Step 3.5 Flash 196B , and Sarvam 30B . In my architecture comparison article , I framed GQA as the new standard replacement for classic multi-head attention (MHA). The reason is that standard MHA gives every head its own keys and values, which is more optimal from a modeling perspective but expensive once we have to keep all of that state in the KV cache during inference. In GQA, we keep a larger set of query heads, but we reduce the number of key-value heads and let multiple queries share them. That lowers both parameter count and KV-cache traffic without making drastic implementation changes like multi-head latent attention (MLA), which will be discussed later. In practice, that made and keeps it a very popular choice for labs that wanted something cheaper than MHA but simpler to implement than newer compression-heavy alternatives like MLA. GQA results in big savings in KV storage, since the fewer key-value heads we keep per layer, the less cached state we need per token. That is why GQA becomes more useful as sequence length grows. GQA is also a spectrum. If we reduce all the way down to one shared K/V group, we are effectively in multi-query attention territory, which is even cheaper but can hurt modeling quality more noticeably. The sweet spot is usually somewhere in between multi-query attention (1 shared group) and MHA (where K/V groups are equal to the number of queries), where the cache savings are large but the modeling degradation relative to MHA stays modest. Figure 11: Lower is better. Once the context window grows, KV-cache savings become more pronounced. (Original source: LLMs-from-scratch GQA materials ) 2.3 Why GQA Still Matters In 2026 More advanced variants such as MLA are becoming popular because they can offer better modeling performance at the same KV efficiency levels (e.g., as discussed in the ablation studies of the DeepSeek-V2 paper ), but they also involve a more complicated implementation and a more complicated attention stack. GQA remains appealing because it is robust, easier to implement, and also easier to train (since there are fewer hyperparameter tunings necessary, based on my experience). That is why some of the newer releases still stay deliberately classic here. E.g., in my Spring Architectures article, I mentioned that MiniMax M2.5 and Nanbeige 4.1 as models that remained very classic, using only grouped-query attention without piling on other efficiency tricks. Sarvam is a particularly useful comparison point as well: the 30B model keeps classic GQA, while the 105B version switches to MLA. Figure 12: Total KV cache sizes for 105B Sarvam (using MLA) versus 30B Sarvam (using GQA), versus using plain MHA. The motivation behind Multi-head Latent Attention (MLA) is similar to Grouped-Query Attention (GQA). Both are solutions for reducing KV-cache memory requirements. The difference between GQA and MLA is that MLA shrinks the cache by compressing what gets stored rather than by reducing how many K/Vs are stored by sharing heads. Figure 13: Unlike GQA, MLA does not reduce KV cost by grouping heads. It reduces it by caching a compressed latent representation. Note that it is also applied to the query, which is not shown for simplicity (Original source: The Big LLM Architecture Comparison ). MLA, originally proposed in the DeepSeek-V2 paper, became such a defining DeepSeek-era idea (especially after DeepSeek-V3 and R1). It is more complicated to implement than GQA, more complicated to serve, but nowadays also often more compelling once model size and context length get large enough that cache traffic starts to dominate, because at the same rate of memory reduction, it could maintain better modeling performance (more on that later). EXAMPLE ARCHITECTURES DeepSeek V3 , Kimi K2 , GLM-5 , Ling 2.5 , Mistral Large 3 , and Sarvam 105B Instead of caching full-resolution key and value tensors as in MHA and GQA, MLA stores a latent representation and reconstructs the usable state when needed. Essentially, it is a cache compression strategy embedded inside attention, as illustrated in the previous figure. The figure below shows the savings compared to regular MHA. Figure 14: Once context length grows, the savings from caching a latent representation instead of full K/V tensors become very visible (Original source: LLMs-from-scratch MLA section). 3.2 MLA Ablation Studies The DeepSeek-V2 paper provided some ablations where GQA looked worse than MHA in terms of modeling performance, while MLA held up much better and could even outperform MHA when tuned carefully. That is a much stronger justification than “it (also) saves memory.” In other words, MLA is a preferable attention mechanism for DeepSeek not just because it was efficient, but because it looked like a quality-preserving efficiency move at large scale. (But colleagues also told me that MLA only works well at a certain size. For smaller models, let’s say <100B, GQA seems to work better, or, is at least easier to tune and get right.) Figure 15: GQA drops below MHA here, while MLA remains competitive and can even slightly outperform it. Underlying paper: DeepSeek-V2 . Below is again the comparison between GQA in 30B Sarvam versus MLA in 105B Sarvam. Figure 16: GQA and MLA are solving the same bottleneck from different directions. The tradeoff is simplicity versus better modeling performance for larger models. 3.3 How MLA Spread After DeepSeek Once DeepSeek V3/R1, V3.1 etc. normalized the design after its introduction in V2, it started showing up in a second wave of architectures. Kimi K2 kept the DeepSeek recipe and scaled it up. GLM-5 adopted MLA together with DeepSeek Sparse Attention (from DeepSeek V3.2). Ling 2.5 paired MLA with a linear-attention hybrid. Sarvam released two models where the 30B model stayed with classic GQA and the 105B model switched to MLA. That last pair is particularly useful as it puts the technical-complexity discussion aside. I.e., the Sarvam team implemented both variants and deliberately chose to then use GQA for one variant and MLA for the other. So, in a sense, that makes MLA feel less like a theoretical alternative and more like a concrete architectural upgrade path once a family scales up. Sliding window attention reduces the memory and compute cost of long-context inference by limiting how many previous tokens each position can attend to. Instead of attending to the entire prefix, each token only attends to a fixed window of recent tokens around its position. Because attention is restricted to a local token neighborhood, this mechanism is often referred to as local attention. Some architectures combine these local layers with occasional global attention layers so that information can still propagate across the entire sequence. Figure 17: The conceptual shift is simple. Regular attention is global attention, while sliding-window attention is local attention. Global attention lets every token see the full prefix; SWA turns many of those layers into local attention layers (Original source: The Big LLM Architecture Comparison ). EXAMPLE ARCHITECTURES Gemma 3 27B , OLMo 3 32B , Xiaomi MiMo-V2-Flash , Arcee Trinity , Step 3.5 Flash , and Tiny Aya Gemma 3 is still one of the clearest recent SWA examples because it is easy to compare against Gemma 2. Gemma 2 already used a hybrid attention setup with a 1:1 ratio between local and global layers and a 4096-token window. Gemma 3 pushed this further to a 5:1 ratio and reduced the window size to 1024. The key finding was not that local attention is cheaper, because that was already known. Here, the more interesting takeaway from the Gemma 3 ablation study was that using this more aggressively seemed to hurt modeling performance only slightly. The Gemma ablation study suggests that the smaller window and more aggressive local:global ratio have little effect on perplexity. Underlying paper: Gemma 3 article (Original source: The Big LLM Architecture Comparison ). 4.2 The Ratio And Window Size In practice, saying that a model “uses SWA” does not mean it relies on SWA alone. What usually matters are the local-to-global layer pattern and the attention window size. For example: Gemma 3 and Xiaomi use a 5:1 local-to-global pattern. OLMo 3 and Arcee Trinity use a 3:1 pattern. Xiaomi also uses a window size of 128, which is much smaller, and therefore more aggressive, than Gemma’s 1024. SWA is essentially a knob that can be tuned more or less aggressively. Figure 18: The long-context savings come from turning many full-attention layers into local ones, which reduces how much cached context those layers need to consider (Original source: LLMs-from-scratch SWA materials ). 4.3 Combining SWA with GQA SWA often appears together with GQA because the two ideas address different parts of the same inference problem. SWA reduces how much context a local layer has to consider. GQA reduces how much key-value state each token contributes to the cache. That is why many recent dense models use both rather than treating them as alternatives. Gemma 3 is again a good reference point here, since it combines sliding window attention with grouped-query attention in the same architecture. DeepSeek Sparse Attention is one of the architectural changes that appeared in the DeepSeek V3.2 line and later showed up again in GLM-5. Specifically, DeepSeek V3.2 combines it with Multi-head Latent Attention (MLA) , and GLM-5 adopts the same pair for the same general reason, namely, reducing inference cost when context lengths get large. EXAMPLE ARCHITECTURES DeepSeek V3.2 and GLM-5 In sliding-window attention, the current token does not attend to the full prefix but only to a fixed local window. This is the same broad idea behind DeepSeek Sparse Attention, where each token also only attends to a subset of previous tokens. However, the selected tokens are not determined by a fixed-width local window. Instead, DeepSeek Sparse Attention uses a learned sparse pattern. In short, it uses an indexer-plus-selector setup, where a lightning indexer computes relevance scores, and a token selector keeps only a smaller set of high-scoring past positions. The way the subset of tokens is selected is the main difference from sliding-window attention. Sliding-window attention hard-codes locality. DeepSeek Sparse Attention still limits attention to a subset, but it lets the model decide which prior tokens are worth revisiting. Figure 19: Similar to sliding-window attention, DeepSeek Sparse Attention also restricts each token to a subset of prior tokens, but does not do so with a fixed local window (Original source: From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates ). 5.2 DeepSeek Sparse Attention and MLA DeepSeek V3.2 uses both Multi-head Latent Attention (MLA) and DeepSeek Sparse Attention. MLA reduces KV-cache cost by compressing what gets stored. DeepSeek Sparse Attention reduces how much of the prior context the model has to revisit. Put differently, one optimizes the cache representation, the other optimizes the attention pattern on top of it. Figure 20: DeepSeek V3.2 is the obvious reference point, because this is the model family most closely associated with the sparse-attention idea. The sparse pattern is not random. The first stage is a lightning indexer that scores previous tokens for each new query token. It uses MLA’s compressed token representations and computes a learned similarity score over the prior context, so the model can rank which earlier positions are worth revisiting. The second stage is a token selector. It keeps only a smaller high-scoring subset, for example, a top- set of past positions, and turns that subset into the sparse attention mask. So the main point is that DeepSeek Sparse Attention does not hard-code the sparsity pattern. It learns which past tokens to keep. Figure 21: The mechanism consists of a lightning indexer that scores prior tokens and a selector that keeps only a smaller subset for attention (Original source: From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates ). DeepSeek Sparse Attention is relatively new and relatively complicated to implement, which is why it has not been so widely adopted as Grouped-Query Attention (GQA) yet. Gated attention is best understood as a modified full-attention block rather than as a separate attention family. It usually appears inside hybrid stacks that still keep an occasional full-attention layer for exact content retrieval, but add a few stability-oriented changes on top of an otherwise familiar scaled dot-product attention block. Figure 22: Trinity Large is a useful comparison because gated attention is not only a Qwen idea (more on that later). Here the gate appears after the scaled dot-product attention output and before the output projection in a different long-context architecture (Original source: A Dream of Spring for Open-Weight LLMs ). 6.1 Where Gated Attention Appears The Qwen3-Next and Qwen3.5 architectures show that recent hybrids (covered in the next section) do not replace attention everywhere. Instead, they replace most attention layers with a cheaper alternative and keep a smaller number of full-attention layers in the stack. Those remaining full-attention layers are where gated attention typically appears. Qwen3-Next and Qwen3.5 use it together with Gated DeltaNet in a 3:1 pattern. But hybrid architectures aside, Trinity uses a related gating idea in a more conventional attention stack, as shown in the previous figure above. The gated attention block in Qwen-style hybrids or Trinity (not a hybrid) is essentially standard scaled-dot-product attention with a few changes on top. In the original Gated Attention paper , those changes are presented as a way to make the retained full-attention layers behave more predictably inside a hybrid stack. The block still looks like standard (full) attention, but it adds: an output gate that scales the attention result before it is added back to the residual, a zero-centered QK-Norm variant instead of standard RMSNorm for q and k, partial RoPE. These are not changes on the scale of MLA or linear attention but merely stability and control changes applied to an otherwise familiar attention block. Figure 23: In Qwen3-Next and Qwen3.5, gated attention appears as the full-attention layer that periodically breaks up runs of Gated DeltaNet blocks. Note that the figure above also includes Gated DeltaNet, which we will cover in the next section below. Hybrid attention is a broader design pattern rather than a specific, single mechanism. The overall idea is to keep a transformer-like stack, but replace most of the expensive full-attention layers with cheaper linear or state-space sequence modules. The motivation is long-context efficiency. Full attention grows quadratically with sequence length, so once models move to contexts like 128k, 256k, or 1M tokens, attention memory and compute become expensive enough that using cheaper sequence modules in most layers while keeping only a smaller number of heavier retrieval layers starts making more sense. (Note that this comes with a bit of a modeling performance trade-off, though.) In Qwen3-Next, this pattern appears as a 3:1 mix of Gated DeltaNet and Gated Attention blocks. Gated DeltaNet is also closely related to Mamba-2 (see the Gated Delta Networks: Improving Mamba2 with Delta Rule paper, for instance), and the mechanism can be read as a DeltaNet-style fast-weight update combined with Mamba-style gating. Later architectures keep the same overall idea but swap in other lightweight sequence mixers, such as Kimi Delta Attention, Lightning Attention, or standard Mamba-2. Figure 24: The basic hybrid pattern, where most blocks are cheaper sequence mixers and every fourth block restores a heavier attention layer (Original source The Big LLM Architecture Comparison ). To my knowledge, the first prominent example of a close-to-flagship LLM with hybrid attention was Qwen3-Next in 2025, which does not remove attention completely but mixes three Gated DeltaNet blocks with one Gated Attention block. Here, lightweight Gated DeltaNet blocks do most of the long-context work and keep memory growth much flatter than full attention. The heavier gated-attention layer remains because DeltaNet is less exact at content-based retrieval. Inside a Gated DeltaNet block, the model computes query, key, and value vectors together with two learned gates (α, β). Rather than forming the usual token-to-token attention matrix, it writes to a small fast-weight memory using a delta-rule update. In rough terms, the memory stores a compressed running summary of past information, while the gates control how much new information is added and how much previous state is retained. That makes Gated DeltaNet a linear-attention or recurrent-style mechanism rather than just another tweak to MHA. Relative to Mamba-2, the close connection is that both belong to the linear-time gated sequence-model family, but Gated DeltaNet uses a DeltaNet-style fast-weight memory update instead of the Mamba state-space update. Figure 25: The practical motivation behind the hybrids is shown here in the memory curve. Hybrid stacks with Gated DeltaNet grow much more slowly with context length than ordinary full attention (Original source LLMs-from-scratch DeltaNet materials ). Qwen3.5 moves the former Qwen3-Next hybrid into Qwen’s main flagship series, which is an interesting move. This basically signals that the hybrid strategy is a success and that we may see more models with this architecture in the future. Figure 26: Qwen3.5 shows the Qwen team promoting the former Qwen3-Next side-branch into the main model line rather than leaving it as a one-off efficiency variant (Original source A Dream of Spring for Open-Weight LLMs ). 7.2 Kimi Linear And Modified Delta Attention Kimi Linear keeps the same broad transformer skeleton and the same 3:1 pattern, but it changes both halves of the recipe. On the lightweight side, Kimi Delta Attention is a refinement of Gated DeltaNet. Where Qwen3-Next uses a scalar gate per head to control memory decay, Kimi uses channel-wise gating, which gives finer control over the memory update. On the heavier side, Kimi replaces Qwen3-Next’s gated-attention layers with gated MLA layers. So, it’s still the same broader pattern as in Qwen3-Next and Qwen3.5, but both ingredients (slightly) change. I.e., most layers are still handled by a cheaper linear-style mechanism, and periodic heavier layers still remain for stronger retrieval. Figure 27: Kimi Linear keeps the same overall hybrid pattern while changing both the lightweight side and the heavier attention side of the stack (Original source The Big LLM Architecture Comparison ). 7.3 Ling 2.5 And Lightning Attention Ling 2.5 shows another swap on the lightweight side. Instead of Gated DeltaNet, Ling uses a slightly simpler recurrent linear attention variant called Lightning Attention. On the heavier side, it keeps MLA from DeepSeek. Most sequence mixing happens in the cheaper linear-attention blocks, while a smaller number of heavier layers remain to preserve stronger retrieval. The difference is that the specific lightweight mechanism is now Lightning Attention rather than DeltaNet or Kimi Delta Attention. Figure 28: Ling 2.5 and Qwen3.5 are both linear-attention hybrids, even though Ling swaps in Lightning Attention and MLA instead of the Qwen recipe (Original source A Dream of Spring for Open-Weight LLMs ). Ling 2.5 is aimed more at long-context efficiency than at absolute benchmark leadership. According to the Ling team, it was reported as substantially faster than Kimi K2 at 32k tokens, which is the practical payoff these hybrids are aiming for. Figure 29: Ling 2.5 was presented as a strong efficiency upgrade, with much higher 32k-token throughput than Kimi K2 at the same 1-trillion-parameter scale (Original source Ling 2.5 model hub page ). Nemotron And Mamba-2 Nemotron pushes the pattern further away from the transformer baseline. Nemotron 3 Nano is a Mamba-Transformer hybrid that interleaves Mamba-2 sequence-modeling blocks with sparse MoE layers and uses self-attention only in a small subset of layers. This is a more extreme version of the same basic tradeoff discussed above. Here, the lightweight sequence module is a Mamba-2 state-space block rather than a DeltaNet-style fast-weight update, but the basic tradeoff is similar. Figure 30: Nemotron 3 Nano uses Mamba-2 for most of the sequence modeling work, with self-attention only appearing in a small subset of layers (Original source The Big LLM Architecture Comparison ). The larger Nemotron 3 Super keeps the Mamba-2 hybrid attention approach and adds other efficiency-oriented changes such as latent MoE and shared-weight multi-token prediction (MTP) for speculative decoding. Figure 31: Nemotron 3 Super keeps the Mamba-2 hybrid attention pattern while adding latent MoE and shared-weight MTP on top (Original source The Big LLM Architecture Comparison ). Conclusion Of course, there are many more (mostly niche) attention variants throughout the literature that I haven’t covered here. The focus of this article was on those that are currently used in state-of-the-art (open-weight) models. In particular, I am looking forward to (1) seeing the brand new Mamba-3 layers getting integrated into the aforementioned hybrid architectures (replacing Gated DeltaNet) and (2) attention residuals being used in general. In practice, you may also wonder what the “best” architecture is at the moment. This is hard to answer, as there are no public experiments that train different architectures on the same training data etc. Hence, we can currently only answer what the best (trained) model choice is for a given problem. In my opinion, hybrid architectures are still a novelty, and the main selling point is mainly (long-context) efficiency versus just modeling performance. Hence, I think they are a great candidate for agent contexts (like OpenClaw). Personally, I think the problem with hybrid architectures is also that the inference stacks are not quite as optimized, yet, and I find that I get better tok/sec throughput when running LLMs locally using more classic setups like GPT-OSS with grouped-query attention. Anyways, I am curious to see what DeepSeek V4 has in store, since DeepSeek has been quite the reliable trend-setter in the recent 2 years. Figure 1: Overview of the LLM architecture gallery and its visual model cards. After I shared the initial version, a few readers also asked whether there would be a poster version. So, there is now a poster version via Redbubble . I ordered the Medium size (26.9 x 23.4 in) to check how it looks in print, and the result is sharp and clear. That said, some of the smallest text elements are already quite small at that size, so I would not recommend the smaller versions if you intend to have everything readable. Figure 2: Poster version of the architecture gallery with some random objects for scale. Alongside the gallery, I was/am also working on short explainers for a few core LLM concepts. So, in this article, I thought it would be interesting to recap all the recent attention variants that have been developed and used in prominent open-weight architectures in recent years. My goal is to make the collection useful both as a reference and as a lightweight learning resource. I hope you find it useful and educational! 1. Multi-Head Attention (MHA) Self-attention lets each token look at the other visible tokens in the sequence, assign them weights, and use those weights to build a new context-aware representation of the input. Multi-head attention (MHA) is the standard transformer version of that idea. It runs several self-attention heads in parallel with different learned projections, then combines their outputs into one richer representation. Figure 3: Olmo 2 as an example architecture using MHA. The sections below start with a whirlwind tour of explaining self-attention to explain MHA. It’s more meant as a quick overview to set the stage for related attention concepts like grouped-query attention, sliding window attention, and so on. If you are interested in a longer, more detailed self-attention coverage, you might like my longer Understanding and Coding Self-Attention, Multi-Head Attention, Causal-Attention, and Cross-Attention in LLMs article. EXAMPLE ARCHITECTURES GPT-2 , OLMo 2 7B , and OLMo 3 7B 1.2 Historical Tidbits And Why Attention Was Invented Attention predates transformers and MHA. Its immediate background is encoder-decoder RNNs for translation. In those older systems, an encoder RNN would read the source sentence token by token and compress it into a sequence of hidden states, or in the simplest version into one final state. Then the decoder RNN had to generate the target sentence from that limited summary. This worked for short and simple cases, but it created an obvious bottleneck once the relevant information for the next output word lived somewhere else in the input sentence. In short, the limitation is that the hidden state can’t store infinitely much information or context, and sometimes it would be useful to just refer back to the full input sequence. The translation example below shows one of the limitations of this idea. For instance, a sentence can preserve many locally reasonable word choices and still fail as a translation when the model treats the problem too much like a word-by-word mapping. (The top panel shows an exaggerated example where we translate the sentence word by word; obviously, the grammar in the resulting sentence is wrong.) In reality, the correct next word depends on sentence-level structure and on which earlier source words matter at that step. Of course, this could still be translated fine with an RNN, but it would struggle with longer sequences or knowledge retrieval tasks because the hidden state can only store so much information as mentioned earlier. Figure 4: Translation can fail even when many individual word choices look reasonable because sentence-level structure still matters (Original source LLMs-from-scratch ). The next figure shows that change more directly. When the decoder is producing an output token, it should not be limited to one compressed memory path. It should be able to reach back to the more relevant input tokens directly. Figure 5: Attention breaks the RNN bottleneck by letting the current output position revisit the full input sequence instead of relying on one compressed state alone (Original source LLMs-from-scratch ). Transformers keep that core idea from the aforementioned attention-modified RNN but remove the recurrence. In the classic Attention Is All You Need paper, attention becomes the main sequence-processing mechanism itself (instead of being just part of an RNN encoder-decoder.) In transformers, that mechanism is called self-attention, where each token in the sequence computes weights over all other tokens and uses them to mix information from those tokens into a new representation. Multi-head attention is the same mechanism run several times in parallel. 1.3 The Masked Attention Matrix For a sequence of tokens, attention needs one row of weights per token, so overall we get a matrix. Each row answers a simple question. When updating this token, how much should each visible token matter? In a decoder-only LLM, future positions are masked out, which is why the upper-right part of the matrix is grayed out in the figure below. Self-attention is fundamentally about learning these token-to-token weight patterns, under a causal mask, and then using them to build context-aware token representations. Figure 6: A concrete masked attention matrix where each row belongs to one token, each entry is an attention weight, and future-token entries are removed by the causal mask (Original source Understanding and Coding Self-Attention ). 1.4 Self-Attention Internals The next figure shows how the transformer computes the attention matrix ( ) from the input embeddings , which is then used to produce the transformed inputs ( ). Here , , and stand for queries, keys, and values. The query for a token represents what that token is looking for, the key represents what each token makes available for matching, and the value represents the information that gets mixed into the output once the attention weights have been computed. The steps are as follows: , , and are weight matrices that project the input embeddings into , , and produces the raw token-to-token relevance scores softmax converts those scores into the normalized attention matrix that we discussed in the previous section is applied to to produce the output matrix Figure 7: The full single-head pipeline, from input embeddings X to the normalized attention matrix A and output representations Z (Original source Understanding and Coding Self-Attention ). The next figure shows the same concept as the previous figure but the attention matrix computation is hidden inside the “scaled-dot-product attention” box, and we perform the computation only for one input token instead of all input tokens. This is to show a compact form of self-attention with a single head before extending this to multi-head attention in the next section. Figure 8: One attention head is already a complete mechanism. One set of learned projections produces one attention matrix and one context-aware output stream (Original source Understanding and Coding Self-Attention ). 1.5 From One Head To Multi-Head Attention One set of matrices gives us one attention head, which means one attention matrix and one output matrix . (This concept was illustrated in the previous section.) Multi-head attention simply runs several of these heads in parallel with different learned projection matrices. This is useful because different heads can specialize in different token relationships. One head might focus on short local dependencies, another on broader semantic links, and another on positional or syntactic structure. Figure 9: Multi-head attention keeps the same basic attention recipe, but repeats it across several heads in parallel so the model can learn several token-to-token patterns at once (Original source Understanding and Coding Self-Attention ). 2. Grouped-Query Attention (GQA) Grouped-query attention is an attention variant derived from standard MHA. It was introduced in the 2023 paper GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints by Joshua Ainslie and colleagues. Instead of giving every query head its own keys and values, it lets several query heads share the same key-value projections, which makes KV caching much cheaper (primarily as a memory reduction) without changing the overall decoder recipe very much. Figure 10: GQA keeps the same overall attention pattern as MHA, but collapses the number of key-value heads by sharing them across multiple query heads (Original source: The Big LLM Architecture Comparison ). EXAMPLE ARCHITECTURES Dense: Llama 3 8B , Qwen3 4B , Gemma 3 27B , Mistral Small 3.1 24B , SmolLM3 3B , and Tiny Aya 3.35B . Sparse (Mixture-of-Experts): Llama 4 Maverick , Qwen3 235B-A22B , Step 3.5 Flash 196B , and Sarvam 30B . 2.1 Why GQA Became Popular In my architecture comparison article , I framed GQA as the new standard replacement for classic multi-head attention (MHA). The reason is that standard MHA gives every head its own keys and values, which is more optimal from a modeling perspective but expensive once we have to keep all of that state in the KV cache during inference. In GQA, we keep a larger set of query heads, but we reduce the number of key-value heads and let multiple queries share them. That lowers both parameter count and KV-cache traffic without making drastic implementation changes like multi-head latent attention (MLA), which will be discussed later. In practice, that made and keeps it a very popular choice for labs that wanted something cheaper than MHA but simpler to implement than newer compression-heavy alternatives like MLA. 2.2 GQA Memory Savings GQA results in big savings in KV storage, since the fewer key-value heads we keep per layer, the less cached state we need per token. That is why GQA becomes more useful as sequence length grows. GQA is also a spectrum. If we reduce all the way down to one shared K/V group, we are effectively in multi-query attention territory, which is even cheaper but can hurt modeling quality more noticeably. The sweet spot is usually somewhere in between multi-query attention (1 shared group) and MHA (where K/V groups are equal to the number of queries), where the cache savings are large but the modeling degradation relative to MHA stays modest. Figure 11: Lower is better. Once the context window grows, KV-cache savings become more pronounced. (Original source: LLMs-from-scratch GQA materials ) 2.3 Why GQA Still Matters In 2026 More advanced variants such as MLA are becoming popular because they can offer better modeling performance at the same KV efficiency levels (e.g., as discussed in the ablation studies of the DeepSeek-V2 paper ), but they also involve a more complicated implementation and a more complicated attention stack. GQA remains appealing because it is robust, easier to implement, and also easier to train (since there are fewer hyperparameter tunings necessary, based on my experience). That is why some of the newer releases still stay deliberately classic here. E.g., in my Spring Architectures article, I mentioned that MiniMax M2.5 and Nanbeige 4.1 as models that remained very classic, using only grouped-query attention without piling on other efficiency tricks. Sarvam is a particularly useful comparison point as well: the 30B model keeps classic GQA, while the 105B version switches to MLA. Figure 12: Total KV cache sizes for 105B Sarvam (using MLA) versus 30B Sarvam (using GQA), versus using plain MHA. 3. Multi-Head Latent Attention (MLA) The motivation behind Multi-head Latent Attention (MLA) is similar to Grouped-Query Attention (GQA). Both are solutions for reducing KV-cache memory requirements. The difference between GQA and MLA is that MLA shrinks the cache by compressing what gets stored rather than by reducing how many K/Vs are stored by sharing heads. Figure 13: Unlike GQA, MLA does not reduce KV cost by grouping heads. It reduces it by caching a compressed latent representation. Note that it is also applied to the query, which is not shown for simplicity (Original source: The Big LLM Architecture Comparison ). MLA, originally proposed in the DeepSeek-V2 paper, became such a defining DeepSeek-era idea (especially after DeepSeek-V3 and R1). It is more complicated to implement than GQA, more complicated to serve, but nowadays also often more compelling once model size and context length get large enough that cache traffic starts to dominate, because at the same rate of memory reduction, it could maintain better modeling performance (more on that later). EXAMPLE ARCHITECTURES DeepSeek V3 , Kimi K2 , GLM-5 , Ling 2.5 , Mistral Large 3 , and Sarvam 105B 3.1 Compression, Not Sharing Instead of caching full-resolution key and value tensors as in MHA and GQA, MLA stores a latent representation and reconstructs the usable state when needed. Essentially, it is a cache compression strategy embedded inside attention, as illustrated in the previous figure. The figure below shows the savings compared to regular MHA. Figure 14: Once context length grows, the savings from caching a latent representation instead of full K/V tensors become very visible (Original source: LLMs-from-scratch MLA section). 3.2 MLA Ablation Studies The DeepSeek-V2 paper provided some ablations where GQA looked worse than MHA in terms of modeling performance, while MLA held up much better and could even outperform MHA when tuned carefully. That is a much stronger justification than “it (also) saves memory.” In other words, MLA is a preferable attention mechanism for DeepSeek not just because it was efficient, but because it looked like a quality-preserving efficiency move at large scale. (But colleagues also told me that MLA only works well at a certain size. For smaller models, let’s say <100B, GQA seems to work better, or, is at least easier to tune and get right.) Figure 15: GQA drops below MHA here, while MLA remains competitive and can even slightly outperform it. Underlying paper: DeepSeek-V2 . Below is again the comparison between GQA in 30B Sarvam versus MLA in 105B Sarvam. Figure 16: GQA and MLA are solving the same bottleneck from different directions. The tradeoff is simplicity versus better modeling performance for larger models. 3.3 How MLA Spread After DeepSeek Once DeepSeek V3/R1, V3.1 etc. normalized the design after its introduction in V2, it started showing up in a second wave of architectures. Kimi K2 kept the DeepSeek recipe and scaled it up. GLM-5 adopted MLA together with DeepSeek Sparse Attention (from DeepSeek V3.2). Ling 2.5 paired MLA with a linear-attention hybrid. Sarvam released two models where the 30B model stayed with classic GQA and the 105B model switched to MLA. That last pair is particularly useful as it puts the technical-complexity discussion aside. I.e., the Sarvam team implemented both variants and deliberately chose to then use GQA for one variant and MLA for the other. So, in a sense, that makes MLA feel less like a theoretical alternative and more like a concrete architectural upgrade path once a family scales up. 4. Sliding Window Attention (SWA) Sliding window attention reduces the memory and compute cost of long-context inference by limiting how many previous tokens each position can attend to. Instead of attending to the entire prefix, each token only attends to a fixed window of recent tokens around its position. Because attention is restricted to a local token neighborhood, this mechanism is often referred to as local attention. Some architectures combine these local layers with occasional global attention layers so that information can still propagate across the entire sequence. Figure 17: The conceptual shift is simple. Regular attention is global attention, while sliding-window attention is local attention. Global attention lets every token see the full prefix; SWA turns many of those layers into local attention layers (Original source: The Big LLM Architecture Comparison ). EXAMPLE ARCHITECTURES Gemma 3 27B , OLMo 3 32B , Xiaomi MiMo-V2-Flash , Arcee Trinity , Step 3.5 Flash , and Tiny Aya 4.1 Gemma 3 As A Reference Point Gemma 3 is still one of the clearest recent SWA examples because it is easy to compare against Gemma 2. Gemma 2 already used a hybrid attention setup with a 1:1 ratio between local and global layers and a 4096-token window. Gemma 3 pushed this further to a 5:1 ratio and reduced the window size to 1024. The key finding was not that local attention is cheaper, because that was already known. Here, the more interesting takeaway from the Gemma 3 ablation study was that using this more aggressively seemed to hurt modeling performance only slightly. The Gemma ablation study suggests that the smaller window and more aggressive local:global ratio have little effect on perplexity. Underlying paper: Gemma 3 article (Original source: The Big LLM Architecture Comparison ). 4.2 The Ratio And Window Size In practice, saying that a model “uses SWA” does not mean it relies on SWA alone. What usually matters are the local-to-global layer pattern and the attention window size. For example: Gemma 3 and Xiaomi use a 5:1 local-to-global pattern. OLMo 3 and Arcee Trinity use a 3:1 pattern. Xiaomi also uses a window size of 128, which is much smaller, and therefore more aggressive, than Gemma’s 1024. Figure 18: The long-context savings come from turning many full-attention layers into local ones, which reduces how much cached context those layers need to consider (Original source: LLMs-from-scratch SWA materials ). 4.3 Combining SWA with GQA SWA often appears together with GQA because the two ideas address different parts of the same inference problem. SWA reduces how much context a local layer has to consider. GQA reduces how much key-value state each token contributes to the cache. That is why many recent dense models use both rather than treating them as alternatives. Gemma 3 is again a good reference point here, since it combines sliding window attention with grouped-query attention in the same architecture. 5. DeepSeek Sparse Attention (DSA) DeepSeek Sparse Attention is one of the architectural changes that appeared in the DeepSeek V3.2 line and later showed up again in GLM-5. Specifically, DeepSeek V3.2 combines it with Multi-head Latent Attention (MLA) , and GLM-5 adopts the same pair for the same general reason, namely, reducing inference cost when context lengths get large. EXAMPLE ARCHITECTURES DeepSeek V3.2 and GLM-5 5.1 Changes Relative To Sliding-Window Attention In sliding-window attention, the current token does not attend to the full prefix but only to a fixed local window. This is the same broad idea behind DeepSeek Sparse Attention, where each token also only attends to a subset of previous tokens. However, the selected tokens are not determined by a fixed-width local window. Instead, DeepSeek Sparse Attention uses a learned sparse pattern. In short, it uses an indexer-plus-selector setup, where a lightning indexer computes relevance scores, and a token selector keeps only a smaller set of high-scoring past positions. The way the subset of tokens is selected is the main difference from sliding-window attention. Sliding-window attention hard-codes locality. DeepSeek Sparse Attention still limits attention to a subset, but it lets the model decide which prior tokens are worth revisiting. Figure 19: Similar to sliding-window attention, DeepSeek Sparse Attention also restricts each token to a subset of prior tokens, but does not do so with a fixed local window (Original source: From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates ). 5.2 DeepSeek Sparse Attention and MLA DeepSeek V3.2 uses both Multi-head Latent Attention (MLA) and DeepSeek Sparse Attention. MLA reduces KV-cache cost by compressing what gets stored. DeepSeek Sparse Attention reduces how much of the prior context the model has to revisit. Put differently, one optimizes the cache representation, the other optimizes the attention pattern on top of it. Figure 20: DeepSeek V3.2 is the obvious reference point, because this is the model family most closely associated with the sparse-attention idea. The sparse pattern is not random. The first stage is a lightning indexer that scores previous tokens for each new query token. It uses MLA’s compressed token representations and computes a learned similarity score over the prior context, so the model can rank which earlier positions are worth revisiting. The second stage is a token selector. It keeps only a smaller high-scoring subset, for example, a top- set of past positions, and turns that subset into the sparse attention mask. So the main point is that DeepSeek Sparse Attention does not hard-code the sparsity pattern. It learns which past tokens to keep. Figure 21: The mechanism consists of a lightning indexer that scores prior tokens and a selector that keeps only a smaller subset for attention (Original source: From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates ). DeepSeek Sparse Attention is relatively new and relatively complicated to implement, which is why it has not been so widely adopted as Grouped-Query Attention (GQA) yet. 6. Gated Attention Gated attention is best understood as a modified full-attention block rather than as a separate attention family. It usually appears inside hybrid stacks that still keep an occasional full-attention layer for exact content retrieval, but add a few stability-oriented changes on top of an otherwise familiar scaled dot-product attention block. Figure 22: Trinity Large is a useful comparison because gated attention is not only a Qwen idea (more on that later). Here the gate appears after the scaled dot-product attention output and before the output projection in a different long-context architecture (Original source: A Dream of Spring for Open-Weight LLMs ). 6.1 Where Gated Attention Appears The Qwen3-Next and Qwen3.5 architectures show that recent hybrids (covered in the next section) do not replace attention everywhere. Instead, they replace most attention layers with a cheaper alternative and keep a smaller number of full-attention layers in the stack. Those remaining full-attention layers are where gated attention typically appears. Qwen3-Next and Qwen3.5 use it together with Gated DeltaNet in a 3:1 pattern. But hybrid architectures aside, Trinity uses a related gating idea in a more conventional attention stack, as shown in the previous figure above. 6.2 Gated Attention Relative To Standard Attention The gated attention block in Qwen-style hybrids or Trinity (not a hybrid) is essentially standard scaled-dot-product attention with a few changes on top. In the original Gated Attention paper , those changes are presented as a way to make the retained full-attention layers behave more predictably inside a hybrid stack. The block still looks like standard (full) attention, but it adds: an output gate that scales the attention result before it is added back to the residual, a zero-centered QK-Norm variant instead of standard RMSNorm for q and k, partial RoPE.

0 views
flowtwo.io 3 weeks ago

Fundamentals of Software Architecture

A handshake should be firm, but not overpowering. Look the person in the eye; looking away while shaking someone’s hand is a sign of disrespect, and most people will notice that. Also, don’t keep the handshake going too long. Two or three seconds are all you need. — Richards & Ford, Fundamentals of Software Architecture , Ch. 32, para. 87 I swear, I find a lot of value in reading books about software. But I take issue with the length of some of them. When I'm 600 pages into an 800 page technical book, and I'm reading something barely tangential to the book's topic, like detailed instructions on how to shake hands...I get a bit annoyed. I think it's because every author wants to make their book "the definitive reference on X", whatever X is, so they feel the need to include stuff about leadership, soft skills, etc. Technical books like this could be more approachable if they kept to a more concise topic. My two cents. Anyways, Fundamentals of Software Architecture was written by Mark Richards and Neal Ford. It's a thorough cataloguing of every popular architectural style and their pros/cons. It introduces a lot of terminology, with the goal of defining how to evaluate and explain the architectural qualities of a system—qualities like availability, coupling, fault tolerance etc. This post is mostly a summary the architectural topics covered by the book; I've added some personal commentary on system coupling and AI near the end. According to Richards and Ford, the 3 laws of software architecture are: Everything in software architecture is a trade-off Why  is more important than  how Most architecture decisions aren’t binary but rather exist on a spectrum between extremes. They added the 3rd law in the book's 2nd edition. It sorta just feels like a different way of phrasing the 1st law, but I think they're trying to highlight that any architectural decision is never "absolute", i.e. most systems don't perfectly align to any one architectural style. A system might lean towards microservices architecture but have elements of other patterns too, for example. "As I have evolved, so has my understanding of the Three Laws. You cannot be trusted with your own system architecture." — Claude For mostly my own sake, I've briefly summarized each of the architecture styles covered by the book. Just 1 or 2 sentences explaining what it is and when you should use it—I'm aiming for brevity here, like a crib sheet. Pictured: Enterprise Service Java Beans from the Neolithic era. Thought to be a tribute to Sun Microsystems It's important to understand how to define a system's boundaries. In the book, the authors define the concept of an architectural quantum which is the "smallest part of the system that runs independently". The system might be your entire microservice architecture, but if one part of it can function independently of other parts of the system, it forms its own architectural quantum. So how does an architectural quantum run independently if it has to communicate with other parts of the system? The critical part is how the communication happens—whether it's synchronous or asynchronous: The dependency turns them into a single architectural quantum. Asynchronous communication can help detangle architectural quanta because it removes that dynamic dependency — Richards & Ford, Ch. 21, para. 48 If the operation of System A requires information from System B, then it's coupled to System B and they form a single architectural quantum. This means that System A's characteristics are impacted by System B's characteristics. If System A needs to be fast, we must ensure System B is fast, and consistently fast. At my current company, every service is associated with a reliability tier. The service's tier determines many of its operational requirements. For instance, a tier 0 system (the highest tier) needs to be deployed in multiple regions for redundancy. It needs an on-call engineer, clearly defined SLAs, etc. But if a tier-0 system needs to retrieve data from a lower tier system as part of its operation, all of a sudden the lower tier system needs to be a tier-0 system. They become coupled. In practice, there's some nuance here. Just because you call another service via HTTP and block the current process waiting for a response, doesn't mean the two services are fully coupled. As long as there's fallback functionality that doesn't constitute an error state, they needn't be considered coupled. If your service needs to be fast and the other service isn't reliably fast, you may implement a strict timeout and then fallback to some degraded functionality in the event the request times out. As an example, consider a new user recommendation system being built by your company's ML team. Your tier-0 homepage rendering service can still attempt to retrieve user recommendations from this new system, but as long as you can fallback to some other functionality (like just choosing the user's recently viewed content) we don't need to group that recommendation system in with our service and its strict functional requirements. The 2nd edition of this book was published in April 2025. So of course, AI was brought up a lot. In general, the authors' stance was that AI is not an effective replacement for human architects—and they didn't seem optimistic that it could ever be. Why? Because, as we’ve demonstrated in this book, everything in software architecture is a trade-off. LLMs are great for understanding knowledge, but to this day, they still lack the wisdom necessary to make appropriate decisions. That wisdom includes so much context that it’s much faster for the architect to solve a business problem by themselves than to teach an LLM all about the problem and its extended environment and context. The fact that we’ve included eight other intersections to be concerned about should be evidence enough that this is a daunting task. — Richards & Ford, Ch. 33, para. 80 While I agree that the amount of context necessary to properly make architectural decisions is hard to shove into an LLM's context window right now, I don't believe that'll be the case for long. I have a feeling the opinions in this book will become outdated quite soon. Also, despite the authors' insistence that "architecture is the stuff you can’t Google or ask an LLM about", I fully believe that AI tools are an indispensable tool for researching architectural decisions. They can explore the problem domain more completely and much faster than any human could. They can also illuminate trade-offs and nuances you might have missed. The fact that the authors' never mentioned this in their statements on AI utility is a major oversight. Every job function in software development, from junior dev to CTO, should be leveraging AI tooling at this point. Like I mentioned at the start, I found FoSA to be a bit bloated. Also, the book didn't didn't really cover what I was looking for. I wanted a book that described more specific architectural patterns for solving common technical challenges like cache invalidation, database replication etc. Instead, it focuses exclusively on the overall system layout—how the domain boundaries are divided and what the physical topology looks like. And how to shake someone's hand properly. I also think the book tried too hard to quantify complex system characteristics. I don't find much use in assigning a 1 to 5 star rating for the "maintainability" of a "microkernel" architecture style (which is 3/5 according to the book)—simply because both the characteristic and the style itself are too vaguely defined to warrant a rating. I'm certain you could build your microkernel system to have poor maintainability OR incredible maintainability. There's too much ambiguity to extract any conclusions from these assessments. Still, in general, FoSA is an interesting book that tackles one of the more complex and less formally researched areas of software development. Architectural decisions are the hardest to make due to their consequences and trade-offs, so knowing the patterns that have worked for others is a great starting point. Everything in software architecture is a trade-off Why  is more important than  how Most architecture decisions aren’t binary but rather exist on a spectrum between extremes. What is it: Technically partitioned: presentation, business, persistence, and database layers for example. Typically a monolithic application with a monolithic database. Very common, especially in legacy systems. When to use it: Small, low-budget applications. But it can scale surprisingly well. What is it: Another monolithic style, i.e. a singularly deployed application. The system is divided by business domain instead of technical functionality. Domains are called "modules". Goal is to minimize communication between modules as much as possible. When to use it: If teams are domain-focused and using domain-driven development, it's a good starting architecture. Can later migrate to a distributed architecture more easily. What is it: Topology consists of pipes and filters . Filters perform business logic; pipes coordinate and transfer data. Systems have a unidirectional data flow; it can be monolithic or distributed. When to use it: Suitable for systems with one-way, ordered processing steps. ETL pipelines, etc. What is it: Topology consists of a core system (the "microkernel") and plug-ins. Plug-ins are optional and provide extensible functionality to the system. Traditionally monolithic with a single database. Plug-ins shouldn't access database directly. When to use it: Installable desktop applications, or domains that address a wide market and require many custom rules and functionalities for each customer. What is it: Distributed architecture with a separately deployed user interface, coarse-grained domain-centric remote services, and a monolithic database. Basically microservices but with coarser service boundaries and a single shared database, or just a few. When to use it: When the system is of significant complexity and serves a wide enough user base that the benefits of a distributed architecture outweigh the costs. Can be a stepping stone towards other distributed architectures What is it: Distributed system using mostly asynchronous communication. Consists of event publishers, brokers, and processors (the services). Central communication unit is an event, as opposed to a request. When to use it: Systems that require flexible, dynamic processing that need to scale to lots of concurrent users. Applications where eventual consistency is tolerable and immediate acknowledgement isn't needed. What is it: A complicated distributed infrastructure of scalable processing units that are supported by replicated and/or distributed caches. There is a shared "data grid" that handles data syncing between units and reading/writing from the database. This removes the database bottleneck from the system—database access isn't needed for processing requests. When to use it: Applications with very high concurrent user volume and high traffic variability, AND a low need for data consistency between users. Race conditions and data conflicts will be unavoidable in this system. What is it: A legacy architectural style that uses abstract service layers and operations orchestrated by a shared "enterprise service bus" which knows which services to call to complete operations. Uses generic components to increase code re-use. When to use it: If you've taken a time machine back to the 90s and you have to write enterprise software. What is it: Domain-driven architecture that enforces strict API boundaries and minimizes coupling between domains. Duplication is favoured over re-use where possible. Each service should "do one thing" and have its own database ideally. When to use it: Systems that are highly modular and have high enough load to justify the scalability and performance benefits compared to the development and operational costs.

0 views
Weakty 3 weeks ago

Pinned

Back in March, I found this book at the back of my bookshelf. I don't know how it got there. I didn't remember ever buying it, much less reading it. It had the most ornate cover, as if someone had done an oil painting on an old hard-cover book. In fact, that seemed to be what it was exactly. Just like an oil painting, it had bumps and ridges, and if I had the heart to do it, I probably could have picked at it and flicked little chips of colour off with my fingers. The cover depicted several different images, but was sufficiently abstract to make me doubt my interpretation. It reminded me of a smouldering firepit set against a twisting whirlwind of leaves and strange debris: splinters of wood, the bits of plastic that connect 6-packs, old shoelaces, chunks of a gilded picture frame, and even something that looked like the severed legs of some poor creature. If you saw it, you’d probably see something else. I spent a fair bit of time staring at that cover. With the book splayed out it made for a beautiful but disturbing landscape. It was actually a few days before I even looked at the pages, I was so taken with the cover. When I finally got around to looking inside , I was surprised to find that the pages were blank. I flipped through them a few times, incredulous. My first thought was, this is a journal, or a sketchbook . But it wasn't. The book had an ISBN number, a bar code, and a bunch of information on the inside page. On the other side of the cover was the following: Of course, there were a few other things on the page. But the thing that shocked me, what made my stomach throw itself down into the basement of my body, was the line Thank you, Acton . It really struck me. Because my name is Acton. I've never met another Acton. What can I tell you about my name? It’s uncommon. It’s usually the first on any list of names. What else? People have a hard time making fun of it. Your standard school-yard bully wasn’t clever enough to come up with a quip for Acton . So, I was surprised to see my name, being thanked from this very personal, empty book. And as you might guess, my next move was to go over to the computer and look this thing up. I searched for the book's title online and plunged into reading all about it. Here's what I learned: That was about it. I found the regular stuff, you know, people sharing their reviews and ratings and whatnot. But here's what was odd. The photos I saw of the book's cover looked different than the one I had before me. And of course, the pages of mine were blank. I re-checked the cover of the book to make sure, yet again, that it wasn't just some kind of wrapper around another book. It wasn't. It was a hardcover book that someone had painted . I ran my fingers over the paint. It was paint, no doubt about it. And open the cover, and there was that first page with my name staring back at me. But I'm rambling at this point. That was back in March. Let’s fast-forward a few weeks to when the book started to talk to me. Things got weird in early April. I had the book open on my desk. I wasn't sure why I opened it. I think I had been admiring the cover in the morning sunlight coming through the wide window above my desk. When I opened the book, the pages smelled of a newness I hadn't noticed before: starchy and pleasingly rough to the touch, like the high-quality pages of paper in an artist's sketchbook. It seemed to hold the promise of possibility. I felt compelled to take a photo of the book. I reached into my pocket and pulled out my phone. Then, with the blank pages before me, in a rush, the book slurped up my phone. It slurped it right out of my hands and collected it onto the first empty page. Don't misunderstand me—this book very literally liquefied my phone, sucking it up and rearranging its physical form into a two-dimensional representation of my phone—right down to the small crack in the bottom-right corner. I stared at the miniature picture on the page: it was a rather painterly depiction, not unlike the cover of the book. I spent some time admiring the rendering. Entranced, I almost moved to take a photo of it but, of course, my phone was gone. That's when things got interesting. It was hard to operate without my phone. I knew I was addicted to it, sure, but I didn't realize how dependent I was on it. It immediately caused a fuss for my job. I worked part-time for a distress hotline for teenagers who are in a bad way. Employees of the hotline have a special application on their phone ensuring the calls can be encrypted and recorded safely while retaining anonymity. So, when I didn't "show-up" online for my job that day not only did I likely miss showing up for people who needed help, but I wasn't going to get my measly pay for the shift. I felt worse about not being there for the teens, of course. I had enough money to get by for the immediate future. But the thought of some youths calling the hotline and being on hold when I could have been there to answer made me feel terrible. But not terrible enough to tear myself away from the strange situation I found myself in. It was through the blank pages that Myriam was able to reach me. It happened a few days after my phone had been slurped up into a page of the book. That morning, I came back to the book at my desk with my morning tea in hand. I sat down, and stared once more at the painted cover, before slowly opening it to the page with the little image of my phone in the corner. Slowly, words began to appear on the left page of the book, beautifully typeset, as if punched in by a typewriter, one letter at a time. I knew in an instant that it was Myriam. In my shock of seeing this happen for the first time, I could barely register that as her writing got to the end of the page, the whole thing cleared and started again. I don't remember the exact words of our first conversation, but I think it went something like this: Then the words began to disappear. I grabbed a scrap of paper and wrote down Hoke Scripter and Able-Archive Pigmented Blue Ink . And then the words were gone. Now I had something else to do. A week passed before I found the ink. Right after I got that message from Myriam, I went online and looked up Able-Archive Pigmented Blue . I wasn't surprised to find that the ink was no longer in production. It would be too easy to click "Add to cart" on a bottle of ink, and wait a few days for it to show up at my door. Instead, I found myself on forums for pen-and-ink fanatics, reading, reading, reading about this ink. Often, I saw that people had inherited a bottle from an older relative (along with some old, fancy pens to boot). Other people wrote posts online to laud this ink as a workhorse ink and that not many people made them like this anymore . I found myself enjoying going through these posts. After a day or two of looking, I found a post of someone selling a bottle of Able-Archive Pigmented Blue . They were across the country but it seemed they would ship it my way. I sent them a direct message and inquired about purchasing the ink: And that was it. Soon I would have some Able-Archival Pigmented Blue ink, along with its contractual-binding-powers. Whatever that meant. Waiting for the ink made for a strange week. I had to pause work entirely. After missing my first shift, I sent an e-mail to my coordinator letting them know I would be taking some time off. I didn't tell them the part about my phone getting sucked into a book, of course. That would be too weird. Instead, I told them I needed to take a break from work—and in this line of work they get it. So I sent off that e-mail, apologizing again for the shift I missed, and saying I would reach out when I was ready to return to work. My schedule was full of empty slots. I almost felt like I had just finished school and was at the beginning of summer vacation. I had all the time in the world to hunt down stationery and whatever other obscure things I felt like searching for. Next up, of course, was a Hoke Scripter — the pen Myriam had requested I use. This turned out to be no trouble at all. There are a few stationary shops in town. I went out to three before I was able to track down the Hoke Scripter over at Infinite Inkwell . I had never been in before. The shop was basically a large stone room. Along the rounded walls were recessed cubbies of various sizes and shapes, holding all kinds of objects. There was one portion of the wall that held rows and rows of inks, while others had glass drawers that pulled out of the cubbies, displaying pens, pencils and other writing implements of all kinds. In the center of the room was a large round table where you could sit, perhaps for testing pens or paper. There was only one person at the table, hunched over. They had a huge stack of paper and they were furiously writing. After finishing a page they would, without looking, add it to the growing stack to their left before pulling a new page from a pile of crisp sheets to their right. I wasn’t sure if they were an employee, another customer, or something more random and bizarre. My staring was interrupted by a staff member, who offered me some assistance. They led me over to the cubbies with the glass drawers, specifically pointing out a row of pens in deep blue, green, and red hues. "These are the Hoke Scripters. They’ve got a classic appeal, perhaps you can see." "I do see a certain classic air to them," I agreed. "And what might you be looking for in a pen, may I ask?" "Oh well," I began, unsure what to say. "I’m looking for a pen that my grandmother mentioned she used to use in her youth. I think it was this one." I felt strange lying to the employee, who was just looking to help me. "Well, The Hoke Scripter is a model from Hoke that has existed for many years. It is likely that your grandmother could have used an earlier incarnation of one of these." "Cool," I said flatly. "Would you like to try it?" the staff asked. "Why not." I said, flatter than flat. Flatter than a pancake. Flat enough to write on. The staff member disappeared for a moment and reappeared with a tester Hoke Scripter. She passed me the pen. I held it in my hand, lifting and lowering it experimentally. It felt pretty sturdy. It certainly wasn’t cheap plastic. Too heavy, and too elegant for plastic. Aluminum? I would have to look it up later. The staff gestured for me to grab a seat at the large round table, to test the pen out. I joined the strange character, still writing furiously at the big table, and I did exactly that. I wrote a few lines. Then a few squiggles. What can I say? The Hoke Scripter moved like a pen does, leaving a trail of itself wherever my hand moved. I suppose I was perhaps expecting something more majestic from a pen with such a history. But there was nothing majestic about it. It was just a pen. It felt sturdy in my hands, and it did have a nice pattern on the barrel, and I supposed a person could customize it with whatever ink they liked. But what else was there to say? "What do you think?" The staff member reappeared after a few minutes of my scribbling and scratching with the pen. "Um, it’s nice." "The scripter can be a bit underwhelming, but there’s more to it than meets the eye. It’s not our most flashy or popular pen, but I think the people who come to buy this pen always have a certain, well, intentionality to them. Let me know if you need any other help or have any other questions." The staff member disappeared, leaving me to write some more. I didn’t spend much more time in the shop. I bought the pen, and it cost me a whopping $55. I could hardly believe it. That price included something called a converter which I was told I needed: it would hold the ink I wanted to use with the pen. I walked away from the shop to get into my car to go home. I grumbled under my breath. Oh, the intentionality of dropping $55 on a pen. The weekend arrived without much fanfare. I was feeling a bit naked and aimless. Without my phone, my fingers fidgeted with the accumulated bits and bobs of things around the kitchen counter and table: a magazine that had been sitting there for weeks, some dirty cutlery, a notepad from the dollar store, and a rubber band. I sat there, with the sun rising around me, spinning that rubber band about my index finger, waiting for life to wake up around me. Eventually, I got up and I poured some cereal into a bowl, made myself a coffee, and sat back down to eat. I grabbed a pen and flipped open the cheap notepad. One of the spiral ones that could fit in a shirt or pants pocket. It had already had a few pages ripped out of it. Then I remembered I had gotten it out when I had a few friends over to play cards last week. I briefly looked at the tally marks of our scores. I ripped out the page and noted that the tally marks had imprinted on the cheap paper below. I turned to my cereal. If I left it any longer it would get soggy. What would I do with my day? It would be another few days before the ink would arrive, and this strange episode in my life was on hiatus until it appeared. I poured myself another cup of coffee and threw myself down on the couch. I stared into the old fireplace in the corner of the room, long since used. Outside, cars were idling in the street, and traffic was already picking up. I could tell by the exhaust rising up to the window. The exhaust of a home fireplace had been replaced with the exhaust of cars. I moved closer to the window and looked at the traffic. They were stuck, too, waiting in line. All of us waiting for some kind of ink, something permanent to arrive. I stood there and took a sip of my coffee. Just watching. I didn’t want to open the book. If I did, there was a chance that Myriam would speak to me again. Perhaps I should say she would write to me again. And if I couldn’t write back without the correct ink, she might lose interest. Perhaps she only had so much strength with which to communicate with the outside world. I figured that if I didn’t open the book, she would stay in some kind of stasis; her life frozen in the permanency of the page. In that sense, I supposed I had time. I sighed, finished my breakfast and got up. I wasn’t going to get anywhere sitting around thinking. My hands reached for my phone, only to find it missing from my pocket for the fiftieth time today. I thought about e-mailing a friend and seeing if they wanted to get together over a coffee, or maybe go for a drink later tonight. I popped open my computer, opened my e-mail and stared at the empty draft. I started to write without addressing the message to anyone yet. I looked at the "To" field of the e-mail. I clicked the "plus" button and a pop-up appeared with a list of some of my most frequent e-mail recipients. I started adding all the people I thought might reach out. Before long I had sent the e-mail to some fourteen people. Some of them I had spoken with as recently as last week and some I had not seen in years. Then, I moved all the e-mail addresses into the BCC field so that the recipients wouldn’t know who else had been e-mailed. That’s the decent way to do it, in case you were wondering how e-mail works. I hit Send , shut the laptop, and started cleaning up. I decided that I should go for a walk. It was still early, and there wouldn’t be many people out. Never mind that it was the dead of winter. I watched the cars on the road from my apartment. They always seemed to be perpetually lined up, exhaust rising into my view. What a view. I always go back and forth in my mind whether I should have taken the apartment at the top of the building. When I moved into this place it was either this, the second floor where I am now, or the top floor. All the units on the first and third floor had been occupied. I eventually chose the second floor because I thought I wouldn’t want to move everything up four floors (there’s no elevator). But from here, I was a bit too close to the action. After having lived here for a year or so, I think being on the fourth floor would have been better. A bit more removed from the commotion, the exhaust, the closeness to the ground. Besides, I figure I could use the exercise of an extra two flights of stairs each day. I pulled on my hoodie and tossed a thin jacket over top, shoved on my boots and stepped into the hall. It would only be a short walk, so I wouldn’t need any more layers. I didn’t even have my gloves. Outside, I stepped between the idling cars with their anonymous drivers, and took the back alley behind Jason’s Grocer out toward Delmont Ave. It was still early and the alley was quiet, as I expected. I passed a few piles of garbage outside the backs of the commercial buildings and kept walking. The alley narrowed past the point cars could reach and I kept walking. I took a left at Montrain and walked the ten minutes to Gaston park. A few more minutes of walking, past the fountains, shut off for the winter, until I was on one of the trails. Despite its confusing mess of criss-crossing trails, it made for a great escape from the city. I walked through the park, listening to my feet occasionally deviate from their regular cadence, kicking and tripping over bits and pieces of trail. A few runners passed, and even one cross-country skiier, although it seemed the snow was a bit thin for it. All these people had headphones in, plugged into a world of their devising. I had nothing to listen to but the sounds of crunching snow, birds in the barren trees, and the far off sounds of cars starting and stopping and occasional honking. I reached the end of the trail and then turned around. I was growing restless and feeling disconnected. I had seen people IRL, sure, but I could feel myself growing increasingly uneasy. I knew this was likely due in part, if not entirely, to my phone being sucked into a book. I knew this because my hands still kept patting at my pants or jacket pocket, thinking my phone was there, within reach and ready to soothe whatever restless thought came my way. I walked back home in a funk. I had not anticipated that I would feel this sort of withdrawal. When my phone had been liquefied and sucked into the page of that book, well, you could say I was somewhere between amused and bemused. But now I was starting to feel agitated. I took a few deep breaths and tried to tune into the sounds of the world again, but, instead, my ears just pounded as if they were trying to tell me to plug something into them, shove anything in: a podcast, some music, anything but this overly-present, disgusting excuse-for-nature around me. I picked up my pace to a point where I was almost running. I exited the trail where I came in. There were more people now: some of them sledding down the large hill facing east, others with their dogs, families with kids running after each other. I walked a few more paces and then looked back at the entrance of the trail. It looked light and entreating. I found a bench to sit down on and looked over at the people in the park. I was sweating underneath my two layers. The sounds of the children laughing ricocheted through the air and slammed into my eardrums. Cars passed by with a roar I had hardly noticed before. I put my face in my hands and pulled at my skin, as if I had a tight mask on that needed to come off. No such luck. Instead, I rested my chin in the palm of my hand, my elbow on my knee. I looked down at the ground before me. There were a few spare bits of garbage before me, empty weed canisters, a pop bottle that someone had peeled the label off of, and a large ribbon that looked like it belonged on a kite. I just didn’t care enough to pick up any of it and throw it away. Why should I, I thought miserably, when I felt like this? When I got home, I immediately threw open my computer to see if anyone had responded to my email. Not a word. I shut it in frustration, slid it away from me on the coffee table, and slumped back into the couch until I was lying horizontally. I didn’t know what time it was, but it was probably close to lunch. My phone was also my watch, like for most people. Without a clock in the room, I resigned myself to not knowing what time it was. I simply closed my eyes and lay there. I imagined the fumes of the cars, still lined up outside waiting for their turn to go somewhere , rising up and lulling me into a noxious sleep. And I did fall asleep. And I started to dream. In my dream, I was walking with someone. Not in the park this time, and not in the city either, really. It seemed we were walking across a bridge. It was foggy all around us. I could not tell if the bridge was going over water, or if it was going over a highway, or something else entirely. Looking over the edge of the bridge, it could be spanning a valley full of turtles, or lava, or a great black emptiness. I walked on, my eyes periodically darting to the side of the bridge, wondering at the mist below. All this time, my walking partner had been talking, but I had not listened to a single word. I turned to look at them. It was Myriam. I knew it immediately, there in the dream. She was exactly as I had pictured her, based on our first conversation. She had a sad wilt to her, like flowers left in the sun on a kitchen table too long. She looked back at me, perhaps wondering if I had heard anything I had said. I said I was sorry, but this didn’t seem to reset anything between us. "Will you ever respond to me? I’ve been wondering this every day, Acton." "I will, I’m just not ready yet," I replied. My words clumsily escaped from me. I watched them tumble over the bridge into the mass of fog. "I hope you do," she said. "You have the ink and the pen, right?" "I do." I said, which was strange because I knew the ink had not yet arrived. We walked in silence for a few minutes. The bridge was the only concrete thing within the fog and seemed to go on forever, disappearing into the misty distance. "I just need some time to get my thoughts into place," I said, facing forward so my words would come out right in front of us, and not get swept away into the abyss. Myriam’s hair was a steely gray, and she wore it loose, down to her shoulders. Every now and then, a muggy breeze would come and push it beyond her shoulders, like the swinging doors of a saloon in an old Western film. I imagined foggy, unseen spectres passing through the doors of her shoulders into a place I couldn’t go. "Whenever you’re ready," I heard her say. We walked some more. Then she began to shrink, and I began to grow. With each step, she got smaller and I got larger. Before long, I was trying to avoid crushing her with my feet. I couldn’t continue walking beside her and instead had to step out of the bridge and into the unknown fog. My first step came down unsteadily, but with such grandiosity that the fog cleared and was swept away, the weight of my footstep pushing it outward. All around my foot, I saw a verdant greenness, twisting vines, lush ferns, and tall grasses. All this I saw for only a moment before the fog returned to surround my foot. I grew larger still, and the following step with my left foot required straddling the bridge. By now, Myriam was long gone, and I was so tall I couldn’t see what the displaced fog would reveal. Despite not being able to see that far down, I knew that my next step would not yield the previously lush green ground; for a hundred, or maybe a thousand years had passed since my previous step, and things had surely changed for the worse. So I stopped in my tracks, afraid of advancing time any farther or growing any larger, of stomping out any other life in my next movements. I knew the next movements I would make would decide the fate of all of life. My next step would be like a thousand atom bombs, and there would be nothing left of this world. I took a step, and I was right. It all ended. And then I woke up. I sat up on the couch and groggily mused over my dream. It was starting to fade already. I didn’t feel like writing it down, but I tried to remain with the feeling of being a giant. Of a being so large that they can’t even see what their giant steps are obliterating, the world so far below them. I made myself a coffee and lay down on the couch, the cup just within my reach. Periodically, I pulled it to my face, and carefully sipped it from my lying down position. A most precarious way of drinking a coffee. I looked at my laptop on the couch. It was calling to me. A silent notification, ringing in my ears. I wanted to hear from someone. Anyone. I set my coffee on the table next to the couch and pulled it onto my lap and opened my e-mail client. Before me was a single unread message in my e-mail inbox: I looked at the e-mail. It took me a moment to register the name. Casey. Casey was a friend from high school. We hadn’t talked in quite some time. The last time we had gotten together, she and I had indeed gone for dinner. It had been a patio brunch in the middle of summer. I looked over the list of people I had sent my original e-mail to. Casey was my oldest friend among the list. The two of us had met in the school wrestling club, two odd ducks who had picked the wrong club to join. We had bonded over discussing mechanical magazines, web forums for Arkendo’s Binding , a game we both loved, and the fact that neither of us actually wanted to be in wrestling club. All this talking we did over the slam of teenage bodies against thinly padded mats. I can almost hear it now. I responded quickly, and effortlessly—the kind of correspondence that happens with people you really just click with. Short and sweet. Save the questions for when we get together. I snapped my laptop closed and thought about this making of plans without a phone. I wouldn’t be able to check in with her leading up to the dinner. I felt a sweet anticipation already growing in me. I got up and busied myself with making some lunch. I chewed my food, staring out the kitchen window. I tidied up. I started putting away dishes that had piled up over the previous few days when the buzzer rang for my apartment. I pressed the intercom and asked who it was. In response I simply heard a muted thump . My apartment not having a camera to watch approaching people, I had no choice but to go see what it was. I clomped down the stairs to the lobby to where people leave packages outside the rickety, old door that served as the gateway to the lobby. I poked my head out and felt a blast of cold wind and snow hit me right in the face. I looked down. There was a small package, clumsily wrapped. I picked it up. It didn’t weigh much. My name was on it. There was no return address, but beside a simple "from:" label I saw a familiar name: NibTuner79. The ink was here. I took the package inside and brought it up to my apartment. I clumsily picked away at several layers of clear tape with my fingernails before giving up and cutting open the brown parcel paper to reveal the ink. The inkwell was round with a square neck, ice blue. It was filled 3/4 full with the ink. I turned it in my hands, taking in its shape and the rich blue colour swirling with the movement of the inkwell. I stopped so that the label faced me: a simple white label with a black border and the serif text: Able-Archive Pigmented Blue. There was no logo, no insignia, no brand to speak of. For all I knew, NibTuner79 could have filled any old empty bottle of ink with some random, cheap blue ink. But it seemed unlikely. I wasn’t exactly a collector of these sorts of things, but as far as I could tell, honesty went fairly far in the world of nerds and gearheads. I chuckled to myself, because I felt quite honest thinking that. I was on my way to becoming one of them. I put the ink down on the desk next to Myriam's Codebreakers and the Hoke Scripter. I had all three. The trifecta. I could crack open the book and write something. I stared down at the book and its painted cover. To my right was a freshly inked pen. To my left, a quickly cooling cup of tea. I drummed my ink-stained fingers on my desk—I had quickly learned that filling a pen could be a very messy job. I wondered how staining archival ink was. I would find out soon enough, I supposed. I picked up the inked pen in my hand. Despite now having just a few millilitres of ink in it, it seemed to feel heavier. Potent. No longer an inert stick, but a poised snake. I could feel my hesitation mounting, but I knew it was time to act. I opened the book with the fingers that had remained unstained by the act of filling the pen. I flipped to the page where Myriam’s text had first appeared and where there was still the image of my phone on the page. I wrote underneath the depiction of my phone: I could feel myself running out of things to say. I looked down at the pen in my hand. I looked down at my writing. It was messy and seemed to switch between cursive and printing at random. I had hoped I would have come up with something more meaningful or interesting to say. I leaned back in my chair. I was uninteresting. I was offline. Were the two interchangeable for me? I was a spectator of myself, watching my events unfold in a terrible synchronicity that had me simultaneously bored and amused. Despite all the separation anxiety of being without my phone, of not working my regular shifts, of what felt like just wandering around aimlessly, I was becoming a subject of my own life. My old life, my digital life, was in the hands of Myriam, a woman, I presumed, in a book by an author named Solah J. Trek. I rotated and twirled the Hoke Scripter in my fingers and stared at my paragraph, re-reading it for the second time. As I read, the words began to disappear into the page. It seemed that as soon as I finished reading a sentence, one by one, the words of it would sink into the nothingness of the page. By the time I got to the end of my second paragraph it was all but gone. I stared in astonishment at the empty page. The painted image of my phone was still there in the corner. I shifted uneasily in my chair, breaking a silence that had settled in the room. Then, a voice appeared in my head, and I realized it was from myself reading words on the page—Myriam was responding. I pounded down the apartment stairs, my messenger bag banging against my back. Every moment an eternity . Every step, the thump of my bag against me—muffled and extinguished by the muted stairwell. I threw open the front door, nearly hitting another resident of the apartment on the way in. I took hardly any notice of them. I was back outside again. I was to meet Casey at 7pm at Deep Blue, a gastrobar about a fifteen minute walk from my apartment. On my wrist: an old watch I had found in my bedside table. In my bag: a strange book, slung on my back. I walked quickly, trying to make sense of things. After seeing Myriam’s message I had stared blankly at it, shut the book, and got up in a daze. At first I felt unsafe, the target of some kind of operation that was beyond my comprehension, beyond my plane of existence, even. The message in the book had disappeared within minutes, maybe seconds after my reading it, and had left me staring at an empty page. Somehow, Myriam had known about my rendezvous with Casey. At first, I was confounded. I walked and mulled. The questions in my mind turned over and over, scratching at an imagined blank page. I turned onto Range Ave and crossed over to the other side, stepping around parked cars and stopped at the curb. I stood on the cobblestone walk that marked the beginning of the old downtown. I studied my shoes against the unevenly placed stones. I stood hunched there, losing my gaze to the ground while the world spun around me. I wasn’t sure why I had stopped. It was as if I was frozen—I couldn’t move my body save for the clenching and unclenching of my hands. All the while that I stood like this, I felt an eerie sixth sense, as if a great wave was going to crash over me. I stood wanting to look back but resolving not to. Eventually, I broke free. I could move again. I didn’t look back. Everything clicked in that first footstep out of that immobile state. My questions answered: Myriam had read the message from Casey through my phone—my phone that was still somehow working in another ethereal realm. It was clear to me now. Myriam was reading my e-mails. She was in a real-time lock-step with me. She was stuck in her own eternity, watching the stream of data of my life move past her, watching and absorbing it as she liked. I continued down the cobblestone path at a reasonable rate, minding my step for the occasional uneven stone, until the sign of Deep Blue came into view. The downtown was bustling with people, despite the cold weather and snow. Half of them could well be tourists, periodically stopping to find their way with maps on their phones, or to send a message to someone out of view, out of picture, someone who might be at the other end of the world—and it seemed they had to do this in the middle of the sidewalk. I found myself grumbling. I had become moodier in the preceding days. I reached the door to Deep Blue and stopped, feeling an impulse to message Casey and say I had arrived. I couldn’t, of course. Instead, one of us would have to go into the restaurant and be there waiting for the other. We hadn’t agreed to stand outside and wait to go in together. But there I was, standing outside, surveying the landscape of downtown, searching for a familiar face in a crowd of strangers. I stood staring for a minute. I wanted to shake myself out of this slump I was sliding into. I hadn’t seen Casey in years and this wasn’t the disposition I wanted to show up with. I had nothing to prove to her, of course, but it would have been nice to let her know that I wasn’t losing my mind. Which I might have been. I headed into Deep Blue. "Acton!" I was stomping through Deep Blue when I heard my name called out. Casey’s voice pulled me out of the mud. You might even say my heart soared at the sound of her voice. That’s what the voice of an old friend can do to you. I had been in my own head too much—and it was only her voice cutting through the bog of my thoughts that made me realize it. I made my way to her table. She had gotten a small booth at the back of the restaurant. I looked at my watch. It was 7:02. The restaurant was bustling. She jumped up, arms outstretched for a hug as I approached. She was wearing a colourful knit sweater, loose and baggy. I sank into her arms and found myself breathing in deeply. In her arms, I felt something stirring in me. In a friend, even one you hadn’t seen in some time: support, meaning, connection, a feeling of being grounded. I could have stood in her arms, in the way of the staff and the customers in the middle of that restaurant for the rest of the night. Instead, I grew self-conscious that I would be the one holding on longer than her. I let go. I sat down, tossing my bag haphazardly (forgetting what was in it) into the corner of my side of the booth. As soon as I sat down we were immediately swarmed by the staff. We both held our tongues, not wishing to begin the business of our catching-up in the immediate presence of others. We put in an order for some appetizers. As promised, Casey put in an order for two drinks, the same for the both of us. Then, we were left to ourselves. I checked my watch again, expecting it to be 8pm already. It was 7:09. "This is a pleasant surprise," I found myself saying. Between us, there was a small center-piece with a lit candle. The flame of it danced under the whims of my opening remarks. "Agreed." Casey took a sip of her water and I watched her. "You’re the only person who e-mailed me back. I didn’t expect anything from anyone, I suppose, but I was surprised that you were up for it." "Why’s that?" "I don’t know, it’s been some time since we got together. People get in their head about that sort of thing," I said. "Which is funny, don’t you think? Maybe you were surprised I was interested, when most people would think old friends from the past make for awkward communications. But I’ve never really felt that way. People weave in and out of each others lives over the years. It’s normal. Something to be grateful for, really." "That’s insightful," I said, pausing to sip my water, "I suppose I’m also a little surprised—I would have thought that the people I had been messaging with over my phone just weeks ago would get back to me when I e-mailed them, but instead it was the opposite." "How many people did you send your message to?" "Thirteen or fourteen," I said. "E-mail is old, now. It’s a dumping ground." "I guess." "So the phone is gone. And it pushed you to reach out to some old friends." I tasted my drink, which had appeared moments ago. I had already forgotten what she had ordered. It definitely had some gin in it, but maybe some lemon and honey, too. "Gone…" I echoed back to her. "Yes, temporarily." "What do you mean?" She asked. "Well, it’s a bit of an odd story," I said. My eyes shot to my bag in the corner. Casey’s look followed my eyes to the bag, slumping in its seat of the booth. "This is a really good drink." I said. "Acton, quit being so vague. What happened to your phone? What’s in your bag?" This is what I liked about Casey. She was to the point, and she sometimes made herself laugh in that way. She wasn’t laughing now, though. "Well, my phone…" I said, trailing off again. How much did I actually want to share about this? I looked at my bag again and reached for it. "Well, it would be easier to just show you." Out of the corner of my eyes, I could see Casey watching me as I opened my messenger bag, and pulled out the book that had started all of this. "What’s that?" "Well, it’s supposed to be a novel, but look—" I opened the page to where the image of my phone resided and passed it over to her. "When I opened this book—" Casey received the book in her hands, when I realized what I had just done. I lurched across the table to take it back, but it was already too late. Myriam, the book, whatever it was in there, had slurped up Casey’s phone too. Before Casey could even cry out, her phone appeared next to mine in the corner of the page in the very same painterly style as mine. "Fuck." Casey looked up at me in disbelief, while simultaneously patting the table where her phone had been seconds ago. "It’s gone," I said flatly. "What the hell, Acton! What just happened?" "I’m sorry, I didn’t realize it was going to happen again. I—I was going to show you. Look—that’s my phone. It slurped up my phone." I paused, "And there’s yours," I pointed dumbly. "I can see that, Acton, now show me the trick to return them back!" "Ah-ha," I laughed weakly. "There isn’t one." Casey lifted herself out of her seat and looked around the restaurant wildly, as if she was expecting to see cameras from a reality TV show pop out of nowhere. She sat back down and took a deep breath, followed by a large gulp of her drink. "Explain yourself." I figured this was not going to turn out to be the dinner I had hoped for. Casey looked at me for a good minute before she spoke. I had just finished explaining everything: from the moment I had picked up the book to right before I had left my apartment after communicating with Myriam for the second time. I hadn’t told her about the dream I had had, though. "This is insane." "It is insane," I confirmed. Her eyes drilled into me. Mostly, she looked furious. But eventually, that fury gave way to an uneasy look. "So, if I were to take this book, walk over to that table over there," she nodded to her right, "and plop it in front of that bald dude, it would slurp up his phone." "I’m pretty sure," I said. "It has only happened twice, and there might be other conditions necessary for it to happen. Maybe the phone needs to be within a certain distance from the page, or maybe the person had to be holding the book open. I don’t know." "Well, I don’t know, either," Casey sounded exasperated. "But I’m really tempted to try." "He might not take it as well as you did," I said uneasily. I looked over at the guy Casey had gestured to. He didn’t look like the thumping type, but who knows, he could probably beat us to kingdom come. "This is fucked up. What about this bitch in the book?" "Hey, I don’t know that you need to call her a bitch." "She certainly seems like a manipulative bitch! She used you to steal all my information. By what you told me, right now she’s scanning all my messages, pictures, and e-mails." "She could be," I said, putting my head into my hands. "Sounds like a real bi–" "—Look, call her what you want," I said, cutting off Casey, "but I believe what she said about being stuck in the book." Casey picked up the book and nervously opened it, as if it might suck her entire self right into its empty pages. Hey, it might as well have. It turned out that the impossible was possible. Or maybe I should say, the very strange and unreal had been made real. "Solah J. Trek," Casey muttered. "That’s the author," I said, just catching Casey’s words. "I know," Casey said, her eyes widening in annoyance. "I’d look her up if I had my phone, " Casey said curtly. "Look—you can be mad about this all you want, but like it or not—huh," I paused. "What?" "I never looked up the author. And she told me to. 'Take it up with my author,' Myriam had said, or something like that. I never did. I must have been too shocked by what had happened." "Or were you too busy cooing over your new love Myriam, " Casey snapped at me. Underneath the anger I could see she was still able to tease me. Maybe she was feeling a bit better. I blushed, thinking about the dream I had in which I had walked, shoulders side by side, with Myriam—until I had grown too big, and her too small. "She’s too old for me," I muttered. "From 1955, by the looks of it," Casey said smartly, and snapped the book shut. She paused and looked at me again. I could see her fingers moving absently across the painted cover of the book. "Let’s go back to your place. You’re still on Hatchet Ave, yeah?" "Yeah." "Well, let’s go look her up. I’m not leaving until we get to the bottom of this. After all, I use my phone to unlock the door to my apartment." "You do?" "And guess what else," she said, tapping the table. "The bill?" "That’s right." "Well, I owed you a dinner anyway, didn’t I?" I asked, as I waved the waiter down to order. We trudged back through the snow to my place. With Casey there, there was none of that walking and stopping to feel like I was about to get hit by an invisible wave. Sure, I felt uneasy, but at least with Casey beside me I had someone to shoot the breeze with. We ended up catching up more on the walk home than we did over dinner. I was surprised she didn’t want to drill me with more questions about Myriam. I patted the book, tucked away in my messenger bag at my side as we walked. It was still there. My compulsive checking to make sure my phone was in my pocket had been replaced by something else. I learned plenty about Casey as we walked. In the years since we had last gotten together, she had gone through some life changes that made my life look as plain and untroubled as a bulletin board in an elementary school. Her Mom had died just months after we had last met up. She had had a terrible job during that time that had treated her like shit while her mother was in the hospital. I had only met her Mom twice—both times at school. I remember her watching in the bleachers at one of our painful intracity wrestling competitions. Her mom actually saw me and called me over to watch Casey wrestle in her first match in the girls tournament. I remember there weren’t many girls into wrestling and so there were only a few matches to get to the top of the ladder. Casey’s mom and I had talked about that a little. I hadn’t stayed to watch Casey’s match to the end. I had come up with some excuse about having to talk to my coach, or something, when really, I was avoiding having to be next to her if Casey lost the match. Something about that had been too painful to imagine, and so I politely excused myself after our brief conversation. With this memory before me, I felt ashamed of my behaviour, hearing that Casey’s mom had since died. I know neither of them would have held it against me, but when you find out someone’s gone, sometimes the first thing your mind does is think about the last dumb thing you did in front of them. It wasn’t just that Casey’s mom had died. She shared a little about trying out a secondary community college program that was a complete miss. Then she went on to say that her high school boyfriend had broken up with her while she was trying to make sense of all these huge life changes. I vaguely remembered the guy. He was one of the quiet, inconspicuous types in high school. I know the type because I had been the type. I probably still was. We walked and talked, though it was more me listening. I hesitate to say much about my life, because it didn’t feel like I had really had a life in comparison to what Casey had been through. On top of that, I was realizing that for me, what was almost a farcical experience of having my phone sucked up into a book, was probably not the same for this other person walking beside me. We stepped through the snow together abreast. I was half-listening now, because I was reminded of my dream where I walked beside Myriam. In it, I had seen myself grow bigger and bigger. But now, beside Casey, that was the last thing I felt. I felt like I wanted to shrink into non-existence with each step. It was with that sense of shame that we arrived at my apartment entry. By this time we had been walking in, what I hoped was, a comfortable silence. We walked up the stairs, gripping the railing: the apartment super still hadn’t set up any anti-slip devices for the winter, and the stairs were as slippery as wet marble (I assure you, there wasn’t a lick of marble anywhere in this dingy old apartment building). We entered my apartment and I put my bag on the kitchen table. I went to the fridge. "Tea? Beer?" "A beer would be great," said Casey’s voice from around the corner. I heard the distinct sounds of my messenger bag’s buckles unbuckling. I leaned around the corner: Casey throwing herself down on the couch with the book. I turned back to the fridge and grabbed the last two beers. I joined Casey on the couch, handing her one. "Thanks," she said absentmindedly. "I hope no one needed to get in touch with you tonight," I said glumly, looking over at the page with our phones on it. Casey was running her thumb over the images. "No, but we’ll solve this tonight." "How do you figure." "Well, you’re going to give me a pen, and I’m going to tear Myriam a new one." I shrugged. I got up and got the pen and the ink and showed them to Casey. She glared at them. "You spent how much on this hokum?" "Don’t worry about it," I said gruffly. I had spent over $50, was the answer, and I wasn’t working right now, thank you very much. Casey pulled the table in front of the couch closer and put the book, the ink, and the pen down in front of her. She took out the pen, and hovered it over the page. "Wait—what are you going to write?" "I don’t think you get to ask me that. Your little book here sucked up my phone without asking, and now I’m going to write. Either get out of the way, or watch quietly." I sighed. She had every right to say that. I inched closer to her on the couch. I could smell the pub on both of us. I watched as she wrote. Her handwriting was far nicer than mine. And that was what Casey wrote. We both sat back on the couch, as if we had done a workout. Then, the words began to slowly disappear in to the page as before. We watched, expectantly. We paused waiting for more words to appear but there were none. "The audacity!" Casey crowed. "This bitch barely acknowledged me!" "Wow," I said, dumbly. "Well, you better write something. At least she’ll talk to you ," Casey pushed the pen into my hands and I received it clumsily. I began to write. "I was thinking more, give us back our phones ," Casey said. My eyes bulged and I held my palms up defensively. "It’s fine it’s fine, she’s writing back," Casey said hurriedly. We both stared dumbly at this response, until it faded away into the page. "We are dealing with a very strange entity, here, Acton." "I know," I said. "What do we do?" "I don’t know." " Know , don’t know , whatever state we’re in, we can’t keep continuing like this. I think she’s on a mission to absorb more information—she probably wants us to expose this book to other people so that it sucks up their phones too." "I suppose," I said slowly. "We need to destroy it!" "Destroy it!" I gasped. "What about our phones?" I cried out feebly. "They’re gone, and everything in them. There’s no way she’s going to give it back," Casey snorted, eyeing the book. I grabbed at the book automatically and clasped it to my chest. "We can’t destroy it!" My heart was racing. I felt the invisible wave about to crash over me again. What had changed in me to make me this way? Why did I want to protect this book, protect Myriam? Casey wanted to destroy her. Burn the book, or rip it to shreds. But she was inside it. I knew I couldn’t let her do it. It was true that something had changed in me the moment Myriam had sucked up my phone. My life had slowed down considerably, but I also was seeing things differently. I looked at the world differently. Everyone around me appeared less real than Myriam! All of them were sucked into their phones, while mine had been sucked away. My relationships with the people of this world were hardly meaningful, I saw that now. They were facile and fragile things—like something a child had made with glue and popsicle sticks: sloppy, brittle, ready to break at a moment’s notice. I felt a warmth in my chest where the book was. Across from me, Casey appeared venemous, her face angular and snake-like—ready to strike. I inched back on the couch. "Acton," my name floated out of her mouth, wrapped in a warning tone. Her lips hardly seemed to move. She moved toward me slowly, her eyes locked on mine. I was stuck, caught in her gaze. She moved closer still, her upper body hovering over the couch where we sat. Then, she pounced. I couldn’t help but release the book: she pounced on me, not it . She struck me like a shot from a cannon and together we fell back into the couch. I heard the book fall from the couch to the floor with a muffled thump. My face was awash in her hair. Behind the smell of the pub was a fragrance from earlier in her day. She reached for the book. I writhed underneath her and rolled off the couch, hitting the floor—slap against the mats. I got up onto my hands and knees and grabbed for the book but she was already behind me. I knew I was done for. Wrestling between boys and girls in high school was forbidden, but it all came back to me the same—and evidently for Casey too. I felt her arm cross under my chin and her other arm snake through my legs. Click , went the padlock of her hands. She pulled me back and we both rolled backward. It was over before it started. "You still wrestle?" I gasped out. "No," she laughed, "but a champion doesn’t forget her moves." A champion doesn’t forget her moves. I had never actually watched Casey’s matches, that day I sat with her Mom. Now it was sounding like she had taken home the trophy. I lay there, pinned by her, her hair brushing into my face. She smelled nice. The promise of spring. She must have sensed that I was giving up whatever fight was in me because I felt her grip loosen. I relaxed into her body, my head unlocked and leaning back into her shoulder. "I thought you hated wrestling." "I did, sometimes," she said distantly. I couldn’t look at her face from where I lay. "Now are you going to stand in the way of burning this book or not?" "No," I sighed. "Get some kindling, then." I wandered through my apartment, grabbing a newspaper from my recycling. I had a wooden box that a few clementines had come in. I moved them to the fridge. I found some matches and brought some rubbing alcohol from the bathroom. I returned to Casey and picked the book off the floor. It felt heavier. I walked over to the fireplace and kneeled down. I looked back at Casey. She stared back at me. My fingers ran back and forth absentmindedly along the strange impasto cover. I turned back to the book, looked at it once more, and then ripped off the cover. I poured the alcohol on it, tossed it in the fireplace, and threw the small crate on top. I lit a match, stepped back, and threw it in. The alcohol burned off quickly with a whoof but the book still caught. It did not take long to burn. There was not much smoke, and what little there was sent some creature who had taken up residence in the chimney scuttling up and out into the night air. I returned to the couch and we watched the book burn until it was no more than a blackened crisp. Neither of us said anything. Finally, when it was over, Casey turned to me. "I still won’t be able to get into my apartment, at least until I can get ahold of the Super." "Right," I said tiredly, "take my bed. I’ll sleep on the couch." "Thanks," she patted me on the leg as she stood up. I remained on the couch to watch the last few embers in the fireplace. I listened as Casey roamed around my apartment, finding whatever she needed to get situated to sleep over. I pulled the couch blanket over me, a comfortable weariness settling over me. I lay there listening to the apartment and the neighbourhood. I imagined Casey pressed against me. She had pinned me fair and square, not that I had had much fight in me in the first place. It had all happened in a matter of seconds. In light of that, I felt a competitive spark growing within me. Something tonight had lit it, and I imagined it would grow just as the embers in the fireplace would diminish into nothing. I fell asleep hoping to dream about a bridge. The book was originally published in 1955 The author had written several books Myriam's Codebreakers was the last book that Solah J. Trek wrote; she died shortly after it was published A plot summary online told me the book was mostly a love story that involved a woman in a post-World-War-Two workplace early computers feature prominently in the book, specifically, ones that were developed based on the machines built and used for war-time decryption.

0 views
Kev Quirk 3 weeks ago

Three Men Tried To Steal My Motorbike While I Was On It!

Spring is in the air here in North Wales, so I decided to take one of my motorbikes to the office yesterday. On the way home, not too far from where I live, I was sat at traffic lights when all of a sudden three men on off-road bikes surrounded me. One left, one right, one right up to my back wheel. And they were really close, like, inches from me kinda close. I immediately felt uneasy, like something was about to happen. I think we as a species have a sense for this kinda thing. Anyway, seconds later the guy on my left reached over to, I assume, grab the keys for the bike, but I was on the BMW, which has a keyless ignition, luckily. I clocked what the guy was trying to do, panicked, and kicked the side of his bike as hard as I could. Which, thankfully was enough to put him off balance, causing him to topple over. Then I clobbered the guy to the right around the head - he was wearing a helmet so it wouldn't have hurt him, but I suppose I figured it would be enough to shock him by me a couple seconds. I dunno, I basically shitting my pants at this point. As I soon as I'd hit the guy to my right, I took off like the absolutely clappers, running a red light in the process (that goodness nothing was coming the other way). My BMW is a fast bike, at 1000cc and over 170BHP. They were on dirt bikes, which are nowhere near as quick as mine. I also had knowledge of the local roads, which I hoped they didn't. As I flew off, they gave chase but quickly dropped back. A brief glance of my speedo showed I was doing over 120MPH, but it was working. In my panic I didn't know what to do - shall I go home? What if they see me pull in and find out where I live? Should I go somewhere else? But it's rush hour and if I get caught in traffic they could catch up to me again - my bike is a lot quick on the open road, but in traffic, they would have the advantage. I decided to floor it and get home as quick as possible. There's a straight road that leads to my village, so I figured if I can't see them behind I'll quickly swing the bike in and hide behind the garage (which can't be seen from the road). If I could, I'd just carry on and continue trying to lose them. I'm nearing my drive now, so I glance in the mirrors and see nothing; I decide to risk it and swing in, going up our gravel drive as quickly as I dare, while simultaneously hoping the kids aren't playing in the drive. They aren't. I dive in behind the garage and wait...5 seconds...10 seconds...I hear bikes getting closer. They fly right past my drive, going way too fast for our single track village road. My wife later asked the owner of the village pub if he caught anything on his CCTV. Here's what he found for us: It looks like only 1 of them had a number plate, and it's pretty much parallel with the road, so impossible to identify from the video. We've passed it onto the Police and we're waiting to hear back from their forensics dept. to see if they can pickup any prints from my bike. I don't remember if they had gloves on though, and I'm not very confident it will come to anything. I'm fine now, but it shook me up. I just hope they were opportunist idiots, rather than something more sinister. I've already bought myself a camera for the garage. Stay safe out there, folks. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Kaushik Gopal 3 weeks ago

Podsync - I finally built my podcast track syncer

I host and edit a podcast 1 . When recording remotely, we each record our own audio locally (I on my end, my co-host on his). The service we use (Adobe Podcast, Zoom, Skype-RIP) captures everyone together as a master track. But the quality doesn’t match what each person records locally with their own microphone. So we use that master as a reference point and stitch the individual local tracks together. This is what the industry calls a “ double-ender ”. Add a guest and it becomes a “triple-ender”. But this gets hairy during editing. Each person starts their recording at a slightly different moment — everyone hits record at a different time. Before I can edit, I need to line everything up. Drop all the tracks into a DAW, play the master alongside each individual track, nudge by ear until the speech aligns. Add a guest and it gets tedious fast. 10–15 minutes of fiddly, ear-straining alignment before I’ve even started editing. There’s also drift. Each machine’s audio clock runs at a slightly different rate, so two tracks that are perfectly aligned at minute one might be 200ms apart by minute sixty. So I built PodSync 2 . I first heard of a similar technique from Marco Arment — back in ATP episode 25 . He had a new app for aligning double-ender tracks and was already thinking about whether something so niche was even worth releasing publicly. I don’t think he ever released it. Being a Kotlin developer at the time, I figured I’d build my own. Java was mature. Surely there were audio processing libraries that could handle this. There weren’t 😅. At least not in any clean, usable form. Getting the right signal processing pieces together in JVM-land was awkward enough that my interest fizzled, so I kept doing it by hand. When I revamped Fragmented , I finally came back to this. I used Claude to help me build it — in Rust, no less. 3 But before you chalk this up to another vibecoded project, hear me out. The interesting part here wasn’t just that AI made it easier. It was thinking through the actual algorithm: Voice activity detection ( VAD ) to find speech regions. MFCC features to fingerprint the audio. Cross-correlation to find where the tracks match. Some real signal processing techniques, not just prompt engineering. Now, could I have prompted my way to a solution? Probably. But I like to think, years of manually aligning tracks — and some sound engineering intuition — helped me steer AI towards a better solution. Working on this felt refreshing. In an era where half the conversation is about AI replacing engineering work, here’s a problem where the hard part is still the problem itself — understanding the domain, picking the right approach, knowing what “correct” sounds like. It gives me confidence that solving real problems well still has its place. I like how Dax put it: thdxr on twitter I really don’t care about using AI to ship more stuff. It’s really hard to come up with stuff worth shipping. The core idea: take a chunk of speech from a participant track, compare it against the master recording, find where they match best. That position is the time offset. The trick is picking which chunk of speech to use. Rather than betting on a single region, Podsync finds a few strong candidates per track (longer contiguous speech blocks preferred) and tries each one against the master. For long candidates, it samples from the start, middle, and end. The highest-confidence match wins; if a second independent region agrees on the same offset, that corroboration factors in as a tie-breaker. After finding the offset, Podsync pads or trims each track to align with the master and match its length (and outputs some info on the offset). Drop the output into my DAW at 0:00. Done. I even wrote an agent skill you can just point your agent harness to and it will take care of all the steps for you : What used to be 10–15 minutes of alignment per episode is now a single command. Marco, if you ever read this, would still love to see your implementation! His solution (as I understand) is aimed more at correcting the drift vs getting the offset right. In practice, I haven’t found drift to be much of a problem. It exists but stays minor, and I’m typically editing every second of the podcast anyway so it’s easy enough to handle by hand. I even had a branch that corrected drift by splicing at silence points, but it complicated things more than it helped. It’s a podcast on AI development but we strive to make it high signal. None of that masturbatory AI discourse .  ↩︎ See also Phone-sync .  ↩︎ I chose Rust (it’s what interests me these days ) and a CLI tool with no runtime dependency is more pleasant to distribute.  ↩︎ It’s a podcast on AI development but we strive to make it high signal. None of that masturbatory AI discourse .  ↩︎ See also Phone-sync .  ↩︎ I chose Rust (it’s what interests me these days ) and a CLI tool with no runtime dependency is more pleasant to distribute.  ↩︎

0 views
Neil Madden 3 weeks ago

Maybe version ranges are a good idea after all?

One of the most important lessons I’ve learned in security, is that it’s always better to push security problems back to the source as much as possible. For example, a small number of experts (hopefully) make cryptography libraries, so it’s generally better if they put in checks to prevent things like invalid curve attacks rather than leaving that up to applications , so that we don’t get the same vulnerabilities cropping up again and again. It’s much more efficient to fix the problem at source rather than having everyone re-implement the same redundant checks everywhere. Now consider how we currently manage security vulnerabilities in third-party software dependencies. Current accepted wisdom is to lock dependencies to a single specific version, often with a cryptographic hash to ensure you get exactly that version. This is great for reproducibility, and everyone loves reproducibility. However, when there’s a security vulnerability in that dependency, every single consumer of that library has to manually update to the next version, and then their consumers have to update, and so on. The fix is done at source, but the responsibility for updating cascades through the entire ecosystem. This is not efficient. Two years after log4shell, around 25% of vulnerable consumers had apparently still not updated . To solve this problem we have created an industry of automated nagging software: SCA tools that alert you to all the “risk” you are carrying, and the ever-watchful Dependabot, which will automatically upgrade everything for you. Combine this with CVSS severity inflation ( CVSS 4 is not helping in this regard ) and the acceleration in production of CVEs , and it’s not surprising that many developers find the whole situation demoralising and stressful. It’s an almost constant churn of new must-fix CVEs to address, especially when only about 1% of CVEs will ever go on to be exploited (rising to about 4.25% for critical CVEs ). This is not a sustainable or efficient situation. There’s clearly a problem, but what would a solution look like? I have some ideas, but this is a complex problem where it is easy to introduce unintended side-effects. So take these suggestions as just that: suggestions . To provoke discussion, not as a perfect fully-baked solution. There are lots of competing factors to balance here, and I’m not going to claim that I’ve considered them all. Also, many of the suggestions I make below are not currently actionable . It is an idea for what the future might look like, not something you can implement right now. Ultimately, I think that locking to specific versions is a mistake . And by locking, I mean not just explicit lockfiles, but also things like Maven where dependency versions are (usually) uniquely determined by the POM. This feels like such heresy to utter in 2026, and I’m sure there will be lots of angry reactions to this post. But in my opinion, it would be much healthier in general if software builds always pulled in the latest patch version of a dependency (and transitive dependencies), and specified only a particular major version and minimum minor version. (Although even that can be problematic ). “But, but, but, …”, I hear you scream. What about supply chain attacks? What about deterministic builds and reproducibility? What about unintended breakages in patch versions? Locking to a particular version lets you be more controlled in applying updates: Dependabot automatically upgrades, yes, but it raises a PR and lets you run your test suite first. This is surely better than just automatically pulling in the latest thing every time. What if someone publishes a malicious version of the package? I don’t want to just pull that in straightaway! These are all completely valid concerns, but I believe they can be addressed by changes to dependency resolution: (You could implement some of these things right now by having your build scripts run e.g. “ uv lock –upgrade ” or “ mvn versions:use-latest-versions ” before each build, but again this is shifting the responsibility onto consumers to implement). How would this be better? It means that the default shifts from pulling in fixed insecure versions to always pulling in newer, more secure versions. It’s based on an assumption that the overwhelming majority of software patches are good. It also shifts work away from downstream developers: for the most part, updates will happen automatically and without any manual intervention. And it happens for everyone, not just the projects mature enough to be running Dependabot. And it happens on every active release branch, not just on main. A further advantage of this approach is that most low and medium severity issues (and probably a fair number of “high” ones too) could be fixed without a CVE being issued at all. The whole CVE process exists largely so that vendors can scaremonger and sell tooling, and security researchers can make a name for themselves. I frankly find it one of the most embarrassing and immature aspects of software security. Many smaller projects don’t have the time or inclination to issue CVEs, so just silently fix any security bugs in the next release. And frankly that should be the norm. The only reason it isn’t the norm is that we’ve got ourselves into a situation where CVEs have to be published because nobody updates without them. The default is to stick with the older insecure versions, so you have to scream loudly to overcome that inertia. Because updating is work and not updating is free. Switch the default and perhaps we can all start to calm down a bit. Maybe. Firstly, just as you should implement a time delay for Dependabot to give some leeway for supply chain attacks to be discovered, the same should happen here: dependency resolution should have a built-in time delay, so that new versions are not resolved until they are at least N days old. (I believe most repos already track version publication time). This can be controlled by setting a policy, so that e.g. you can have a canary CI pipeline that always builds with the latest to flag any incompatibilities early. It should be possible to shun versions that are known to cause test failures or other incompatibilities. Ideally such shunning information would also feedback to the central repository so that frequently shunned versions can be investigated. A sudden version update breaks your PR for reasons unrelated to your changes? Shun it! We change the default from opting-in to security updates to opting-out. Building from source should always produce a detailed SBOM , that lists exactly which versions of which libraries went into that build. It should then be possible to specify the SBOM when (re-)building to have it resolve exactly those versions, giving us back reproducibility. Essentially, this is producing the same information as a lockfile, but at build-time rather than commit-time. This allows retrospective rather than proactive reproducibility. (If you want to be a bit more deterministic around releases then it seems reasonable to me to switch to SBOM-locked builds at code-freeze).

0 views
Stratechery 4 weeks ago

Spring Break

Stratechery is on a bit of a disjointed Spring Break, as my usual week off will be spread out: I will return to my usual posting schedule on Tuesday, March 31. All other Stratechery Plus content, including my podcasts, will stay on schedule. There will be no Update on Thursday, March 19 There will be no Update on Monday and Tuesday, March 23–24; there will be an Update and Interview on Wednesday and Thursday, March 25–26 There will be no Update on Monday, March 30

0 views
daniel.haxx.se 1 months ago

One hundred curl graphs

In the spring of 2020 I decided to finally do something about the lack of visualizations for how the curl project is performing, development wise. How does the line of code growth look like? How many command line options have we had over time and how many people have done more than 10 commits per year over time? I wanted to have something that visually would show me how the project is doing, from different angles, viewpoints and probes. In my mind it would be something like a complicated medical device monitoring a patient that a competent doctor could take a glance at and assess the state of the patient’s health and welfare. This patient is curl, and the doctors would be fellow developers like myself. GitHub offers some rudimentary graphs but I found (and still find) them far too limited. We also ran gitstats on the repository so there were some basic graphs to get ideas from. I did a look-around to see what existing frameworks and setups that existed that I should base this one, as I was convinced I would have to do quite some customizing myself. Nothing I saw was close enough to what I was looking for. I decided to make my own, at least for a start. I decided to generate static images for this, not add some JavaScript framework that I don’t know how to use to the website. Static daily images are excellent for both load speed and CDN caching. As we already deny running JavaScript on the site that saved me from having to work against that. SVG images are still vector based and should scale nicely. SVG is also a better format from a download size perspective, as PNG almost always generate much larger images for this kind of images. When this started, I imagined that it would be a small number of graphs mostly showing timelines with plots growing from lower left to upper right. It would turn out to be a little naive. I knew some basics about gnuplot from before as I had seen images and graphs generated by others in the past. Since gitstats already used it I decided to just dive in deeper and use this. To learn it. gnuplot is a 40 year old (!) command line tool that can generate advanced graphs and data visualizations. It is a powerful tool, which also means that not everything is simple to understand and use at once, but there is almost nothing in terms of graphs, plots and curves that it cannot handle in one way or another. I happened to meet Lee Phillips online who graciously gave me a PDF version of his book aptly named gnuplot . That really helped! I decided that for every graph I want to generate, I first gather and format the data with one script, then render an image in a separate independent step using gnuplot. It made it easy to work on them in separate steps and also subsequently tune them individually and to make it easy to view the data behind every graph if I ever think there’s a problem in one etc. It took me about about two weeks of on and off working in the background to get a first set of graphs visualizing curl development status. I then created the glue scripting necessary to add a first dashboard with the existing graphs to the curl website. Static HTML showing static SVG images. On March 20, 2020 the first version of the dashboard showed no less than twenty separate graphs. I refer to “a graph” as a separate image, possibly showing more than one plot/line/curve. That first dashboard version had twenty graphs using 23 individual plots. Since then, we display daily updated graphs there . All data used for populating the graphs is open and available, and I happily use whatever is available: Open and transparent as always. Every once in a while since then I get to think of something else in the project, the code, development, the git history, community, emails etc that could be fun or interesting to visualize and I add a graph or two more to the dashboard. Six years after its creation, the initial twenty images have grown to one hundred graphs including almost 300 individual plots. Most of them show something relevant, while a few of them are in the more silly and fun category. It’s a mix. The 100th graph was added on March 15, 2026 when I brought back the “vulnerable releases” graph (appearing on the site on March 16 for the first time). It shows the number of known vulnerabilities each past release has. I removed it previously because it became unreadable, but in this new edition I made it only show the label for every 4th release which makes it slightly less crowded than otherwise. vulnerabilities in releases This day we also introduce a new 8-column display mode. Many of the graphs are internal and curl specific of course. The scripts for this, and the entire dashboard, remain written specifically for curl and curl’s circumstances and data. They would need some massaging and tweaking in order to work for someone else. All the scripts are of course open and available for everyone. I used to also offer all the CSV files generated to render the graphs in an easy accessible form on the site, but this turned out to be work done for virtually no audience, so I removed that again. If you replace the .svg extension with .csv, you can still get most of the data – if you know. The graphs and illustrations are not only silly and fun. They also help us see development from different angles and views, and they help us draw conclusions or at least try to. As an established and old project that makes an effort to do right, some of what we learn from this curl data might be possible to learn from and use even in other projects. Maybe even use as basis when we decide what to do next. I personally have used these graphs in countless blog posts, Mastodon threads and public curl presentations. They help communicate curl development progress. On Mastodon I keep joking about me being a graphaholic and often when I have presented yet another graph added the collection, someone has asked the almost mandatory question: how about a graph over number of graphs on the dashboard? Early on I wrote up such a script as well, to immediately fulfill that request. On March 14 2026, I decided to add it it as a permanent graph on the dashboard. Graphs in the curl dashboard The next-level joke (although some would argue that this is not fun anymore) is then to ask me for a graph showing the number of graphs for graphs. As I aim to please, I have that as well. Although this is not on the dashboard: Number of graphs on the dashboard showing number of graphs on the dashboard More graphs I am certain I (we?) will add more graphs over time. If you have good ideas for what source code or development details we should and could illustrate, please let me know. The git repository: https://github.com/curl/stats/ Daily updated curl dashboard: https://curl.se/dashboard.html curl gitstats: https://curl.se/gitstats/ git repository (source, tags, etc) GitHub issues mailing list archives curl vulnerability data hackerone reports historic details from the curl past

0 views

XML is a Cheap DSL

Yesterday, the IRS announced the release of the project I’ve been engineering leading since this summer, its new Tax Withholding Estimator (TWE). Taxpayers enter in their income, expected deductions, and other relevant info to estimate what they’ll owe in taxes at the end of the year, and adjust the withholdings on their paycheck. It’s free, open source, and, in a major first for the IRS, open for public contributions . TWE is full of exciting learnings about the field of public sector software. Being me, I’m going to start by writing about by far the driest one: XML. (I am writing this in my personal capacity, based on the open source release, not in my position as a federal employee.) XML is widely considered clunky at best, obsolete at worst. It evokes memories of SOAP configs and J2EE (it’s fine, even good, if those acronyms don’t mean anything to you). My experience with the Tax Withholding Estimator, however, has taught me that XML absolutely has a place in modern software development, and it should be considered a leading option for any cross-platform declarative specification. TWE is a static site generated from two XML configurations. The first of these configs is the Fact Dictionary, our representation of the US Tax Code; the second will be the subject of a later blog post. We use the Fact Graph, a logic engine, to calculate the taxpayer’s tax obligations (and their withholdings) based on the facts defined in the Fact Dictionary. The Fact Graph was originally built for IRS Direct File and now we use it for TWE. I’m going to introduce you to the Fact Graph the way that I was introduced to it: by fire example. Put aside any preconceptions you might have about XML for a moment and ask yourself what this fact describes, and how well it describes it. This fact describes a fact that’s derived by subtracting from . In tax terms, this fact describes the amount you will need to pay the IRS at the end of the year. That amount, “total owed,” is the difference between the total taxes due for your income (“total tax”) and the amount you’ve already paid (“total payments”). My initial reaction to this was that it’s quite verbose, but also reasonably clear. That’s more or less how I still feel. You only need to look at a few of these to intuit the structure. Take the refundable credits calculation, for example. A refundable credit is a tax credit that can lead to a negative tax balance—if you qualify for more refundable credits than you owe in taxes, the government just gives you some money. TWE calculates the total value of refundable credits by adding up the values of the Earned Income Credit, the Child Tax Credit (CTC), American Opportunity Credit, the refundable portion of the Adoption Credit, and some other stuff from the Schedule 3. By contrast, non-refundable tax credits can bring your tax burden down to zero, but won’t ever make it negative. TWE models that by subtracting non-refundable credits from the tentative tax burden while making sure it can’t go below zero, using the operator. While admittedly very verbose, the nesting is straightforward to follow. The tax after non-refundable credits is derived by saying “give me the greater of these two numbers: zero, or the difference between tentative tax and the non-refundable credits.” Finally, what about inputs? Obviously we need places for the taxpayer to provide information, so that we can calculate all the other values. Okay, so instead of we use . Because the value is… writable. Fair enough. The denotes what type of value this fact takes. True-or-false questions use , like this one that records whether the taxpayer is 65 or older. There are some (much) longer facts, but these are a fair representation of what the median fact looks like. Facts depend on other facts, sometimes derived and sometimes writable, and they all add up to some final tax numbers at the end. But why encode math this way when it seems far clunkier than traditional notation? Countless mainstream programming languages would instead let you write this calculation in a notation that looks more like normal math. Take this JavaScript example, which looks like elementary algebra: That seems better! It’s far more concise, easier to read, and doesn’t make you explicitly label the “minuend” and “subtrahend.” Let’s add in the definitions for and . Still not too bad. Total tax is calculated by adding the tax after non-refundable credits (discussed earlier) to whatever’s in “other taxes.” Total payments is the sum of estimated taxes you’ve already paid, taxes you’ve paid on social security, and any refundable credits. The problem with the JavaScript representation is that it’s imperative . It describes actions you take in a sequence, and once the sequence is done, the intermediate steps are lost. The issues with this get more obvious when you go another level deeper, adding the definitions of all the values that and depend on. We are quickly arriving at a situation that has a lot of subtle problems. One problem is the execution order. The hypothetical function solicits an answer from the taxpayer, which has to happen before the program can continue. Calculations that don’t depend on knowing “total estimated taxes” are still held up waiting for the user; calculations that do depend on knowing that value had better be specified after it. Or, take a close look at how we add up all the social security income: All of a sudden we are really in the weeds with JavaScript. These are not complicated code concepts—map and reduce are both in the standard library and basic functional paradigms are widespread these days—but they are not tax math concepts. Instead, they are implementation details. Compare it to the Fact representation of that same value. This isn’t perfect—the that represents each social security source is a little hacky—but the meaning is much clearer. What are the total taxes paid on social security income? The sum of the taxes paid on each social security income. How do you add all the items in a collection? With . Plus, it reads like all the other facts; needing to add up all items in a collection didn’t suddenly kick us into a new conceptual realm. The philosophical difference between these two is that, unlike JavaScript, which is imperative , the Fact Dictionary is declarative . It doesn’t describe exactly what steps the computer will take or in what order; it describes a bunch of named calculations and how they depend on each other. The engine decides automatically how to execute that calculation. Besides being (relatively) friendlier to read, the most important benefit of a declarative tax model is that you can ask the program how it calculated something. Per the Fact Graph’s original author, Chris Given : The Fact Graph provides us with a means of proving that none of the unasked questions would have changed the bottom line of your tax return and that you’re getting every tax benefit to which you’re entitled. Suppose you get a value for that doesn’t seem right. You can’t ask the JavaScript version “how did you arrive at that number?” because those intermediate values have already been discarded. Imperative programs are generally debugged by adding log statements or stepping through with a debugger, pausing to check each value. This works fine when the number of intermediate values is small; it does not scale at all for the US Tax Code, where the final value is calculated based on hundreds upon hundreds of calculations of intermediate values. With a declarative graph representation, we get auditability and introspection for free, for every single calculation. Intuit, the company behind TurboTax, came to the same conclusion, and published a whitepaper about their “Tax Knowledge Graph” in 2020. Their implementation is not open source, however (or least I can’t find it). The IRS Fact Graph is open source and public domain, so it can be studied, shared, and extended by the public. If we accept the need for a declarative data representation of the tax code, what should it be? In many of the places where people used to encounter XML, such network data transfer and configuration files, it has been replaced by JSON. I find JSON to be a reasonably good wire format and a painful configuration format, but in neither case would I rather be using XML (although it’s a close call on the latter). The Fact Dictionary is different. It’s not a pile of settings or key-value pairs. It’s a custom language that models a unique and complex problem space. In programming we call this a domain-specific language, or DSL for short. As an exercise, I tried to come up with a plausible JSON representation of the fact from earlier. This is not a terribly complicated fact, but it’s immediately apparent that JSON does not handle arbitrary nested expressions well. The only complex data structure available in JSON is an object, so every child object has to declare what kind of object it is. Contrast that with XML, where the “kind” of the object is embedded in its delimiters. I think this XML representation could be improved, but even in its current form, it is clearly better than JSON. (It’s also, amusingly, a couple lines shorter.) Attributes and named children give you just enough expressive power to make choices about what your language should or should not emphasize. Not being tied to specific set of data types makes it reasonable to define your own, such as a distinction between “dollars” and “integers.” A lot of minor frustrations we’ve all internalized as inevitable with JSON are actually JSON-specific. XML has comments, for instance. That’s nice. It also has sane whitespace and newline handling, which is important when your descriptions are often long. For text that has any length or shape to it, XML is far more pleasant to read and edit by hand than JSON. There are still verbosity gains to be had, particularly with switch statements (omitted here out of respect for page length). I’d certainly remove the explicit “minuend” and “subtrahend,” for starters. I believe that the original team didn’t do this because they didn’t want the order of the children to have semantic consequence. I get it, but order is guaranteed in XML and I think the additional nesting and words do more harm then good. What about YAML? Chris Given again : whatever you do, don’t try to express the logic of the Internal Revenue Code as YAML Finally, there’s a good case to made that you could build this DSL with s-expressions. In a lot of ways, this is nicest syntax to read and edit. HackerNews user ok123456 asks : “Why would I want to use this over Prolog/Datalog?” I’m a Prolog fan ! This is also possible. My friend Deniz couldn’t help but rewrite it in KDL , a cool thing I had to look up. At least to my eye, all of these feel more pleasant than the XML version. When I started working on the Fact Graph, I strongly considered proposing a transition to s-expressions. I even half-jokingly included it in a draft design document. The process of actually building on top of the Fact Graph, however, taught me something very important about the value of XML. Using XML gives you a parser and a universal tooling ecosystem for free. Take Prolog for instance. You can relate XML to Prolog terms with a single predicate . If I want to explore Fact Dictionaries in Prolog—or even make a whole alternative implementation of the Fact Graph—I basically get the Prolog representation out of the box. S-expressions work great in Lisp and Prolog terms work great in Prolog. XML can be transformed, more or less natively, into anything. That makes it a great canonical, cross-platform data format. XML is rivaled only by JSON in the maturity and availability of its tooling. At one point I had the idea that it would be helpful to fuzzy search for Fact definitions by path. I’d like to just type “overtime” and see all the facts related to overtime. Regular searches of the codebase were cluttered with references and dependencies. This was possible entirely with shell commands I already had on my computer. This uses XPath to query all the fact paths, to clean up the output, and to interactively search the results. I solved my problem with a trivial bash one-liner. I kept going and said: not only do I want to search the paths, I’d like selecting one of the paths to show me the definition. Easy. Just take the result of the first command, which is a path attribute, and use it in a second XPath query. I got a little carried away building this out into a “$0 Dispatch Pattern” script of the kind described by Andy Chu . (Andy is a blogging icon, by the way.) I also added dependency search—not only can you query the definition of a fact, but you can go up the dependency chain by asking what facts depend on it. Try it yourself by cloning the repo and running (you need installed). The error handling is janky but it’s pretty solid for 60 lines of bash I wrote in an afternoon. I use it almost daily. I’m not sure how many people used my script, but multiple other team members put together similarly quick, powerful debugging tools that became part of everyone’s workflow. All of these tools relied on being able to trivially parse the XML representation and work with it in the language that best suited the problem they were trying to solve, without touching the Fact Graph’s actual implementation in Scala. The lesson I took from this is that a universal data representation is worth its weight in gold. There are exactly two options in this category. In most cases you should choose JSON. If you need a DSL though, XML is by far the cheapest one, and the cost-efficiency of building on it will empower your team to spend their innovation budget elsewhere. Thanks to Chris Given and Deniz Akşimşek for their feedback on a draft of this blog. I had never heard of XPath before 2023, when Deniz figured out an XPath query that made my first htmx PR possible. Another reason to use XML is that humans who aren’t programmers can read it. They usually don’t like it, but, if you did a good-enough job designing the schema, they can read it in a pinch. Do them a favor and build an alternative view, though. Because you’re using XML, this is pretty easy. It’s probably just because I’ve started to use it—buy a Jeep Grand Cherokee and suddenly the roadways seem full of them—but lately I have noticed an uptick in XML interest. Fellow Spring ’24 Recurser Jake Low recently wrote a tool called which turns XML documents into a flat, line-oriented representation. Martijn Faassen has been working on a modern XPath and XSLT engine in Rust . I’m not sure it’s fair to call JSON “lobotomized” but I thought this article was largely correct about the problems XML can solve. The binary format is especially interesting to me.

0 views
Martin Fowler 1 months ago

Fragments: March 10

Tech firm fined $1.1m by California for selling high-school students’ data I agree with Brian Marick’s response No such story should be published without a comparison of the fine to the company’s previous year revenue and profits, or valuation of last funding round. (I could only find a valuation of $11.0M in 2017.) We desperately need corporations’ attitudes to shift from “lawbreaking is a low-risk cost of doing business; we get a net profit anyway” to “this could be a death sentence.” ❄                ❄                ❄                ❄                ❄ Charity Majors gave the closing keynote at SRECon last year, encouraging people to engage with generative AI. If I was giving the keynote at SRECon 2026, I would ditch the begrudging stance. I would start by acknowledging that AI is radically changing the way we build software. It’s here, it’s happening, and it is coming for us all. Her agenda this year would be to tell everyone that they mustn’t wait for the wave to crash on them, but to swim out to meet it. In particular, I appreciated her call to resist our confirmation bias: The best advice I can give anyone is: know your nature, and lean against it. ❄                ❄                ❄                ❄                ❄ In a comment to Kief Morris’s recent article on Humans and Agents in Software Loops , in LinkedIn comments Renaud Wilsius may have coined another bit of terminology for the agent+programmer age This completes the story of productivity, but it opens a new chapter on talent: The Apprentice Gap. If we move humans ‘on the loop’ too early in their careers, we risk a future where no one understands the ‘How’ deeply enough to build a robust harness. To manage the flywheel effectively, you still need the intuition that comes from having once been ‘in the loop.’ The next great challenge for CTOs isn’t just Harness Engineering, it’s ‘Experience Engineering’ for our junior developers in an agentic world. ❄                ❄                ❄                ❄                ❄ In hearing conversations about “the ralph loop”, I often hear it in the sense of just letting the agents loose to run on their own. So it’s interesting to read the originator of the ralph loop point out: It’s important to watch the loop as that is where your personal development and learning will come from. When you see a failure domain – put on your engineering hat and resolve the problem so it never happens again. In practice this means doing the loop manually via prompting or via automation with a pause that involves having to prcss CTRL+C to progress onto the next task. This is still ralphing as ralph is about getting the most out how the underlying models work through context engineering and that pattern is GENERIC and can be used for ALL TASKS. At the Thoughtworks Future of Software Development Retreat we were very concerned about cognitive debt. Watching the loop during ralphing is a way to learn about what the agent is building, so that it can be directed effectively in the future. ❄                ❄                ❄                ❄                ❄ Anthropic recently published a page on how AI helps break the cost barrier to COBOL modernization . Using AI to help migrate COBOL systems isn’t an new idea to my colleagues, who shared their experiences using AI for this task over a year ago. While Anthropic’s article is correct about the value of AI, there’s more to the process than throwing some COBOL at an LLM. The assumption that AI can simply translate COBOL into Java treats modernization as a syntactic exercise, as though a system is nothing more than its source code. That premise is flawed. A direct translation would, in the best case scenario, faithfully reproduce existing architectural constraints, accumulated technical debt and outdated design decisions. It wouldn’t address weaknesses; it would restate them in a different language. In practice, modernization is rarely about preserving the past in a new syntax. It’s about aligning systems with current market demands, infrastructure paradigms, software supply chains and operating models. Even if AI were eventually capable of highly reliable code translation, blind conversion would risk recreating the same system with the same limitations, in another language, without a deliberate strategy for replacing or retiring its legacy ecosystem. ❄                ❄                ❄                ❄                ❄ Anders Hoff (inconvergent) an LLM is a compiler in the same way that a slot machine is an ATM ❄                ❄                ❄                ❄                ❄ One of the more interesting aspects of the network of people around Jeffrey Epstein is how many people from academia were connected. It’s understandable why, he had a lot of money to offer, and most academics are always looking for funding for their work. Most of the attention on Epstein’s network focused on those that got involved with him, but I’m interested in those who kept their distance and why - so I enjoyed Jeffrey Mervis’s article in Science Many of the scientists Epstein courted were already well-established and well-funded. So why didn’t they all just say no? Science talked with three who did just that. Here’s how Epstein approached them, and why they refused to have anything to do with him. I believe that keeping away from bad people makes life much more pleasant, if nothing else it reduces a lot of stress. So it’s good to understand how people make decisions on who to avoid. If you are a reflexive naysayer or a pessimist, know that, and force yourself to find a way in to wonder, surprise and delight. If you are an optimist who gets very excited and tends to assume that everything will improve: know that, and force yourself to mind real cautionary tales.

0 views