Latest Posts (20 found)

📝 2026-06-23 12:58: Create one of those Uses pages. Still a work in progress, but there's a good...

Create one of those Uses pages. Still a work in progress, but there's a good chunk of the stuff I use on there now. https://kevquirk.com/uses Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Hijacking ELF entry points for NixOS compatibility or WTF is wrap-buddy?

We are part-way through TacoSprint 2026 and a project that has inspired me has been the long-standing pursuit of producing relocatable binaries in Nix. This is something I’ve been discussing since as early as 2022 . We’ve made pretty great headway! 🥳 I posted a proposal to the Linux kernel mailing list to add support for to , which will allow for resolving the interpreter relatively. I also submitted PR#534339 to nixpkgs which improves the generation and shrinking by modifying them to leverage as well. This needs no new Linux kernel support and will make Nix derivations a teeny bit more relocatable. Throughout this investigation, I was informed about similar efforts via wrap-buddy by the venerable Mic92 . I opened the GitHub project and I have to admit, I did not quite understand it. Jörg is an amazingly prolific and technical developer, and despite my knowledge of the space, it took me a while to understand the craziness beauty of what was being done. So, wtf is wrap-buddy ? Nix is all about explicit dependencies and it leverages this with techniques like on the ELF binary. This all works for newly minted code, but if you try to download any precompiled binary on your NixOS machine, you’ll hit an error for a myriad of reasons. One of the biggest being that the dynamic linker/interpreter, , does not exist on NixOS. We would love to compile everything from source, but the reality is that plenty of software people want to use is closed . In order to allow that to work on NixOS machines, derivations may patch the ELF files with patchelf setting things like and to Nix-friendly paths. In some rare cases, however, that doesn’t work. The documentation in claims: autoPatchelfHook can be error-prone and may break binaries that, have unusual ELF layouts. In these pathological cases, is an alternative that takes over the startup of the binary to modify it at runtime. 🤯 Let’s take a look with a small example. We can build a small C program linked against two shared libraries, and , forcing a non-NixOS interpreter path: If we run this binary, it fails immediately because doesn’t exist or it can’t resolve . Now we patch it using pointing to our library paths: Now if we run our binary, , we see that it works: What did it do? 🤔 First off, it copies the first 416 bytes of our program code into a hidden file named . Let’s peek at the original binary and the instructions for : saves those starting 416 bytes to the hidden file . The configuration file format starts with a 22-byte header, followed by the interpreter string (83 bytes) and string (442 bytes), placing our saved original instructions at offset 547 ( ): Next, it clears our to so the Linux kernel thinks it’s a statically linked binary and boots it directly: Lastly, it overwrites our entrypoint with that small stub (416 bytes). We can see in the disassembly that immediately redirects and calls now: Why all this complexity? What is doing? The goal of is to find a known custom loader, , which will help us finish all the dynamic linking. The custom loader gets even more nuanced and low-level. It would be a disservice to try and completely go over everything it does, and at this point the README does a fairly good job. At a high level: The NixOS dynamic linker takes over, uses the to resolve and . We can now run the application using the restored original entry point with everything resolved. Magic. Wizard. Mic92 . 🧙 It reads the saved original bytes from the file and copies the original bytes back over our stub in memory. To any observer, the binary is now completely clean and resembles the original. It injects the custom by creating a brand new dynamic section in memory and populates it with the containing our library search paths that we stored in . It loads the real NixOS interpreter into memory. It rewrites the kernel’s stack metadata (auxiliary vector pointers like , , and ) to trick the native loader ( ) into believing it was loaded natively by the kernel. Finally, it jumps to the entry point of the NixOS interpreter.

0 views

The Coming Loop

I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops. — Boris Cherny Over the last months I have watched more and more people build something on top of coding agents that feels meaningfully different from just using a coding agent. Some of this happens on top of Pi which is cool to see for sure! The pattern is the same everywhere though: work is put into a queue of sorts, a machine picks it up, attempts it, stops, and then some harness decides whether that was actually the end. If not, the harness continues the same session, injects another message, starts a fresh session with modified context, or sends the task to another machine. The task stays alive beyond the point where the model by itself would normally have said: “I am done.” I think about that type of loop more than I want to admit. There is already an agent loop inside every coding agent. The model calls a tool, incorporates the result, calls another tool, reads a file, edits a file, runs tests, and eventually produces some answer. That loop is one we have been quite familiar with for a long time. The other loop is the harness level loop: the loop outside the agent loop. That loop is also not new . We have been doing versions of this since early Claude Code days, but that loop is becoming ever more present in agentic engineering and in recent weeks it has started to dominate the Twitter discourse. My current status is that I have not had much success with this way of working for code I deeply care about which turns out to be quite a lot of code. Part of that is taste and part of it is control. I attempt to set a high bar for what I want code to look like, and I want to understand the code I ship. Under pressure, or in a discussion with another human, I want to be able to explain what the system does without first having to ask a clanker to explain it to me. Now there is obviously a question if this desire to understand the code is one that I will still have a few years from now. For now I have not moved past the point of comprehension being important to me. Given this desire, there is something I lack with my experience of code written without me paying attention, particularly from loops. Present-day models tend to produce code that is too defensive, too complex, too local in its reasoning. They avoid strong invariants. They add fallbacks instead of making bad states impossible. They duplicate code, invent bad abstractions, and paper over unclear design with more machinery. Worse though: I so far see very little progress of this improving. If anything, on that front it feels to me that we might even be making steps in the wrong direction. At least for my taste, present-day hands-off harnesses like Claude Code with ultracode produce worse code than what we were producing last autumn. That’s because Claude Code, with Fable for instance will be working uninterrupted on a problem for thirty minutes or more, when previously the process would have been much more human in the loop. Furthermore it’s well understood that models tend to observe some local failure and add a local defense. Karpathy mentioned how they are “mortally terrified of exceptions”. In systems with important invariants, especially persisted data formats or core infrastructure, the right fix is not “handle every malformed case.” The right fix is to make the malformed case unrepresentable or impossible to write in the first place. Yet even with a lot of manual steering, that type of code does not come out of LLMs naturally, and even if the code comes out naturally like that, they will still attempt to handle now impossible errors. When you take that behavior and you put it behind loops, you tend to amplify it. If each iteration adds another small defense, the system slowly becomes less understandable while appearing more robust. The more hands-off you are, the more that happens. It also teaches really bad practices when tools like this are given to juniors without clear guidance. Because if you ask them, why they are doing all that, they will convincingly argue their case. At the same time, it would be dishonest to pretend the loop pattern does not work because it already works astonishingly well in some domains. Porting code one of them. There are already impressive examples of large automatic porting efforts, including the reported work around moving parts of Bun from Zig to Rust . I have used it with success myself to port MiniJinja to Go . Performance explorations are another case where this works beautifully. A machine can try experiments, benchmark them, discard failures, and keep searching. Security scanning fits naturally too and so does almost any type of research: asking a system to explore a complex problem space and report back without necessarily committing lasting code. One thing that many of these have in common is that they either do not generate new code, but transform code that already exists, or they produce code that intentionally does not have a long shelf life. They either produce proof of concepts or ideas, surface findings or are more akin to mechnical transformation. I believe that loops that produce artifacts without necessity of longevity or that create some form of clearly verifiable mechnical translation matters more than the general ability of a harness to mechanically measure a goal. Many successful applications of loops use another LLM as a judge or as an orchestrator. The mechnical translation case can be verified with a binary test case, but it can also be judged by an LLM instead! Claude Code, for instance, is increasingly good at creating entire experimental workflows that it will then execute. Sure, the code it produces is slop, but that’s more the fault of the model than the harness not being a good judge on if a step in the workflow resulted in a net improvement or completion. The harness just needs some signal that lets it continue. It does not have to be objective or binary, it just has to be useful enough to drive another iteration. I absolutely love loops already that take the boring parts out of my day to experiment and measure and to give me ideas. On the other hand using that same looping methodology to write lasting code does not yet sit well with me. The metaphor I like to reach for is one of moving from software as a deterministic machine to software as an organism. I became a software engineer in an enviornment that encouraged me to understand the machine. There was always a layer you could peel off to deepen your understanding. Machines that did not exhibit deterministic observable behavior were maybe accepted, but generally seen as not exactly optimal. Software architecture-wise, I saw it as desireable to push further towards more determinism rather than less. Likewise the ability to understand the code has been an undeniable goal. In practice not always possible we still took pride in writing code so that it became possible even for new engineers to navigate complex code bases through clever architecture. On well designed systems there were always engineers that knew where the invariantes lived, which parts were load-bearing and which changes were safe. Ideally all of that was also well documented. Where that understanding was lacking, it was generally regarded as something to improve upon. Obviously that ideal has always been strained. Many software systems, especially very successful ones had periods where engineers on the team were able to keep them clean. Large software systems are not infrequently too big, too dynamic and too dependent on external services to fit into anyone’s head. Even without LLMs we already diagnose distributed systems somewhat like doctors in that we observe symptoms, form hypotheses, “order more tests”, try some remedies, and observe again. Yet with LLMs we’re pushing much further in that direction and much quicker. We use them to write the code and we also use them for diagnosis and remedy. There are plenty of engineers that already live in a world in which the first step after the occurrence of a production issue is followed by having a clanker read logs, propose root causes and proactively put up a patch. The resulting patch is then often picked up by another machine that reviews, sometimes even landing it on main without any human supervision. Obviously that is powerful and I cannot deny that it sounds appealing. But giving in to that idea, particularly with less and less human oversight means accepting that we may no longer understand the whole system in the same way. We treat it, we monitor it, we stabilize it, but we do not necessarily comprehend it. I have no doubts that for some software, that is okay. Not every line of code deserves human authorship and worse code might have been written in the past. But do I want all software to be authored this way? What’s very uncomfortable is that opting out of this fully machine-driven future may not be an option. Security is the clearest example today. Even if you do not use loops to build your software, other people will use loops against your software. Attackers will run machines continuously and even if it’s not attackers, then security researchers will and some of that automated work will throw up dust but also find real issues. And both the signal and the noise will come your way at a volume that makes it almost impossible to deal with unless you yourself throw a machine at the problem. Daniel Stenberg’s post about curl’s summer of bliss is a good example of the pressure maintainers are already under. As far as I know, AI does not play a tremendous role in the core development of curl today. Yet despite all of this, maintainers are overwhelmed by reports, most of which are now AI-generated ones. If attackers and reporters loop, defenders will eventually need to loop too to keep up. Maybe not to write patches directly, maybe just to triage and reproduce and pressure will increase. The same is true competitively as some teams will out-build others through raw speed. Some projects will suddenly move faster because a tiny group figures out how to orchestrate machines effectively. Some startups will do with five people what used to require fifty. Some people might literally put a machine against your product in a loop and ask it to “make it like the other one.” And if their users are happy, does it really matter? Not all software will be equally affected. Some domains will punish sloppiness and demand trust and responsibility, but a lot of software lives in a world where raw speed, quick experimentation, and vast coverage matter enormously. The scariest part to me is that we become dependent on these new machines in new ways. Software has always depended on tools. I remember the time when I had to pay for compilers. These new tools are a flashback to times where creating software came with real costs. But now it’s no longer a one-time payment, it’s a constant dependency. Not just a dependency on a filled wallet, but also a cognitive dependency. If a codebase is produced by loops, reviewed by loops, patched by loops, and kept alive by loops, what happens when you no longer have access to the same class of systems? What happens when some trade restrictions take away access to the most powerful models? What if just the cost becomes unbearable? What if you and your team just lose the last remaining ability to understand the code without using the machine? We may create codebases that are not merely hard to maintain by humans, but that assume machine participation as part of their maintenance model. This is already happening! It’s not happening everywhere, and it might not even be happening in ways that are seen as problematic, but we see more and more of it. People more and more merge code they cannot fully explain. People lose their ability to create issue reports or discuss things in chat, without augmenting or rephrasing their messages with the context provided by a clanker. Too many people increasingly rely on a machine to summarize or contextualize it. More and more do I encounter people who converse with me through the indirection of an LLM. Again, maybe that is not even going to be wrong, but it’s a massive change to how we did things. I have little doubt that this is where things are going but going there will require us to do something about our tooling everywhere, and not just in the coding agents. Just orchestrating more loops won’t be enough. Better visualizations of changes or orchestration or agents will not restore our understanding. Either we need to find clever ways to jolt the human back into the loop and make the changes of the loops legible long term, or we need to find better ways to compose these ever more complex systems. This is also where my thinking about the role of Pi is changing. Pi has been cautious, and I think that caution is good. I do not want a future where every interaction turns into an uncontrolled swarm of machines making changes I cannot follow. I would not want Pi to become an unmaintainable mess in an effort to win the race towards software that writes itself and I would not want Pi to promote this type of engineering either. At the same time Pi is a harness and harnesses are at the center of people running these new types of experiments. Task queues for coding tasks, orchestration of agents, subagents, durable sessions will matter more and more. Even those of us who have their reservations and are not blindly embracing loops will have to start doing those experiments. We need to, because we need to understand how to make this future bounded and survivable. As you can read from this post, I’m very uneasy about this future. Not cause of fear, but because of caution given experiences with this technology so far. Adopting the idea of harness loops means that the harness decides when work is finished. In the agent loop, the model eventually says “done” and I review. Even before that, I usually steer along the way. I am involved and I enjoy learning along the way. In the harness operated loop I’m not sure what my role even is. Even the “done” signal loses all meanings and just becomes communicated to yet another machine that judges. My role is reduced to that of a messenger. Today I do not like much of the code that I see from systems built that way and neither do I enjoy interacting with too much of software built with AI assistence. Looping is powerful but it removes responsibility more and more, and it at least today very much encourages us to give in to the machine. And yet I have no doubts that this looping future is going to be our future despite the fact that I presently resent it. I already see astonishingly small teams building at impossible speed and I see codebases turning more and more into obscure and confusing organisms that can only be diagnosed by more machines. Those codebases are simultaniously useful and messy. So I guess I’m coming to terms with that the question is not whether we will loop because clearly we will. Maybe the question is that in a future of loops, how do we don’t abdicate judgment, how we can retain rules of good engineering, how we can ensure that responsible human can continue to supervise, how we need to re-think how we architect code to retain sanity along the way.

0 views

Porting the Moebius 0.2B image inpainting model to run in the browser with Claude Code

This morning on Hacker News I saw Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance , describing a small but effective inpainting model - a model where you can mark regions of an image to remove and the model imagines what should fill the space. The released model required PyTorch and NVIDIA CUDA , but since it described itself as 0.2B I decided to try and get it running using WebGPU in a browser. TL;DR: I got it working, and you can try the demo at simonw.github.io/moebius-web/ . Read on for the details. Here's a video demo of the finished tool: You can open any image in it (non-square images get letterboxed), highlight areas to remove, click the "Run inpaint" button and wait for the model to do its magic. My main project for today was landing a major feature in Datasette: a UI for creating and altering tables, as a follow-up to the insert and edit rows feature I released last week. I was working on that in Codex Desktop (here's the PR ) and often found myself spending 5-10 minutes spinning my fingers waiting for it to complete a mid-sized refactor or add the finishing touches to a change to the UI. (An amusing thing about coding agents is that the harder a problem is the more time you have to get distracted while you wait for them to finish crunching!) So I decided to spin up Claude Code in a terminal window and see how far I could get at porting Moebius to the web. My first step was to ask regular Claude about the feasibility of this project. In Claude.ai , which has the ability to clone repos from GitHub: (I hadn't spotted the link to the weights yet, that's tucked away in the "News" section.) I like telling models to "muse on X", it's the shortest way I've found of expressing that I want them to contemplate a problem for me without providing them with a concrete goal. Here's that chat transcript . I copied out the last answer and saved it as research.md for Claude Code to read later. Claude suggested using ONNX Runtime Web on the WebGPU backend - the layer below the Transformers.js library I had suggested. That was enough to convince me it was worth setting Claude Code loose and seeing how far it could get. I usually start projects like this by gathering as much information as the coding agent might need as possible. Since I didn't expect this project to actually work I did everything in my folder: I created a directory for the rest of the project and ran in that so Claude could start committing code notes: I fired up a instance in the folder, the level above all of the research materials I had prepared for it. I prompted: As it started to work I dropped in this follow-up (typos included): I often ask agents to keep notes like this - the end result is often interesting, both for myself and for the next agent session that touches the same project. Here's what that notes.md file looked like at the end of the project. I kicked it off and went back to my main project, checking in occasionally to see how Claude was doing. When it looked like it might have something that worked I prompted: Then I tried it out in Chrome and pasted some errors (and screenshots of errors) back into Claude Code. After a few rounds of this we had something that appeared to work! Time to put it on the internet so other people could use it. Claude Code knows how to use the CLI tool, so I created a model repo on Hugging Face , then created a token that could write to that repo and dropped it into a file so Claude could use it. It published the 1.24GB of converted ONNX weights to huggingface.co/simonw/Moebius-ONNX for me. I'd seen other demos load weights into the browser from Hugging Face before, so I knew it was possible. I decided to host my own frontend code on GitHub Pages, so I said: Telling it the final URL was important in case it needed to fix the URLs in the demos that it was building so they would work when deployed to production. After a few more rounds of iteration, in between working on my main project, we got to a working, deployed version! Except... each time I reloaded the page it seemed to download ~1.3GB of model weights. Browser caching seemed pretty important for this! I knew that Transformers.js projects could handle this properly, so I grabbed a copy of the Whisper Web demo, dropped it into and said: That project was entirely obfuscated, built JavaScript files so I figured using a subagent would avoid spending the rest of my top-level token context deciphering those files. Claude figured out that it was using - the CacheStorage API - and added that to our project . I've shared the full Claude Code transcript for this project (published using my claude-code-transcripts tool). This definitely counts as vibe coding: I didn't look at a single line of code from the project, restricting my input to testing, suggesting small feature improvements (like a progress bar for the large file downloads) and pointing the model in the direction of examples of how I wanted things to work. Since I didn't write any code the amount I learned about the underlying technologies - WebGPU, ONNX, and the Moebius model itself - was very limited. As is usually the case with this kind of project the most important things I learned concerned what was possible : I felt like I should probably try and learn a little more about my project. I fired up Claude.ai and prompted: Here's the transcript and the understanding.md Markdown file it created, which I've now added to the GitHub repo. I found the explanation of ONNX particularly enlightening: ONNX (Open Neural Network Exchange) is a portable, framework-neutral file format for neural networks. An file is essentially two things bundled together: Crucially, ONNX describes what to compute , abstractly, without saying how or on what hardware . The operator set is versioned by an opset number (this repo uses opset 18 ), which pins down exactly which operators exist and what their semantics are. It turns out PyTorch has built in mechanisms for exporting to ONNX, as seen here in export_onnx.py : Claude also included a handy glossary and an only-slightly-broken ASCII-art diagram showing how the model pipeline fits together. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Claude Opus 4.8 is capable of converting a PyTorch model to ONNX, publishing the result to Hugging Face and then building out a web application and interface that can load and execute that model. Chrome, Firefox and Safari are all now capable of running this kind of model - I tried it in all three. The CacheStorage API works with ~1.3GB model files. ... which means we can have inpainting as a feature of a client-only web application! (If our users can tolerate the 1.3GB download.) A computation graph — a directed graph of nodes , where each node is an operator ( , , , , , , , …) wired together by named tensors flowing between them. This is the "recipe" for the forward pass. The weights — the learned parameter tensors (the convolution kernels, the embedding table, etc.), stored as initializers in that same graph.

0 views

Consistency, But in Excellence Not Appearance

Consistency serves a purpose in visual design, but it seems to have become the purpose of a lot of visual design. Look no further than these evolutions of macOS icons ( image courtesy of BasicAppleGuy ): The Creator Studio icons are undeniably consistent visually: rounded rectangles, controlled gradients, simplified forms, restrained depth, etc. In contrast (and by modern standards) the originals seem heretically inconsistent. They lack coherence in visual details like shape, material, and lighting. But what they lack in visual consistency between one another, they make up for in excellence individually. In fact, their aversion to familial visual consistency almost seems like an intentional choice — a deliberate augmentation of individual purpose. What purpose? To be singularly representative and deeply iconic. Icons that are iconic . To be iconic, by definition, is to be famously distinctive. None of the Creator Studio icons, especially when held up as a suite, are iconic. None are atypical, they’re merely typical. All in pursuit of what, consistency — amongst each other and across platforms — as the overriding goal? This over-emphasis on “systems” design seems endemic to modern software. Systems prescribe rules because they are the easiest attributes to document, enforce, and automate — “All icons must use this shape, this lighting, this stroke.” Excellence, by contrast, is harder to systematize. It requires judgment, taste, care, experience, and a sensitivity to context — all in service of meaning and purpose, not superficial similarity. When you strive for consistency across a suite, individual elements lose their ability to be exceptional and iconic on their own terms. Consistency for the group becomes a ceiling on individual excellence. But if you flip that, if you make excellence the goal for each individual element, something interesting can happen: excellence becomes your motif of consistency. It’s no longer a consistency of shapes and gradients, but one of quality and intention that serves a deeper meaning and purpose than superficial visuals. Give me a consistency of excellence any day over a consistency of appearance. Reply via: Email · Mastodon · Bluesky

0 views
Unsung Yesterday

“Ketchup is next, which is similar in construction to the mustard.”

Since we’re talking about pixel art, in this 30-minute video , Stuart Brown known as Ahoy embarks on recreating an illustration called Four-Byte Burger: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/ketchup-is-next-which-is-similar-in-construction-to-the-mustard/yt1-play.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/ketchup-is-next-which-is-similar-in-construction-to-the-mustard/yt1-play.1600w.avif" type="image/avif"> The original picture was created by artist Jack Haeger on an influential computer Commodore Amiga in 1985, on prototype software that “was in such an early stage of development that it lacked a save feature, entirely.” Proper to-disk screenshotting didn’t come to computers until the 1990s, so the only reproduction of the picture was a photograph taken off of the display and reproduced in print in a manual for the graphics software; the original image pixels evaporated when the computer was eventually turned off. Brown recreates the image using more modern means (Photoshop), but eventually goes back to an Amiga to try to display it as close to the original as possible. It’s a soothing watch, and there are some fun moments in the video, like rotating the CRT to “portrait mode” – in a world populated by smartphones, in some sense the image aspect ratio seems oddly prescient. (Also, if you ever find yourself having to rotate a CRT, you can just degauss it instead of waiting all night. Degaussing a monitor is one of the forgotten weird tactile pleasures bordering on dark magic, and if you’re ever near an old CRT, ask someone to show you.) = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/ketchup-is-next-which-is-similar-in-construction-to-the-mustard/1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/ketchup-is-next-which-is-similar-in-construction-to-the-mustard/1.1600w.avif" type="image/avif"> #art #emulation #history #youtube

1 views
Unsung Yesterday

“Playing through it felt like reading a love letter.”

The videogame MainFrames was released on Steam and Nintendo Switch in 2025 to positive reviews: MainFrames invites you to meet Floppy and to browse a clever and charming platformer that plays out entirely within the windows and desktop of a PC monitor. You won’t want to press the escape key on this cozy outing! Recently, I stumbled upon the artist Alexis Morille who worked on a game sharing a few visuals and animations on Bluesky. Here’s what really happens under the hood when you resize the window : And here are the other “UI daemons” helping you scroll the contents : I believe the word gremlins, before being usurped by the 1984 horror comedy , was generally used to denote little mischevious creatures that live inside machinery and cause trouble. I wonder what the word would be for the little creatures that do all the hard work. I haven’t tried the game yet, but I found these to be delightful. = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/playing-through-it-felt-like-reading-a-love-letter/3.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/playing-through-it-felt-like-reading-a-love-letter/3.1600w.avif" type="image/avif"> #art #games

0 views

Can We Agree on a Storage/Workload Architecture Taxonomy?

The lines between transactional systems, analytical systems, hybrid systems, and shared storage architectures are getting blurry. This post proposes a small taxonomy for describing the different ways systems, workloads, storage tiers, visibility, and durable copies relate to each other. OLTP, OLAP, HTAP, and now LTAP? We can think of the first two as two types of workload which have specialized query engines and storage systems to support them. OLTP such as the RDBMS like Postgres and MySQL use row-based storage engines. OLAP, such as Clickhouse, cloud data warehouse and the lakehouse use column-based storage. HTAP is a hybrid workload system: one system -> both transactional and analytical workloads. The HTAP system therefore has specialized storage and specialized query engine to stitch together the row-based and columnar data. So far, we’re dealing with a single system. A Postgres (OLTP), a Clickhouse (OLAP), a SingleStore or TiDB (HTAP). So what is the recent Databricks’ LTAP announcement? LTAP is the two workloads (OLTP and OLAP) but also two systems (e.g. Postgres and lakehouse/Spark) and some blend of two different storage systems. As well single single vs multi-system, single vs multi-workload, there are other relevant concepts such as tiering and materialization: A single system can tier (move) data from hot to cold storage (for cost efficiency). One system, one copy, two tiers. Hot and cold might be the same storage format (both row-based or both columnar), or might be different formats (hot is row-based, cold is columnar). We can have two systems share the same storage tier. System A tiers (move) hot data to the storage of System B. Two systems, one copy, though System B doesn’t see the newest data yet which only exists on A. Materializing One system can materialize (copy) data into another system. Two systems, two copies. Note when I say “copy of the data”, I mean durable copy, so caching doesn’t count. If the number of copies really matters to you as a metric, then maybe caching does count, depending on how much cached data you need to make it work? If only life were simpler. It would be nice to have some shared vocabulary around this, so we can talk about system architecture more easily. So I defined some terms last year for this, and expanded it as seen below. Vis means Visibility (when is data available in the other workload). The broad classification scheme: Single tier, one system, one workload. Example: Postgres with SSD, single tier CockroachDB, standard Kafka cluster. Internal Tiering, one system, one workload, commonly tiers from hot to cold storage for cost efficiency, e.g. hot=SSD, cold=S3. Though tiering could also serve other purposes than cost. Example: Apache Kafka tiered storage, ClickHouse MergeTree tiered storage. Hybrid-Sync (aka HTAP), one system, two workloads, two or more storage with potentially different formats/tiers, e.g. hot row-based data on SSD, long-term columnar data on S3. Data is immediately available to both workloads (e.g. OLTP queries and OLAP queries). Example: SingleStore and TiDB (Pingcap). Hybrid-Async , one system, two workloads. Like Hybrid-Sync except hot row-based data is asynchronously tiered to long-term columnar format. OLAP queries do not see the very newest data. Example: Snowflake Hybrid tables. Materializing , two workloads, two systems, two copies. System A copies data to System B. Each system is dedicated to one workload, with specialized query engine and storage. Example: ETL in general, many Kafka-compatible services have automatic Iceberg materialization of topics e.g. Confluent Tableflow, Databricks Synced tables asynchronously materialize from lakehouse to lakebase (Postgres). Shared Tiering , two workloads, two systems. one copy across hot tier + shared colder tier (e.g. hot row-based data on SSD for System A, colder columnar data on S3 for System A + B). Example: Apache Fluss tiers hot data (Fluss servers) to lakehouse (lakehouse is a shared tier), LTAP. Potentially, a 7th and 8th category could hypothetically exist: Shared-Sync-RR and Shared-Sync-MM. Two systems, two workloads, one synchronous storage (each write is immediately visible in the other system. Read-replica (RR) variant has one master system and one read-only system (e.g. writes to Postgres are immediately visible for reads in lakehouse). Multi-master (MM) allows both systems to write (hard!!). At the time of writing the details on LTAP are scarce, but it seems like LTAP will fall into Shared Tiering. The thing that differentiates HTAP from LTAP is that HTAP is a single hybrid system which makes data visible to both transactional and analytical queries at the same time. LTAP is a way of unifying the data of two different systems (each targeting a different workload) and sharing the colder data such that there is no (durable) data copy required. It is fundamentally asynchronous: hottest data is only in System A and the remaining colder data is stored in System B but made available to System A (as it’s cold tier). Of course LTAP could potentially move towards the hypothetical category Shared-Sync-RR , given both systems exist in the same platform, then it gets murky again because its one platform, its veering towards HTAP (Hybrid-Sync). One thing that the marketing material of unified OLTP-OLAP system commonly glosses over are the different data models used in each, such as Third Normal Form (3NF) common in OLTP and Kimball (star and snowflake schema) common in analytics. This adds another dimension, on top of query engine, storage layout and storage substrate. If you want 3NF for OLTP and Kimball for analytics, then it’s probably going to be Materialization (as star schema is not viable as a cold tier for 3NF). What you you think of this broad classification scheme? Find on me social media :) ps, some thoughts on data copies… With Shared Tiering, you can think of the data-copy question as a dial: Dial it to no-copies-at-all means evicting data as soon as it has been tiered. Lower storage cost, but maybe it would be good to hang onto to the hot data a little longer for performance. Dial it to lots-of-data-overlap means aggressively tiering to System B but hanging onto the data in System A for the better performance profile, at the additional storage cost. And technically it would now count as cached data which might not count as a data copy, depending on how you define that. However, the data-copy question is also murky with Materialization. Because we have two (or more) independent systems, each can potentially use independent data expiration policies. For example, in Kafka, it might store 7 days, but in the lakehouse, it might store 7 years. In that case, while theoretically it is a two-copy system, the total duplication would only be 0.0027%. I generally dislike the whole “zero-copy” or “one-copy” thing, it’s too much marketing. Focusing on how many copies you have is just weird as a primary design point when you’re building data systems, the real world is more nuanced. Tiering A single system can tier (move) data from hot to cold storage (for cost efficiency). One system, one copy, two tiers. Hot and cold might be the same storage format (both row-based or both columnar), or might be different formats (hot is row-based, cold is columnar). We can have two systems share the same storage tier. System A tiers (move) hot data to the storage of System B. Two systems, one copy, though System B doesn’t see the newest data yet which only exists on A. Materializing One system can materialize (copy) data into another system. Two systems, two copies. Single tier, one system, one workload. Example: Postgres with SSD, single tier CockroachDB, standard Kafka cluster. Internal Tiering, one system, one workload, commonly tiers from hot to cold storage for cost efficiency, e.g. hot=SSD, cold=S3. Though tiering could also serve other purposes than cost. Example: Apache Kafka tiered storage, ClickHouse MergeTree tiered storage. Hybrid-Sync (aka HTAP), one system, two workloads, two or more storage with potentially different formats/tiers, e.g. hot row-based data on SSD, long-term columnar data on S3. Data is immediately available to both workloads (e.g. OLTP queries and OLAP queries). Example: SingleStore and TiDB (Pingcap). Hybrid-Async , one system, two workloads. Like Hybrid-Sync except hot row-based data is asynchronously tiered to long-term columnar format. OLAP queries do not see the very newest data. Example: Snowflake Hybrid tables. Materializing , two workloads, two systems, two copies. System A copies data to System B. Each system is dedicated to one workload, with specialized query engine and storage. Example: ETL in general, many Kafka-compatible services have automatic Iceberg materialization of topics e.g. Confluent Tableflow, Databricks Synced tables asynchronously materialize from lakehouse to lakebase (Postgres). Shared Tiering , two workloads, two systems. one copy across hot tier + shared colder tier (e.g. hot row-based data on SSD for System A, colder columnar data on S3 for System A + B). Example: Apache Fluss tiers hot data (Fluss servers) to lakehouse (lakehouse is a shared tier), LTAP. Dial it to no-copies-at-all means evicting data as soon as it has been tiered. Lower storage cost, but maybe it would be good to hang onto to the hot data a little longer for performance. Dial it to lots-of-data-overlap means aggressively tiering to System B but hanging onto the data in System A for the better performance profile, at the additional storage cost. And technically it would now count as cached data which might not count as a data copy, depending on how you define that.

0 views
Brain Baking Yesterday

Week of the Eclair

Last week I arrived home with a delicious pear frangipane pie from the local bakery. The contents of the cardboard pie box didn’t last long, but on top of it, in a corner, a pink round sticker caught my attention: it read WEEK VAN DE ECLAIR (8th Edition). Today marks the last day of that special week so perhaps it’s not too late to rush off to fetch some after finishing this article, even though I’m not particularly fond of them. You see, exactly ten years ago, during this period my friend & I had to prepare and defend our bakery skills in front of a jury. After three years of éclair after éclair after éclair, you get sick at just the sight of them. The Belgian patisserie scene is just like most other Belgian gastronomic affairs: it’s mainly variations of French cuisine—hence the éclair as the pinnacle of an indulgent Sunday afternoon snack. During our training, the cooking of the pudding and the preparation of the choux pastry dough was perfected by repetition, until we could do it with our eyes closed. Most of these Fridays I would return home with a painful tummy. How else do you judge the readiness of the pudding than by tasting too much of it? Hence my growing disgust of the éclair. If you don’t know what a choux pastry is, just think huge profiterole tower—the one where you pour a ton of hot chocolate over. These round doughy balls, risen by employing the steam of the wet pastry, are usually filled with either whipped cream or custard/pudding. An éclair is a longer pudding-filled profiterole. One of our assignments was the creation of such a tower, yet now that I look back at the photos we took of our table displaying the baked goods, somehow the profiteroles disappeared? Perhaps in the jury’s belly? I can’t remember. Our bread and pastry baked goods table, presented for the jury as part of our degree as professional bakers. What I can remember is that we didn’t like the traditional Flemish/Belgian take on most of these pastries. My friend and I visited Paris to do a marathon bakery run and the result was no ordinary pudding/custard. Instead, we infused the boiling milk with a good dose of chai tea. Most other pastries got the “twist treatment” as well. For example, the frangipane dough we were taught is awfully dry. We Belgians are used to that kind of filling but I didn’t like it. I found recipes that combine half frangipane and half pudding/custard to create a smoother texture (the little tartlets on the left part of the picture with pear slices in them). The result? Our jury was displeased: the filling was “too wet”. Sure, if you compare it to the junk typical Belgian bakeries sell you, then yes. Needless to say, the training encouraged you to stay in line and not experiment too much. The black gooey swirls are supposed to be what we call brioches made with buttery yeasted dough. That’s a bit of a deception though: a French brioche is something else entirely , and most swirls you’ll encounter in a French bakery are made with laminated dough instead. Historically speaking, both brioches should share the same yeasted basis, but ours is much less enriched with butter/eggs. Normally, we would spread out the dough, slather it with more pudding and raisins, roll them up and proceed to bake after a quick second rise. We couldn’t stand even more pudding so we turned to David Lebovitz’ chocolate babka whilst keeping the form of a Flemish brioche . Want more pudding? You can’t handle the pudding! I know I can’t. The round sugar powdered things on the far left are boule de Berlins or Berliners. They are traditionally made by deep frying like a doughnut but we made them from sandwich dough instead to keep them light (and digestible). The baked and cooled round shapes are then cut in half, after which… one proceeds by filling them with a thick layer of pudding with the help of a piping bag. No jam-filled Berliner to be found here: pudding pudding pudding. I think I threw up once after getting home from one of the many pudding-filled lessons. As you can see, the end exam consists of a very evenly distributed assignment of 90% pastry and 10% bread. We did make some what we call pistolets , a typical Belgian round bread roll with a crispy crust (due to plenty of steam injected during baking) and soft inside. These are heavily yeasted rolls. I had to sneak in some kind of sourdough somewhere so the salt-and-rosemary sprinkled focaccia was, to me, the tastiest thing we produced. Of course, the jury disagreed. We got our diplomas anyway but I wouldn’t dare to dream of opening a dreadfully boring average Belgian bakery where the yeast, questionable bright yellow looking margarines called “superior butter”, and pudding flows and flows. It turns out that Week of the Eclair is organised by Puratos, a big industrial supplier raw bakery materials. I wonder if all bakery participants are Puratos clients as well? I know our local bakery, where I got the sticker from, is, as I regularly spot the Puratos truck restocking them. The initiative is a Belgian one—I doubt it that French people really need such a week to promote the éclair. Just walk into a random bakery in downtown Paris and you’ll be mesmerised by the unique shapes, forms, colours, and tastes of their éclair offer. Meanwhile, here in Belgium, most bakers offer just one: the classic chocolate top-coated one with vanilla pudding filling. There goes my stomach again… Now if you’ll excuse me, I’ll be off to get my éclair fix. Related topics: / patisserie / By Wouter Groeneveld on 22 June 2026.  Reply via email .

0 views
マリウス Yesterday

I Do Not Recommend Google Hardware

I’ve been a GrapheneOS user for years now. Back in 2022 I switched away from /e/OS on a Samsung Galaxy S10 to a Google Pixel 6a that I had bought, because at the time it happened to be one of the cheapest devices on the short list of officially supported Pixels . However, my history with Google phones goes way past the 6a and ever since I got my first Nexus , every single piece of Google (branded and manufactured) hardware that has passed through my hands has eventually broken on a hardware level, way quicker than expected. At this point I have run out of patience with Google ’s consumer electronics and have decided to stop giving the company any more of my money. This post is part personal post-mortem, part survey of the wider Pixel landscape, and part forward-looking note on what I’m going to do instead. Disclosure: The opinions in here are entirely my own, formed from years of using Google hardware as a paying customer. To be very clear up-front, I have never been a fan of Google as a company, and I have certainly never been a fan of their hardware design language. I normally do not run Google ’s software on any of my devices , I avoid Google services , and I would prefer not to give the company a single cent. The only reason I have nevertheless ended up with a stack of Pixel devices on my desk is GrapheneOS . Graphene , to this day, requires Pixel hardware because Google ’s phones are essentially the only consumer Android devices that ship with a verified-boot chain, a relockable bootloader after flashing, and a security coprocessor ( Titan M2 ) that the project considers sufficient for its threat model. There is no other Android manufacturer in this market that offers a comparable hardware security surface for an alternative OS. So if you want the strongest privacy- and security-hardened Android, you buy a Pixel . That’s literally the only reason. In my original write-up of the switch to GrapheneOS I went into the why in much more detail. The short version is that, I no longer trusted any stock smartphone OS, and after years of bouncing between CyanogenMod , LineageOS , and /e/OS , GrapheneOS was the first ROM that felt like actual engineering rather than a community paint-job over a vendor blob. In my follow-up post about the Pixel 8 I went so far as to call the Pixel 8 “a solid piece of hardware, if you happen to find a fully functional device” . In hindsight, I have to admit that I was wrong. Let me start with the actual Google devices that I have owned, in chronological order. The Nexus 5 was the first Google -branded phone I bought, back when the device was still being manufactured by LG and GrapheneOS was not yet a thing. I ran it for a while on Google ’s stock Android and, after the initial honeymoon period, switched it over to CyanogenMod , the project that, years later , would be reborn as LineageOS . For its first year or so, the Nexus 5 was actually a likeable phone, as it was compact, light, with a clean software experience that, at the time, felt refreshing compared to the bloated OEM skins on competing Android devices. Then the hardware started giving up. The battery, which had been mediocre to begin with, became unreliable and the phone would report 40% charge one moment and shut off entirely the next, and over time it began to randomly reboot and power off without any obvious trigger. The decline was not gradual either and once the battery started misbehaving, the device was effectively unusable within a matter of weeks. Combined with a charging port that became increasingly finicky about which cables it would accept, the phone went from likeable to unusable in well under two years of moderate use. The Nexus 6 , which, ironically given where this post is heading, was actually built by Motorola rather than by Google itself, replaced the Nexus 5 once the latter had given up on life. As with its predecessor, GrapheneOS was still years away, so I alternated between Google ’s stock firmware, CyanogenMod , and eventually LineageOS over the course of owning it. What made the Nexus 6 particularly memorable was the way in which its internals seemed to fail one component at a time , almost like a series of unfortunate but separate events. First, the microphone began cutting out during calls, with the other end of the line hearing nothing or only a faint, crackling signal. Then the loudspeaker and earpiece started developing distortion, eventually to the point where music and call audio were barely intelligible. Finally, true to the pattern that would later repeat on every subsequent Google / Pixel device I owned, the battery rapidly lost capacity and started misbehaving, with the phone shutting off at high reported charge levels and refusing to hold a charge during light use. All of this happened within the first few years of ownership, well before any reasonable expectation of obsolescence. In retrospect, the Nexus 6 also gave me my first real taste of what Motorola hardware can feel like. It’s worth keeping that in mind for the later section on Motorola ’s planned GrapheneOS -compatible devices . The Pixel 2 XL was my first phone branded purely as a Google device, with all the responsibility for design, hardware integration, and support sitting with Google itself. GrapheneOS still didn’t exist as it does today (the project’s early predecessor, CopperheadOS , was in the middle of its very public implosion right around this time), so the device once again spent its life running Google ’s stock firmware as well as LineageOS . The Pixel 2 XL disappointed me from essentially day one, and only got worse from there. The two main themes were performance, which, even fresh out of the box, felt sluggish for a flagship that was supposed to be competing with the Galaxy S8 and the iPhone 8 , as well as battery, which, as with every Google device before, deteriorated rapidly. The Pixel 2 XL was a particularly bad, with animations stuttered, app launches being inconsistent, and the whole experience feeling half a generation behind what Samsung and Apple were shipping that year. As the device aged, this only got worse. Within the first year I was already noticing significant drops in standby and active runtime, and by the second year I was forced to carry a power bank everywhere and even basic tasks like opening the camera app or switching between recent apps became noticeably slow. In addition, the Pixel 2 XL shipped with a notoriously bad display that suffered from blue-tint shifting, screen burn-in within months of light use, and uneven color rendering. All of which were defects that Google , in classic form, partially acknowledged with software workarounds rather than hardware replacements for most affected owners. The Pixel 2 XL was the phone that, at the time, made me seriously question whether I wanted to keep buying Google hardware at all. The answer, sadly, turned out to be yes , but only because of the eventual emergence of GrapheneOS and the absence of viable alternatives on comparable hardware. The Pixel 6a was purchased on sale for $299 in late 2022. It came with the Tensor G1 , served as my primary GrapheneOS device for roughly a year and a half, and was eventually relegated to “spyware phone” duty after I upgraded to the Pixel 8 . As with every Google phone prior, the Pixel 6a battery life declined noticeably and the device eventually became part of Google ’s Battery Performance Program , which, depending on how you look at it, was either a voluntary repair offer or an opaque battery nerf forced on owners via a mandatory update. In addition, the the charging port developed an unstable connection , which made charging frustrating and unreliable. After roughly two years of daily use, the device became unusable enough for me to downgrade it to a backup device, only to finally toss it after only two more years. Note: Google ’s entire A-series has a documented track record of battery problems. The Pixel 4a has been the subject of a UK Office for Product Safety and Standards alert for overheating and fire risk, the Pixel 6a has been the subject of multiple melted-device reports and was pulled from Google ’s refurbished store after fire incidents, and the Pixel 7a has had its own battery swelling repair program . Google has not initiated a proper recall in any of these cases. The Pixel 8 replaced the Pixel 6a in mid-2024 after I came across an unusually good tax-refunded deal . I have been running GrapheneOS on it from day one. Within less than two years of moderate, careful use, this phone developed the now-infamous Pixel 8 green-screen-of-death , a display defect that causes the screen to glitch with vertical green lines and flicker until you physically squeeze the lower part of the chassis . Google has, in a rare admission, extended the warranty on Pixel 8 displays to three years specifically because of how widespread this defect is, while pointedly not extending it to the Pixel 8 Pro despite reports of the same problem on that model. However, because my device suffered a drop and hence has its backside glass shattered, as well as the adjacent corner scratched open, Google will blame the screen issue (in my case) on the impact and won’t grant me a free repair. It’s also important to note that the lower portion of the device gets noticeably and uncomfortably hot under normal load, which is a known issue with the Tensor G3 SoC and its Samsung Exynos 5300 modem . In addition, the Pixel suffers from the family’s connectivity issues , that had plagued the Pixel 7 series already. When I sat down to research a possible replacement (a Pixel 9 or Pixel 10 ), the picture only got bleaker, but more on this in a moment. Note: Probably the most maddening pet peeve that I have with the Pixel 8 is its slippery surface. It is the only phone that I ever had that, no matter on what surface I put it, will eventually slide down without me interacting with it. Without stickers or protectors on its back the phone is so slippery that it will glide away from virtually any surface material. Put the Pixel 8 on a smooth wooden table and it will move by itself over time. Put it on a rough wooden speaker box and it will fall over the edge halfway through the first song that’s playing. Put it on top of another smartphone and it will fall off sideways. Whenever I hear a hollow knock I already know that it was the Pixel 8 randomly falling off of whatever surface I had put it on. The Pixel Tablet joined the line-up in late 2024 specifically because it is the only tablet that GrapheneOS supports. I wrote a relatively positive review of it at the time, with the significant caveat that it’s “underpowered” and not really suitable for anything more demanding than media consumption and light note-taking. A year and a half later, that already-modest assessment has aged poorly. The device’s Tensor G2 , which was already two generations old at the point Google shipped the tablet, has become noticeably sluggish as apps have continued to grow heavier. Lightroom Mobile , which was one of my primary reasons for buying it, runs with random glitches, crashes and odd behaviour , to the point where I’m looking to migrate my photography workflow once again to something else. Also, it seems like the device developed some WiFi connectivity issue leading to specifically streamed content pausing/stuttering for around half a second before resuming for maybe another half a minute, only to then repeat this behavior. I don’t know whether this is a hardware issue or a GrapheneOS bug, but I’ve noticed this issue for now over a year. Additionally, the battery life has degraded faster than anticipated, with editing workloads draining a full charge in under three hours even with the screen way below maximum brightness, and overall the tablet has aged significantly faster than anticipated , rendering it largely useless for any of the things I originally bought it for. In short, Google shipped a 2023 tablet with a 2021 chip and a sealed battery, and in 2026 this has become very noticeable. Note: These were only the Google -branded and -made devices that I owned, alongside a long list of other Android devices from HTC , Sony , Samsung , OnePlus , and even OPPO , that in all honesty weren’t exponentially better with regard to reliability and longevity. It would be easy to write all of the above off as bad luck, so let me back up the personal experience with what is documented elsewhere. Google ’s A-series phones in particular have, by now, a multi-generation track record of batteries that swell, overheat, or catch fire. The Pixel 4a was included in the UK Office for Product Safety and Standards alert for fire risk. Google ’s Battery Performance Program nerfed the device’s battery via a mandatory update rather than acknowledging a hardware defect. The Pixel 6 had reports of battery swelling and off-gassing , with some users describing flame and smoke incidents. The Pixel 6a saw multiple fire incidents , was pulled from the refurbished store , and was subjected to the same Battery Performance Program as the 4a . Google restricted charge rate and capacity after 400 cycles via forced OTA on July 8, 2025. The Pixel 7 and 7 Pro had widespread swelling reports less than three years post-launch. Google ’s response has been described as “inconsistent” by Android Central , with some users receiving free replacements and others being told to pay out of pocket. Oh, and the Pixel 7a has its own repair program for swollen batteries. When the same failure mode shows up across five consecutive versions/generations of phones from the same vendor, and the vendor’s first response is to throttle charging rather than replace the cells, you’re no longer looking at bad luck but at a structural problem with battery sourcing, cell qualification, or thermal design. I’ve mentioned my own Pixel 8 display dying above. The Extended Repair Program that Google published in response covers Pixel 8 devices that exhibit “a vertical line running from the bottom of the display to the top or a display flicker” , with coverage extended to three years post-purchase. Pixel 8 Pro owners with the same vertical line defect have not been so lucky and are largely on their own. Manufacturers don’t extend warranties on a whim. Google extending warranties on the Pixel 8 display by a factor of three is, in itself, the admission-of-a-defect that the company has otherwise tried to avoid in public. Since the Tensor G2 , Google ’s Pixel flagships have been using a Samsung Exynos 5300 modem (and its successors) for cellular connectivity. This is the same modem family that has, generation after generation, been criticised for worse signal stability than the Qualcomm modems used by competitors, as well as significantly higher power consumption, especially on 5G, and battery drain bugs that essentially trade-off endurance for modem efficiency. Google ’s answer in the Pixel 10 generation has been to switch to a MediaTek T900 , which according to early benchmarks is an improvement, but does not retroactively help any of the millions of Pixel 6 / 7 / 8 / 9 owners who paid flagship prices for what was, by industry standards, a sub-par modem. Google ’s Tensor chips were seemingly never designed to compete head-to-head with Qualcomm or Apple on raw CPU or GPU throughput, despite the pricing being in a similar range. For example, the Snapdragon 8 Gen 3 is roughly 68% faster than the Tensor G3 in Geekbench 6 multi-core, and about 32% faster in single-core. In some graphics workloads, it’s roughly twice as fast. The Apple A17 Pro is nearly 50% faster than the Tensor G4 in multi-core, and the Pixel 9 Pro XL ’s Tensor G4 loses up to 50% of its sustained CPU performance under thermal throttling, with the throttling kicking in within three to four minutes of full load . The Pixel 10 and Pixel 10 Pro , powered by the Tensor G5 , score 3,707 in the Vulkan GPU benchmark , compared to 26,333 for the Samsung Galaxy S25+ , which is a difference of roughly 7x . Even the Pixel 9 Pro ’s outgoing chip outperforms its successor at 9,023 points. In 3DMark Wild Life Extreme , neither the Pixel 10 nor the Pixel 10 Pro break 20 FPS, while a Snapdragon 8 Elite device comfortably clocks 38 FPS. Hence, the Snapdragon 8 Elite -based Galaxy S25 comfortably outscores the Pixel 10 in both single- and multi-core CPU performance , with the S25 posting roughly 75% higher multi-core scores. If you want a single chart that summarises this, Geekbench ’s Android benchmarks page is a good overview and shows that Pixel flagships do not appear anywhere near the top. What this means in practice is that when you buy a Pixel , you are paying roughly the same money as you would for a Samsung , OnePlus , Xiaomi , or Apple flagship, but you are getting an SoC that is one and a half to two generations behind on raw compute, and even further behind on graphics. The phone feels snappy because Android is optimized for these chips and because Google ’s AI use cases are accelerated by the TPU , but once you actually push the device, e.g. with raw photo editing, gaming, prolonged camera use, or pretty much anything that requires sustained performance, it falls behind quickly. Beyond the flagship failures, Pixel devices have, generation after generation, shipped with a steady stream of quality-control issues that read more like early-access hardware than flagship . E.g. with the Pixel 8 , Google shipped a batch of factory-unlocked phones without the ability to relock the bootloader, requiring a return. Then we had the Pixel 8 green screen recall , which had been the precursor to the extended-warranty program, as well as the phantom touches issue, where intermittent ghost- touches were frequently dismissed by support as user error before being diagnosed as actual hardware problems. The Pixel 9 Pro XL had its infamous camera tilt issue, where some users reported the 5x telephoto lens shipping physically tilted out of the box, and the Pixel Tablet had the “check charging accessory” issue, where the charging dock dies surprisingly often , with troubleshooting steps that boil down to “clean the contacts and hope for the best” . You can find an essentially endless stream of similar reports on and the official Pixel Phone Community forums and the pattern is always the same: A defect is reported, Google ’s official support insists on app-uninstalls and factory resets, and after enough public outcry the defect is eventually quietly acknowledged via a support page, hidden so deep that probably won’t people won’t bother to look. Honestly, in my circle of people who care about privacy, the answer is almost always the same as mine, namely because of GrapheneOS . For everyone else, the answer is the camera and the “AI features” , plus a vague brand-loyalty to Google that exists for reasons I truly struggle to understand. The camera is, to be fair, very good. Google ’s computational photography pipeline is one of the few areas where the company’s ML-first approach to silicon pays off in a way the user actually notices. If you primarily care about point-and-shoot photography out of a phone, the Pixel camera is still near the top of the pile, even on the cheaper A-series . Everything else, in my view, is not competitive with what Samsung , Xiaomi , OnePlus , Nothing , or Apple ship for the same money or, in some cases, less. You can verify that for yourself. After my Pixel 8 green-screened on me, my initial instinct was to do what I’ve always done and just replace it with the next Pixel . I spent a few weeks looking at deals on the Pixel 9 and Pixel 10 , reading through their respective issue threads on Reddit , looking at the benchmarks above, and decided that I simply don’t want to give Google any more of my money for what is, charitably put, garbage hardware sold at flagship prices. The interesting development that makes this decision possible is that, on March 2, 2026 , at MWC 2026 , Motorola officially announced a partnership with the GrapheneOS Foundation . This is the first time GrapheneOS will officially support a non- Pixel vendor, with availability expected to begin in 2027. There is some uncertainty in all of this, though, as hardware schedules often slip and partnerships sometimes dissolve, and there’s no guarantee that the eventual Motorola device will meet Graphene ’s requirements (verified boot, relockable bootloader, etc.) at a price point that’ll be remotely interesting to the average GrapheneOS user. There is also the risk that Android 17 turns into more of an Intelligence System launcher than an actual OS. However, I’d rather wait six to twelve months and roll the dice on Motorola than spend another $800-$1000 on a phone that, by all available evidence, is statistically likely to develop a hardware defect shortly past its warranty window. The obvious follow-up question is whether existing Motorola hardware, like the Edge series, or the current razr line-up, is any good to begin with, since these broadly resemble what the eventual GrapheneOS -compatible devices are likely to be. Frankly, I have no idea. The reviews of the Motorola Razr Ultra (2025) seem relatively positive on durability. Android Central ’s one-year follow-up describes the display still looking “like the day it was received” after a year of regular use, with the major caveat that the vegan leather on the back has been peeling. Reviewers have called it “Motorola’s best and most popular flagship phone thus far” . The Motorola Edge 60 is even more interesting from a durability perspective. It carries an IP69 rating , which is above the IP68 on the latest Pixels and means the device is certified against high-pressure, high-temperature water jets in addition to sustained submersion. Motorola also commits to three OS updates and four years of security updates , which is a little behind Google ’s nominal seven years on the Pixel , but in line with the rest of the Android industry, and arguably more honest given that Google ’s seven years are seemingly predicated on the device not physically falling apart in years two and three. Note: I’ve started to believe that Google ’s 7 years of updates is simply a marketing stunt and that the company knows that most of its hardware will fail well before users get even close to the seventh year. If you look up (used) offers for e.g. the now almost 7-year-old Pixel 4a on marketplaces like eBay you’ll find the offer to be surprisingly thin. Similarly, the slightly younger 5a is also relatively hard to come by in good shape. Older smartphones sustained above 80% of their original battery capacity for up to 500 charging cycles, which amounts to less than 3 years if you assume a full charge every two days, which is unrealistically generous especially for an Android device. Even if we assume that modern smartphones sustain 80% capacity for up to 1000 recharges and we use the generous two-day cycle, the phone will likely drop below 80% battery capacity within 5 and a half years. Again, that’s a very positive calculation that doesn’t take into account prolonged charging cycles (over night), environmental impacts (high heat or freezing cold) and arbitrary battery deterioration. A more realistic outlook is a drop below 80% within the device’s first three years. It is also worth noting that at some point past the 80% mark degradation speeds up sharply and becomes roughly exponential, as Lithium plating, electrolyte depletion, and loss of active material compound on each other. This means that the drop from 80% capacity to 60% will happen significantly faster than the initial drop from 100% to 80%. The 80% mark was deliberately chosen by manufacturers as it kind of marks the practical end of the stable region of the battery. Past that point, the phone will become less stable and show effects like sudden reboots, or at some point even shutdowns at around 30% indicated charge. Compared to the Pixel line, Motorola ’s 2025 hardware appears to have notably better water- and dust-ingress protection ( IP69 vs IP68 ), use Qualcomm Snapdragon silicon, which means, per the benchmarks above, meaningfully better raw performance and meaningfully better modem efficiency, have a build quality that holds up better through year-one stress tests, even on the foldable form factors that are notoriously hard to engineer, and are priced lower than the equivalent Pixel Pro , with the obvious caveat that the razr ultra at $1,300 is, in fact, a tough pill to swallow . What it doesn’t appear to offer, at least yet, is the Pixel ’s camera quality. Reviews of the Edge 60 and Edge 50 Ultra are competent but not class-leading on the photography front. For someone who uses a dedicated camera for serious photography and reserves the phone for documentary snapshots, this is a perfectly acceptable trade-off, but your mileage may vary. Until GrapheneOS -compatible Motorola hardware is actually on shelves, I’m going to keep using the Pixel 8 with its hardware workaround (yes, I’m literally squeezing the lower part of the chassis whenever the screen starts glitching) and avoid spending any more money on Google hardware. Unless the Pixel 8 will completely die or become otherwise unusable I won’t be purchasing another Google device. For anyone in a similar situation, my recommendation is to not upgrade if your current Pixel still works, and instead hold on to it . Pixel to Pixel generational improvements are marginal at best, and you’re almost certainly going to inherit a fresh set of defects with each new model. Also, E-waste is a real concern , especially with repairability scores below most Apple devices, particularly because of the extensive use of adhesives within Pixel phones. If you have to get a replacement in the meantime, buy used or discounted. The Pixel 8a is occasionally available below $300 refurbished, the Pixel 9 is now in the same price band as the Pixel 8 was a year ago, and the Pixel 9a is probably the best affordable entry point. Keep in mind that none of the historical hardware-defect patterns have spared the Pro models, but the Pro pricing has consistently included an Apple -level markup for what amounts to a bigger screen and one extra camera sensor. Hence I would avoid those variants. If you can hold off on a phone purchase for another year or so, see how the Motorola / GrapheneOS situation develops. If the first compatible devices land at a reasonable price with an acceptable build quality, that will be the first competitive alternative to the Pixel line for privacy-conscious users. If you’re a tech power-user, however, maybe consider Linux on mobile as a more radical alternative. I’ve been eyeing postmarketOS on the Fairphone 6 for a while, as it appears to be making meaningful progress, but it is not yet a daily-driver experience and probably won’t be for another year or two. The Pinephone is a dead end , imho, but it seems like Ubuntu Touch is coming along nicely. Google ’s consumer hardware is, in my unscientific but consistent personal experience, garbage. The A-series has a multi-generation track record of batteries that swell or catch fire. The Pixel 8 has a display defect serious enough to introduce an extended warranty program. The Pixel Tablet shipped with a chip that was already two generations old. Tensor -based flagships are routinely outperformed by competitors at the same price point, and thermal-throttle hard enough under sustained load that the silicon is barely delivering half of its rated performance for any task longer than a few minutes. I have given Google enough of my money over the past years. The only reason I have kept doing so is because of the community ROMs and, in the recent past, because of GrapheneOS , which I consider one of the most important pieces of consumer software in the privacy and security space today, that has been Pixel -only by hardware necessity. As of MWC 2026 , that constraint has an end date however. Until either GrapheneOS -compatible Motorola hardware actually ships, or Linux on Mobile becomes actually usable on a halfway modern device like the Fairphone (with replaceable battery), I am holding on to my squeezable Pixel 8 and not buying anything else from Google . After that, I expect to never own another Pixel ever again. Note: I deliberately picked the same title format as my I Do Not Recommend Bitwarden and I Do Not Recommend Proton Mail posts. The reason is the same in all three cases, which is that I used the product, in many cases over the course of years, recommended it to others in writing on this site, and have since come to a different conclusion. If your own experience has been different and you’re happily using a Pixel without issues, that’s great. This post is, in part, an updated honest disclosure of where I personally landed, and a counterweight to my own earlier, more positive reviews of these devices.

0 views
fLaMEd fury Yesterday

I Am A Link Curator

What’s going on, Internet? Friend of the site James recently shared a new post Blogger Archetypes which asks a series of questions to help you narrow down your character as a member of the blogging community. A bit of indie web fun. Here are my results: You are a Link curator The web is not just its pages, but the connections between pages. You have internalised this and love spending your time exploring the web and sharing what you find with the world. You are also a Culture maker You love to help push the blogging community forward by starting discussions, encouraging thought, and sharing what’s on your mind. And the other archetypes on offer: Explorer: To you, the web feels like a library that’s open all hours and has everything you could ever imagine! You love reading others blogs, and know how important readers are to the whole of the indie web community! Community gardener: You love to help contribute to building the blogging community, either through your writing or how you share the spirit of writing on the web with friends. Author: You love writing and have a growing backlog of posts on your website! Words are your best friend and you’re always thinking about what to write next. Link curator feels about right. A lot of what I do here is exactly that. Surfing the web, finding the good stuff, and passing it along. The bookmarks , the links back to other people’s writing (which I need to get back into doing regularly). That’s the fun part for me. If you’ve got a blog, go and take the quiz and write up what you got. Send it my way, I’d love to to see what you got. 🤙 Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

1 views
Kev Quirk Yesterday

📝 2026-06-22 09:39: The fox continues to prowl around our chickens. This morning we caught it in the...

The fox continues to prowl around our chickens. This morning we caught it in the GARDEN a few feet from our favourite chicken. Luckily the magpies warned us and we were able to scare it away. It's not nice keeping the little cluckers cooped up in this heat, but needs must unfortunately. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Kev Quirk Yesterday

No, I don't want you to summarise the page!

I've talked about LLMs a few times here - the TL;DR is that I find them useful for certain use cases . Searching something complex? Great. Checking my code, or helping me with a problem in said code? Count me in. But summarising a page I'm reading? Absofuckinglutely not. One of the things I really enjoy about the web is surfing it and reading . Reading is one of the great joys I get from the web, and in general . Why would I want a bastardised version of your words presented to me by a computer when I can read the actual words you took time to write? LLMs have their place and are useful tools in my opinion, but I'm getting sick of them being crammed into every facet of computing. Hopefully the bubble will burst soon and we can all enjoy an LLM enriched web, not an LLM hijacked web. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
iDiallo Yesterday

Everything you say CAN and WILL be used against you

- "If you talk to me, I'll punch you in the face, are you ok with talking with me?" - "Nods in agreement." - "Proceeds to punch the man in the face." That's how I feel whenever I hear the Miranda rights being read. It was designed specifically to scare anyone being read to, into silence. Don't incriminate yourself. If you are like me, guilty of watching those police bodycams videos on youtube, then you know that people proceed to talk right after they are read their rights, as if they heard absolutely nothing. Those rights exist solely to protect you from the very authority addressing you. They have authority over you, so you need protection to balance the playing field. The perfect way to balance it, is by affording people the right to remain silent and not to be coerced into incriminating yourself. We can all agree that the Miranda rights are a fundamental power we have and should exercise. The State reads you Miranda rights to limit its own power over you. So let me rephrase Miranda rights in a way that you will find relevant in this tech focused blog: “You have the right to remain silent. Everything you say, do, or generate on this device can and will be used against you… Would you like to create an account?” Yet, we agree to these terms constantly. It’s second nature. We sign up to test a new AI or a new service, telling ourselves, "If I don’t like it, I’ll just cancel." We ignore the reality that while it takes one click to sign up, it often requires a fax machine or a physical letter to cancel. This week, following the "Fable" kerfuffle, Anthropic announced they now support customer identification. You can upload your government-issued ID or passport to verify your identity. We are rolling out identity verification for a few use cases, and you might see a verification prompt when accessing certain capabilities, as part of our routine platform integrity checks, or other safety and compliance measures. This will eventually be used to determine who is considered an "approved" user. In other words, when you type "Fix this code" into Claude, it will check your verified status before executing, all in the name of compliance. By uploading those documents, you are surrendering control. You are giving up your rights, your identity, just to access a service. If things go wrong, there is no "Miranda warning" for the consumer. Every action on your account is now permanently tethered to your identity. In the digital world, the corporation reads you the Terms of Service to expand its own power over you. When you agree to Claude's terms, OpenAi’s or any corporation, you are waiving your right to remain silent. And then you are providing them with a searchable database with the most intimate information about yourself, that can and will be used against you. For example, imagine you upload your Driver’s License to Claude to unlock advanced coding features. Three months later, you ask Claude to review a snippet of open-source code that accidentally contains proprietary company secrets (you didn't realize it). Under Miranda, you could have said, "I refuse to discuss this code." Online, you already discussed it. Your verified identity is now permanently attached to that leak, making you the prime suspect for corporate espionage, even if it was an accident. Or you make a joke. You ask Claude: "Fix this bug before I throw my laptop out the window, and delete the entire production database." Because you verified your ID, this log is permanently stored. Six months later, your company undergoes a security audit. The audit team subpoenas your AI logs. They see a verified user (you) threatening to delete a database. You now face a disciplinary board for "security threats," and the AI log is treated as a written confession, because you gave up your right to contextual defense when you agreed to permanent, verbatim logging. Even worse, when all your data is logged and attached to your identity, it can later be cross referenced against laws that don’t exist yet. You are an aspiring writer but you just weren’t gifted with words. So you use Fable to write a short story. You verified your ID of course then you prompted a story about a rogue AI overthrowing the government. A few years later, an Anti-Terrorism AI Monitoring directive was passed under the leadership of new secretary of war Alex Karp. Your sci-fi hobby is retroactively flagged, and you are put on a watchlist. When you are read Miranda rights, the officer is saying: "You have a right to a lawyer, and if you cannot afford one, one will be provided." The State bears the burden of providing you protection. In the digital ToS, the corporation is saying: "We have a right to audit you, and if you cannot afford to fight us in court, too bad." You are giving up the presumption of innocence. In a physical court, your silence cannot be used against you. In a digital audit, your silence doesn't exist. Every click is a spoken word. By uploading your ID, you are giving the corporation a signed affidavit that you are the one pressing the buttons. If a hacker steals your account, you still bear the burden of proof to clear your name, because the logs show your verified ID. ToS exists only to protect a corporation. Remember, Disney tried to use their Disney+ ToS to dodge a wrongful deaf case from food poisoning in one of their restaurants. The user is giving up the right to be forgotten, the right to be misinterpreted favorably, and the right to change their mind. They are trading their 5th Amendment-equivalent (protection from self-incrimination) for a free API call. The only way to win is to treat every prompt as if you are testifying under oath in a courtroom, because legally, thanks to that uploaded passport, you are.

0 views
iDiallo Yesterday

Happy Father's Day.

I am a father of twin boys. There is a question I often think about. It often appears as a midlife crisis where I am not sure when I became a man responsible for a family. It looked so easy for my father. It was as if he was born into it. He was a leader, a strong man, one that an entire community could rely on. When does that kick in for me? When do I become this leader? Or have I already become? That's what I was thinking when I wrote this short story about my father 10 years ago. I couldn't find a way to describe him, without mentioning clocks. It was fitting since he loved them so much. I hope you enjoy this "Ode to my Father" . Happy Father's day to you all. PS: please excuse the AI generated images, I'm still trying to find the best way to present it. The text was entirely written by me in 2016-2017.

0 views

Expert-aware quantisation: near-Q4 quality at near-Q2 size?

While researching and writing my last article on the history of KV cache compression , it occurred to me while there has been so much implemented research on KV cache efficiency, actual model weights quantisation is still pretty blunt. This makes sense - at large scale with many tens of thousands of GPUs the weights themselves aren't a huge efficiency bottleneck for the most part, and KV cache starts dominating memory usage. But, for us lowly serfs who don't have access to a warehouse full of HBM memory, it is a problem. The key constraint for local models is (mostly) just loading the weights into something fast enough. I spend a lot of time profiling applications to improve their performance, and a couple of months ago I built a tool to do the same for MoE models. This got me thinking. What if instead of just quantising the entire model to a certain level - the blunt hammer I mentioned - we instead profile the model first and then quantise the "cold" experts selectively , for that specific set of tasks? For this research I profiled Qwen3.6 35B-A3B on C++ programming tasks. There's an important nuance worth flagging up front: this particular model is very well load-balanced, so when it's reading code it spreads the work across its experts almost evenly (a per-layer Gini coefficient near 0 - basically uniform). The selectivity only shows up when it's generating code. And there it concentrates hard. Running a handful of C++ generation prompts through, the per-layer Gini coefficient jumps to 0.61 - meaning the top 32 of the 256 experts handle ~50% of the routing during code generation, versus the 12.5% you'd expect at random. That concentration is exactly what we can exploit: if only a subset of experts really matter for the task, we can keep those at high precision and crush the rest. Once we've got these traces showing which experts are hot (used a lot for the specific domain) vs cold (not used), we can then move on to the next step. This took (Claude Code) a fair while - ironically I suspect Fable would have been perfect for this kind of task. The core idea was to allow llama.cpp to read different levels of quantisation per expert, which had a fair few issues. Eventually though, it figured it out (running autonomously for a good 90 minutes!). It also wrote a script to take the profiling data and do quantisation per expert. All numbers below are perplexity (lower is better) measured on CPU. "Reading code" is a held-out chunk of real C++ source; "writing code" is a set of the model's own C++ generations. The tiered models keep a "hot" set of 64 experts (out of 256) at high precision and drop the other 192 "cold" experts to 2-bit. * The "writing code" eval was generated by the Q8 model, so scoring Q8 against it would be circular - it's left out. A few things jump out. First, the baseline. Full-fat Q8 (35 GB) scores 1.568 reading C++, and a "blunt" Q2 quantisation of everything (13 GB) jumps to 2.103 - a big drop in quality for less than half the size. (Perplexity is roughly "how surprised the model is at each token" - so going from 1.57 to 2.10 is the model getting noticeably dumber, not lobotomised, but clearly worse.) Now the actual experiment. I A/B tested the tiered approach two ways: random - pick the hot experts arbitrarily, as a control - versus profiled - keep the experts our profiling flagged as hot for C++ and crush the cold ones. The profiled version wins every single time: across two precision tiers and both eval sets, that's four out of four. With Q8 hot / Q2 cold (18 GB), random tiering scores 1.667 while the profiled version recovers nearly half of that gap back towards Q8, landing at 1.620. So the core idea works - which experts you protect matters, and the profile tells you which ones. But here's the catch I have to be honest about: uniform Q4 is really good. On code, 4-bit is almost lossless - Q4 (20 GB) scores 1.582, basically tying Q8. So the fancy Q8-hot/Q2-cold model, despite all the cleverness, doesn't actually beat just using Q4 everywhere at a similar size. The win shows up when you go smaller than Q4. I built a Q4-hot / Q2-cold version - 4-bit for the hot experts, 2-bit for the cold ones - which comes in at 14 GB, just 1 GB more than the blunt Q2 model. And it scores 1.635 reading and 1.477 writing - recovering ~90% of the quality gap between Q2 and Q4 for that single extra gigabyte. That's the real result: near-Q4 quality at near-Q2 size , by spending your bits on the experts that actually matter for the task. This is absolutely nowhere near production ready and needs a lot of work from someone that knows the llama.cpp codebase far better than me. I only ran this on CPU which is (very) slow, my eval sets are small, and there's no doubt the vibe coded implementation Claude came up with could be improved further. There's loads of interesting angles to continue researching on this. I tried a couple of tiers here (Q8/Q2 and Q4/Q2), but there's no reason you couldn't go further - pushing the cold experts down to sub-2-bit (IQ1/IQ2) would drop the model below the uniform-Q2 size while keeping the hot experts sharp. You could imagine a whole gradient: hottest experts at high precision, then incrementally more aggressive quantisation as experts get colder. There's also the question of how many experts you keep sharp. I protected 64 of the 256 here, which turns out to be pretty generous - generous enough that even picking those 64 at random recovers around 80% of the Q2-to-Q4 gap by itself. That's less surprising than it first looks: an MoE layer's output is a weighted sum of its active experts, so keeping any quarter of them accurate anchors the result no matter which quarter you pick. Profiling buys the last ~10% by making sure the experts that actually fire are the protected ones. Where I'd expect it to really pull away is at small hot sets - keep only 16 experts sharp and random selection would mostly be protecting cold experts and fall apart, while the profile tells you exactly which handful matter. That's the experiment I'd run next: it should shrink the model and widen the profiled-vs-random gap, which is where this whole approach earns its keep. But in the end I think this is a really interesting approach. If we could get mass-scale profiling data from real world llama.cpp executions, it may allow a really big jump in quality. I can see a world where the harness detects what domain the task is in, downloads a quantised model for that specific domain and then runs prompts through it. This really takes advantage that storage is cheap (relatively speaking) and RAM is expensive . So having a bunch of different quantisation variants - of the same model - on disk is pretty doable. I should add that there is a fair bit of prior art in this space. The closest I found, DynaExq, does something very similar dynamically at serving time from router traces - but I couldn't find anyone doing it as a static, domain-profiled quantisation you ship as a single GGUF. Here are some links that I read up on while doing this: Closest prior art (variable precision per expert): Foundational quantisation: On-device routing in the wild: If you're working on these kind of optimisations I'd love to hear from you - please feel free to reach out on my contact page . DynaExq - a serving system with a "hotness-aware precision controller" that reads router traces to keep hot experts at higher precision and crush the cold ones, done dynamically at runtime. Mixture-Compressor - folds expert activation frequency directly into a per-expert bit-width allocation. MoPEQ - assigns per-expert bits by sensitivity and explicitly avoids using activation frequency to do it - a nice counterpoint to the approach here. AWQ , GPTQ , SmoothQuant and SpQR - the lineage of protecting the salient weights and crushing the rest. SpQR in particular is basically the within-tensor version of what I'm doing across experts. llama.cpp's importance matrix (imatrix) and k-quants - already uses calibration-data importance to steer per-weight quantisation. This is really just pushing that same idea up to per-expert granularity. Apple's on-device and server foundation models - their ~3B on-device model uses a mixed 2-bit/4-bit scheme (~3.7 bits per weight) with LoRA adapters to claw back quality, then routes harder requests to the cloud. Not far off the "small quantised model for the task at hand" idea.

0 views

GixSQL, COBOL and FreeBSD

GixSQL, COBOL and FreeBSD This is more of a small bookmark for my own sake, rather than a full article. I created a tiny repository with working GnuCOBOL, GixSQL and SQLite examples: gnucobol-gixsql-sqlite-examples It is intentionally small. There is one SELECT example, one INSERT example, a tiny SQLite schema, a Makefile, and some notes about the things that were harder to discover than I expected. Not a framework, not a full tutorial, not a polished “this is how you should do it” document.

0 views

sqlite-utils 4.0rc1 adds migrations and nested transactions

sqlite-utils is my combined Python library and CLI tool for working with SQLite databases. It provides an extensive set of higher-level operations on top of Python's default sqlite3 package , including support for complex table transformations , automatic table creation from JSON data and a whole lot more. I released sqlite-utils 4.0rc1 , the first release candidate for sqlite-utils v4. The major version bump indicates some (minor) backwards incompatible changes, so I'm interested in having people try this out before I commit to a stable release. There are two significant new features in this RC compared to the previous 4.0 alphas. The first is support for database migrations . This isn't a completely new implementation - it's a slightly modified port of the sqlite-migrate package I released a few years ago. I think that package has proved itself over time, so I'm now ready to bundle it with directly. Here's what a set of migrations in a file looks like: This defines a set of two migrations, one creating the table and another adding a column to it. You can then run those migrations either using Python: Or with the command-line command: The system is deliberately small: it doesn't provide reverse migrations, so any mistakes you make should be fixed by deploying a fresh migration to undo them. Its predecessor has been used by LLM and various other projects for several years, so I'm confident that the design is stable and works well. The new migrations feature is documented here . This feature is a lot less exercised than migrations, so it deserves more attention from testers. Previously, mostly left transaction management up to its users, via a construct that reused the mechanism directly. SQLite supports nested transactions in the form of savepoints, so I wanted an abstraction that could make those as easy to use as possible. I borrowed the terminology "atomic" from Django and Peewee. Here's what the new API looks like: More details in the documentation . The backwards incompatible changes in v4 were described in the alpha release notes. For 4.0a0 : And for 4.0a1 : You can install the new RC like this: Or try the CLI version directly with like this: Come chat with us about it in the sqlite-utils Discord channel , or file any bugs in GitHub Issues . You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Upsert operations now use SQLite's syntax on all SQLite versions later than 3.23.1. This is a very slight breaking change for apps that depend on the previous followed by behavior. ( #652 ) Python library users can opt-in to the previous implementation by passing to the constructor, see Alternative upserts using INSERT OR IGNORE . Dropped support for Python 3.8, added support for Python 3.13. ( #646 ) is now provided by the sqlite-utils-tui plugin. ( #648 ) Test suite now also runs against SQLite 3.23.1, the last version (from 2018-04-10) before the new syntax was added. ( #654 ) Breaking change : The method now only works with tables. To access a SQL view use instead. ( #657 ) The and methods can now accept an iterator of lists or tuples as an alternative to dictionaries. The first item should be a list/tuple of column names. See Inserting data from a list or tuple iterator for details. ( #672 ) Breaking change : The default floating point column type has been changed from to , which is the correct SQLite type for floating point values. This affects auto-detected columns when inserting data. ( #645 ) Now uses in place of for packaging. ( #675 ) Tables in the Python API now do a much better job of remembering the primary key and other schema details from when they were first created. ( #655 ) Breaking change : The and mechanisms no longer skip values that evaluate to . Previously the option was needed, this has been removed. ( #542 ) Breaking change : Tables created by this library now wrap table and column names in in the schema. Previously they would use . ( #677 ) The CLI argument now accepts a path to a Python file in addition to accepting a string full of Python code. It can also now be specified multiple times. ( #659 ) Breaking change: Type detection is now the default behavior for the and CLI commands when importing CSV or TSV data. Previously all columns were treated as unless the flag was passed. Use the new flag to restore the old behavior. The environment variable has been removed. ( #679 )

0 views
Kev Quirk Yesterday

📝 2026-06-21 18:42: It's handy when your riding buddy is a photographer. You end up with some nice...

It's handy when your riding buddy is a photographer. You end up with some nice photos. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views