Posts in C (20 found)
Max Bernstein 2 days ago

Partial static single information form

In compilers, static single information form (SSI) is a common extension to static single assignment form (SSA). It was introduced by C. Scott Ananian in 1999 in his MS thesis (PDF) 1 . SSI extends your existing SSA intermediate representation by discovering facts from your existing program and reifying them as path-dependent/flow-sensitive IR nodes. That might sound complicated, but at least the basic idea is pretty natural. I talk a little bit about it in What I talk about when I talk about IRs and I’ll rehash here in more depth, starting with some motivating examples. Consider this admittedly contrived example: We should be able to learn from the comparison that in some branches in the IR, is positive. In that region, we can add a new IR instruction that attaches that knowledge right in the instruction’s type field (yay, sparseness!) and then rewrite uses of to now use . Because we’ve done that, our (imaginary) optimization rule that gets rid of on known-positive integers can kick in, and we can delete the invocation of . Yay, optimization! But a couple of questions remain, at least for me: We’ll go through them, starting with the compiler pipeline. The original SSI paper starts with (I think?) SSA form and places some number of new refinement nodes based on conditionals. I have admittedly not tried very hard, but the into-SSI algorithms look complicated and kind of heavyweight. As a reward, you get “linear” into-SSI time complexity. But I am a humble compiler engineer, and I don’t have the time to go through and load all of this into my head. Instead what I have seen done and have been doing is to take a shortcut: build partial SSI during SSA construction 2 . Most of the time this is from bytecode, but it could also be from some other non-SSA IR. In any case, this is an excellent shortcut for two reasons: This is pretty compelling. We can learn from the bytecode with a very small amount of marginal new complexity. See my implementation in ZJIT , for example. All it really does is modify the abstract interpreter state when building SSA out of , , and bytecode instructions to take into account the new refined values. This is fine for branches that are already in the user’s source program but sometimes optimization, especially of dynamic languages, adds new branches that were not there before. And sometimes these branches get added much later, long after SSA construction. What then? Can we do something similar and rely on existing infrastructure? Implicit in this “can we do it” is the assumption that your IR tracks data dependencies from use to corresponding def, but not from def to uses. Sea of Nodes (at least the Simple implementation), is an IR that tracks both directions all the time for easier rewriting. Many IRs do not do this, so we will continue assuming that there’s no “easy way out”. JIT optimization of dynamic language compilers often adds synthetic instructions to the IR that enforce pre-conditions. These guards allow optimizing happy/fast path cases in JIT code while leaving the interpreter as a fallback. For example, we might be able to optimize two back-to-back instructions (a very dynamic operation in the world of ideas, but fast when concretely implemented using object shapes) from: which is very generic and involves calling into C code that might raise an exception, to something more like: which is much faster (assuming shape stability at run-time). There’s an irritating problem, though, which is that we have a bunch of duplicate instructions littered around the IR now because our optimizer worked on each instruction individually. Kind of a “template optimizer” situation. Now we need some pass to clean up the detritus. Global value numbering (GVN) will do a good job of de-duplicating instructions. It should notice that we already have an instruction that looks like called and rewrite into . That’s great because we have de-duplicated the guard. GVN may not get everything, though; if some instructions later use , they will not get rewritten to instead use the output of these new guard instructions. To do that, we need to add some kind of pass or augment GVN with some canonicalization feature. That canonicalization would handle rewriting operands to use the “latest version” of some value, so to speak. See the canonicalization section of Chris Fallin’s excellent aegraphs blog post for more (and of course the (currently block-local) implementation in ZJIT ). Where I’m going with all of this, though, is that you may already have some dominance-based instruction rewriting mechanism in your compiler, either as part of GVN or separately! And you can use this to do a very low code into-partial-SSI in the middle of your optimizer. This means you could very well get away with inserting instructions in successor blocks of conditionals and get the into-SSI “for free”. That’s up to you. There’s a trade-off between compile-time and run-time, especially in JITs. Inserting more instructions and rewriting more times may slow down your compiler. It’s a cheap lunch, not a free one. I don’t know. I don’t have a good grasp of how this “partial SSI” compares to the “full SSI”. I don’t plan on implementing full SSI in the near future. I will note that this partial SSI approach doesn’t do two things: I can’t tell what impact this has. Like Simple, TruffleRuby is built on a Sea of Nodes IR (Graal). Chris Seaton has an excellent blog post about TruffleRuby’s use of “stamp nodes” (“Pi nodes” 3 ). The function does a lot of heavy lifting, I think because Graal tracks uses. Cinder mostly inserts instructions in the HIR builder, before into-SSA, and then lets the SSA construction take care of things. That’s where I learned this trick, actually. Here is one example of refining the type of the matched operand when building IR for pattern matching. Luau is working on something like this, but for their type checker. Chatting with someone on their team is actually part of the reason I got motivated to write this post. Android ART looks like it has HBoundType and inserts them in reference type propagation . This handles class checks, null checks, and instanceof checks. Last, I want to talk a little bit about some interesting reasoning you can do when you have two implementations of something that you can switch between. For example, JIT (+ interpreter), or aliasing and non-aliasing cases in C code, or the weirdo NULL-UB reasoning LLVM can do to C code, things like that. In ZJIT, we currently insert s opportunistically in “easy” cases when building our HIR from the interpreter bytecode. For example, if in the bytecode there is a branch that compares some value with , it will have two outgoing control-flow edges: one block where is definitely , and one block where is definitely not . In each of these control-flow edges, we can insert corresponding type refinement hints. That’s pretty standard. But we can also do weirder stuff. CRuby has a notion of heap objects vs immediate objects. Many (most?) objects are heap objects. However, integer , for example is not allocated on the heap but instead represented by a tagged bit pattern that pretends to be an address: the whole value is encoded in the pointer itself. We encode this knowledge in the HIR’s type system: “heapness” and “immediateness” each get a bit in the type lattice . We use this in the optimizer to reason about effects , among other things. We can’t know a lot of the time what type a thing is, so we pessimistically type most objects flowing through bytecode as . This type encapsulates the entire world of possible values that could go on the stack or in a local variable. On most heap objects, with only a few exceptions, you can write instance variables (fields, attributes, whatever you want to call them). You can never write an instance variable to an immediate. This means that if we observe the following pattern in the bytecode: Then after building and emitting HIR for the opcode, we can upgrade the type of from a to a . We can do this because if it weren’t a heap-allocated object, we would have left the compiled code and entered the interpreter. This is another SSI-type thing you can do in your compiler. Uhh I guess the conclusion is that you don’t have to do full SSI and partial SSI is available and not too scary? Does your compiler do this? Reader, please write in. …and optimized in 2002 (PDF), revisited in 2009 (PDF), implemented in LLVM in 2010 (PDF), investigated in 2017 for abstract compilation (PDF), and probably more. The 2009 paper by Boissinot, Brisk, Darte, and Rastello even shows that both Ananian and Singer’s papers have bugs, while perhaps unintentionally also making an excellent pun about the literature being “sparse”.  ↩ This blog post is different than the what the LLVM paper (PDF) calls partial SSI. Partial for different reasons. Maybe it’s not even single information anymore.  ↩ Today I learned that this terminology comes from the ABCD paper (PDF).  ↩ Where/when in the compiler pipeline do we insert and remove these type refinements? Do we need to refine after every conditional? Do we need to implement the whole into-SSI and out-of-SSI algorithms from all the complicated-looking papers? It lets me cleanly separate adding the type refinements (pretty straightforward) from the hard part of doing all of the operand rewriting and phi placement and marking and all manner of other nonsense. In addition to separating the concerns, the hard part is already done by SSA construction. We can actually just skip it! SSA construction handles phi placement, operand rewriting, all of it. It probably fits neatly into a naive or a Braun-style (PDF) construction. It doesn’t split variables with a new sigma node, and it generally inserts the refine node within the target block rather than above the branch (For only) It doesn’t insert new phi nodes; it just leaves both IR nodes available and, instead of re-merging, drops them …and optimized in 2002 (PDF), revisited in 2009 (PDF), implemented in LLVM in 2010 (PDF), investigated in 2017 for abstract compilation (PDF), and probably more. The 2009 paper by Boissinot, Brisk, Darte, and Rastello even shows that both Ananian and Singer’s papers have bugs, while perhaps unintentionally also making an excellent pun about the literature being “sparse”.  ↩ This blog post is different than the what the LLVM paper (PDF) calls partial SSI. Partial for different reasons. Maybe it’s not even single information anymore.  ↩ Today I learned that this terminology comes from the ABCD paper (PDF).  ↩

0 views
daniel.haxx.se 3 days ago

Mythos finds a curl vulnerability

yes, as in singular one . Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead trickle it out to a selected few companies for a while to allow a few good ones(?) to get a head start and fix the most pressing problems first, before the general populace would get their hands on it. The whole world seemed to lose its marbles. Is this the end of the world as we know it? An amazingly successful marketing stunt for sure. Part of the deal with project Glasswing was that Anthropic also offered access to their latest AI model to “Open Source projects” via Linux Foundation . Linux Foundation let their project Alpha Omega handle this part, and I was contacted by their representatives. As lead developer of curl I was offered access to the magic model and I graciously accepted the offer. Sure, I’d like to see what it can find in curl. I signed the contract for getting access, but then nothing happened. Weeks went past and I was told there was a hiccup somewhere and access was delayed. Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report. To me, the distinction isn’t that important. It’s not that I would have a lot of time to explore lots of different prompts and doing deep dive adventures anyway. Getting the tool to generate a first proper scan and analysis would be great, whoever did it. I happily accepted this offer. (I am purposely leaving out the identity of the individual(s) involved in getting the curl analysis done as it is not the point of this blog post.) Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of “normal” static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc). Primarily AISLE , Zeropath and OpenAI’s Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more. Nowadays we also use tools like GitHub’s Copilot and Augment code to review pull requests, and their remarks and complaints help us to land better code and avoid merging new bugs. I mean, we still merge bugs of course but the PR review bots regularly highlight issues that we fix: our merges would be worse without them. The AI reviews are used in addition to the human reviews. They help us, they don’t replace us. We also see a high volume of high quality security reports flooding in : security researchers now use AI extensively and effectively. Security is a top priority for us in the curl project. We follow every guideline and we do software engineering properly, to reduce the number of flaws in code. Scanning for flaws is just one of many steps to keep this ship safe. You need to search long and hard to find another software project that makes as much or goes further than curl, for software security. Steps involved in keeping curl secure May 6, 2026 It was with great anticipation we received the first source code analysis report generated with Mythos. Another chance for us to find areas to improve and bugs to fix. To make an even better curl. This initial scan was made on curl’s git repository and its master branch of a certain recent commit . It counted 178K lines of code analyzed in the src/ and lib/ subdirectories. The analysis details several different approaches and methods it has performed the search, and how it has focused on trying to find which flaws. A fun note in the top of the report says: curl is one of the most fuzzed and audited C codebases in existence (OSS-Fuzz, Coverity, CodeQL, multiple paid audits). Finding anything in the hot paths (HTTP/1, TLS, URL parsing core) is unlikely. … and it correctly found no problems in those areas. Completely unscientific poll on Mastodon about people’s expectations for Mythos scanning curl The size of curl curl is currently 176,000 lines of C code when we exclude blank lines. The source code consists of 660,000 words, which is 12% more words than the entire English edition of the novel War and Peace. On average, every single production source code line of curl has been written (and then rewritten) 4.14 times. We have polished on this. Right now, the existing production code in git master that still remains, has been authored by 573 separate individuals. Over time, a total of 1,465 individuals have so far had their proposed changes merged into curl’s git repository. We have published 188 CVEs for curl up until now. curl is installed in over twenty billion instances . It runs on over 110 operating systems and 28 CPU architectures . It runs in every smart phone, tablet, car, TV, game console and server on earth. The report concluded it found five “Confirmed security vulnerabilities”. I think using the term confirmed is a little amusing when the AI says it confidently by itself. Yes, the AI thinks they are confirmed, but the curl security team has a slightly different take. Five issues felt like nothing as we had expected an extensive list. Once my curl security team fellows and I had poked on the this short list for a number of hours and dug into the details, we had trimmed the list down and were left with one confirmed vulnerability. The other four were three false positives (they highlighted shortcomings that are documented in API documentation) and the fourth we deemed “just a bug”. The single confirmed vulnerability is going to end up a severity low CVE planned to get published in sync with our pending next curl release 8.21.0 in late June. The flaw is not going to make anyone grasp for breath. All details of that vulnerability will of course not get public before then, so you need to hold out for details on that. The Mythos report on curl also contained a number of spotted bugs that it concluded were not vulnerabilities, much like any new code analyzer does when you run it on hundreds of thousands of lines of code. All the bugs in the report are being investigated and one by one we are fixing those that we agree with. All in all about twenty bugs that are described and explained very nicely. Barely any false positives, so I presume they have had a rather high threshold for certainty. curl is certainly getting better thanks to this report, but counted by the volume of issues found, all the previous AI tools we have used have resulted in larger bugfix amounts. This is only natural of course since the first tools we ran had many more and easier bugs to find. As we have fixed issues along the way, finding new ones are slowly becoming harder. Additionally, a bug can be small or big so it’s not always fair to just compare numbers My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing. This is just one source code repository and maybe it is much better on other things. I can only tell and comment on what it found here. But allow me to highlight and reiterate what I have said before: AI powered code analyzers are significantly better at finding security flaws and mistakes in source code than any traditional code analyzers did in the past. All modern AI models are good at this now. Anyone with time and some experimental spirits can find security problems now. The high quality chaos is real. Any project that has not scanned their source code with AI powered tooling will likely find huge number of flaws, bugs and possible vulnerabilities with this new generation of tools. Mythos will, and so will many of the others. Not using AI code analyzers in your project means that you leave adversaries and attackers time and opportunity to find and exploit the flaws you don’t find. Zero memory-safety vulnerabilities found. Methodology note: this review is hand-driven analysis using LLM subagents for parallel file reads, with every candidate finding re-verified by direct source inspection in the main session before being recorded. The CVE to variant-hunt mapping was built from curl’s own vuln.json. No automated SAST tooling was used. This outcome is consistent with curl’s status as one of the most heavily fuzzed and audited C codebases. The defensive infrastructure (capped dynbufs everywhere, with explicit max on every numeric parse, overflow guard, CURL_PRINTF format-string enforcement, per-protocol response-size caps, pingpong 64KB line cap) systematically closes the bug classes that would normally be productive in a codebase this size. Coverage now includes: all minor protocols, all file parsers, all TLS backends’ verify paths, http/1/2/3, ftp full depth, mprintf, x509asn1, doh, all auth mechanisms, content encoding, connection reuse, session cache, CLI tool, platform-specific code, and CI/build supply chain. It should be noted that the AI tools find the usual and established kind of errors we already know about. It just finds new instances of them. We have not seen any AI so far report a vulnerability that would somehow be of a novel kind or something totally new. They do not reinvent the field in that way, but they do dig up more issues than any other tools did before. These were absolutely not the last bugs to find or report. Just while I was writing the drafts for this blog post we have received more reports from security researchers about suspected problems. The AI tools will improve further and the researchers can find new and different ways to prompt the existing AIs to make them find more. We have not reached the end of this yet. I hope we can keep getting more curl scans done with Mythos and other AIs, over and over until they truly stop finding new problems. Thanks to Anthropic and Alpha Omega for providing the model, the tools and doing the scan for us. Thanks also to the individual who did the scan for us. Much appreciated! Top image by Jin Kim from Pixabay Thanks for flying curl. It’s never dull. They can spot when the comment says something about the code and then conclude that the code does not work as the comment says. It can check code for platforms and configurations we otherwise cannot run analyzers for It “knows” details about 3rd party libraries and their APIs so it can detect abuse or bad assumptions. It “knows” details about protocols curl implements and can question details in the code that seem to violate or contradict protocol specifications They are typically good at summarizing and explaining the flaw, something which can be rather tedious and difficult with old style analyzers. They can often generate and offer a patch for its found issue (even if the patch usually is not a 100% fix).

0 views
Blargh 3 days ago

Quantum safe amateur radio secure shell

I’ve previously pointed out that the AX.25 implementation in the kernel is pretty poor . It’s not really being maintained, and even when it gets fixes after I reported it , with people running LTS OSs it can take like 5 years before before the fix actually reaches users, if ever. So when writing applications, you still have to work around kernel bugs from a decade ago. This makes it kind of pointless to upstream patches. The exception is security patches, and reading between the lines of why the AX.25 code is now being removed from the kernel , it sounds like maybe some LLM (like the looming “Mythos” and the related Glasswing ) may have found some severe problems. But even if there aren’t any known security problems yet, having code is now more of a liability than ever. Code needs to be removed, or taken responsibility of. (tangent about ffmpeg at the bottom of this post) With the kernel code removed, say goodbye to the old walkthrough . Well, not “new”, per se, but “replacement”. With the socket based API about to be gone, we need some other way for applications to send packets and manage connections. For sending raw packets to and from the modem there’s KISS . I have no real complaints about it. Not much to get wrong about sending frames. It’s implemented by most modems, like the software modem Direwolf and by some radios like the Kenwood TH-D75 , so it’s not going anywhere. For connected mode (streams of in order data, like with TCP) the biggest contender seems to be AGW . Direwolf implements it, and I’ve made a messy implementation of an AGW client in Rust . The Rust API works, as we’ll see, but the code needs some refactoring and cleanup due to it being written exploratorily while I was deciding what it should even do, and how. The AGW protocol is not super amazing, but it gets the job done. One can build a connection API on top of it, as I have , and never have to think about the AGW protocol ever again. There’s another protocol called RHP, specified here and here . It came out of the XRouter project. Since XRouter is closed source, I have a strong aversion to it. It seems both counter to how I see amateur radio, and anachronistic, for it to be closed source. It’s bad enough that VARA and [Winlink][winlkn] are closed source. And people are definitely working on replacing VARA with various other modes because of it. tl;dr: I’m going with AGW for now. If someone writes a Rust crate for RHP exposing a compatible API, I certainly wouldn’t mind adding that dependency to optionally use. I have not yet implemented AGW (or RHP) in my own AX.25 stack , but I plan to. For now that means I’ll use Direwolf. My previous axsh implementation, since deleted , had some problems: So with everything but terminal management needing a rewrite, this is a reason to rewrite the whole thing. Non-requirement: Encrypt — This would violate the amateur radio license. And then, why not just use SSH? If you have an AGW server, such as Direwolf, then it’s easy to run axsh. Just start a server: Then log in: Then wait like 30-40 seconds for the handshake to complete. The reason for the wait is the large ML-DSA signatures used in the handshake. It can’t be the same direwolf instance, since Direwolf only shuffles packets between the radio and AGW clients, not from one AGW client to another. In my case I had one Direwolf connected to an ICom 9700, and another to a Baofeng UV5R using an AIOC (all in one cable) . AIOC is highly recommended for experimentation over the air. So yeah my test is between just about the cheapest VHF/UHF radio that exists, and maybe the most expensive one. In addition to running : With KISS providing packet support (and AGW providing a higher level API on top, if preferred), why not just run TCP/IP, and let the very stable OS TCP implementation take care of everything? TCP is definitely more modern, stable, and maintained, but it doesn’t scale down to slow speeds very well. A TCP+IPv4 header is at least 40 bytes, and if you don’t want to be some sort of caveman, IPv6 is another 20 bytes. At 1200bps that would be 267-400ms overhead for every packet 1 . Checking a random TCP data packet on my laptop I see that with TCP options TCP/IPv4 is actually 52 bytes, or 350ms. Counting the air time (milliseconds, not just bytes) makes this overhead problem more obvious. And because of amateur radio license reasons TCP would still need to identify the callsign, you probably have to add 17 bytes (113ms) as a surrounding header. That leaves TCP with 69 or 89 bytes overhead per packet, meaning 460ms or 593ms. And since you don’t want to tie up the RF channel for too long (only for the whole packet to be dropped due to interference), you won’t want to send packets that are too large. Of course it’s 4x as slow if you want to do something like Bell 103 on HF. AX.25 connected mode takes that down to 19 bytes (126ms) overhead (if using Mod 128 mode) per data packet. Because of the AX.25 segmenter, for bulk data TCP is not as bad as it may have sounded. For a 1500 byte TCP segment, fitting in just under 8 200 byte AX.25 frames (totalling bytes of overhead), this means 1367ms overhead instead of plain AX.25 (at bytes) 1013ms. A 1500 byte payload takes 10 seconds to send, so that’s an overhead of 13.7% instead of 10.1%. But for interactive use cases, worst case a single payload packet, it’s 467ms vs 133ms. And that’s only counting the data frames, not the acknowledgments. A TCP ACK is at a minimum bytes, or 380ms. An AX.25 RR is 18-19 bytes, or 120-127ms. That makes TCP about three times less efficient, compared to AX.25. A bigger problem with TCP, especially untweaked, is resend timers and window sizes. At 1200bps you don’t actually want too big a window size, since you don’t want to tie up the RF channel for several minutes if the other end has gone away. So a bunch of airtime tweaks are needed. And at best you’ll end up with the numbers above. Maybe you could tweak TCP to be more friendly to lower speeds, and find the other overhead acceptable. If so, then you’ll be happy to hear that axsh supports running on TCP as well. Well first, it inherits the same problems from TCP/IP. Sure, the UDP header is smaller than the TCP header, but then on top of that there’s the QUIC header. The second problem is that QUIC is meant to be encrypted. Ripping out encryption, while staying secure, seems more dangerous that keeping it simple and just working from the requirements. Probably the whole handshake would have to be redesigned. AX.25 being removed from the Linux kernel reminds me of LLM finding that bug in ffmpeg , causing all that drama. I have no dog in this fight, but in my opinion ffmpeg is in the wrong, here. Their argument seems to be all about how this particular encoder is rarely used, is just a hobby project, etc.. Ok, but it’s in your code base. Even if disabled by default, why would you want to ship a security footgun? Maybe some hobbyists out there build ffmpeg with all encoders enabled. Do you want them to be vulnerable to someone’s virus? So Google should either keep quiet, or give a patch? Well, keeping quiet because the codec is rarely used is not really an option. That’s borderline negligent and morally culpable, for when someone eventually gets hacked. So Google “should” always provide a patch in these cases? Perhaps, depending on the meaning of the word “should”. Google is rich, so “should” be morally forced to contribute to your software, just because Google (presumably, via youtube) is a heavy user of ffmpeg? Well, that just sounds like the the (non-)problem with open source software (or free software) in general. The license permits use and profit without contribution. If you wanted a tithe then you should have put that in the license. Sounds like you want everyone to be free only to do what you want. That’s not how that works. This is also why I don’t like the AGPL license . It’s not free software if it binds me in your serfdom. Actually, it’s a tiny bit more, because of the occasional bit stuffing ↩ it was implemented in C++, and not only do I prefer Rust, how could I even call something written in C++ “secure”? (a blog post for another day) used the kernel API, so that needs rewriting, used , which proved to be a bit “weird” when interoperating with some other APIs, and used crypto primitives vulnerable to quantum computers. Don’t use kernel AX.25 sockets — this means use AGW. Also work on TCP (mainly for debugging) — This means using an internal framing protocol. Be quantum safe — Use ML-DSA+ed25519 dual signed for authentication of server and client. Be efficient — This means don’t use ML-DSA for per packet signatures (they are huge), at the cost of some quantum safety (see [the README][axsh]). Actually, it’s a tiny bit more, because of the occasional bit stuffing ↩

0 views
Susam Pal 5 days ago

I Will Not Add Query Strings to Your URLs

Last evening, a short blog post appeared in my feed reader that felt as if it spoke directly to me. It is Chris Morgan's excellent post I've banned query strings . Chris is someone whose Internet comments I have been reading for about half a decade now. I first stumbled upon his comments on Hacker News, where he left very detailed feedback on a small collection of boilerplate CSS rules I had shared there. I am by no means a web developer. I have spent most of my professional life doing systems programming in C and C++. However, developing websites and writing small HTML tools has been a long-time hobby for me. I have learnt most of my web development skills as a hobbyist by studying what other people do: first by viewing the source of websites I liked in the early 2000s, and later by occasionally getting possessed by the urge to implement a new game or tool and searching MDN Web Docs to learn whatever I needed to make it work. One problem with learning a skill this way is that you sometimes pick up habits and practices that are fashionable but not necessarily optimal or correct. So it was really valuable to me when Chris commented on my collection of boilerplate CSS rules. It helped me improve my CSS a lot. In fact, a few of the lessons from his comment have really stuck with me; I keep them in mind whenever I make a hobby HTML project: always retain underlines in links and retain purple for visited links. I have been following Chris's posts and comments on web-related topics since then. He often posts great feedback on web-related projects. Whenever I come across one, I make sure to read them carefully, even when the project isn't mine. I always end up learning something nice and useful from his comments. Here is one such recent example from the Lobsters story Adding author context to RSS . A couple of months ago, I created a new project called Wander Console . It is a small, decentralised, self-hosted web console that lets visitors to your website explore interesting websites and pages recommended by a community of independent personal website owners. For example, my console is here: susam.net/wander/ . If you click the 'Wander' button there, the tool loads a random personal web page recommended by the Wander community. The tool consists of one HTML file that implements the console and one JavaScript file where the website owner defines a list of neighbouring consoles along with a list of web pages they recommend. If you copy these two files to your web server, you instantly have a Wander console live on the Web. You don't need any server-side logic or server-side software beyond a basic web server to run Wander Console. You can even host it in constrained environments like Codeberg Pages or GitHub Pages. When you click the 'Wander' button, the console connects to other remote consoles, fetches web page recommendations, picks one randomly and loads it in your web browser. It is a bit like the now defunct StumbleUpon but it is completely decentralised. It is also a bit like web rings except that the community network is not restricted to being a cycle; it is a graph that can take any shape. There are currently over 50 websites hosting this tool. Together, they recommend over 1500 web pages. You can find a recent snapshot of the list of known consoles and the pages they recommend at susam.codeberg.page/wcn/ . To learn more about this tool or to set it up on your website, please see codeberg.org/susam/wander . In case you were wondering why I suddenly plugged my project into this post in the previous section, it is because I recently added a dubious feature to that project that I myself was not entirely convinced about. That misfeature is relevant to this post. In version 0.4.0 of Wander Console, I added support for a query parameter while loading web pages. For example, if you encountered midnight.pub while using the console at susam.net/wander/ , the console loaded the page using the following URL: This allowed the owner of the recommended website to see, via their access logs, that the visit originated from a Wander Console. Chris's recent blog post is critical of features like this. He writes: I don't like people adding tracking stuff to URLs. Still less do I like people adding tracking stuff to my URLs. ? Did I ask? If I wanted to know I'd look at the header; and if it isn't there, it's probably for a good reason. You abuse your users by adding that to the link. I mentioned earlier that I was not entirely convinced that adding a referral query string was a good thing to do. Why did I add it anyway? I succumbed to popular demand. Let me briefly describe my frame of mind when I considered and implemented that feature. When I first saw the feature request on Codeberg, my initial reaction was reluctance. I wasn't convinced it was a good feature. But I was too busy with some ongoing algebraic graph theory research, another recent hobby, with a looming deadline, so I didn't have a lot of time to think about it clearly. In fact, everything about Wander Console has been made in very little time during the short breaks I used to take from my research. I made the first version of the console in about one and a half hours one early morning when my brain was too tired to read more algebraic graph theory literature and I really needed a break. During another such break, I revisited that feature request and, despite my reservations, decided to implement it anyway. During yet another such break, I am writing this post. Normally, I don't like adding too many new features to my little projects. I want them to have a limited scope. I also want them to become stable over time. After a project has fulfilled some essential requirements I had, I just want to call it feature complete and never add another feature to it again. I'll fix bugs, of course. But I don't like to keep adding new features endlessly. That's my style of maintaining my hobby projects. So it should have been very easy for me to ignore the feature request for adding a referral query string to URLs loaded by the console tool. But I think a tired body and mind, worn down by long and intense research work, took a toll on me. Although my gut feeling was telling me that it was not a good feature, I couldn't articulate to myself exactly why. So I implemented the referral query string feature anyway. While doing so, I added an opt-out mechanism to the configuration, so that if someone else didn't like the feature, they could disable it for themselves. This was another mistake. A questionable feature like this should be implemented as an opt-in feature, not an opt-out feature, if implemented at all. The fact that I didn't have a lot of time to reason through the implications of this feature meant that I just went ahead and implemented it without thinking about it critically. As the famous quote from Jurassic Park goes: It soon turned out that my gut feeling was correct. After I implemented that feature, a page from one of my favourite websites refused to load in the console. To illustrate the problem, here are a few similar but slightly different URLs for that page: The first and second URLs load fine, but the third URL returns an HTTP 404 error page. The website uses the query string to determine which one of its several font collections to show. So when we add an arbitrary query string to the URL, the website tries to interpret it as a font collection identifier and the page fails to load. That is why, when my tool added the query parameter to the first URL, the page failed to load. Later, with a little time to breathe and some hindsight, I could articulate why adding referral query strings to a working URL was such a bad idea. Altering a URL gives you a new URL. The new URL could point to a completely different resource, or to no resource at all, even if the alteration is as small as adding a seemingly harmless query string. By adding the referral query string, I had effectively broken a working URL from a website I am very fond of. It is also worth asking whether an HTML tool should concern itself with referral query strings at all when web browsers already have a mechanism for this: the HTTP Referer header, governed by Referrer-Policy . That policy can be set at the server level, the document level or even on individual links. The Web standards already provide deliberate controls to decide how much referrer information should be sent. Appending referral query strings to URLs bypasses those controls. It moves a privacy and attribution concern out of the referrer mechanism and embeds it into the destination URL instead. I don't think an HTML tool should do that. There is also a moral question here about whether it is okay to modify a given URL on behalf of the user in order to insert a referral query string into it. I think it isn't. In the end, I decided to remove the referral query string feature from Wander Console. One might wonder why I couldn't simply leave the feature in as an opt-in. Well, the answer is that once I had deemed the feature misguided, I no longer wanted it to be part of my software in any form. The project is still new and we are still in the days of 0.x releases, so if there is a good time to remove features, this is it. But my ongoing research work left me with no time to do it. Finally, when the post I've banned query strings appeared in my feed reader last evening, it nudged me just enough to take a little time away from my academic hobby and devote it to removing that ill-considered feature. The feature is now gone. See commit b26d77c for details. The latest release, version 0.6.0, does not have it anymore. This is a lesson I'll remember for any new hobby projects I happen to make in the future. If I ever load URLs again, I'll load them exactly as the website's author intended. I will never add query strings to your URLs. Read on website | #web | #technology Wisdom on the Web Wander on the Web Broken URLs https://int10h.org/oldschool-pc-fonts/fontlist/ https://int10h.org/oldschool-pc-fonts/fontlist/?2 https://int10h.org/oldschool-pc-fonts/fontlist/?foo

0 views
Stone Tools 6 days ago

PipeDream on the Acorn Archimedes

During the "throw everything at the wall and see what sticks" years of home computing, up to around 1995, a lot was thrown and a lot failed to stick. Sometimes clumps would form that appeared to have the combined friction necessary to maintain wall grip, each holding the other up. But, like Mitch Hedberg's observation of belts and belt loops, it was difficult to discern who was helping who stick to what. Take for example, our focus today. We have a completely novel CPU, built by a tiny team of engineers who had never designed a processor before, running a bespoke operating system squeezed out in a rush to meet the shipping deadline of a computer that wanted to carry on the legacy of a system beloved by British schoolchildren, hosting a productivity suite that completely rethought what the term "productivity suite" even meant. Together, they formed a complete computing dead-end. Yet separately, they each achieved life beyond expectations, given their shaky beginnings. Let's start with the hardware, Acorn Computer Ltd.'s follow-up to the famous 8-bit BBC Micro, the Archimedes. Feeling the 16-bit processors of the day didn't deliver enough bang-for-the-quid, they began an investigation into 32-bit processor options. After reading a U.C. Berkeley paper extolling the virtues of the RISC architecture, and seeing firsthand the ease with which chips could be designed, in 1983 Acorn launched the Acorn RISC Machine project to develop the 32-bit brain of their next system. The fruit of that labor, the ARM processor, defined the Archimedes line. Try as they might, Acorn could never crack the home market the way they did education. Still, those ARM CPUs had longevity well beyond the life of the company that commissioned it. Your smartphone likely has ARM in it right now, and Apple's entire current hardware ecosystem is built on its spec. That powerful hardware needed a preemptive multitasking operating system that befit its computing prowess. That was to be ARX , whose troubled development missed the product launch window. In the meantime, so the computer could have something driving it at launch, a stop-gap operating system called Arthur was shipped. It was similar to Acorn's previous BBC Micro MOS (Machine Operating System), with a graphical layer grafted on top; hit F12 and that text interface will peek out from behind the curtain. Over time it was decided that Arthur was doing a bang-up job and ARX was cancelled. Thus was born RISC OS, a cooperative multitasking WIMP (windows, icons, menu, pointer) with possibly the first application "dock" on a home computer. Its mandatory three-button mouse summons an application's current context menu at the pointer location; there are no menu bars whatsoever. Drag-and-drop is embraced as a central file management metaphor, even to save documents. On top of all that, it was the first to offer scalable, anti-aliased font rendering, even if its fonts were a little "off brand." On top of this unique foundation, we have PipeDream . Developer Mark Colton was convinced that the boundaries between word processor, spreadsheet, and database were artificial and could be eliminated. A document should be able to do any of those functions at any time, anywhere on the page, he posited. One might think, "Oh, like Google Sheets ." but PipeDream handles word processing more elegantly. Another might think, "Oh, like Apple Pages " but the spreadsheet and database functions are more robust in PipeDream . This particular balance of the three productivity functions feels unique amongst even its modern peers. Does a productivity suite work better when it's just a single app? Did Colton successfully execute his vision? And where is the Homerton documentary we deserve? (I didn't know Ghost blogging platform forces images to 2000px max; I've revised my design workflow to mitigate this in the future. To make amends for this timeline's illegibility at 2000px, please accept this PDF version) Testing Rig RPCEmu v371 on Windows 11 RISC OS v3.7 1024 x 768 15-bit color 64MB RAM PipeDream v4.13 Let's Get to Work My process when first examining unfamiliar systems is as follows: I do that across a variety of emulators to see which gives me the least grief; I need to be sure I can trust a basic productivity loop. I usually try to give it a go without research, to see how far I can get on pure skillz (with a Z). It's unusual to sit down at what appears to be a computer I understand and be baffled every step of the way. I've heard this system described as "elegant" and "easy to learn." This has me questioning if maybe I'm actually a very dumb person because my impression is "uncomfortable." You know that modern horror story, aka "creepypasta", The Backrooms ? It's a hidden world that co-exists with our own, which can be entered only by clipping through a seam of reality which separates the two. In there, buzzing fluorescents light an infinite maze of featureless, yellow-wallpapered office-style floor layouts. If one were to find a running computer there, I suspect RISC OS would drive it. It's just common enough in its GUI metaphors to feel familiar, and just off-kilter enough to turn that familiarity against you. Liam Proven wrote in The Register , "You will find it very disorienting, especially if all you know is post-1990s OSes." My dude, I've been computing since the 1970s and I find it disorienting. Nothing is unlearnable (I'm dumb, not incompetent), but I genuinely had to work through its manual to acclimate myself. To be clear, I enjoyed the thrill of venturing into the unknown. After all, one of the goals of this blog is to investigate the less-trodden paths in software history. Still, there are times when I feel RISC OS is " having me on." (trying to ingratiate myself with British readers in today's post) I'll start with the three-button mouse. From left to right the buttons are "Select", "Menu", and "Adjust." After weeks working with the system, I still can't figure out what problem the "Adjust" button solves. It's semi-analogous to on modern systems, as when clicking to add/remove elements to/from a set of selected items. Then, sometimes it does something unexpected like, "drag a window by its title bar without bringing that window to the front." Other times it is baffling. a file icon to a new folder location doesn't move the file to the new location. It copies the file. If you want to move the file, you must . Why are we "SHIFT" dragging anything when we have a perfectly good "Adjust" button? Sometimes the "Adjust" button does "opposite" actions. Click a "down" scroll arrow with "Adjust" and it will to scroll up instead. Is that an "adjustment?" What does it even mean, to "Adjust" a mouse click? It seems like it could mean anything , and that's kind of my point. It's unguessable and unintuitive. An interesting UI element (which predates NeXT and Windows 95) is the Icon Tray, an important tool inexplicably not described at all in the RISC OS 3 manual. Situated along the bottom of the screen, currently running applications and directory icons sit on a little shelf. Double-click "Select" on an application icon to launch it and... nothing. Its icon displays in the Icon Tray, and that's it. We must now Single-click "Select" on that icon to actually bring the application to the forefront and activate it. I don't know what that's all about, but that's how it works. Menus are fascinating in both the positive and negative meanings of the word. There are no menus on screen whatsoever, they are only made visible by the middle "Menu" mouse button. "Menu" clicking opens a given menu at the current mouse pointer location. Icons in the Icon Tray can be "Menu" clicked to get application-level menus, like "Make a new document." Within a document, "Menu" click will give us document-level options. Conceptually, I like the "Menu" button a lot. Within a menu, any choices which open dialog boxes or control panels tend to open in-menu. It's kind of cool, being able to type, or flip switches and radio buttons, directly inside the menu itself, rather than popping up a modal window. However, it is jarring to have large panels suddenly lunge out like a xenomorph's inner jaws when scrolling through menus. These can obscure the root menu, depending on screen position. 0:00 / 0:08 1× The last point to get our collective heads around is file saving. When saving a new document, simply typing in a file name is not sufficient. Save dialog boxes expect and require the full path to your save destination; no assumptions or default folder locations are provided. You can manually type in the full path to your desired save location like this: While you type, the system will not assist you in navigating the directory structure; no autocompletion here. You must know the path by heart. The other option, as described in manuals, is to drag-and-drop your document to its save location. Drag-and-drop really seems to be the RISC OS idiomatic way to manipulate files. In a Save dialog box there is a little icon for the application. It looks like decoration, but it physically represents your document. Type a name into the text field, then drag that icon to your desired save folder. 0:00 / 0:13 1× I don't want to get bogged down enumerating RISC OS's idiosyncrasies, but a few more things need mentioning. There is a kind of "programmer's art" ugliness to the user interface; those folder icons are terrible. There are graphical glitches, as when scrolling a window too quickly (though moving windows around shows full contents, which wasn't typical during that period). Everything you set up to customize the system, like desktop icons, window positions, desktop resolution, and other settings is reset every boot unless you manually tell the system to save the current state as the "boot file." The list goes on like that. Sheesh, what a journey just to understand the basics. I expect that kind of learning curve for the text-based systems, as those DOS-like commands are unknown to me. For a GUI system to throw this "spanner in the works" (continuing my pandering) is unexpected, but a fun challenge. I can't feel myself growing to love it, but the initial feeling of discombobulation is receding. A spreadsheet is an ordered matrix of cells, each of which can hold text or math. Cells with text are typically used as labels for columns and rows of numbers, and the math cells do the work of calculating relationships between those numbers. It's all very simple. No, wait, I mean it's "easy-peasy." (commitment to the bit) Lotus 1-2-3 felt "columns and rows" could also be useful for textual data. They said the line between spreadsheets and databases is pretty fuzzy, and even today spreadsheets are used to hold and manipulate simple databases. Then racecar driver Mark Colton pierced the veil entirely. It wasn't just spreadsheets and databases that had a fuzzy separation. If we can type arbitrary text into a cell in a spreadsheet, why couldn't we type an entire book? What if all applications were really just one application, in the end? He fired his first shot at uniting everything in View Professional . This was released as PipeDream on the Cambridge Z88, a portable Z80 machine by Sir Clive Sinclair's Cambridge Computer. Built into the ROM itself, it was insta-boot, insta-launch right into a multi-purpose integrated document suite. Jerry Pournelle, in BYTE Magazine 's February 1989 issue, was moderately enamored with the hardware, but PipeDream was, "disappointingly hard to use." With Acorn evolving their BBC Micro via the Archimedes, Colton continued to support their hardware line. In interviews, he seemed to really be leaning toward Windows for the future of his company. However, since he switched development to C and there was a C compiler for the Archimedes, he said it wasn't hard to provide his product to the Acorn crowd. Running on Arthur, the precursor to RISC OS, he embraced and extended the "one document, many forms" approach. Much like today's Google Sheets, we can add arbitrarily long sections of text, insert images, set up database information, perform spreadsheet calculations, run spellcheck, and generate inline graphs. However, try typing a chapter of a book into Google Sheets if you want to drive yourself "mental." (there's no stopping me) In PipeDream , that's frictionless (within a certain definition of "friction"). Like RISC OS itself, PipeDream also requires certain shifts in thinking to not lose a finger to its sharp edges. I suppose that when a developer offers a truly new paradigm, it is fair to ask users to meet it halfway. I'm not convinced the advertising (see "Historical Record" at the end) gave customers a full understanding of how drastic that shift was. "Menu" click the Icon Tray icon (i.e. the application-level menu) for PipeDream to start up a new "Text" file and begin typing into cell A1. You'll find that text overflows, across cell boundaries, until it hits the "row wrap marker" seen in the rightmost column header (shown as a "down arrow" icon). Every line of text is its own row, in spreadsheet terms. As you type, PipeDream fills the current row, then silently inserts a new row to catch overflow. Until a paragraph break, these rows are internally associated as a logical unit. Edits which alter or disrupt text flow across rows within a paragraph are not reflected immediately in the UI. Or maybe they are? It's hard to tell with the graphic glitches in the screen redraw, a constant source of frustration while working on this article. PipeDream concedes the reflow point itself. When in doubt about the current visual structure of your text, , a manual action, will force PipeDream to recalculate text wrapping and line spacing. This can be mitigated a bit through a hidden toggle in the "Options" screen, the confusingly named "Insert on Return." This reduces the need to force a manual reflow, but can still leave visual chaos. 0:00 / 0:44 1× I've altered the text flow and initiated a recalculation of the lines. It does the work, but visually shows no change until I trigger a graphics refresh in some way. Selecting the text works, but then leaves its own graphic artifacts behind. I've "gone nutter!" (yes, these are in the captions as well!) Interestingly, I saw similar redraw issues in View Professional on the BBC Micro. It would appear this is, to some extent, part of the software's DNA. Honestly, this is all "a bit of a shambles." (the hits keep coming) Have you ever wanted a word processor that won't indent paragraphs? PipeDream being a chimera, navigation idioms are forced to choose which parent they love most. An examination of the key demonstrates this. In a word processor, we usually have a horizontal page ruler with tab stops. Tab over to a tab stop and type to align text at that indentation point on the page. In a spreadsheet, navigates us to the next cell to the right. In PipeDream , the spreadsheet idiom wins TAB's love. In a text cell, sets an invisible indicator at paragraph start which forces every subsequent line of that paragraph to begin at that same column. For example, by default every line of text is added to column A, the leftmost. If we to column B, the text will start there but when it wraps to the next line, that will also begin in column B. "Indentation" is at the paragraph level, not the line level. How do we indent the first line of a paragraph? The manual has a solution. In looking back through the history of Colton's software on the Acorn line, I found this note in a review of View 2.1, his standalone word processor for the BBC Micro. "Why is there no numerical information on the rulers or cursors to assist formatting?" asked Acorn User , January 1985. It seems Colton had it in for rulers for a decade, and to my thinking this points to a disconnect between what a programmer thinks users need, versus what users actually need. A stubborn rejection of norms doesn't always mean we're on the right track. We can use the cell-based layout engine of the program to pull off a fun party trick. Under "Options" there is a toggle between Row and Column text wrap. "Row" behaves like a typical word processor. "Column" lets us divide the page into columns, like a newspaper. Tab between columns and the column width will be respected by the word wrap. Kind of cool, and could be useful in a "I need to make a newsletter, stat!" pinch. Like a spreadsheet, column widths are document-wide, so no mix-and-match. Someone very clever with the tools could probably coax complex layouts out of it, but that would require an ungodly amount of pre-planning, design, and patience before starting a document. You really have to try to get it right the first time, because I don't find PipeDream particularly adept at handling large structural changes after the fact. The column-based formatting gets frustrating, but in other ways the word processing is "bog-standard." (How many will I squeeze in? Place your bets!) We have a built-in spell check, user-definable dictionary, word count, text alignment, font choices, and an anagram/subgram maker. Bank Street Writer Plus had an anagram maker as well. Why was that such a thing back then? Have I forgotten some fad of the 80s and 90s? That's all fine and dandy, but I'll tell you what isn't: there's no simple cut/copy/paste, at least not as a modern audience may understand those tools. In the document, we are restricted to cell-level selection, meaning I can't select individual words inside cell A1. I can only select the entire cell A1, which in PipeDream means an entire line of text. We can ask PipeDream to edit a cell in its own window, where it pops out for surgical editing. "Edit Formula in Window" highjacks the spreadsheet formula editor in order to get character-level selection control. In this pop-out window, we can highlight individual words and do typical cut/copy/paste actions. Notice, though, we're still restricted to only the text within the cell, which means only that line (row) of text. It's highly likely any given row will contain the tail-end of the previous sentence and the first part of the next sentence. If we want to cut out a specific sentence which doesn't align neatly to the row structure, there is no way to do so. I will repeat that. There is no way to cut/copy/paste an arbitrary string of characters. Now I feel PipeDream's vision working against itself for anything but simple correspondence. Remember, this is version 4 of PipeDream, Colson's fifth software release to pursue this unified application dream, and this is where we're at. I can't imagine writing anything substantial within these frustrating limitations. As a spreadsheet, PipeDream performs far more admirably, even if certain conventions have been eschewed in favor of its new vision. Hey, if you're gonna quirk it up, might as well go for broke. Unlike its spreadsheet ancestors, there is no menu, nor is there a simple way to tell PipeDream that we want to enter a formula into a cell, as with to denote a function call, or to indicate we want to do math. Many of Lotus 1-2-3's innovations have been utterly ignored. The global "Options" allows us to set default behavior for cell entry. Setting it to "numbers" will put us into the right context for easy formula entry, or we can click into the ever-present formula entry line at the top of the window. Turn on the "Grid" overlay to draw cell boundaries and before you know it what was a word processing document is now a spreadsheet with "the full Monty." (TIL it doesn't mean "full-frontal nudity") The functions available to number crunchers are plentiful and robust. Trigonometric functions are a given, but its inclusion of matrix math may come as a surprise. Even complex functions like , which computes "the complex hyperbolic arc cosecant of as a complex number," are present and accounted for, so hardcore math nerds can breathe a sigh of relief. A wide number of financial functions, statistical functions, lookup tables, string manipulations, and date handling are all here. So too are flow control tools, like , , and more. There are even GUI controls available for showing error dialog boxes and prompts for user input, though those are only available from within custom functions. Yes, if you're missing a function, you can make your own. In a new worksheet, start a formula with (which can accept typed parameters) and end it with . In between, do the work. PipeDream will check syntax and accept or reject each line of your function. If accepted, it will prefix a line with In your real working worksheet, access the formula by . That file reference implies PipeDream can access data from other worksheets, and that is true. Even a cell reference in a formula can be pulled from a completely different worksheet. I find the syntax for custom functions opaque, and the manual does a poor job of explaining what is possible and how to use the tool. There are a handful of examples provided with the software installation, with bugs, that reveal secrets only upon very close inspection. For example, notice in the screenshot above that the parameters to the function are later referenced by prefix, but local variables, as set by the function are not prefixed when used in calculations. It's those subtle little things that tripped me up. The same with having the return value called . Or how the program has a selection of "Strings" functions, but when passing a string as a parameter its type is "Text." I stared at that syntax for a LONG TIME before finally realizing my various little misunderstandings. Customization doesn't stop there. Individual keys can be defined as shortcuts to longer string sequences, F-Keys (plain and modified) can be defined to trigger commands, and command sequences (triggered by the CTRL key) can be redefined to your liking (which risks overwriting built-in command shortcuts). You really can make PipeDream your own, though you're in for a struggle compared to Lotus 1-2-3 and the thousands of books available to help learn its principles. I found no actual books for PipeDream , just publishing announcements in old magazines. Something must exist, but the internet at large appears bereft. On the scorecard of "this amalgamation approach to productivity software is working," I'd say we're 1 and 1. The spreadsheet tools are fiddly, but robust. The word processing has me very underwhelmed. Time for the tie-breaker: databases. Using the supplied Lotus 1-2-3 conversion tool, I was able to bring in the data I originally created in CP/M dBASE II and had subsequently converted to DOS Lotus 1-2-3. Now it lives on in RISC OS PipeDream . This data has more passport stamps than Indiana Jones. Let's consider some of the basic things one might want to do with data. PipeDream beats out Lotus in sorting, giving us a five-stage, multi-row, sort with ascension. Not too shabby for the time, all things considered. Search and replace does what it "says on the tin" (in for a penny, in for a pound), and can also accept regex-like tokens and patterns. More interestingly, cells can be set up to directly perform queries on table data. There are a small handful of prefixed database functions to calculate averages, min/max, counts of things, and more. One last feature of note is how to use the query tools to extract a result into a new database. This is interesting as it utilizes RISC OS's drag-and-drop Save functionality in a clever way. 0:00 / 0:15 1× Note how the query for data extraction is much longer than the tiny little text field in the contextual menu can handle elegantly. This is one of those usability tradeoffs for the RISC OS way of doing things. I was initially ready to write off the database functionality as being underwhelming, until I reminded myself of the stated goal for PipeDream . Its core proposition is that there is no difference between the various aspects of the software. The word processor is the spreadsheet is the database. We're not limited to the "database" functions when manipulating our database data. We have access to everything the program has to offer, at all times. Let's clip through the inverted UV plane separating database and spreadsheet, and see what kind of trouble we can get into. I'm thinking back to the Lotus 1-2-3 article and how database information was queried there. With a table of data, we had to use the built-in query forms, define areas on the sheet to hold query parameters, and designate another section of the sheet into which query results would display. It was an obtuse Rube Goldberg machine that I couldn't understand until I drew a diagram of the process. In PipeDream , we just write a formula, the same as if it were a spreadsheet. Let's get the average rating of all adventure games in the database published before 1985. "Bob's your uncle!" (I was hoping to work that one in) Let's mix it up a little and get the same average, but only for titles which begin with "Zork." We can use wildcards, but let's leverage PipeDream's word processing string tools. The most awesome part about this is that, like any spreadsheet formula, it updates in real time. Change the ratings, or add a new Zork game to the mix, and get the new average instantly. The database is the spreadsheet is the database, so that calculation can then be referenced as a value for another cell's formula, perhaps adding sales tax to the average unit price. While we're at it, might as well throw in some fancier text formatting to make it look pretty. In the Lotus 1-2-3 investigation, I wanted a pie chart showing a breakdown by game categories. Lotus had a handy function which removed duplicates from lists, making it possible to extract the full list of unique game categories, which could then be used as the query parameters for generating a chart. PipeDream can't do that, but it does have other string parsing routines, variables, cross-file data referencing, and the ability to write custom functions and macros. I don't doubt it would be possible to homebrew a workaround to this missing function. In fact, let's "have a bash at it." (swish!) 0:00 / 0:12 1× Note the real-time update of the chart as I modify an external database. Ultimately, I couldn't achieve an elegant solution, but I could achieve my goal. I sorted the original data by genre, then created a column that checks if the genre for each row matches the one above it. If so, it's a otherwise a . Then, I extracted all rows with in the column. Last, I did (count any items in a list), where the source list is contained in the original database document. With the documents thus linked, I get real-time graph updates when I alter the core database, thanks to external reference handling. Everything's "tickety-boo!" (I'm trusting The Independent on this one) OK, PipeDream , you're winning me over a little more now. Time to take this to its logical conclusion. We haven't yet pushed it as the multi-purpose document creation tool it promises to be. We've done a little dabbling, with text formatting and data extraction, but I want to see everything come together. I want the borders to crumble . The approach I'm finding to be least troublesome is to begin with a "text" document, then decorate that with spreadsheet/database elements. 0:00 / 0:16 1× As I scroll, text will disappear until I trigger a redraw event in the window. (pay no attention to the content of the letter) In building that document, here's what I learned. We have a unique confluence of interesting technologies coming together to form a strangely flawed jewel. It sparkles and shines when the light hits it just right , and in those sparkles we may catch a fleeting glimpse of a world that might have been. Might have been, but wasn't . Let's see where each of the underlying technologies wound up and those in the know can feign shock with the rest of us when we learn that ARM isn't the only thing that survives to this day. We'll start with the obvious truth: ARM won. It's in everything, everywhere, all at once. If it isn't in your computer, it's in your phone, or your Newton, or your Palm Pilot, or your Canon camera, or your Nintendo DS, or your Nintendo 3DS, or your Nintendo Wii, or your Nintendo Switch, or your Nintendo Switch 2, or your Raspberry Pi, or maybe you're sidetalking on your N-Gage. Its combination of low power consumption with high performance makes it ideal for mobile devices, of which we are in abundance. But why ARM specifically? Others have swung for the RISC fences and stumbled, yet Acorn set two engineers to the task of designing their first ever microprocessor and somehow achieved a ubiquity that has remained (mostly) unchallenged. Apple/IBM/Motorola gathered their forces and developed their own RISC architecture, which debuted in Apple's Power Macintosh 6100. PowerPC doesn't mean much to a Windows/Intel crowd, but the Mac faithful remember all too well Apple's investment in that as the successor to the x68000. Frustrated by delays in the evolution of the chip line, Apple wound up ditching it for Intel x86 , even if they eventually rediscovered the joys of RISC. PowerPC went on to be adopted by a number of game consoles, notably the Nintendo Wii, XBox 360, and PS3 simultaneously. The line continues today, and heck, Mars rovers Curiosity and Perseverance both have PPC inside. Hard to call such a history a "failure," but who outside hardcore Amiga faithful today is clamoring for a PowerPC chip? The SPARC RISC architecture, of "Sun SPARC Workstation" fame, chugged along until as late as 2017, when Oracle purchased Sun. A notable achievement, in pop culture circles, is this is the hardware Pixar's first Toy Story was rendered on . Though Oracle disbanded the design team keeping the architecture alive, the architecture itself is free and open source. There's nothing stopping an intrepid reader from carrying on the lineage, I suppose. Fujitsu, the last of the production line for the series, has abandoned SPARC for ARM. I'll be honest, I can't figure out what ARM does so much better than other attempts, like SPARC, at making a great RISC processor. Reading through the Ars Technica story , it seems to be less about the underlying tech and more about the savvy promotional work of Robin Saxby and his absolute unwillingness to lose the RISC wars. Where others were building RISC for the server-side, ARM committed themselves to the mobile side, skating to where the puck would be . Whatever the case, whatever the magic, ARM makes it available to anyone who wants it, through their licensing partnerships. Ultimately, this really seems to be what has given ARM its staying power; a low barrier to entry to quickly join in on high-performance, low-power draw, ARM fun. It's important to note that ARM doesn't make processors; they only license their IP. <<record_scratch.mp3>> OK, be that as it may, it is still substantially correct to say that IP licenses are their bread and butter. A "core license" allows a company to manufacture a specific ARM-designed CPU, a popular choice for system-on-a-chip designs. Alternatively, an "architectural license" permits a company to design and build its own custom CPU around the ARM instruction set. That's what Apple does with their A- and M-series chips. In recent years, ARM is feeling light competitive pressure from the RISC-V architecture. Born in the same UC Berkeley labs that birthed the original RISC design reports that inspired Acorn to take a chance on RISC, its architecture, unlike ARM, is free and open source. Consumer-level devices running on RISC-V have already started shipping. A new race has begun. Acorn's Archimedes line ultimately never sold particularly well. It's hard to nail down specific sales figures , but a 1991 Acorn shareholder report said, "Acorn is now the UK number one supplier of 32-bit RISC machines with an installed base of over 150,000 units." For context, the Amiga line had sold some 2 million units by 1991. We can't say Acorn didn't put in the effort, releasing some 13 model variations in under a decade. The general consensus seems to be that they "cost a bomb." (that's a new one on me) Schools adopted them, as a natural evolution of Acorn's prior BBC Micro installations, but at US$3,000 to $9,000 (in 2026 money) families just couldn't afford to put one in the home. In the mid-90s, Acorn dropped the Archimedes line, switching tracks to the more business-like Risc PC line, and produced a handful of systems around the StrongARM CPU. However, while the CPU spirit was willing, the motherboard flesh was weak, leaving the CPU underutilized . The lineup ended concurrently with the end of Acorn around 1998. Castle Technology tried to keep the Risc PC line going, post Acorn, but called it quits shortly thereafter, in 2003. Open-sourced in 2018, RISC OS Open keeps it running and up to date for modern RISC-based hardware platforms, especially the Raspberry Pi. Currently at v5.30 at the time of this writing, it is still a 32-bit operating system with " moonshot" aspirations of 64-bit someday. * checks watch* Time is ticking to pull that together before fading into 32-bit irrelevance. Did I mention how tiny this thing is? The latest version for Raspberry Pi is a 155MB download. Version 3.7, which I used for this article, downloaded as a pre-configured emulator with OS and apps pre-installed, was a mere 129MB. Even the most up-to-date pre-configured package tops out at a "massive" 1GB, apps and emulator inclusive. How big is macOS on ARM? Leading in with his View lineup of productivity apps on the BBC Micro, Mark Colton was the man with the all-in-one vision. With View Professional , he took his first stab at providing an uber-app for that 8-bit workhorse. It's primitive and clunky to use, but the spark is present. He would then expand on his ideas through the PipeDream lineup, taking it all the way to version 4.5. Every version refined the vision, but ultimately its character-based layout engine roots became a limiting factor to its growth. One rewrite later, he had a true GUI-based implementation, for both the Archimedes and Windows, in Fireworkz released in 1993. Having created standalone products Wordz and Resultz, Fireworkz combined those back into one. By mid 1995, Fireworkz Pro added in the database functionality, merging the new Recordz into the product, and that's where Colton's involvement ended. Besides asking "What even is a spreadsheet anyway?" Colton's other passion was race car driving. In August 1995, an engineering defect in the front wing of his Pilbeam M72 caused it to fold under his car while he was at top speed. He lost control, crashing headlong into a telegraph pole, and was killed. Most shockingly, both PipeDream and Fireworkz continue to be maintained to this day. Mark's father, Richard, generously open-sourced both PipeDream and Fireworkz just before his own untimely death in 2015. Fireworkz Pro, the version that includes database functionality, is not open-sourced and is still for sale . The PipeDream package available for installation in RISC OS package manager is not the version I'm using for this article. That is the modern update, which adds a bunch of niceties, including a GUI toolbar for formatting text, expanded spreadsheet functions, and a mind-boggling number of bug fixes. This is all maintained by lone developer Stewart Swales , someone intimately involved in the RISC OS and PipeDream history. He worked at Acorn and helped develop Arthur, the OS that became RISC OS. Later, he joined Colton Software as lead developer, working on PipeDream and Fireworkz. There's really nobody better to carry on the legacy. Where, precisely, Colton's continuation of that legacy would have gone, we can't say with certainty. However, we do have a little insight into his thinking. In an interview with Acorn User , December 1994, he said, "Over the next few years...we won’t be writing spreadsheets either; we'll be writing a totally different style of program. I expect spreadsheets, word processors and so on to be provided as part of the operating system in the future." Let me start by making it clear that I appreciate the effort. I say that with all sincerity and for everyone involved. From the machine, to the OS, to the productivity suite, all katamari'd up into a unique star. It was a lot of fun feeling like a beginner again. I had moments of true learning, shedding expectations of "how things should be" and experiencing fresh, alternate ways to approach work. I said at the beginning, the question that needs answering is, "Did Colton successfully execute his vision?" and here I must waffle. From View Professional , through five major releases of PipeDream , and two Fireworkz releases, he held fast to a very particular line of exploration. That he never wavered in his pursuit of that vision, says to me that he must have felt he had achieved his goal to some degree. In that regard, we can say he successfully executed his vision. As an end-user, it is hard to align myself to that vision. I get what he's after, especially when trying to make sure documents always reflect the latest data. After using PipeDream for a number of weeks, I remain unconvinced that the solution is to graft all software into one uber-application. If we follow that thinking to its logical conclusion, then why not include paint features? Why not include robust desktop publishing features? Where would it stop? Had the amalgamation of these productivity apps birthed something uniquely unachievable by other means, or unlocked some latent potential in the individual apps, I'd be very willing to adapt to this "skew-whiff" (last one, I promise!) approach to application design. As it stands, I ultimately don't see what it does that wouldn't be equally well-served, perhaps better-served, by intelligent file link management with robust publish/subscribe functionality. In fairness, a deep implementation of that would work best as an OS-level feature, and Colton could only control his own works. Paradoxically, the most frustrating aspect in removing the barriers between applications is how we wind up with a slate of new barriers forged in that alliance. Colton said of View Professional that even when the apps are combined, none should feel like a compromised version of that app. Yet, compromises are what I feel with every document I build. Is it worth giving up easy text formatting and basic cut/copy/paste for the off-chance I might need to insert a little spreadsheet table? There's an 80/20 rule being almost willfully ignored here. I love that Colton had a unique vision and stuck to it. I love that someone tried to forge a new path in productivity application design. I love that PipeDream exists, but I don't love it . Ways to improve the experience, notable deficiencies, workarounds, and notes about incorporating the software into modern workflows (if possible). Testing Rig RPCEmu v371 on Windows 11 RISC OS v3.7 1024 x 768 15-bit color PipeDream v4.13 boot the system launch my application of interest make a dummy document quit the emulator entirely and reboot load my saved document Because rows and columns are shared throughout the document, insertions and deletions, or moving things around, creates difficult-to-resolve layout issues. If a spreadsheet sits to the right of a block of text, and we want to insert a row into only the spreadsheet part, that's not possible. Doing so will also insert an empty row into the paragraph, leaving a gap. PipeDream has a strange concept of "global font" vs. "local font". Local fonts can't be changed until the global font is set to something other than the system font. The global font controls value cells, which cannot be styled individually. Local fonts will style a cell from wherever the cursor is currently located, and it is very easy to target a cell and style its font, but miss the first character or two, even though the entire cell is highlighted as a selection. "What will be the result of my action?" is not always crystal clear. The controls for styling charts are difficult to understand, and messing up is hard to reverse out. I accidentally added "New Text" to the chart and it took a long time to figure out how to delete it; selecting it and hitting "delete" doesn't work. There is no way to modify the legend. There's no facility for selecting elements for inclusion/exclusion from the graph. In my case, formatting to look good on the printed page meant adding empty columns which wound up in the pie chart. This is very representative of the struggles the layout engine introduces. Making data look good in one context risks "making a shambles of it" (are these working? have I won you over?) in another. Page layout settings are cryptic. Margins can only be set to the top and left (?!?!) and only in unspecified numeric units. I used the template default values, and the page wound up shifted down and to the left. Getting beautiful output is a challenge. How could I forget? There's no UNDO! Some programs, like !Draw (vector illustration) and !Edit (text editor) have undo, and others like !Paint and !PipeDream do not. Getting started with RPCEmu , using a pre-built package, was as dead simple to use as you'd imagine. I experienced no crashes of the emulator, operating system, or PipeDream . It was a very solid experience in that regard. PipeDream itself, at least the version I used, had a ton of annoying bugs and the graphical glitches were even noted in a review by Micro User , February 1992 . But emulator-wise, everything was smooth. I recommend first-time users grab a pre-built image for quickly jumping in and seeing what the fuss is all about. I also do recommend going through the RISC OS Manual. The operating system is almost unusable until you learn its little tricks and nuances of operation. Pre-built images: https://www.marutan.net/rpcemu/easystart.html v3 Manual: https://archive.org/details/ro-3-user-guide v5 Manual: https://archive.org/details/risc-os-5.28-user-guide Technically, I am cheating a bit in this review. RPCEmu doesn't emulate an Archimedes but rather Acorn's later Risc PC. I ran PipeDream from floppy in Arculator, which explicitly emulates Archimedes systems, to compare the experiences. Except for RPCEmu's snappier performance (which I want anyway), RISC OS itself abstracts away the hardware layer so much it didn't seem to matter one emulator over the other. The emulator itself expects some specific keyboard, with the key situated between and . I don't have that, and nothing on my extended keyboard would send the right code to the emulator. is used for logical in PipeDream data queries; I had to use Windows ALT keycodes. I mentioned earlier, but I'll make it explicit here: there is no undo. Fireworkz is available as a native Win32 app. It launches without issue on Windows 11 64-bit, and even in Wine on macOS. It looks and feels exactly like Fireworkz on RISC OS, which looks and feels a lot like the latest version of PipeDream (minus the database parts). The list of bug fixes and quality of life enhancements is vast. Scrolling through all changes since Colton passed is kind of pointless due to its scope. I'll say, "a lot has improved" and leave it at that. As a local-only alternative to the Google/Apple/Microsoft hegemony, it's worth checking out. It's free, open source, actively maintained, a mere 2.5MB download, and for God's sake at least it's trying to do something different. Getting documents out of RISC OS into a modern system is easy, but has its caveats. RPCEmu can directly save to the host operating system, so getting files out is a non-issue. PipeDream's options for saving documents will strip the document's uniqueness, however. Saving as ASCII will try to keep text precisely as shown in PipeDream, inserting line breaks at the end of every line of text. Tables are just tab-indented. Any text formatting, fonts, graphs, etc. are stripped, of course. Saving as "Paragraph" is like ASCII, but will keep text together as logical paragraphs. This is much better for pasting the text into new documents. We still lose anything done to make the document look pretty. PDF printing is an option in RISC OS, and proved to be the best way I could find to get PipeDream documents into the real world. This required two parts: activating the PDF printer and running a separate !PrintPDF application. With both active, PipeDream generated PDFs without issue.

0 views
Corrode 2 weeks ago

Bugs Rust Won't Catch

In April 2026, Canonical disclosed 44 CVEs in uutils, the Rust reimplementation of GNU coreutils that ships by default since 25.10. Most of them came out of an external audit commissioned ahead of the 26.04 LTS. I read through the list and thought there’s a lot to learn from it. What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing, and none of them were caught by the borrow checker, clippy lints , or cargo audit . I’m not writing this to criticize the uutils team. Quite the contrary; I actually want to thank them for sharing the audit results in such detail so that we can all learn from them. We also had Jon Seager, VP Engineering for Ubuntu, on our ‘Rust in Production’ podcast recently and a lot of listeners appreciated his honesty about the state of Rust at Canonical. If you write systems code in Rust, this is the most concentrated look at where Rust’s safety ends that you’ll likely find anywhere right now. This is the largest cluster of bugs in the audit. It’s also the reason , , and are still GNU in Ubuntu 26.04 LTS. :( The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target. Rust’s standard library makes this easy to get wrong. The ergonomic APIs you reach for first ( , , , ) all take a path and re-resolve it every time, rather than taking a file descriptor and operating relative to that. That’s fine for a normal program, but if you’re writing a privileged tool that needs to be secure against local attackers, you have to be careful. Here’s the bug, simplified from . Between step 1 and step 2, anyone with write access to the parent directory can plant as a symlink to, say, . Then follows the symlink and the privileged process happily overwrites with whatever happened to contain. The fix uses : The docs for say (emphasis mine): No file is allowed to exist at the target location, also no (dangling) symlink . In this way, if the call succeeds, the file returned is guaranteed to be new. A in Rust looks like a value, but remember that to the kernel it’s just a name. That name can point to different things from one syscall to the next. Anchor your operations on a file descriptor instead. only helps with that when you’re creating a new file. For everything else, open the parent directory once and work relative to that handle . If you act on the same path twice, assume it’s a TOCTOU (Time Of Check To Time Of Use) bug until you’ve proven otherwise. This is a close relative of TOCTOU. You want a directory with restrictive permissions, so you write something like this. For a brief moment, exists with the default permissions. Any other user on the system can it during that window. Once they have a file descriptor, the later doesn’t take it away from them. Reach for and so the file or directory is born with the permissions you want. The kernel will apply your on top, so set that explicitly too if you really care. The original check in was literally this: That comparison is bypassed by anything that resolves to but isn’t spelled . So , , , or a symlink that points to . Run and see it rip right past your check and lock down the whole system. Here’s the fix : resolves , , and symlinks into a real absolute path. That’s a lot better than string comparison. Oh and if you were wondering about this line: I think that’s just a fancy way of saying In the specific case of , this works because has no parent directory, so there’s nothing for an attacker to swap from underneath you. In the more general case of comparing two arbitrary paths for filesystem identity, however, you’d want to open both and compare their pairs, the way GNU coreutils does. (Think identity, not string equality.) By the way, my favorite bug in this group is CVE-2026-35363: It refused and but happily accepted and , then deleted the current directory while printing . 😅 Rust’s and are always UTF-8. That’s a great choice in 99% of all cases, but Unix paths, environment variables, arguments, and the inputs flowing through tools like , , and live in the messy world of bytes. Every time a Rust program bridges that gap, it has three options. The audit found bugs in both of the first two categories. Here’s an example. This is the original code, from . GNU works on binary files because it just shuffles bytes around. The uutils version replaced anything that wasn’t valid UTF-8 with , which silently corrupted the output. Here’s the fix: stay in bytes. forces a UTF-8 round-trip through . does not. It writes the raw bytes directly to . For Unix-flavored systems code, use and for filesystem paths, for environment variables, and or for stream contents. It’s tempting to round-trip them through for easier formatting, but that’s where the corruption creeps in. UTF-8 is a great default for application strings, but it’s absolutely, positively the wrong default for the raw byte stuff Unix tools work with. In a CLI, every , every , every slice index, every unchecked arithmetic operation, every is a potential denial of service if an attacker can shape the input. That’s because a unwinds the stack and aborts the process. If your tool is running in a cron job, a CI pipeline, or a shell script, that means the whole thing just stops working. Even worse, you could find yourself in a crash loop that paralyzes the entire system. A canonical case from the audit was ( CVE-2026-35348 ). The flag reads a NUL-separated list of filenames from a file, but the parser called on a UTF-8 conversion of each name: GNU treats filenames as raw bytes, the way the kernel does. The uutils version required UTF-8 and aborted the whole process on the first non-UTF-8 path: (I reproduced this against on macOS. The Python one-liner is there because most modern shells refuse to create a non-UTF-8 filename for you.) Your nightly cron job is dead and there goes your weekend. In code that processes untrusted input, treat every , , indexing, or cast as a CVE waiting to be filed. Use , , , , and surface a real error. Push back on the boundary of your application and let the caller deal with the fallout. A good lint baseline to catch this in CI: These are noisy in test code where panicking on bad data is exactly what you want. The cleanest way to scope them to non-test code is to put at the top of each crate root, or to gate on the individual modules. Closely related to the previous point, a few CVEs come from ignoring or losing error information. and returned the exit code of the last file processed instead of the worst one. So could fail on half the files and still exit . Your script thinks everything is fine. called on its call to mimic GNU’s behavior on . The intent was reasonable, but that same code ran for regular files too, so a full disk silently produced a half-written destination. The reason was that someone wanted to throw away a and reached for , , or . Here’s a very simple pattern to avoid that: Also, if you write to discard a , leave a comment that explains why this specific failure is safe to ignore. A surprising number of these CVEs aren’t “the code does something unsafe” but “the code does something different from GNU, and a shell script somewhere relied on the GNU behavior.” The clearest example is (CVE-2026-35369). GNU reads as “signal 1” and asks for a PID. uutils read it as “send the default signal to PID -1”, which on Linux means every process you can see . Yikes! A typo becomes a system-wide kill switch. If you reimplement a battle-tested tool, bug-for-bug compatibility on exit codes, error messages, edge cases, and option semantics is a security feature. (Hello, Hyrum’s Law – and obligatory XKCD 1172 !) Anywhere your behavior diverges from the original, somebody’s shell script is making a wrong decision. uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug. CVE-2026-35368 is the worst single bug in the audit. It’s local root code execution in . The bug is visible if you know what to look for (a followed by a function call that loads a dynamic library), but it’s the kind of thing that doesn’t jump out on a first read. Here’s the pattern, simplified from the utility. Huh. Looks innocent. The trap is that ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0. GNU resolves the user before calling . Same fix here. Once you’re across, every library call might run the attacker’s code. And no, static compilation doesn’t help here, because goes through NSS, which s modules at runtime regardless of whether your binary is statically linked. You might have made it this far and thought “Wow, that’s a lot of bugs! Maybe Rust isn’t as safe as I thought?” That would be the wrong conclusion. Keep in mind that none of the following bad things happened: That means, even if the tools were (and probably still are) buggy, they never had a bug that could be exploited to read arbitrary memory. GNU coreutils has shipped CVEs in every single one of those categories. Take a peek at the last few years of the GNU file: …the list goes on and on. The Rust rewrite has shipped zero of these, over a comparable window of activity. 1 That’s most of what historically goes wrong in a C codebase. What’s left is, frankly, a more interesting class of bug. It lives at the boundary between our controlled Rust environment and the messy, chaotic outside world, where paths, bytes, strings, and syscalls are all tangled up in one eternal ball of sadness. That’s the new security boundary of modern systems code. 2 If you write systems code in Rust, treat this CVE list as a checklist. Grep your own codebase for , stray calls, discarded s, , and string comparisons against . I also wrote a companion post, titled Patterns for Defensive Programming in Rust . When I think of “ idiomatic Rust ”, correctness is not the first thing that comes to mind. After all, isn’t that the compiler’s job? Instead, I think of elegant iterator patterns , ergonomic method signatures, immutability , or clever use of expressions . But none of that matters if the code doesn’t do the right thing, and the compiler is far from perfect at enforcing correctness. That’s why we don’t only have idioms for writing more elegant code; we also have idioms for writing correct code. They are the distilled experience of a community that has learned, often painfully, which shapes of code survive contact with reality and which ones do not. Reality is rarely as tidy as the abstractions we would like to impose on it. The mark of robust systems, in any language, is the willingness to reflect that untidiness rather than paper over it. Rust gives us extraordinary tools to do so, and the compiler will hold a great deal for us. But the part it cannot hold, the boundary between our program and everything else, is still ours to get right. The type system can encode many things, but it cannot encode conditions outside of its control, such as the passage of time between two syscalls. Idiomatic Rust, then, is not just code that the borrow checker accepts or that leaves alone. It is code whose types, names, and control flow tell the truth about the system they run in. And that truth is sometimes ugly. It could mean using file descriptors instead of paths, instead of , instead of , and bug-for-bug compatibility over clean semantics. None of it is as pretty as the version you would write on a whiteboard. But it is more honest. Need Help Hardening Your Rust Codebase? Is your team shipping Rust into production and want to make sure you’re not falling into the same traps? I offer Rust consulting services, from code reviews and security-focused audits to training your team on the patterns that the compiler won’t enforce for you. Get in touch to learn more. To be fair to GNU: GNU coreutils is 40 years old and has had a very long time to surface and fix this class of bug. And we don’t know there are no memory-safety bugs in the Rust rewrite, only that the audit didn’t find any. Still, the difference is noticeable when comparing the same duration of development activity. ↩ It’s worth noting that the / TOCTOU class of bug is in some ways easier to avoid in C than in Rust. C code naturally reaches for an open file descriptor and the family of syscalls ( , , , ), and most creation syscalls take a argument directly. Rust’s high-level APIs abstract over the file descriptor and operate on values, which makes the path-based, re-resolving call the path of least resistance. The handle-based APIs exist on every Unix platform; Rust just doesn’t put them front and center. ↩ 🫩 Lossy conversion with silently rewrites invalid bytes to U+FFFD. That’s just fancy data corruption. 🫤 Strict conversion with or crashes or refuses to operate. 😚 Staying in bytes with or is what you should usually do. No buffer overflows. No use-after-free. No double-free. No data races on shared mutable state. No null-pointer dereferences. No uninitialized memory reads. buffer overflow on deep paths longer than (9.11, 2026) out-of-bounds read on trailing blanks (9.9, 2025) heap buffer overflow (9.9, 2025) writes a NUL byte past a heap buffer (9.8, 2025) 1-byte read before a heap buffer with a key offset (9.8, 2025) and crashes with SELinux but no xattr support (9.7, 2025) heap overwrite ( CVE-2024-0684 , 9.5, 2024) reads unallocated memory on malformed input (9.4, 2023) stack buffer overrun with many files and a high (9.0, 2021) To be fair to GNU: GNU coreutils is 40 years old and has had a very long time to surface and fix this class of bug. And we don’t know there are no memory-safety bugs in the Rust rewrite, only that the audit didn’t find any. Still, the difference is noticeable when comparing the same duration of development activity. ↩ It’s worth noting that the / TOCTOU class of bug is in some ways easier to avoid in C than in Rust. C code naturally reaches for an open file descriptor and the family of syscalls ( , , , ), and most creation syscalls take a argument directly. Rust’s high-level APIs abstract over the file descriptor and operate on values, which makes the path-based, re-resolving call the path of least resistance. The handle-based APIs exist on every Unix platform; Rust just doesn’t put them front and center. ↩

0 views

Debugging WASM in Chrome DevTools

When I was working on the WASM backend for my Scheme compiler , I ran into several tricky situations with debugging generated WASM code. It turned out that Chrome has a very capable WASM debugger in its DevTools, so in this brief post I want to share how it can be used. I'll be using an example from my wasm-wat-samples project for this post. In fact, everything is already in place in the gc-print-scheme-pairs sample. This sample shows how to construct Scheme-like s-exprs in WASM using gc references and print them out recursively. The sample supports nested pairs of integers, booleans and symbols. To see this in action, we have to first compile the WAT file to WASM, for example using watgo : The browser-loader.html file in that directory already expects to load gc-print-scheme-pairs.wasm . But we can't just open it directly from the file-system; since it loads WASM, this file needs to be served with a local HTTP server. I personally use static-server for this, but you can use anything else - like Python's built-in http.server : Now it can be opened in the browser by following the printed link and selecting the browser-loader.html file. Open the Chrome DevTools, and in Sources , open the Page view on the left. It should have one entry under wasm , which will show the decompiled WAT code for our module. Note: this code is disassembled from the binary WASM, so it will lose some WAT syntactic sugar (like folded instructions): You can set a breakpoint by clicking on the address column to the left of the code, and then refresh the page. The DevTools debugger will run the program again and stop at the breakpoint: Here you can step over, into, see local values and call stack, etc - a real debugger! The most important use case for me while developing the compiler was debugging unexpected exceptions (coming from instructions like ref.cast ). Notice the checkboxes saying "Pause on ... exceptions" on the right-hand side of the previous screenshot. With these selected, the DevTools debugger will automatically stop on an exception and show where it is coming from. Let's modify the gc-print-scheme-pairs.wat sample to see this in action. The $emit_value function performs a set of ref.test checks to see which kind of reference it's dealing with before casting; let's add this line at the very start: It's clearly wrong to assume that $v is a bool reference without first testing it; this is just for demonstration purposes. Without setting any breakpoints, recompiling this code with watgo and reloading the page, we get: The debugger stopped at the instruction causing the exception; moreover, in the Scope pane on the right we can see that the actual type of $v is (ref $Pair) , so it's immediately clear what's going on. I've found this capability extremely valuable when writing (or emitting from a compiler) non-trivial chunks of WASM code using gc types and instructions. "Should I use a debugger or just printfs" is a common topic of debate among programmers. While I'm usually in the "printf debugging" camp, I'm not dogmatic, and will certainly reach for a debugger when the situation calls for it. Specifically, when investigating reference exceptions in WASM, two strong factors tilt the decision towards using a debugger: In general, WASM's printf capabilities aren't great. We can import print-like functions from the host (and - in fact - our sample does just that), but they're not very flexible and dealing with strings in WASM is painful in general. This is compounded even more when working with gc types, because these aren't even visible to the host (they're opaque references). If we want to do printf debugging of gc values, we have to build a lot of scaffolding first. Exception debugging - in general - is much easier with a supportive debugger in hand. Our ref.cast exception from the example above could have happened anywhere in the code. Imagine having to debug a very large WASM program (emitted by a compiler) to find the source of a failed ref.cast ; the debugger takes you right to the spot! In fact, even for C programming, I've always found gdb most useful for pinpointing the source of segmentation faults and similar crashes. In general, WASM's printf capabilities aren't great. We can import print-like functions from the host (and - in fact - our sample does just that), but they're not very flexible and dealing with strings in WASM is painful in general. This is compounded even more when working with gc types, because these aren't even visible to the host (they're opaque references). If we want to do printf debugging of gc values, we have to build a lot of scaffolding first. Exception debugging - in general - is much easier with a supportive debugger in hand. Our ref.cast exception from the example above could have happened anywhere in the code. Imagine having to debug a very large WASM program (emitted by a compiler) to find the source of a failed ref.cast ; the debugger takes you right to the spot! In fact, even for C programming, I've always found gdb most useful for pinpointing the source of segmentation faults and similar crashes.

0 views
Simon Willison 1 months ago

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me

Anthropic didn't release their latest model, Claude Mythos ( system card PDF ), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing . The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser . Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems. There's a great deal more technical detail in Assessing Claude Mythos Preview’s cybersecurity capabilities on the Anthropic Red Team blog: In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex  JIT heap spray  that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets. Plus this comparison with Claude 4.6 Opus: Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more. Saying "our model is too dangerous to release" is a great way to build buzz around a new model, but in this case I expect their caution is warranted. Just a few days ( last Friday ) ago I started a new ai-security-research tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research. Greg Kroah-Hartman of the Linux kernel: Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us. Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real. Daniel Stenberg of : The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good. I'm spending hours per day on this now. It's intense. And Thomas Ptacek published Vulnerability Research Is Cooked , a post inspired by his podcast conversation with Anthropic's Nicholas Carlini. Anthropic have a 5 minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine): It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. [...] I've found more bugs in the last couple of weeks than I found in the rest of my life combined . We've used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. For OpenBSD, we found a bug that's been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it . On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches patches so that anyone who runs the software is no longer vulnerable to these attacks. I found this on the OpenBSD 7.8 errata page : 025: RELIABILITY FIX: March 25, 2026 All architectures TCP packets with invalid SACK options could crash the kernel. A source code patch exists which remedies this problem. I tracked that change down in the GitHub mirror of the OpenBSD CVS repo (apparently they still use CVS!) and found it using git blame : Sure enough, the surrounding code is from 27 years ago. I'm not sure which Linux vulnerability Nicholas was describing, but it may have been this NFS one recently covered by Michael Lynch . There's enough smoke here that I believe there's a fire. It's not surprising to find vulnerabilities in decades-old software, especially given that they're mostly written in C, but what's new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues. I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates "$100M in usage credits ... as well as $4M in direct donations to open-source security organizations". Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well - GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon. The bad news for those of us who are not trusted partners is this: We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview. I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
devansh 1 months ago

On LLMs and Vulnerability Research

I have been meaning to write this for six months. The landscape kept shifting. It has now shifted enough to say something definitive. I work at the intersection of vulnerability triage. I see, every day, how this landscape is changing. These views are personal and do not represent my employer. Take them with appropriate salt. Two things happened in quick succession. Frontier models got dramatically better (Opus 4.6, GPT 5.4). Agentic toolkits (Claude Code, Codex, OpenCode) gave those models hands. The combination produces solid vulnerability research. "LLMs are next-token predictors." This framing was always reductive. It is now actively misleading. The gap between what these models theoretically do (predict the next word) and what they actually do (reason about concurrent thread execution in kernel code to identify use-after-free conditions) has grown too wide for the old frame to hold. Three mechanisms explain why. Implicit structural understanding. Tokenizers know nothing about code. Byte Pair Encoding treats , , and as frequent byte sequences, not syntactic constructs. But the transformer layers above tell a different story. Through training on massive code corpora, attention heads specialise: some track variable identity and provenance, others develop bias toward control flow tokens. The model converges on internal representations that capture semantic properties of code, something functionally equivalent to an abstract syntax tree, built implicitly, never formally. Neural taint analysis. The most security-relevant emergent capability. The model learns associations between sources of untrusted input (user-controlled data, network input, file reads) and dangerous sinks (system calls, SQL queries, memory operations). When it identifies a path from source to sink without adequate sanitisation, it flags a vulnerability. This is not formal taint analysis. No dataflow graph as well. It is a statistical approximation. But it works well for intra-procedural bugs where the source-to-sink path is short, and degrades as distance increases across functions, files, and abstraction layers. Test-time reasoning. The most consequential advance. Standard inference is a single forward pass: reactive, fast, fundamentally limited. Reasoning models (o-series, extended thinking, DeepSeek R1) break this constraint by generating internal reasoning tokens, a scratchpad where the model works through a problem step by step before answering. The model traces execution paths, tracks variable values, evaluates branch conditions. Symbolic execution in natural language. Less precise than formal tools but capable of handling what they choke on: complex pointer arithmetic, dynamic dispatch, deeply nested callbacks. It self-verifies, generating a hypothesis ("the lock isn't held across this path"), then testing it ("wait, is there a lock acquisition I missed?"). It backtracks when reasoning hits dead ends. DeepSeek R1 showed these behaviours emerge from pure reinforcement learning with correctness-based rewards. Nobody taught the model to check its own work. It discovered that verification produces better answers. The model is not generating the most probable next token. It is spending variable compute to solve a specific problem. Three advances compound on each other. Mixture of Experts. Every frontier model now uses MoE. A model might contain 400 billion parameters but activate only 17 billion per token. Vastly more encoded knowledge about code patterns, API behaviours, and vulnerability classes without proportional inference cost. Million-token context. In 2023, analysing a codebase required chunking code into a vector database, retrieving fragments via similarity search, and feeding them to the model. RAG is inherently lossy: code split at arbitrary boundaries, cross-file relationships destroyed, critical context discarded. For vulnerability analysis, where understanding cross-module data flow is the entire point, this information loss is devastating. At one million tokens, you fit an entire mid-size codebase in a single prompt. The model traces user input from an HTTP handler through three middleware layers into a database query builder and spots a sanitisation gap on line 4,200 exploitable via the endpoint on line 890. No chunking. No retrieval. No information loss. Reinforcement-learned reasoning. Earlier models trained purely on next-token prediction. Modern frontier models add an RL phase: generate reasoning chains, reward correctness of the final answer rather than plausibility of text. Over millions of iterations, this shapes reasoning to produce correct analyses rather than plausible-sounding ones. The strategies transfer across domains. A model that learned to verify mathematical reasoning applies the same verification to code. A persistent belief: truly "novel" vulnerability classes exist, bugs so unprecedented that only human genius could discover them. Comforting. Also wrong. Decompose the bugs held up as examples. HTTP request smuggling: the insight that a proxy and backend might disagree about where one request ends and another begins feels like a creative leap. But the actual bug is the intersection of known primitives: ambiguous protocol specification, inconsistent parsing between components, a security-critical assumption about message boundaries. None novel individually. The "novelty" was in combining them. Prototype pollution RCEs in JavaScript frameworks. Exotic until you realise it is dynamic property assignment in a prototype-based language, unsanitised input reaching object modification, and a rendering pipeline evaluating modified objects in a privileged context. Injection, type confusion, privilege boundary crossing. Taxonomy staples for decades. The pattern holds universally. "Novel" vulnerabilities decompose into compositions of known primitives: spec ambiguities, type confusions, missing boundary checks, TOCTOU gaps, trust boundary violations. The novelty is in the composition, not the components. This is precisely what frontier LLMs are increasingly good at. A model that understands protocol ambiguity, inconsistent component behaviour, and security boundary assumptions has all the ingredients to hypothesise a request-smuggling-class vulnerability when pointed at a reverse proxy codebase. It does not need to have seen that exact bug class. It needs to recognise that the conditions for parser disagreement exist and that parser disagreement at a trust boundary has security implications. Compositional reasoning over known primitives. Exactly what test-time reasoning enables. LLMs will not discover the next Spectre tomorrow. Microarchitectural side channels in CPU pipelines are largely absent from code-level training data. But the space of "LLM-inaccessible" vulnerabilities is smaller than the security community assumes, and it shrinks with every model generation. Most of what we call novel vulnerability research is creative recombination within a known search space. That is what these models do best. Effective AI vulnerability research = good scaffolding + adequate tokens. Scaffolding (harness design, prompt engineering, problem framing) is wildly underestimated. Claude Code and Codex are general-purpose coding environments, not optimised for vulnerability research. A purpose-built harness provides threat models, defines trust boundaries, highlights historical vulnerability patterns in the specific technology stack, and constrains search to security-relevant code paths. The operator designing that context determines whether the model spends its reasoning budget wisely or wastes it on dead ends. Two researchers, same model, same codebase, dramatically different results. Token quality beats token quantity. A thousand reasoning tokens on the right code path with the right threat model outperform a million tokens sprayed across a repo with "find vulnerabilities." The search space is effectively infinite. You cannot brute-force it. You narrow it with human intelligence encoded as context, directing machine intelligence toward where bugs actually live. "LLMs are non-deterministic, so you can't trust their findings." Sounds devastating. Almost entirely irrelevant. It confuses the properties of the tool with the properties of the target. The bugs are deterministic. They are in the code. A buffer overflow on line 847 is still there whether the model notices it on attempt one or attempt five. Non-determinism in the search process does not make the search less valid. It makes it more thorough under repetition. Each run samples a different trajectory through the hypothesis space. The union of multiple runs covers more search space than any single run. Conceptually identical to fuzzing. Nobody says "fuzzers are non-deterministic so we can't trust them." You run the fuzzer longer, cover more input space, find more bugs. Same principle. Non-determinism under repetition becomes coverage. In 2023 and 2024, the state of the art was architecture. Multi-agent systems, RAG pipelines, tool integration with SMT solvers and fuzzers and static analysis engines. The best orchestration won. That era is ending. A frontier model ingests a million tokens of code in a single prompt. Your RAG pipeline is not an advantage when the model without RAG sees the whole codebase while your pipeline shows fragments selected by retrieval that does not know what is security-relevant. A reasoning model spends thousands of tokens tracing execution paths and verifying hypotheses. Your external solver integration is not a differentiator when the model approximates what the solver does with contextual understanding the solver lacks. Agentic toolkits handle orchestration better than your custom tooling. The implication the security industry has not fully processed: vulnerability research is being democratised. When finding a memory safety bug in a C library required a Project Zero-calibre researcher with years of experience, the supply was measured in hundreds worldwide. When it requires a well-prompted API call, the supply is effectively unlimited. What replaces architecture as the competitive advantage? Two things. Domain expertise encoded as context. Not "find bugs in this code" but "this is a TLS implementation; here are three classes of timing side-channel that have affected similar implementations; analyse whether the constant-time guarantees hold across these specific code paths." The human provides the insight. The model does the grunt work. Access to compute. Test-time reasoning scales with inference compute. More tokens means deeper analysis, more self-verification, more backtracking. Teams that let a model spend ten minutes on a complex code path will find bugs that teams limited to five-second responses will miss. The end state: vulnerability discovery for known bug classes becomes a commodity, available to anyone with API access and a credit card. The researchers who thrive will focus where the model cannot: novel vulnerability classes, application-level logic flaws, architectural security review, adversarial creativity. This is not a prediction. It is already happening. The pace is set by model capability, which doubles on a timeline measured in months. Beyond next-token prediction Implicit structural understanding Neural taint analysis Test-time reasoning The architecture that enabled this Mixture of Experts Million-token context Reinforcement-learned reasoning The myth of novel vulnerabilities Scaffolding and tokens Non-determinism is a feature Orchestration is no longer your moat

0 views
Anton Zhiyanov 1 months ago

Porting Go's strings package to C

Creating a subset of Go that translates to C was never my end goal. I liked writing C code with Go, but without the standard library it felt pretty limited. So, the next logical step was to port Go's stdlib to C. Of course, this isn't something I could do all at once. I started with the io package , which provides core abstractions like and , as well as general-purpose functions like . But isn't very interesting on its own, since it doesn't include specific reader or writer implementations. So my next choices were naturally and — the workhorses of almost every Go program. This post is about how the porting process went. Bits and UTF-8 • Bytes • Allocators • Buffers and builders • Benchmarks • Optimizing search • Optimizing builder • Wrapping up Before I could start porting , I had to deal with its dependencies first: Both of these packages are made up of pure functions, so they were pretty easy to port. The only minor challenge was the difference in operator precedence between Go and C — specifically, bit shifts ( , ). In Go, bit shifts have higher precedence than addition and subtraction. In C, they have lower precedence: The simplest solution was to just use parentheses everywhere shifts are involved: With and done, I moved on to . The package provides functions for working with byte slices: Some of them were easy to port, like . Here's how it looks in Go: And here's the C version: Just like in Go, the ( → ) macro doesn't allocate memory; it just reinterprets the byte slice's underlying storage as a string. The function (which works like in Go) is easy to implement using from the libc API. Another example is the function, which looks for a specific byte in a slice. Here's the pure-Go implementation: And here's the C version: I used a regular C loop to mimic Go's : But and don't allocate memory. What should I do with , since it clearly does? I had a decision to make. The Go runtime handles memory allocation and deallocation automatically. In C, I had a few options: An allocator is a tool that reserves memory (typically on the heap) so a program can store its data structures there. See Allocators from C to Zig if you want to learn more about them. For me, the winner was clear. Modern systems programming languages like Zig and Odin clearly showed the value of allocators: An is an interface with three methods: , , and . In C, it translates to a struct with function pointers: As I mentioned in the post about porting the io package , this interface representation isn't as efficient as using a static method table, but it's simpler. If you're interested in other options, check out the post on interfaces . By convention, if a function allocates memory, it takes an allocator as its first parameter. So Go's : Translates to this C code: If the caller doesn't care about using a specific allocator, they can just pass an empty allocator, and the implementation will use the system allocator — , , and from libc. Here's a simplified version of the system allocator (I removed safety checks to make it easier to read): The system allocator is stateless, so it's safe to have a global instance: Here's an example of how to call with an allocator: Way better than hidden allocations! Besides pure functions, and also provide types like , , and . I ported them using the same approach as with functions. For types that allocate memory, like , the allocator becomes a struct field: The code is pretty wordy — most C developers would dislike using instead of something shorter like . My solution to this problem is to automatically translate Go code to C (which is actually what I do when porting Go's stdlib). If you're interested, check out the post about this approach — Solod: Go can be a better C . Types that don't allocate, like , need no special treatment — they translate directly to C structs without an allocator field. The package is the twin of , so porting it was uneventful. Here's usage example in Go and C side by side: Again, the C code is just a more verbose version of Go's implementation, plus explicit memory allocation. What's the point of writing C code if it's slow, right? I decided it was time to benchmark the ported C types and functions against their Go versions. To do that, I ported the benchmarking part of Go's package. Surprisingly, the simplified version was only 300 lines long and included everything I needed: Here's a sample benchmark for the type: Reads almost like Go's benchmarks. To monitor memory usage, I created — a memory allocator that wraps another allocator and keeps track of allocations: The benchmark gets an allocator through the function and wraps it in a to keep track of allocations: There's no auto-discovery, but the manual setup is quite straightforward. With the benchmarking setup ready, I ran benchmarks on the package. Some functions did well — about 1.5-2x faster than their Go equivalents: But (searching for a substring in a string) was a total disaster — it was nearly 20 times slower than in Go: The problem was caused by the function we looked at earlier: This "pure" Go implementation is just a fallback. On most platforms, Go uses a specialized version of written in assembly. For the C version, the easiest solution was to use , which is also optimized for most platforms: With this fix, the benchmark results changed drastically: Still not quite as fast as Go, but it's close. Honestly, I don't know why the -based implementation is still slower than Go's assembly here, but I decided not to pursue it any further. After running the rest of the function benchmarks, the ported versions won all of them except for two: Benchmarking details is a common way to compose strings from parts in Go, so I tested its performance too. The results were worse than I expected: Here, the C version performed about the same as Go, but I expected it to be faster. Unlike , is written entirely in Go, so there's no reason the ported version should lose in this benchmark. The method looked almost identical in Go and C: Go's automatically grows the backing slice, while does it manually ( , on the contrary, doesn't grow the slice — it's merely a wrapper). So, there shouldn't be any difference. I had to investigate. Looking at the compiled binary, I noticed a difference in how the functions returned results. Go returns multiple values in separate registers, so uses three registers: one for 8-byte , two for the interface (implemented as two 8-byte pointers). But in C, was a single struct made up of two unions and a pointer: Of course, this 56-byte monster can't be returned in registers — the C calling convention passes it through memory instead. Since is on the hot path in the benchmark, I figured this had to be the issue. So I switched from a single monolithic type to signature-specific types for multi-return pairs: Now, the implementation in C looked like this: is only 16 bytes — small enough to be returned in two registers. Problem solved! But it wasn't — the benchmark only showed a slight improvement. After looking into it more, I finally found the real issue: unlike Go, the C compiler wasn't inlining calls. Adding and moving to the header file made all the difference: 2-4x faster. That's what I was hoping for! Porting and was a mix of easy parts and interesting challenges. The pure functions were straightforward — just translate the syntax and pay attention to operator precedence. The real design challenge was memory management. Using allocators turned out to be a good solution, making memory allocation clear and explicit without being too difficult to use. The benchmarks showed that the C versions outperformed Go in most cases, sometimes by 2-4x. The only exceptions were and , where Go relies on hand-written assembly. The optimization was an interesting challenge: what seemed like a return-type issue was actually an inlining problem, and fixing it gave a nice speed boost. There's a lot more of Go's stdlib to port. In the next post, we'll cover — a very unique Go package. In the meantime, if you'd like to write Go that translates to C — with no runtime and manual memory management — I invite you to try Solod . The and packages are included, of course. implements bit counting and manipulation functions. implements functions for UTF-8 encoded text. Loop over the slice indexes with ( is a macro that returns , similar to Go's built-in). Access the i-th byte with (a bounds-checking macro that returns ). Use a reliable garbage collector like Boehm GC to closely match Go's behavior. Allocate memory with libc's and have the caller free it later with . Introduce allocators. It's obvious whether a function allocates memory or not: if it has an allocator as a parameter, it allocates. It's easy to use different allocation methods: you can use for one function, an arena for another, and a stack allocator for a third. It helps with testing and debugging: you can use a tracking allocator to find memory leaks, or a failing allocator to test error handling. Figuring out how many iterations to run. Running the benchmark function in a loop. Recording metrics (ns/op, MB/s, B/op, allocs/op). Reporting the results.

0 views
Maurycy 1 months ago

GopherTree

While gopher is usually seen as a proto-web, it's really closer to FTP. It has no markup format, no links and no URLs. Files are arranged in a hierarchically, and can be in any format. This rigid structure allows clients to get creative with how it's displayed ... which is why I'm extremely disappointed that everyone renders gopher menus like shitty websites: You see all that text mixed into the menu? Those are informational selectors: a non-standard feature that's often used to recreate hypertext. I know this "limited web" aesthetic appeals to certain circles, but it removes the things that make the protocol interesting. It would be nice to display gopher menus like what they are, a directory tree : This makes it easy to browse collections of files, and help avoid the Wikipedia problem: Absentmindedly clicking links until you realize it's 3 AM and you have a thousand tabs open... and that you never finished what you wanted to read in the first place. I've made the decision to hide informational selectors by default . These have two main uses: creating faux hypertext and adding ASCII art banners. ASCII art banners are simply annoying: Having one in each menu looks cute in a web browser, but having 50 copies cluttering up the directory tree is... not great. Hypertext doesn't work well. In the strict sense, looking ugly is better then not working at all — but almost everyone who does this also hosts on the web, so it's not a huge loss. The client also has a built in text viewer , with pagination and proper word-wrap. It supports both UTF-8 and Latin-1 text encodings, but this has to be selected manually: gopher has no mechanism to indicate encoding. (but most text looks the same in both) Bookmarks work by writing items to a locally stored gopher menu, which also serves as a "homepage" of sorts. Because it's just a file, I didn't bother implementing any advanced editing features: any text editor works fine for that. The bookmark code is UNIX/Linux specific, but porting should be possible. All this fits within a thousand lines of C code , the same as my ultra-minimal web browser. While arguably a browser, it was practically unusable: lacking basic features like a back button or pagination. The gopher version of the same size is complete enough to replace Lynx as my preferred client. Usage instructions can be found at the top of the source file. /projects/gopher/gophertree.c : Source and instructions /projects/tinyweb/ : 1000 line web browser https://datatracker.ietf.org/doc/html/rfc1436 : Gopher RFC

0 views
Maurycy 1 months ago

My ramblings are available over gopher

It has recently come to my attention that people need a thousand lines of C code to read my website. This is unacceptable. For simpler clients, my server supports gopher: The response is just a text file: it has no markup, no links and no embedded content. For navigation, gopher uses specially formatted directory-style menus: The first character on a line indicates the type of the linked resource: The type is followed by a tab-separated list containing a display name, file path, hostname and port. Lines beginning with an "i" are purely informational and do not link to anything. (This is non-standard, but widely used) Storing metadata in links is weird to modern sensibilities , but it keeps the protocol simple. Menus are the only thing that the client has to understand: there's no URLs, no headers, no mime types — the only thing sent to the server is the selector (file path), and the only thing received is the file. ... as a bonus, this one liner can download files: That's quite clunky , but there are lots of programs that support it. If you have Lynx installed, you should be able to just point it at this URL: ... although you will want to put in because it's not 1991 anymore [Citation Needed] I could use informational lines to replicate the webs navigation by making everything a menu — but that would be against the spirit of the thing: gopher is document retrieval protocol, not a hypertext format. Instead, I converted all my blog posts in plain text and set up some directory-style navigation. I've actually been moving away from using inline links anyways because they have two opposing design goals: While reading, links must be normal text. When you're done, links must be distinct clickable elements. I've never been able to find a good compromise: Links are always either distracting to the reader, annoying to find/click, or both. Also, to preempt all the emails : ... what about Gemini? (The protocol, not the autocomplete from google.) Gemini is the popular option for non-web publishing... but honestly, it feels like someone took HTTP and slapped markdown on top of it. This is a Gemini request... ... and this is an HTTP request: For both protocols, the server responds with metadata followed by hypertext. It's true that HTTP is more verbose, but 16 extra bytes doesn't create a noticeable difference. Unlike gopher, which has a unique navigation model and is of historical interest , Gemini is just the web but with limited features... so what's the point? I can already write websites that don't have ads or autoplaying videos, and you can already use browsers that don't support features you don't like. After stripping away all the fluff (CSS, JS, etc) the web is quite simple: a functional browser can be put together in a weekend. ... and unlike gemini, doing so won't throw out 35 years of compatibility: Someone with Chrome can read a barebones website, and someone with Lynx can read normal sites. Gemini is a technical solution to an emotional problem . Most people have a bad taste for HTTP due to the experience of visiting a commercial website. Gemini is the obvious choice for someone looking for "the web but without VC types". It doesn't make any sense when I'm looking for an interesting (and humor­ously outdated) protocol. /projects/tinyweb/ : A browser in 1000 lines of C ... /about.html#links : ... and thoughts on links for navigation. https://www.rfc-editor.org/rfc/rfc1436.html : Gopher RFC https://lynx.invisible-island.net/ : Feature complete text-based web browser

0 views
Max Bernstein 1 months ago

Using Perfetto in ZJIT

Originally published on Rails At Scale . Look! A trace of slow events in a benchmark! Hover over the image to see it get bigger. Now read on to see what the slow events are and how we got this pretty picture. The first rule of just-in-time compilers is: you stay in JIT code. The second rule of JIT is: you STAY in JIT code! When control leaves the compiled code to run in the interpreter—what the ZJIT team calls either a “side-exit” or a “deopt”, depending on who you talk to—things slow down. In a well-tuned system, this should happen pretty rarely. Right now, because we’re still bringing up the compiler and runtime system, it happens more than we would like. We’re reducing the number of exits over time. We can track our side-exit reduction progress with , which, on process exit, prints out a tidy summary of the counters for all of the bad stuff we track. It’s got side-exits. It’s got calls to C code. It’s got calls to slow-path runtime helpers. It’s got everything. Here is a chopped-up sample of stats output for the Lobsters benchmark, which is a large Rails app: (I’ve cut out significant chunks of the stats output and replaced them with because it’s overwhelming the first time you see it.) The first thing you might note is that the thing I just described as terrible for performance is happening over twelve million times . The second thing you might notice is that despite this, we’re staying in JIT code seemingly a high percentage of the time. Or are we? Is 80% high? Is a 4.5% class guard miss ratio high? What about 11% for shapes? It’s hard to say. The counters are great because they’re quick and they’re reasonably stable proxies for performance. There’s no substitute for painstaking measurements on a quiet machine but if the counter for Bad Slow Thing goes down (and others do not go up), we’re probably doing a good job. But they’re not great for building intuition. For intuition, we want more tangible feeling numbers. We want to see things. The third thing is that you might ask yourself “self, where are these exits coming from?” Unfortunately, counters cannot tell you that. For that, we want stack traces. This lets us know where in the guest (Ruby) code triggers an exit. Ideally also we would want some notion of time: we would want to know not just where these events happen but also when. Are the exits happening early, at application boot? At warmup? Even during what should be steady state application time? Hard to say. So we need more tools. Thankfully, Perfetto exists. Perfetto is a system for visualizing and analyzing traces and profiles that your application generates. It has both a web UI and a command-line UI. We can emit traces for Perfetto and visualize them there. Take a look at this sample ZJIT Perfetto trace generated by running Ruby with 1 . What do you see? I see a couple arrows on the left. Arrows indicate “instant” point-in-time events. Then I see a mess of purple to the right of that until the end of the trace. Hover over an arrow. Find out that each arrow is a side-exit. Scream silently. But it’s a friendly arrow. It tells you what the side-exit reason is. If you click it, it even tells you the stack trace in the pop-up panel on the bottom. If we click a couple of them, maybe we can learn more. We can also zoom by mousing over the track, holding Ctrl, and scrolling. That will get us look closer. But there are so many… Fortunately, Perfetto also provides a SQL interface to the traces. We can write a query to aggregate all of the side exit events from the table and line them up with the topmost method from the backtrace arguments in the table: This pulls up a query box at the bottom showing us that there are a couple big hotspots: It even has a helpful option to export the results Markdown table so I can paste (an edited version) into this blog post: Looks like we should figure out why we’re having shape misses so much and that will clear up a lot of exits. (Hint: it’s because once we make our first guess about what we think the object shape will be, we don’t re-assess… yet .) This has been a taste of Perfetto. There’s probably a lot more to explore. Please join the ZJIT Zulip and let us know if you have any cool tracing or exploring tricks. Now I’ll explain how you too can use Perfetto from your system. Adding support to ZJIT was pretty straightforward. The first thing is that you’ll need some way to get trace data out of your system. We write to a file with a well-known location ( ), but you could do any number of things. Perhaps you can stream events over a socket to another process, or to a server that aggregates them, or store them internally and expose a webserver that serves them over the internet, or… anything, really. Once you have that, you need a couple lines of code to emit the data. Perfetto accepts a number of formats. For example, in his excellent blog post , Tristan Hume opens with such a simple snippet of code for logging Chromium Trace JSON-formatted events (lightly modified by me): This snippet is great. It shows, end-to-end, writing a stream of one event. It is a complete (X) event, as opposed to either: It was enough to get me started. Since it’s JSON, and we have a lot of side exits, the trace quickly ballooned to 8GB large for a several second benchmark. Not great. Now, part of this is our fault—we should side exit less—and part of it is just the verbosity of JSON. Thankfully, Perfetto ingests more compact binary formats, such as the Fuchsia trace format . In addition to being more compact, FXT even supports string interning. After modifying the tracer to emit FXT, we ended with closer to 100MB for the same benchmark. We can reduce further by sampling —not writing every exit to the trace, but instead every K exits (for some (probably prime) K). This is why we provide the option. Check out the trace writer implementation from the point this article was written. We could trace: Visualizations are awesome. Get your data in the right format so you can ask the right questions easily. Thanks for Perfetto! Also, looks like visualizations are now available in Perfetto canary. Time to go make some fun histograms… This is also sampled/strobed, so not every exit is in there. This is just 1/K of them for some K that I don’t remember.  ↩ two discrete timestamped begin (B) and end (E) events that book-end something, or an instant (i) event that has no duration, or a couple other event types in the Chromium Trace Event Format doc When methods get compiled How big the generated code is How long each compile phase takes When (and where) invalidation events happen When (and where) allocations happen from JITed code Garbage collection events This is also sampled/strobed, so not every exit is in there. This is just 1/K of them for some K that I don’t remember.  ↩

0 views
Anton Zhiyanov 1 months ago

Porting Go's io package to C

Creating a subset of Go that translates to C was never my end goal. I liked writing C code with Go, but without the standard library it felt pretty limited. So, the next logical step was to port Go's stdlib to C. Of course, this isn't something I could do all at once. So I started with the standard library packages that had the fewest dependencies, and one of them was the package. This post is about how that went. io package • Slices • Multiple returns • Errors • Interfaces • Type assertion • Specialized readers • Copy • Wrapping up is one of the core Go packages. It introduces the concepts of readers and writers , which are also common in other programming languages. In Go, a reader is anything that can read some raw data (bytes) from a source into a slice: A writer is anything that can take some raw data from a slice and write it to a destination: The package defines many other interfaces, like and , as well as combinations like and . It also provides several functions, the most well-known being , which copies all data from a source (represented by a reader) to a destination (represented by a writer): C, of course, doesn't have interfaces. But before I get into that, I had to make several other design decisions. In general, a slice is a linear container that holds N elements of type T. Typically, a slice is a view of some underlying data. In Go, a slice consists of a pointer to a block of allocated memory, a length (the number of elements in the slice), and a capacity (the total number of elements that can fit in the backing memory before the runtime needs to re-allocate): Interfaces in the package work with fixed-length slices (readers and writers should never append to a slice), and they only use byte slices. So, the simplest way to represent this in C could be: But since I needed a general-purpose slice type, I decided to do it the Go way instead: Plus a bound-checking helper to access slice elements: Usage example: So far, so good. Let's look at the method again: It returns two values: an and an . C functions can only return one value, so I needed to figure out how to handle this. The classic approach would be to pass output parameters by pointer, like or . But that doesn't compose well and looks nothing like Go. Instead, I went with a result struct: The union can store any primitive type, as well as strings, slices, and pointers. The type combines a value with an error. So, our method (let's assume it's just a regular function for now): Translates to: And the caller can access the result like this: For the error type itself, I went with a simple pointer to an immutable string: Plus a constructor macro: I wanted to avoid heap allocations as much as possible, so decided not to support dynamic errors. Only sentinel errors are used, and they're defined at the file level like this: Errors are compared by pointer identity ( ), not by string content — just like sentinel errors in Go. A error is a pointer. This keeps error handling cheap and straightforward. This was the big one. In Go, an interface is a type that specifies a set of methods. Any concrete type that implements those methods satisfies the interface — no explicit declaration needed. In C, there's no such mechanism. For interfaces, I decided to use "fat" structs with function pointers. That way, Go's : Becomes an struct in C: The pointer holds the concrete value, and each method becomes a function pointer that takes as its first argument. This is less efficient than using a static method table, especially if the interface has a lot of methods, but it's simpler. So I decided it was good enough for the first version. Now functions can work with interfaces without knowing the specific implementation: Calling a method on the interface just goes through the function pointer: Go's interface is more than just a value wrapper with a method table. It also stores type information about the value it holds: Since the runtime knows the exact type inside the interface, it can try to "upgrade" the interface (for example, a regular ) to another interface (like ) using a type assertion : The last thing I wanted to do was reinvent Go's dynamic type system in C, so dropping this feature was an easy decision. There's another kind of type assertion, though — when we unwrap the interface to get the value of a specific type: And this kind of assertion is quite possible in C. All we have to do is compare function pointers: If two different types happened to share the same method implementation, this would break. In practice, each concrete type has its own methods, so the function pointer serves as a reliable type tag. After I decided on the interface approach, porting the actual types was pretty easy. For example, wraps a reader and stops with EOF after reading N bytes: The logic is straightforward: if there are no bytes left, return EOF. Otherwise, if the buffer is bigger than the remaining size, shorten it. Then, call the underlying reader, and decrease the remaining size. Here's what the ported C code looks like: A bit more verbose, but nothing special. The multiple return values, the interface call with , and the slice handling are all implemented as described in previous sections. is where everything comes together. Here's the simplified Go version: In Go, allocates its buffer on the heap with . I could take a similar approach in C — make take an allocator and use it to create the buffer like this: But since this is just a temporary buffer that only exists during the function call, I decided stack allocation was a better choice: allocates memory on a stack with a bounds-checking macro that wraps C's . It moves the stack pointer and gives you a chunk of memory that's automatically freed when the function returns. People often avoid using because it can cause a stack overflow, but using a bounds-checking wrapper fixes this issue. Another common concern with is that it's not block-scoped — the memory stays allocated until the function exits. However, since we only allocate once, this isn't a problem. Here's the simplified C version of : Here, you can see all the parts from this post working together: a function accepting interfaces, slices passed to interface methods, a result type wrapping multiple return values, error sentinels compared by identity, and a stack-allocated buffer used for the copy. Porting Go's package to C meant solving a few problems: representing slices, handling multiple return values, modeling errors, and implementing interfaces using function pointers. None of this needed anything fancy — just structs, unions, functions, and some macros. The resulting C code is more verbose than Go, but it's structurally similar, easy enough to read, and this approach should work well for other Go packages too. The package isn't very useful on its own — it mainly defines interfaces and doesn't provide concrete implementations. So, the next two packages to port were naturally and — I'll talk about those in the next post. In the meantime, if you'd like to write Go that translates to C — with no runtime and manual memory management — I invite you to try Solod . The package is included, of course.

0 views
Anton Zhiyanov 1 months ago

Solod: Go can be a better C

I'm working on a new programming language named Solod ( So ). It's a strict subset of Go that translates to C, without hidden memory allocations and with source-level interop. Highlights: So supports structs, methods, interfaces, slices, multiple returns, and defer. To keep things simple, there are no channels, goroutines, closures, or generics. So is for systems programming in C, but with Go's syntax, type safety, and tooling. Hello world • Language tour • Compatibility • Design decisions • FAQ • Final thoughts This Go code in a file : Translates to a header file : Plus an implementation file : In terms of features, So is an intersection between Go and C, making it one of the simplest C-like languages out there — on par with Hare. And since So is a strict subset of Go, you already know it if you know Go. It's pretty handy if you don't want to learn another syntax. Let's briefly go over the language features and see how they translate to C. Variables • Strings • Arrays • Slices • Maps • If/else and for • Functions • Multiple returns • Structs • Methods • Interfaces • Enums • Errors • Defer • C interop • Packages So supports basic Go types and variable declarations: is translated to ( ), to ( ), and to ( ). is not treated as an interface. Instead, it's translated to . This makes handling pointers much easier and removes the need for . is translated to (for pointer types). Strings are represented as type in C: All standard string operations are supported, including indexing, slicing, and iterating with a for-range loop. Converting a string to a byte slice and back is a zero-copy operation: Converting a string to a rune slice and back allocates on the stack with : There's a stdlib package for heap-allocated strings and various string operations. Arrays are represented as plain C arrays ( ): on arrays is emitted as compile-time constant. Slicing an array produces a . Slices are represented as type in C: All standard slice operations are supported, including indexing, slicing, and iterating with a for-range loop. As in Go, a slice is a value type. Unlike in Go, a nil slice and an empty slice are the same thing: allocates a fixed amount of memory on the stack ( ). only works up to the initial capacity and panics if it's exceeded. There's no automatic reallocation; use the stdlib package for heap allocation and dynamic arrays. Maps are fixed-size and stack-allocated, backed by parallel key/value arrays with linear search. They are pointer-based reference types, represented as in C. No delete, no resize. Only use maps when you have a small, fixed number of key-value pairs. For anything else, use heap-allocated maps from the package (planned). Most of the standard map operations are supported, including getting/setting values and iterating with a for-range loop: As in Go, a map is a pointer type. A map emits as in C. If-else and for come in all shapes and sizes, just like in Go. Standard if-else with chaining: Init statement (scoped to the if block): Traditional for loop: While-style loop: Range over an integer: Regular functions translate to C naturally: Named function types become typedefs: Exported functions (capitalized) become public C symbols prefixed with the package name ( ). Unexported functions are . Variadic functions use the standard syntax and translate to passing a slice: Function literals (anonymous functions and closures) are not supported. So supports two-value multiple returns in two patterns: and . Both cases translate to C type: Named return values are not supported. Structs translate to C naturally: works with types and values: Methods are defined on struct types with pointer or value receivers: Pointer receivers pass in C and cast to the struct pointer. Value receivers pass the struct by value, so modifications operate on a copy: Calling methods on values and pointers emits pointers or values as necessary: Methods on named primitive types are also supported. Interfaces in So are like Go interfaces, but they don't include runtime type information. Interface declarations list the required methods: In C, an interface is a struct with a pointer and function pointers for each method (less efficient than using a static method table, but simpler; this might change in the future): Just as in Go, a concrete type implements an interface by providing the necessary methods: Passing a concrete type to functions that accept interfaces: Type assertion works for concrete types ( ), but not for interfaces ( ). Type switch is not supported. Empty interfaces ( and ) are translated to . So supports typed constant groups as enums: Each constant is emitted as a C : is supported for integer-typed constants: Iota values are evaluated at compile time and translated to integer literals: Errors use the type (a pointer): So only supports sentinel errors, which are defined at the package level using (implemented as compiler built-in): Errors are compared using . This is an O(1) operation (compares pointers, not strings): Dynamic errors ( ), local error variables ( inside functions), and error wrapping are not supported. schedules a function or method call to run at the end of the enclosing scope. The scope can be either a function (as in Go): Or a bare block (unlike Go): Deferred calls are emitted inline (before returns, panics, and scope end) in LIFO order: Defer is not supported inside other scopes like or . Include a C header file with : Declare an external C type (excluded from emission) with : Declare an external C function (no body or ): When calling extern functions, and arguments are automatically decayed to their C equivalents: string literals become raw C strings ( ), string values become , and slices become raw pointers. This makes interop cleaner: The decay behavior can be turned off with the flag: The package includes helpers for converting C pointers back to So string and slice types. The package is also available and is implemented as compiler built-ins. Each Go package is translated into a single + pair, regardless of how many files it contains. Multiple files in the same package are merged into one file, separated by comments. Exported symbols (capitalized names) are prefixed with the package name: Unexported symbols (lowercase names) keep their original names and are marked : Exported symbols are declared in the file (with for variables). Unexported symbols only appear in the file. Importing a So package translates to a C : Calling imported symbols uses the package prefix: That's it for the language tour! So generates C11 code that relies on several GCC/Clang extensions: You can use GCC, Clang, or to compile the transpiled C code. MSVC is not supported. Supported operating systems: Linux, macOS, and Windows (partial support). So is highly opinionated. Simplicity is key . Fewer features are always better. Every new feature is strongly discouraged by default and should be added only if there are very convincing real-world use cases to support it. This applies to the standard library too — So tries to export as little of Go's stdlib API as possible while still remaining highly useful for real-world use cases. No heap allocations are allowed in language built-ins (like maps, slices, new, or append). Heap allocations are allowed in the standard library, but they must clearly state when an allocation happens and who owns the allocated data. Fast and easy C interop . Even though So uses Go syntax, it's basically C with its own standard library. Calling C from So, and So from C, should always be simple to write and run efficiently. The So standard library (translated to C) should be easy to add to any C project. Readability . There are several languages that claim they can transpile to readable C code. Unfortunately, the C code they generate is usually unreadable or barely readable at best. So isn't perfect in this area either (though it's arguably better than others), but it aims to produce C code that's as readable as possible. Go compatibility . So code is valid Go code. No exceptions. Raw performance . You can definitely write C code by hand that runs faster than code produced by So. Also, some features in So, like interfaces, are currently implemented in a way that's not very efficient, mainly to keep things simple. Hiding C entirely . So is a cleaner way to write C, not a replacement for it. You should know C to use So effectively. Go feature parity . Less is more. Iterators aren't coming, and neither are generic methods. I have heard these several times, so it's worth answering. Why not Rust/Zig/Odin/other language? Because I like C and Go. Why not TinyGo? TinyGo is lightweight, but it still has a garbage collector, a runtime, and aims to support all Go features. What I'm after is something even simpler, with no runtime at all, source-level C interop, and eventually, Go's standard library ported to plain C so it can be used in regular C projects. How does So handle memory? Everything is stack-allocated by default. There's no garbage collector or reference counting. The standard library provides explicit heap allocation in the package when you need it. Is it safe? So itself has few safeguards other than the default Go type checking. It will panic on out-of-bounds array access, but it won't stop you from returning a dangling pointer or forgetting to free allocated memory. Most memory-related problems can be caught with AddressSanitizer in modern compilers, so I recommend enabling it during development by adding to your . Can I use So code from C (and vice versa)? Yes. So compiles to plain C, therefore calling So from C is just calling C from C. Calling C from So is equally straightforward. Can I compile existing Go packages with So? Not really. Go uses automatic memory management, while So uses manual memory management. So also supports far fewer features than Go. Neither Go's standard library nor third-party packages will work with So without changes. How stable is this? Not for production at the moment. Where's the standard library? There is a growing set of high-level packages ( , , , ...). There are also low-level packages that wrap the libc API ( , , , ...). Check the links below for more details. Even though So isn't ready for production yet, I encourage you to try it out on a hobby project or just keep an eye on it if you like the concept. Further reading: Go in, C out. You write regular Go code and get readable C11 as output. Zero runtime. No garbage collection, no reference counting, no hidden allocations. Everything is stack-allocated by default. Heap is opt-in through the standard library. Native C interop. Call C from So and So from C — no CGO, no overhead. Go tooling works out of the box — syntax highlighting, LSP, linting and "go test". Binary literals ( ) in generated code. Statement expressions ( ) in macros. for package-level initialization. for local type inference in generated code. for type inference in generic macros. for and other dynamic stack allocations. Installation and usage So by example Language description Stdlib description Source code

0 views
daniel.haxx.se 2 months ago

Dependency tracking is hard

curl and libcurl are written in C. Rather low level components present in many software systems. They are typically not part of any ecosystem at all. They’re just a tool and a library. In lots of places on the web when you mention an Open Source project, you will also get the option to mention in which ecosystem it belongs. npm, go, rust, python etc. There are easily at least a dozen well-known and large ecosystems. curl is not part of any of those. Recently there’s been a push for PURLs ( Package URLs ), for example when describing your specific package in a CVE. A package URL only works when the component is part of an ecosystem. curl is not. We can’t specify curl or libcurl using a PURL. SBOM generators and related scanners use package managers to generate lists of used components and their dependencies . This makes these tools quite frequently just miss and ignore libcurl. It’s not listed by the package managers. It’s just in there, ready to be used. Like magic. It is similarly hard for these tools to figure out that curl in turn also depends and uses other libraries. At build-time you select which – but as we in the curl project primarily just ships tarballs with source code we cannot tell anyone what dependencies their builds have. The additional libraries libcurl itself uses are all similarly outside of the standard ecosystems. Part of the explanation for this is also that libcurl and curl are often shipped bundled with the operating system many times, or sometimes perceived to be part of the OS. Most graphs, SBOM tools and dependency trackers therefore stop at the binding or system that uses curl or libcurl, but without including curl or libcurl. The layer above so to speak. This makes it hard to figure out exactly how many components and how much software is depending on libcurl. A perfect way to illustrate the problem is to check GitHub and see how many among its vast collection of many millions of repositories that depend on curl. After all, curl is installed in some thirty billion installations, so clearly it used a lot . (Most of them being libcurl of course.) It lists one dependency for curl. Repositories that depend on curl/curl: one. Screenshot taken on March 9, 2026 What makes this even more amusing is that it looks like this single dependent repository ( Pupibent/spire ) lists curl as a dependency by mistake.

0 views
(think) 2 months ago

Building Emacs Major Modes with TreeSitter: Lessons Learned

Over the past year I’ve been spending a lot of time building TreeSitter-powered major modes for Emacs – clojure-ts-mode (as co-maintainer), neocaml (from scratch), and asciidoc-mode (also from scratch). Between the three projects I’ve accumulated enough battle scars to write about the experience. This post distills the key lessons for anyone thinking about writing a TreeSitter-based major mode, or curious about what it’s actually like. Before TreeSitter, Emacs font-locking was done with regular expressions and indentation was handled by ad-hoc engines (SMIE, custom indent functions, or pure regex heuristics). This works, but it has well-known problems: Regex-based font-locking is fragile. Regexes can’t parse nested structures, so they either under-match (missing valid code) or over-match (highlighting inside strings and comments). Every edge case is another regex, and the patterns become increasingly unreadable over time. Indentation engines are complex. SMIE (the generic indentation engine for non-TreeSitter modes) requires defining operator precedence grammars for the language, which is hard to get right. Custom indentation functions tend to grow into large, brittle state machines. Tuareg’s indentation code, for example, is thousands of lines long. TreeSitter changes the game because you get a full, incremental, error-tolerant syntax tree for free. Font-locking becomes “match this AST pattern, apply this face”: And indentation becomes “if the parent node is X, indent by Y”: The rules are declarative, composable, and much easier to reason about than regex chains. In practice, ’s entire font-lock and indentation logic fits in about 350 lines of Elisp. The equivalent in tuareg is spread across thousands of lines. That’s the real selling point: simpler, more maintainable code that handles more edge cases correctly . That said, TreeSitter in Emacs is not a silver bullet. Here’s what I ran into. TreeSitter grammars are written by different authors with different philosophies. The tree-sitter-ocaml grammar provides a rich, detailed AST with named fields. The tree-sitter-clojure grammar, by contrast, deliberately keeps things minimal – it only models syntax, not semantics, because Clojure’s macro system makes static semantic analysis unreliable. 1 This means font-locking forms in Clojure requires predicate matching on symbol text, while in OCaml you can directly match nodes with named fields. To illustrate: here’s how you’d fontify a function definition in OCaml, where the grammar gives you rich named fields: And here’s the equivalent in Clojure, where the grammar only gives you lists of symbols and you need predicate matching: You can’t learn “how to write TreeSitter queries” generically – you need to learn each grammar individually. The best tool for this is (to visualize the full parse tree) and (to see the node at point). Use them constantly. You’re dependent on someone else providing the grammar, and quality is all over the map. The OCaml grammar is mature and well-maintained – it’s hosted under the official tree-sitter GitHub org. The Clojure grammar is small and stable by design. But not every language is so lucky. asciidoc-mode uses a third-party AsciiDoc grammar that employs a dual-parser architecture – one parser for block-level structure (headings, lists, code blocks) and another for inline formatting (bold, italic, links). This is the same approach used by Emacs’s built-in , and it makes sense for markup languages where block and inline syntax are largely independent. The problem is that the two parsers run independently on the same text, and they can disagree . The inline parser misinterprets and list markers as emphasis delimiters, creating spurious bold spans that swallow subsequent inline content. The workaround is to use on all block-level font-lock rules so they win over the incorrect inline faces: This doesn’t fix inline elements consumed by the spurious emphasis – that requires an upstream grammar fix. When you hit grammar-level issues like this, you either fix them yourself (which means diving into the grammar’s JavaScript source and C toolchain) or you live with workarounds. Either way, it’s a reminder that your mode is only as good as the grammar underneath it. Getting the font-locking right in was probably the most challenging part of all three projects, precisely because of these grammar quirks. I also ran into a subtle behavior: the default font-lock mode ( ) skips an entire captured range if any position within it already has a face. So if you capture a parent node like and a child was already fontified, the whole thing gets skipped silently. The fix is to capture specific child nodes instead: These issues took a lot of trial and error to diagnose. The lesson: budget extra time for font-locking when working with less mature grammars . Grammars evolve, and breaking changes happen. switched from the stable grammar to the experimental branch because the stable version had metadata nodes as children of other nodes, which caused and to behave incorrectly. The experimental grammar makes metadata standalone nodes, fixing the navigation issues but requiring all queries to be updated. pins to v0.24.0 of the OCaml grammar. If you don’t pin versions, a grammar update can silently break your font-locking or indentation. The takeaway: always pin your grammar version , and include a mechanism to detect outdated grammars. tests a query that changed between versions to detect incompatible grammars at startup. Users shouldn’t have to manually clone repos and compile C code to use your mode. Both and include grammar recipes: On first use, the mode checks and offers to install missing grammars via . This works, but requires a C compiler and Git on the user’s machine, which is not ideal. 2 The TreeSitter support in Emacs has been improving steadily, but each version has its quirks: Emacs 29 introduced TreeSitter support but lacked several APIs. For instance, (used for structured navigation) doesn’t exist – you need a fallback: Emacs 30 added , sentence navigation, and better indentation support. But it also had a bug in offsets ( #77848 ) that broke embedded parsers, and another in that required to disable its TreeSitter-aware version. Emacs 31 has a bug in where an off-by-one error causes to leave ` *)` behind on multi-line OCaml comments. I had to skip the affected test with a version check: The lesson: test your mode against multiple Emacs versions , and be prepared to write version-specific workarounds. CI that runs against Emacs 29, 30, and snapshot is essential. Most TreeSitter grammars ship with query files for syntax highlighting ( ) and indentation ( ). Editors like Neovim and Helix use these directly. Emacs doesn’t – you have to manually translate the patterns into and calls in Elisp. This is tedious and error-prone. For example, here’s a rule from the OCaml grammar’s : And here’s the Elisp equivalent you’d write for Emacs: The query syntax is nearly identical, but you have to wrap everything in calls, map upstream capture names ( ) to Emacs face names ( ), assign features, and manage behavior. You end up maintaining a parallel set of queries that can drift from upstream. Emacs 31 will introduce which will make it possible to use files for font-locking, which should help significantly. But for now, you’re hand-coding everything. When a face isn’t being applied where you expect: TreeSitter modes define four levels of font-locking via , and the default level in Emacs is 3. It’s tempting to pile everything into levels 1–3 so users see maximum highlighting out of the box, but resist the urge. When every token on the screen has a different color, code starts looking like a Christmas tree and the important things – keywords, definitions, types – stop standing out. Less is more here. Here’s how distributes features across levels: And follows the same philosophy: The pattern is the same: essentials first, progressively more detail at higher levels. This way the default experience (level 3) is clean and readable, and users who want the full rainbow can bump to 4. Better yet, they can use to cherry-pick individual features regardless of level: This gives users fine-grained control without requiring mode authors to anticipate every preference. Indentation issues are harder to diagnose because they depend on tree structure, rule ordering, and anchor resolution: Remember that rule order matters for indentation too – the first matching rule wins. A typical set of rules reads top to bottom from most specific to most general: Watch out for the empty-line problem : when the cursor is on a blank line, TreeSitter has no node at point. The indentation engine falls back to the root node as the parent, which typically matches the top-level rule and gives column 0. In neocaml I solved this with a rule that looks at the previous line’s last token to decide indentation: This is the single most important piece of advice. Font-lock and indentation are easy to break accidentally, and manual testing doesn’t scale. Both projects use Buttercup (a BDD testing framework for Emacs) with custom test macros. Font-lock tests insert code into a buffer, run , and assert that specific character ranges have the expected face: Indentation tests insert code, run , and assert the result matches the expected indentation: Integration tests load real source files and verify that both font-locking and indentation survive on the full file. This catches interactions between rules that unit tests miss. has 200+ automated tests and has even more. Investing in test infrastructure early pays off enormously – I can refactor indentation rules with confidence because the suite catches regressions immediately. When I became the maintainer of clojure-mode many years ago, I really struggled with making changes. There were no font-lock or indentation tests, so every change was a leap of faith – you’d fix one thing and break three others without knowing until someone filed a bug report. I spent years working on a testing approach I was happy with, alongside many great contributors, and the return on investment was massive. The same approach – almost the same test macros – carried over directly to when we built the TreeSitter version. And later I reused the pattern again in and . One investment in testing infrastructure, four projects benefiting from it. I know that automated tests, for whatever reason, never gained much traction in the Emacs community. Many popular packages have no tests at all. I hope stories like this convince you that investing in tests is really important and pays off – not just for the project where you write them, but for every project you build after. This one is specific to but applies broadly: compiling TreeSitter queries at runtime is expensive. If you’re building queries dynamically (e.g. with called at mode init time), consider pre-compiling them as values. This made a noticeable difference in ’s startup time. The Emacs community has settled on a suffix convention for TreeSitter-based modes: , , , and so on. This makes sense when both a legacy mode and a TreeSitter mode coexist in Emacs core – users need to choose between them. But I think the convention is being applied too broadly, and I’m afraid the resulting name fragmentation will haunt the community for years. For new packages that don’t have a legacy counterpart, the suffix is unnecessary. I named my packages (not ) and (not ) because there was no prior or to disambiguate from. The infix is an implementation detail that shouldn’t leak into the user-facing name. Will we rename everything again when TreeSitter becomes the default and the non-TS variants are removed? Be bolder with naming. If you’re building something new, give it a name that makes sense on its own merits, not one that encodes the parsing technology in the package name. I think the full transition to TreeSitter in the Emacs community will take 3–5 years, optimistically. There are hundreds of major modes out there, many maintained by a single person in their spare time. Converting a mode from regex to TreeSitter isn’t just a mechanical translation – you need to understand the grammar, rewrite font-lock and indentation rules, handle version compatibility, and build a new test suite. That’s a lot of work. Interestingly, this might be one area where agentic coding tools can genuinely help. The structure of TreeSitter-based major modes is fairly uniform: grammar recipes, font-lock rules, indentation rules, navigation settings, imenu. If you give an AI agent a grammar and a reference to a high-quality mode like , it could probably scaffold a reasonable new mode fairly quickly. The hard parts – debugging grammar quirks, handling edge cases, getting indentation just right – would still need human attention, but the boilerplate could be automated. Still, knowing the Emacs community, I wouldn’t be surprised if a full migration never actually completes. Many old-school modes work perfectly fine, their maintainers have no interest in TreeSitter, and “if it ain’t broke, don’t fix it” is a powerful force. And that’s okay – diversity of approaches is part of what makes Emacs Emacs. TreeSitter is genuinely great for building Emacs major modes. The code is simpler, the results are more accurate, and incremental parsing means everything stays fast even on large files. I wouldn’t go back to regex-based font-locking willingly. But it’s not magical. Grammars are inconsistent across languages, the Emacs APIs are still maturing, you can’t reuse files (yet), and you’ll hit version-specific bugs that require tedious workarounds. The testing story is better than with regex modes – tree structures are more predictable than regex matches – but you still need a solid test suite to avoid regressions. If you’re thinking about writing a TreeSitter-based major mode, do it. The ecosystem needs more of them, and the experience of working with syntax trees instead of regexes is genuinely enjoyable. Just go in with realistic expectations, pin your grammar versions, test against multiple Emacs releases, and build your test suite early. Anyways, I wish there was an article like this one when I was starting out with and , so there you have it. I hope that the lessons I’ve learned along the way will help build better modes with TreeSitter down the road. That’s all I have for you today. Keep hacking! See the excellent scope discussion in the tree-sitter-clojure repo for the rationale.  ↩︎ There’s ongoing discussion in the Emacs community about distributing pre-compiled grammar binaries, but nothing concrete yet.  ↩︎ Regex-based font-locking is fragile. Regexes can’t parse nested structures, so they either under-match (missing valid code) or over-match (highlighting inside strings and comments). Every edge case is another regex, and the patterns become increasingly unreadable over time. Indentation engines are complex. SMIE (the generic indentation engine for non-TreeSitter modes) requires defining operator precedence grammars for the language, which is hard to get right. Custom indentation functions tend to grow into large, brittle state machines. Tuareg’s indentation code, for example, is thousands of lines long. Use to verify the node type at point matches your query. Set to to see which rules are firing. Check the font-lock feature level – your rule might be in level 4 while the user has the default level 3. The features are assigned to levels via . Remember that rule order matters . Without , an earlier rule that already fontified a region will prevent later rules from applying. This can be intentional (e.g. builtin types at level 3 take precedence over generic types) or a source of bugs. Set to – this logs which rule matched for each line, what anchor was computed, and the final column. Use to understand the parent chain. The key question is always: “what is the parent node, and which rule matches it?” Remember that rule order matters for indentation too – the first matching rule wins. A typical set of rules reads top to bottom from most specific to most general: Watch out for the empty-line problem : when the cursor is on a blank line, TreeSitter has no node at point. The indentation engine falls back to the root node as the parent, which typically matches the top-level rule and gives column 0. In neocaml I solved this with a rule that looks at the previous line’s last token to decide indentation: See the excellent scope discussion in the tree-sitter-clojure repo for the rationale.  ↩︎ There’s ongoing discussion in the Emacs community about distributing pre-compiled grammar binaries, but nothing concrete yet.  ↩︎

0 views
<antirez> 2 months ago

Implementing a clear room Z80 / ZX Spectrum emulator with Claude Code

Anthropic recently released a blog post with the description of an experiment in which the last version of Opus, the 4.6, was instructed to write a C compiler in Rust, in a “clean room” setup. The experiment methodology left me dubious about the kind of point they wanted to make. Why not provide the agent with the ISA documentation? Why Rust? Writing a C compiler is exactly a giant graph manipulation exercise: the kind of program that is harder to write in Rust. Also, in a clean room experiment, the agent should have access to all the information about well established computer science progresses related to optimizing compilers: there are a number of papers that could be easily synthesized in a number of markdown files. SSA, register allocation, instructions selection and scheduling. Those things needed to be researched *first*, as a prerequisite, and the implementation would still be “clean room”. Not allowing the agent to access the Internet, nor any other compiler source code, was certainly the right call. Less understandable is the almost-zero steering principle, but this is coherent with a certain kind of experiment, if the goal was showcasing the completely autonomous writing of a large project. Yet, we all know how this is not how coding agents are used in practice, most of the time. Who uses coding agents extensively knows very well how, even never touching the code, a few hits here and there completely changes the quality of the result. # The Z80 experiment I thought it was time to try a similar experiment myself, one that would take one or two hours at max, and that was compatible with my Claude Code Max plan: I decided to write a Z80 emulator, and then a ZX Spectrum emulator (and even more, a CP/M emulator, see later) in a condition that I believe makes a more sense as “clean room” setup. The result can be found here: https://github.com/antirez/ZOT. # The process I used 1. I wrote a markdown file with the specification of what I wanted to do. Just English, high level ideas about the scope of the Z80 emulator to implement. I said things like: it should execute a whole instruction at a time, not a single clock step, since this emulator must be runnable on things like an RP2350 or similarly limited hardware. The emulator should correctly track the clock cycles elapsed (and I specified we could use this feature later in order to implement the ZX Spectrum contention with ULA during memory accesses), provide memory access callbacks, and should emulate all the known official and unofficial instructions of the Z80. For the Spectrum implementation, performed as a successive step, I provided much more information in the markdown file, like, the kind of rendering I wanted in the RGB buffer, and how it needed to be optional so that embedded devices could render the scanlines directly as they transferred them to the ST77xx display (or similar), how it should be possible to interact with the I/O port to set the EAR bit to simulate cassette loading in a very authentic way, and many other desiderata I had about the emulator. This file also included the rules that the agent needed to follow, like: * Accessing the internet is prohibited, but you can use the specification and test vectors files I added inside ./z80-specs. * Code should be simple and clean, never over-complicate things. * Each solid progress should be committed in the git repository. * Before committing, you should test that what you produced is high quality and that it works. * Write a detailed test suite as you add more features. The test must be re-executed at every major change. * Code should be very well commented: things must be explained in terms that even people not well versed with certain Z80 or Spectrum internals details should understand. * Never stop for prompting, the user is away from the keyboard. * At the end of this file, create a work in progress log, where you note what you already did, what is missing. Always update this log. * Read this file again after each context compaction. 2. Then, I started a Claude Code session, and asked it to fetch all the useful documentation on the internet about the Z80 (later I did this for the Spectrum as well), and to extract only the useful factual information into markdown files. I also provided the binary files for the most ambitious test vectors for the Z80, the ZX Spectrum ROM, and a few other binaries that could be used to test if the emulator actually executed the code correctly. Once all this information was collected (it is part of the repository, so you can inspect what was produced) I completely removed the Claude Code session in order to make sure that no contamination with source code seen during the search was possible. 3. I started a new session, and asked it to check the specification markdown file, and to check all the documentation available, and start implementing the Z80 emulator. The rules were to never access the Internet for any reason (I supervised the agent while it was implementing the code, to make sure this didn’t happen), to never search the disk for similar source code, as this was a “clean room” implementation. 4. For the Z80 implementation, I did zero steering. For the Spectrum implementation I used extensive steering for implementing the TAP loading. More about my feedback to the agent later in this post. 5. As a final step, I copied the repository in /tmp, removed the “.git” repository files completely, started a new Claude Code (and Codex) session and claimed that the implementation was likely stolen or too strongly inspired from somebody else's work. The task was to check with all the major Z80 implementations if there was evidence of theft. The agents (both Codex and Claude Code), after extensive search, were not able to find any evidence of copyright issues. The only similar parts were about well established emulation patterns and things that are Z80 specific and can’t be made differently, the implementation looked distinct from all the other implementations in a significant way. # Results Claude Code worked for 20 or 30 minutes in total, and produced a Z80 emulator that was able to pass ZEXDOC and ZEXALL, in 1200 lines of very readable and well commented C code (1800 lines with comments and blank spaces). The agent was prompted zero times during the implementation, it acted absolutely alone. It never accessed the internet, and the process it used to implement the emulator was of continuous testing, interacting with the CP/M binaries implementing the ZEXDOC and ZEXALL, writing just the CP/M syscalls needed to produce the output on the screen. Multiple times it also used the Spectrum ROM and other binaries that were available, or binaries it created from scratch to see if the emulator was working correctly. In short: the implementation was performed in a very similar way to how a human programmer would do it, and not outputting a complete implementation from scratch “uncompressing” it from the weights. Instead, different classes of instructions were implemented incrementally, and there were bugs that were fixed via integration tests, debugging sessions, dumps, printf calls, and so forth. # Next step: the ZX Spectrum I repeated the process again. I instructed the documentation gathering session very accurately about the kind of details I wanted it to search on the internet, especially the ULA interactions with RAM access, the keyboard mapping, the I/O port, how the cassette tape worked and the kind of PWM encoding used, and how it was encoded into TAP or TZX files. As I said, this time the design notes were extensive since I wanted this emulator to be specifically designed for embedded systems, so only 48k emulation, optional framebuffer rendering, very little additional memory used (no big lookup tables for ULA/Z80 access contention), ROM not copied in the RAM to avoid using additional 16k of memory, but just referenced during the initialization (so we have just a copy in the executable), and so forth. The agent was able to create a very detailed documentation about the ZX Spectrum internals. I provided a few .z80 images of games, so that it could test the emulator in a real setup with real software. Again, I removed the session and started fresh. The agent started working and ended 10 minutes later, following a process that really fascinates me, and that probably you know very well: the fact is, you see the agent working using a number of diverse skills. It is expert in everything programming related, so as it was implementing the emulator, it could immediately write a detailed instrumentation code to “look” at what the Z80 was doing step by step, and how this changed the Spectrum emulation state. In this respect, I believe automatic programming to be already super-human, not in the sense it is currently capable of producing code that humans can’t produce, but in the concurrent usage of different programming languages, system programming techniques, DSP stuff, operating system tricks, math, and everything needed to reach the result in the most immediate way. When it was done, I asked it to write a simple SDL based integration example. The emulator was immediately able to run the Jetpac game without issues, with working sound, and very little CPU usage even on my slow Dell Linux machine (8% usage of a single core, including SDL rendering). Once the basic stuff was working, I wanted to load TAP files directly, simulating cassette loading. This was the first time the agent missed a few things, specifically about the timing the Spectrum loading routines expected, and here we are in the territory where LLMs start to perform less efficiently: they can’t easily run the SDL emulator and see the border changing as data is received and so forth. I asked Claude Code to do a refactoring so that zx_tick() could be called directly and was not part of zx_frame(), and to make zx_frame() a trivial wrapper. This way it was much simpler to sync EAR with what it expected, without callbacks or the wrong abstractions that it had implemented. After such change, a few minutes later the emulator could load a TAP file emulating the cassette without problems. This is how it works now: do { zx_set_ear(zx, tzx_update(&tape, zx->cpu.clocks)); } while (!zx_tick(zx, 0)); I continued prompting Claude Code in order to make the key bindings more useful and a few things more. # CP/M One thing that I found really interesting was the ability of the LLM to inspect the COM files for ZEXALL / ZEXCOM tests for the Z80, easily spot the CP/M syscalls that were used (a total of three), and implement them for the extended z80 test (executed by make fulltest). So, at this point, why not implement a full CP/M environment? Same process again, same good result in a matter of minutes. This time I interacted with it a bit more for the VT100 / ADM3 terminal escapes conversions, reported things not working in WordStar initially, and in a few minutes everything I tested was working well enough (but, there are fixes to do, like simulating a 2Mhz clock, right now it runs at full speed making CP/M games impossible to use). # What is the lesson here? The obvious lesson is: always provide your agents with design hints and extensive documentation about what they are going to do. Such documentation can be obtained by the agent itself. And, also, make sure the agent has a markdown file with the rules of how to perform the coding tasks, and a trace of what it is doing, that is updated and read again quite often. But those tricks, I believe, are quite clear to everybody that has worked extensively with automatic programming in the latest months. To think in terms of “what a human would need” is often the best bet, plus a few LLMs specific things, like the forgetting issue after context compaction, the continuous ability to verify it is on the right track, and so forth. Returning back to the Anthropic compiler attempt: one of the steps that the agent failed was the one that was more strongly related to the idea of memorization of what is in the pretraining set: the assembler. With extensive documentation, I can’t see any way Claude Code (and, even more, GPT5.3-codex, which is in my experience, for complex stuff, more capable) could fail at producing a working assembler, since it is quite a mechanical process. This is, I think, in contradiction with the idea that LLMs are memorizing the whole training set and uncompress what they have seen. LLMs can memorize certain over-represented documents and code, but while they can extract such verbatim parts of the code if prompted to do so, they don’t have a copy of everything they saw during the training set, nor they spontaneously emit copies of already seen code, in their normal operation. We mostly ask LLMs to create work that requires assembling different knowledge they possess, and the result is normally something that uses known techniques and patterns, but that is new code, not constituting a copy of some pre-existing code. It is worth noting, too, that humans often follow a less rigorous process compared to the clean room rules detailed in this blog post, that is: humans often download the code of different implementations related to what they are trying to accomplish, read them carefully, then try to avoid copying stuff verbatim but often times they take strong inspiration. This is a process that I find perfectly acceptable, but it is important to take in mind what happens in the reality of code written by humans. After all, information technology evolved so fast even thanks to this massive cross pollination effect. For all the above reasons, when I implement code using automatic programming, I don’t have problems releasing it MIT licensed, like I did with this Z80 project. In turn, this code base will constitute quality input for the next LLMs training, including open weights ones. # Next steps To make my experiment more compelling, one should try to implement a Z80 and ZX Spectrum emulator without providing any documentation to the agent, and then compare the result of the implementation. I didn’t find the time to do it, but it could be quite informative. Comments

0 views
Anton Zhiyanov 3 months ago

Allocators from C to Zig

An allocator is a tool that reserves memory (typically on the heap) so a program can store its data structures there. Many C programs use the standard libc allocator, or at best, let you switch it out for another one like jemalloc or mimalloc. Unlike C, modern systems languages usually treat allocators as first-class citizens. Let's look at how they handle allocation and then create a C allocator following their approach. Rust • Zig • Odin • C3 • Hare • C • Final thoughts Rust is one of the older languages we'll be looking at, and it handles memory allocation in a more traditional way. Right now, it uses a global allocator, but there's an experimental Allocator API implemented behind a feature flag (issue #32838 ). We'll set the experimental API aside and focus on the stable one. The documentation begins with a clear statement: In a given program, the standard library has one "global" memory allocator that is used for example by and . Followed by a vague one: Currently the default global allocator is unspecified. It doesn't mean that a Rust program will abort an allocation, of course. In practice, Rust uses the system allocator as the global default (but the Rust developers don't want to commit to this, hence the "unspecified" note): The global allocator interface is defined by the trait in the module. It requires the implementor to provide two essential methods — and , and provides two more based on them — and : The struct describes a piece of memory we want to allocate — its size in bytes and alignment: Memory alignment Alignment restricts where a piece of data can start in memory. The memory address for the data has to be a multiple of a certain number, which is always a power of 2. Alignment depends on the type of data: CPUs are designed to read "aligned" memory efficiently. For example, if you read a 4-byte integer starting at address 0x03 (which is unaligned), the CPU has to do two memory reads — one for the first byte and another for the other three bytes — and then combine them. But if the integer starts at address 0x04 (which is aligned), the CPU can read all four bytes at once. Aligned memory is also needed for vectorized CPU operations (SIMD), where one processor instruction handles a group of values at once instead of just one. The compiler knows the size and alignment for each type, so we can use the constructor or helper functions to create a valid layout: Don't be surprised that a takes up 32 bytes. In Rust, the type can grow, so it stores a data pointer, a length, and a capacity (3 × 8 = 24 bytes). There's also 1 byte for the boolean and 7 bytes of padding (because of 8-byte alignment), making a total of 32 bytes. is the default memory allocator provided by the operating system. The exact implementation depends on the platform . It implements the trait and is used as the global allocator by default, but the documentation does not guarantee this (remember the "unspecified" note?). If you want to explicitly set as the global allocator, you can use the attribute: You can also set a custom allocator as global, like in this example: To use the global allocator directly, call the and functions: In practice, people rarely use or directly. Instead, they work with types like , or that handle allocation for them: The allocator doesn't abort if it can't allocate memory; instead, it returns (which is exactly what recommends): The documentation recommends using the function to signal out-of-memory errors. It immediately aborts the process, or panics if the binary isn't linked to the standard library. Unlike the low-level function, types like or call if allocation fails, so the program usually aborts if it runs out of memory: Allocator API • Memory allocation APIs Memory management in Zig is explicit. There is no default global allocator, and any function that needs to allocate memory accepts an allocator as a separate parameter. This makes the code a bit more verbose, but it matches Zig's goal of giving programmers as much control and transparency as possible. An allocator in Zig is a struct with an opaque self-pointer and a method table with four methods: Unlike Rust's allocator methods, which take a raw pointer and a size as arguments, Zig's allocator methods take a slice of bytes ( ) — a type that combines both a pointer and a length. Another interesting difference is the optional parameter, which is the first return address in the allocation call stack. Some allocators, like the , use it to keep track of which function requested memory. This helps with debugging issues related to memory allocation. Just like in Rust, allocator methods don't return errors. Instead, and return if they fail. Zig also provides type-safe wrappers that you can use instead of calling the allocator methods directly: Unlike the allocator methods, these allocation functions return an error if they fail. If a function or method allocates memory, it expects the developer to provide an allocator instance: Zig's standard library includes several built-in allocators in the namespace. asks the operating system for entire pages of memory, each allocation is a syscall: allocates memory into a fixed buffer and doesn't make any heap allocations: wraps a child allocator and allows you to allocate many times and only free once: The call frees all memory. Individual calls are no-ops. (aka ) is a safe allocator that can prevent double-free, use-after-free and can detect leaks: is a general-purpose thread-safe allocator designed for maximum performance on multithreaded machines: is a wrapper around the libc allocator: Zig doesn't panic or abort when it can't allocate memory. An allocation failure is just a regular error that you're expected to handle: Allocators • std.mem.Allocator • std.heap Odin supports explicit allocators, but, unlike Zig, it's not the only option. In Odin, every scope has an implicit variable that provides a default allocator: If you don't pass an allocator to a function, it uses the one currently set in the context. An allocator in Odin is a struct with an opaque self-pointer and a single function pointer: Unlike other languages, Odin's allocator uses a single procedure for all allocation tasks. The specific action — like allocating, resizing, or freeing memory — is decided by the parameter. The allocation procedure returns the allocated memory (for and operations) and an error ( on success). Odin provides low-level wrapper functions in the package that call the allocator procedure using a specific mode: There are also type-safe builtins like / (for a single object) and / (for multiple objects) that you can use instead of the low-level interface: By default, all builtins use the context allocator, but you can pass a custom allocator as an optional parameter: To use a different allocator for a specific block of code, you can reassign it in the context: Odin's provides two different allocators: When using the temp allocator, you only need a single call to clear all the allocated memory. Odin's standard library includes several allocators, found in the and packages. The procedure returns a general-purpose allocator: uses a single backing buffer for allocations, allowing you to allocate many times and only free once: detects leaks and invalid memory access, similar to in Zig: There are also others, such as or . Like Zig, Odin doesn't panic or abort when it can't allocate memory. Instead, it returns an error code as the second return value: Allocators • base:runtime • core:mem Like Zig and Odin, C3 supports explicit allocators. Like Odin, C3 provides two default allocators: heap and temp. An allocator in C3 is a interface with an additional option of zeroing or not zeroing the allocated memory: Unlike Zig and Odin, the and methods don't take the (old) size as a parameter — neither directly like Odin nor through a slice like Zig. This makes it a bit harder to create custom allocators because the allocator has to keep track of the size along with the allocated memory. On the other hand, this approach makes C interop easier (if you use the default C3 allocator): data allocated in C can be freed in C3 without needing to pass the size parameter from the C code. Like in Odin, allocator methods return an error if they fail. C3 provides low-level wrapper macros in the module that call allocator methods: These either return an error (the -suffix macros) or abort if they fail. There are also functions and macros with similar names in the module that use the global allocator instance: If a function or method allocates memory, it often expects the developer to provide an allocator instance: C3 provides two thread-local allocator instances: There are functions and macros in the module that use the temporary allocator: To macro releases all temporary allocations when leaving the scope: Some types, like or , use the temp allocator by default if they are not initialized: C3's standard library includes several built-in allocators, found in the module. is a wrapper around libc's malloc/free: uses a single backing buffer for allocations, allowing you to allocate many times and only free once: detects leaks and invalid memory access: There are also others, such as or . Like Zig and Odin, C3 can return an error in case of allocation failure: C3 can also abort in case of allocation failure: Since the functions and macros in the module use instead of , it looks like aborting on failure is the preferred approach. Memory Handling • core::mem::alocator • core::mem Unlike other languages, Hare doesn't support explicit allocators. The standard library has multiple allocator implementations, but only one of them is used at runtime. Hare's compiler expects the runtime to provide and implementations: The programmer isn't supposed to access them directly (although it's possible by importing and calling or ). Instead, Hare uses them to provide higher-level allocation helpers. Hare offers two high-level allocation helpers that use the global allocator internally: and . can allocate individual objects. It takes a value, not a type: can also allocate slices if you provide a second parameter (the number of items): works correctly with both pointers to single objects (like ) and slices (like ). Hare's standard library has three built-in memory allocators: The allocator that's actually used is selected at compile time. Like other languages, Hare returns an error in case of allocation failure: You can abort on error with : Or propagate the error with : Dynamic memory allocation • malloc.ha Many C programs use the standard libc allocator, or at most, let you swap it out for another one using macros: Or using a simple setter: While this might work for switching the libc allocator to jemalloc or mimalloc, it's not very flexible. For example, trying to implement an arena allocator with this kind of API is almost impossible. Now that we've seen the modern allocator design in Zig, Odin, and C3 — let's try building something similar in C. There are a lot of small choices to make, and I'm going with what I personally prefer. I'm not saying this is the only way to design an allocator — it's just one way out of many. Our allocator should return an error instead of if it fails, so we'll need an error enum: The allocation function needs to return either a tagged union (value | error) or a tuple (value, error). Since C doesn't have these built in, let's use a custom tuple type: The next step is the allocator interface. I think Odin's approach of using a single function makes the implementation more complicated than it needs to be, so let's create separate methods like Zig does: This approach to interface design is explained in detail in a separate post: Interfaces in C . Zig uses byte slices ( ) instead of raw memory pointers. We could make our own byte slice type, but I don't see any real advantage to doing that in C — it would just mean more type casting. So let's keep it simple and stick with like our ancestors did. Now let's create generic and wrappers: I'm taking for granted here to keep things simple. A more robust implementation should properly check if it is available or pass the type to directly. We can even create a separate pair of helpers for collections: We could use some macro tricks to make and work for both a single object and a collection. But let's not do that — I prefer to avoid heavy-magic macros in this post. As for the custom allocators, let's start with a libc wrapper. It's not particularly interesting, since it ignores most of the parameters, but still: Usage example: Now let's use that field to implement an arena allocator backed by a fixed-size buffer: Usage example: As shown in the examples above, the allocation method returns an error if something goes wrong. While checking for errors might not be as convenient as it is in Zig or Odin, it's still pretty straightforward: Here's an informal table comparing allocation APIs in the languages we've discussed: In Zig, you always have to specify the allocator. In Odin, passing an allocator is optional. In C3, some functions require you to pass an allocator, while others just use the global one. In Hare, there's a single global allocator. As we've seen, there's nothing magical about the allocators used in modern languages. While they're definitely more ergonomic and safe than C, there's nothing stopping us from using the same techniques in plain C. on Unix platforms; on Windows; : alignment = 1. Can start at any address (0, 1, 2, 3...). : alignment = 4. Must start at addresses divisible by 4 (0, 4, 8, 12...). : alignment = 8. Must start at addresses divisible by 8 (0, 8, 16...). is for general-purpose allocations. It uses the operating system's heap allocator. is for short-lived allocations. It uses a scratch allocator (a kind of growing arena). is for general-purpose allocations. It uses a operating system's heap allocator (typically a libc wrapper). is for short-lived allocations. It uses an arena allocator. The default allocator is based on the algorithm from the Verified sequential malloc/free paper. The libc allocator uses the operating system's malloc and free functions from libc. The debug allocator uses a simple mmap-based method for memory allocation.

0 views

Rewriting pycparser with the help of an LLM

pycparser is my most widely used open source project (with ~20M daily downloads from PyPI [1] ). It's a pure-Python parser for the C programming language, producing ASTs inspired by Python's own . Until very recently, it's been using PLY: Python Lex-Yacc for the core parsing. In this post, I'll describe how I collaborated with an LLM coding agent (Codex) to help me rewrite pycparser to use a hand-written recursive-descent parser and remove the dependency on PLY. This has been an interesting experience and the post contains lots of information and is therefore quite long; if you're just interested in the final result, check out the latest code of pycparser - the main branch already has the new implementation. While pycparser has been working well overall, there were a number of nagging issues that persisted over years. I began working on pycparser in 2008, and back then using a YACC-based approach for parsing a whole language like C seemed like a no-brainer to me. Isn't this what everyone does when writing a serious parser? Besides, the K&R2 book famously carries the entire grammar of the C99 language in an appendix - so it seemed like a simple matter of translating that to PLY-yacc syntax. And indeed, it wasn't too hard, though there definitely were some complications in building the ASTs for declarations (C's gnarliest part ). Shortly after completing pycparser, I got more and more interested in compilation and started learning about the different kinds of parsers more seriously. Over time, I grew convinced that recursive descent is the way to go - producing parsers that are easier to understand and maintain (and are often faster!). It all ties in to the benefits of dependencies in software projects as a function of effort . Using parser generators is a heavy conceptual dependency: it's really nice when you have to churn out many parsers for small languages. But when you have to maintain a single, very complex parser, as part of a large project - the benefits quickly dissipate and you're left with a substantial dependency that you constantly grapple with. And then there are the usual problems with dependencies; dependencies get abandoned, and they may also develop security issues. Sometimes, both of these become true. Many years ago, pycparser forked and started vendoring its own version of PLY. This was part of transitioning pycparser to a dual Python 2/3 code base when PLY was slower to adapt. I believe this was the right decision, since PLY "just worked" and I didn't have to deal with active (and very tedious in the Python ecosystem, where packaging tools are replaced faster than dirty socks) dependency management. A couple of weeks ago this issue was opened for pycparser. It turns out the some old PLY code triggers security checks used by some Linux distributions; while this code was fixed in a later commit of PLY, PLY itself was apparently abandoned and archived in late 2025. And guess what? That happened in the middle of a large rewrite of the package, so re-vendoring the pre-archiving commit seemed like a risky proposition. On the issue it was suggested that "hopefully the dependent packages move on to a non-abandoned parser or implement their own"; I originally laughed this idea off, but then it got me thinking... which is what this post is all about. The original K&R2 grammar for C99 had - famously - a single shift-reduce conflict having to do with dangling else s belonging to the most recent if statement. And indeed, other than the famous lexer hack used to deal with C's type name / ID ambiguity , pycparser only had this single shift-reduce conflict. But things got more complicated. Over the years, features were added that weren't strictly in the standard but were supported by all the industrial compilers. The more advanced C11 and C23 standards weren't beholden to the promises of conflict-free YACC parsing (since almost no industrial-strength compilers use YACC at this point), so all caution went out of the window. The latest (PLY-based) release of pycparser has many reduce-reduce conflicts [2] ; these are a severe maintenance hazard because it means the parsing rules essentially have to be tie-broken by order of appearance in the code. This is very brittle; pycparser has only managed to maintain its stability and quality through its comprehensive test suite. Over time, it became harder and harder to extend, because YACC parsing rules have all kinds of spooky-action-at-a-distance effects. The straw that broke the camel's back was this PR which again proposed to increase the number of reduce-reduce conflicts [3] . This - again - prompted me to think "what if I just dump YACC and switch to a hand-written recursive descent parser", and here we are. None of the challenges described above are new; I've been pondering them for many years now, and yet biting the bullet and rewriting the parser didn't feel like something I'd like to get into. By my private estimates it'd take at least a week of deep heads-down work to port the gritty 2000 lines of YACC grammar rules to a recursive descent parser [4] . Moreover, it wouldn't be a particularly fun project either - I didn't feel like I'd learn much new and my interests have shifted away from this project. In short, the Potential well was just too deep. I've definitely noticed the improvement in capabilities of LLM coding agents in the past few months, and many reputable people online rave about using them for increasingly larger projects. That said, would an LLM agent really be able to accomplish such a complex project on its own? This isn't just a toy, it's thousands of lines of dense parsing code. What gave me hope is the concept of conformance suites mentioned by Simon Willison . Agents seem to do well when there's a very clear and rigid goal function - such as a large, high-coverage conformance test suite. And pycparser has an very extensive one . Over 2500 lines of test code parsing various C snippets to ASTs with expected results, grown over a decade and a half of real issues and bugs reported by users. I figured the LLM can either succeed or fail and throw its hands up in despair, but it's quite unlikely to produce a wrong port that would still pass all the tests. So I set it to run. I fired up Codex in pycparser's repository, and wrote this prompt just to make sure it understands me and can run the tests: Codex figured it out (I gave it the exact command, after all!); my next prompt was the real thing [5] : Here Codex went to work and churned for over an hour . Having never observed an agent work for nearly this long, I kind of assumed it went off the rails and will fail sooner or later. So I was rather surprised and skeptical when it eventually came back with: It took me a while to poke around the code and run it until I was convinced - it had actually done it! It wrote a new recursive descent parser with only ancillary dependencies on PLY, and that parser passed the test suite. After a few more prompts, we've removed the ancillary dependencies and made the structure clearer. I hadn't looked too deeply into code quality at this point, but at least on the functional level - it succeeded. This was very impressive! A change like the one described above is impossible to code-review as one PR in any meaningful way; so I used a different strategy. Before embarking on this path, I created a new branch and once Codex finished the initial rewrite, I committed this change, knowing that I will review it in detail, piece-by-piece later on. Even though coding agents have their own notion of history and can "revert" certain changes, I felt much safer relying on Git. In the worst case if all of this goes south, I can nuke the branch and it's as if nothing ever happened. I was determined to only merge this branch onto main once I was fully satisfied with the code. In what follows, I had to git reset several times when I didn't like the direction in which Codex was going. In hindsight, doing this work in a branch was absolutely the right choice. Once I've sufficiently convinced myself that the new parser is actually working, I used Codex to similarly rewrite the lexer and get rid of the PLY dependency entirely, deleting it from the repository. Then, I started looking more deeply into code quality - reading the code created by Codex and trying to wrap my head around it. And - oh my - this was quite the journey. Much has been written about the code produced by agents, and much of it seems to be true. Maybe it's a setting I'm missing (I'm not using my own custom AGENTS.md yet, for instance), but Codex seems to be that eager programmer that wants to get from A to B whatever the cost. Readability, minimalism and code clarity are very much secondary goals. Using raise...except for control flow? Yep. Abusing Python's weak typing (like having None , false and other values all mean different things for a given variable)? For sure. Spreading the logic of a complex function all over the place instead of putting all the key parts in a single switch statement? You bet. Moreover, the agent is hilariously lazy . More than once I had to convince it to do something it initially said is impossible, and even insisted again in follow-up messages. The anthropomorphization here is mildly concerning, to be honest. I could never imagine I would be writing something like the following to a computer, and yet - here we are: "Remember how we moved X to Y before? You can do it again for Z, definitely. Just try". My process was to see how I can instruct Codex to fix things, and intervene myself (by rewriting code) as little as possible. I've mostly succeeded in this, and did maybe 20% of the work myself. My branch grew dozens of commits, falling into roughly these categories: Interestingly, after doing (3), the agent was often more effective in giving the code a "fresh look" and succeeding in either (1) or (2). Eventually, after many hours spent in this process, I was reasonably pleased with the code. It's far from perfect, of course, but taking the essential complexities into account, it's something I could see myself maintaining (with or without the help of an agent). I'm sure I'll find more ways to improve it in the future, but I have a reasonable degree of confidence that this will be doable. It passes all the tests, so I've been able to release a new version (3.00) without major issues so far. The only issue I've discovered is that some of CFFI's tests are overly precise about the phrasing of errors reported by pycparser; this was an easy fix . The new parser is also faster, by about 30% based on my benchmarks! This is typical of recursive descent when compared with YACC-generated parsers, in my experience. After reviewing the initial rewrite of the lexer, I've spent a while instructing Codex on how to make it faster, and it worked reasonably well. While working on this, it became quite obvious that static typing would make the process easier. LLM coding agents really benefit from closed loops with strict guardrails (e.g. a test suite to pass), and type-annotations act as such. For example, had pycparser already been type annotated, Codex would probably not have overloaded values to multiple types (like None vs. False vs. others). In a followup, I asked Codex to type-annotate pycparser (running checks using ty ), and this was also a back-and-forth because the process exposed some issues that needed to be refactored. Time will tell, but hopefully it will make further changes in the project simpler for the agent. Based on this experience, I'd bet that coding agents will be somewhat more effective in strongly typed languages like Go, TypeScript and especially Rust. Overall, this project has been a really good experience, and I'm impressed with what modern LLM coding agents can do! While there's no reason to expect that progress in this domain will stop, even if it does - these are already very useful tools that can significantly improve programmer productivity. Could I have done this myself, without an agent's help? Sure. But it would have taken me much longer, assuming that I could even muster the will and concentration to engage in this project. I estimate it would take me at least a week of full-time work (so 30-40 hours) spread over who knows how long to accomplish. With Codex, I put in an order of magnitude less work into this (around 4-5 hours, I'd estimate) and I'm happy with the result. It was also fun . At least in one sense, my professional life can be described as the pursuit of focus, deep work and flow . It's not easy for me to get into this state, but when I do I'm highly productive and find it very enjoyable. Agents really help me here. When I know I need to write some code and it's hard to get started, asking an agent to write a prototype is a great catalyst for my motivation. Hence the meme at the beginning of the post. One can't avoid a nagging question - does the quality of the code produced by agents even matter? Clearly, the agents themselves can understand it (if not today's agent, then at least next year's). Why worry about future maintainability if the agent can maintain it? In other words, does it make sense to just go full vibe-coding? This is a fair question, and one I don't have an answer to. Right now, for projects I maintain and stand behind , it seems obvious to me that the code should be fully understandable and accepted by me, and the agent is just a tool helping me get to that state more efficiently. It's hard to say what the future holds here; it's going to interesting, for sure. There was also the lexer to consider, but this seemed like a much simpler job. My impression is that in the early days of computing, lex gained prominence because of strong regexp support which wasn't very common yet. These days, with excellent regexp libraries existing for pretty much every language, the added value of lex over a custom regexp-based lexer isn't very high. That said, it wouldn't make much sense to embark on a journey to rewrite just the lexer; the dependency on PLY would still remain, and besides, PLY's lexer and parser are designed to work well together. So it wouldn't help me much without tackling the parser beast. The code in X is too complex; why can't we do Y instead? The use of X is needlessly convoluted; change Y to Z, and T to V in all instances. The code in X is unclear; please add a detailed comment - with examples - to explain what it does.

0 views