Posts in Java (20 found)
Neil Madden 3 days ago

Fluent Visitors: revisiting a classic design pattern

It’s been a while since I’ve written a pure programming post. I was recently implementing a specialist collection class that contained items of a number of different types. I needed to be able to iterate over the collection performing different actions depending on the specific type. There are lots of different ways to do this, depending on the school of programming you prefer. In this article, I’m going to take a look at a classic “Gang of Four” design pattern: The Visitor Pattern . I’ll describe how it works, provide some modern spins on it, and compare it to other ways of implementing the same functionality. Hopefully even the most die-hard anti-OO/patterns reader will come away thinking that there’s something worth knowing here after all. (Design Patterns? In this economy?) The example I’ll use in this post is a simple arithmetic expression language. It’s the kind of boring and not very realistic example you see all the time in textbooks, but the more realistic examples I have to hand have too many weird details, so this’ll do. I’m going to write everything in Java 25. Java because, after Smalltalk, it’s probably the language most associated with design patterns. And Java 25 specifically because it makes this example really nice to write. OK, our expression language just has floating-point numbers, addition, and multiplication. So we start by defining datatypes to represent these: If you’re familiar with a functional programming language, this is effectively the same as a datatype definition like the following: Now we want to define a bunch of different operations over these expressions: evaluation, pretty-printing, maybe type-checking or some other kinds of static analysis. We could just directly expose the Expression sub-classes and let each operation directly traverse the structure using pattern matching. For example, we can add an method directly to the expression class that evaluates the expression: (Incidentally, isn’t this great? It’s taken a long time, but I really like how clean this is in modern Java). We can then try out an example: Which gives us: There are some issues with this though. Firstly, there’s no encapsulation. If we want to change the way expressions are represented then we have to change eval() and any other function that’s been defined in this way. Secondly, although it’s straightforward for this small expression language, there can be a lot of duplication in operations over a complex structure dealing with details of traversing that structure. The Visitor Pattern solves both of these issues, as we’ll show now. The basic Visitor Pattern involves creating an interface with callback methods for each type of object you might encounter when traversing a structure. For our example, it looks like the following: A few things to note here: The next part of the pattern is to add an method to the Expression class, which then traverses the data structure invoking the callbacks as appropriate. In the traditional implementation, this method is implemented on each concrete sub-class using a technique known as “double-dispatch”. For example, we could add an implementation of to the Add class that calls . This technique is still sometimes useful, but I find it’s often clearer to just inline all that into the top-level Expression implementation (as a method implementation, because Expression is an interface): What’s going on here? Firstly, the method is parameterised to accept any type of return value. Again, we’ll see why in a moment. It then inspects the specific type of expression of this object and calls the appropriate callback on the visitor. Note that in the Add/Mul cases we also recursively visit the left-hand-side and right-hand-side expressions first, similarly to how we called .eval() on those in the earlier listing. We can then re-implement our expression evaluator in terms of the visitor: OK, that works. But it’s kinda ugly compared to what we had. Can we improve it? Yes, we can. The Visitor is really just a set of callback functions, one for each type of object in our data structure. Rather than defining these callbacks as an implementation of the interface, we could instead define them as three separate lambda functions. We can then invoke these instead: We can then use this to reimplement our expression evaluator again: That’s a lot nicer to look at. We can then call it as before, and we can also use the fluent visitor to define operations on the fly, such as printing a nicer string representation: There are some potential drawbacks to this approach, but overall I think it’s really clean and nice. One drawback is that you lose compile-time checking that all the cases have been handled: if you forget to register one of the callbacks you’ll get a runtime NullPointerException instead. There are ways around this, such as using multiple FluentVisitor types that incrementally construct the callbacks, but that’s more work: That ensures that every callback has to be provided before you can call , at the cost of needing many more classes. This is the sort of thing where good IDE support would really help (IntelliJ plugin anyone?). Another easy-to-fix nit is that, if you don’t care about the result, it is easy to forget to call and thus not actually do anything at all. This can be fixed by changing the method to accept a function rather than returning a FluentVisitor: The encapsulation that the Visitor provides allows us to quite radically change the underlying representation, while still preserving the same logical view of the data. For example, here is an alternative implementation that works only on positive integers and stores expressions in a compact reverse Polish notation (RPN). The exact same visitors we defined for the previous expression evaluator will also work for this one: Hopefully this article has shown you that there is still something interesting about old patterns like the Visitor, especially if you adapt them a bit to modern programming idioms. I often hear fans of functional programming stating that the Visitor pattern only exists to make up for the lack of pattern matching in OO languages like Java. In my opinion, this is the wrong way to think about things. Even when you have pattern matching (as Java now does) the Visitor pattern is still useful due to the increased encapsulation it provides, hiding details of the underlying representation. The correct way to think about the Visitor pattern is as a natural generalisation of the reduce/fold operation common in functional programming languages. Consider the following (imperative) implementation of a left-fold operation over a list: We can think of a linked list as a data structure with two constructors: Nil (the empty list), and Cons(List, List). In this case, the reduce operation is essentially a Visitor pattern where corresponds to the case and corresponds to the case. So, far from being a poor man’s pattern matching, the true essence of the Visitor is a generalised fold operation, which is why it’s so useful. Maybe this old dog still has some nice tricks, eh? We use a generic type parameter <T> to allow operations to return different types of results depending on what they do. We’ll see how this works in a bit. In keeping with the idea of encapsulating details, we use the more abstract type rather than the concrete type we’re using under the hood. (We could also have done this before, but I’m doing it here to illustrate that the Visitor interface doesn’t have to exactly represent the underlying data structures).

0 views

Gatekeepers vs. Matchmakers

I estimate I’ve conducted well over 1,000 job interviews for developers and managers in my career. This has caused me to form opinions about what makes a good interview. I’ve spent the majority of it in fast-growing companies and, with the exception of occasional pauses here and there, we were always hiring. I’ve interviewed at every level from intern (marathon sessions at the University of Waterloo campus interviewing dozens of candidates over a couple of days) to VP and CTO level (my future bosses in some cases, my successor in roles I was departing in others). Probably the strongest opinion that I hold after all that is: adopting a Matchmaker approach builds much better teams than falling into Gatekeeper mode. Once the candidate has passed some sort of initial screen, either with a recruiter, the hiring manager, or both, most “primary” interviews are conducted with two to three employees interviewing the candidate—often the hiring manager and an individual contributor. (Of course, there are innumerable ways you can structure this, but that structure is what I’ve seen to be the most common.) Interviewers usually start with one of two postures when interviewing: the Gatekeeper or the Matchmaker : The former, the Gatekeeper , I would say is more common overall and certainly more common among individual contributors and people earlier in their career. It’s also a big driver of why a lot of interview processes include some sort of coding “test” meant to expose the fraudulent scammers pretending to be “real” programmers. All of that dates back to the early 2000s and the post-dotcom crash. Pre-crash, anyone with a pulse who could string together some HTML could get a “software developer” job, so there were a lot of people with limited experience and skills on the job market. Nowadays, aside from outright fraudsters (which are rare) I haven’t observed many wholly unqualified people getting past the résumé review or initial screen. If you let Gatekeepers design your interview process, you’ll often get something that I refer to as “programmer Jeopardy! .” The candidate is peppered with what amount to trivia questions: …and so on. For most jobs where you’re building commercial software by gluing frameworks and APIs together, having vague (or even no) knowledge of those concepts is going to be plenty. Most devs can go a long time using Java or C# before getting into some sort of jam where learning intimate details of the garbage collector’s operation gets them out of it. (This wasn’t always true, but things have improved.) Of course, if the job you’re hiring for is some sort of specialist role around your databases, queuing systems, or infrastructure in general, you absolutely should probe for specialist knowledge of those things. But if the job is “full stack web developer,” where they’re mostly going to be writing business logic and user interface code, they may have plenty of experience and be very good at those things without ever having needed to learn about consensus algorithms and the like. Then, of course, there’s the much-discussed “coding challenge,” the worst versions of which involve springing a problem on someone, giving them a generous four or five minutes to read it, then expecting them to code a solution with a countdown timer running and multiple people watching them. Not everyone can put their best foot forward in those conditions, and after you’ve run the same exercise more than a few times with candidates, it’s easy to forget what the “first-look” experience is like for candidates. Maybe I’ll write a full post about it someday, but it’s my firm conviction that these types of tests have a false-negative rate so high that they’re counterproductive. Gatekeeper types often over-rotate on the fear of “fake” programmers getting hired and use these trivia-type questions and high-pressure exercises to disqualify people who would be perfectly capable of doing the job you need them to do and perfectly capable of learning any of that other stuff quickly, on an as-needed basis. If your interview process feels a bit like an elimination game show, you can probably do better. You, as a manager, are judged both on the quality of your hires and your ability to fill open roles. When the business budgets for a role to be filled, they do so because they expect a business outcome from hiring that person. Individual contributors are not generally rewarded or punished for hiring decisions, so their incentive is to avoid bringing in people who make extra work for them. Hiring an underskilled person onto the team is a good way to drag down productivity rather than improve it, as everyone has to spend some of their time carrying that person. Additionally, the absence of any kind of licensing or credentialing structure✳️ in programming creates a vacuum that the elimination game show tries to fill. In medicine, law, aviation, or the trades, there’s an external gatekeeper that ensures a baseline level of competence before anyone can even apply for a job. In software, there’s no equivalent, so it makes sense that some interviewers take a “prove to me you can do this job” approach out of the gate. But there’s a better way. “Matchmaking” in the romantic sense tries to pair up people with mutual compatibilities in the hopes that their relationship will also be mutually beneficial to both parties—a real “whole is greater than the sum of its parts” scenario. This should also be true of hiring. You have a need for some skills that will elevate and augment your team; candidates have a desire to do work that means something to them with people they like being around (and yes, money to pay the bills, of course). When people date each other, they’re usually not looking to reject someone based on a box-checking exercise. Obviously, some don’t make it past the initial screen for various reasons, but if you’re going on a second date and looking for love, you’re probably doing it because you want it to work out. Same goes for hiring. If you take the optimistic route, you can let go of some of the pass/fail and one-size-fits-all approaches to candidate evaluation and spend more time trying to find a love match. For all but the most junior roles, I’m confident you can get a strong handle on a candidate’s technical skills by exploring their work history in depth. I’m a big fan of “behavioural” interviewing, where you ask about specific things the candidate has done. I start with a broad opening question and then use ad hoc follow-ups in as conversational a manner as I can muster. I want to have a discussion, not an interrogation. Start with questions like: If you practice, or watch someone who is good at this type of interview, you can easily fill a 45–60 minute interview slot with a couple of those top-level questions and some ad hoc follow-ups based on their answers. Of the three examples I gave, two are a good starting place for assessing technical skills. Most developers will give you a software project as the work they’re the most proud of (if they say “I raised hamsters as a kid,” feel free to ask them to limit it to the realm of software development and try again). This is your opportunity to dig in on the technical details: Questions like that will give you a much stronger signal on their technical skills and, importantly, experience. You should be able to easily tell how much the candidate was a driver vs. a passenger on the project, whether or not they thought about the bigger picture, and how deep or shallow their knowledge was. And, of course, you can keep asking follow-up questions to the follow-up questions until you’ve got a good sense. Interviewing and taking a job does have a lot of parallels to dating and getting married. There are emotional and financial implications for both parties if it doesn’t work out, there’s always a degree of risk involved, and there’s sometimes a degree of asymmetry between the parties. In the job market, the asymmetry is nearly always in favour of the employer. They hold most of the cards and can dictate the terms of the process completely. You have a choice as a leader how much you want to wield that power. My advice is to wield it sparingly—try to give candidates the kind of experience where even if you don’t hire them, they had a good enough time in the process that they’d still recommend you to a friend. Taking an interest in their experience, understanding what motivates them, and fitting candidates to the role that maximizes their existing skills, challenges them in the right ways, and takes maximum advantage of their intrinsic motivations will produce much better results than making them run through the gauntlet like a contestant on Survivor . " Old Royal Naval College, Greenwich - King William Court and Queen Mary Court - gate " by ell brown is licensed under CC BY 2.0 . Like this? Please feel free to share it on your favourite social media or link site! Share it with friends! Hit subscribe to get new posts delivered to your inbox automatically. Feedback? Questions? Topic Suggestions? Get in touch ! I don’t want to hire this person unless they prove themselves worthy . It’s my job to keep the bad and fake programmers out. ( Gatekeeper ) I want to hire this person unless they disqualify themselves somehow. It’s my job to find a good match between our needs and the candidate’s skills and interests. ( Matchmaker ) What’s a deadlock? How do you resolve it? Oh, you know Java? Explain how the garbage collector works! What’s the CAP theorem? What’s the work you’ve done that you’re the most proud of?✳️ What’s the hardest technical problem you’ve encountered? What was the resolution? Tell me about your favourite team member to work with. Least favourite? What language or frameworks did they use? Did they choose, or did someone else? How was it deployed? What sort of testing strategy did they use? What databases were involved? How did their stuff fit into the architecture of the broader system?

1 views
マリウス 2 weeks ago

A Word on Omarchy

Pro tip: If you’ve arrived here via a link aggregator, feel free to skip ahead to the Summary for a conveniently digestible tl;dr that spares you all the tedious details, yet still provides enough ammunition to trash-talk this post in the comments of whatever platform you stumbled upon it. In the recent months, there has been a noticeable shift away from the Windows desktop, as well as from macOS , to Linux, driven by various frustrations, such as the Windows 11 Recall feature. While there have historically been more than enough Linux distributions to choose from, for each skill level and amount of desired pain, a recent Arch -based configuration has seemingly made strides across the Linux landscape: Omarchy . This pre-configured Arch system is the brainchild of David Heinemeier Hansson , a Danish web developer and entrepreneur known as one of the co-founders of 37signals and for developing the Ruby on Rails framework. The name Omarchy appears to be a portmanteau of Arch , the Linux distribution that Hansson ’s configuration is based upon, and お任せ, which translates to omakase and means to leave something up to someone else (任せる, makaseru, to entrust ). When ordering omakase in a restaurant, you’re leaving it up to the chef to serve you whatever they think is best. Oma(kase) + (A)rch + y is supposedly where the name comes from. It’s important to note that, contrary to what Hansson says in the introduction video , Omarchy is not an actual Linux distribution . Instead, it’s an opinionated installation of Arch Linux that aims to make it easy to set up and run an Arch desktop, seemingly with as much TUI-hacker-esque aesthetic as possible. Omarchy comes bundled with Hyprland , a tiling window manager that focuses on customizability and graphic effects, but apparently not as much on code quality and safety . However, the sudden hype around Omarchy , which at this point has attracted attention and seemingly even funding from companies like Framework (Computer Inc.) ( attention ) and Cloudflare ( attention and seemingly funding ), made me want to take a closer look at the supposed cool kid on the block to understand what it was all about. Omarchy is a pre-configured installation of the Arch distribution that comes with a TUI installer on a 6.2GB ISO. It ships with a collection of shell scripts that use existing FOSS software (e.g. walker ) to implement individual features. The project is based on the work that the FOSS community, especially the Arch Linux maintainers, have done over the years, and ties together individual components to offer a supposed ready-to-use desktop experience. Omarchy also adds some links to different websites, disguised as “Apps” , but more on that later. This, however, seems to be enough to spark an avalanche of attention and, more importantly, financial support for the project. Anyway, let’s give Omarchy an actual try, and see what chef Hansson recommended to us. The Omarchy installer is a simple text user interface that tries to replicate what Charm has pioneered with their TUI libraries: A smooth command-line interface that preserves the simplicity of the good old days , yet enhances the experience with playful colors, emojis, and animations for the younger, future generation of users. Unlike mature installers, Omarchy ’s installer script doesn’t allow for much customization, which is probably to be expected with an “Opinionated Arch/Hyprland Setup” . Info: Omarchy uses gum , a Charm tool, under the hood. One of the first things that struck me as unexpected was the fact that I was able to use as my user password, an easy-to-guess word that Omarchy will also use for the drive encryption, without any resistance from the installer. Most modern Linux distributions actively prevent users from setting easily guessable or brute-forceable passwords. Moreover, taking into account that the system relies heavily on sudo (instead of the more modern doas ), and also considering that the default installation configures the maximum number of password retries to 10 (instead of the more cautious limit of three), it raises an important question: Does Omarchy care about security? Let’s take a look at the Omarchy manual to find out: Omarchy takes security extremely seriously. This is meant to be an operating system that you can use to do Real Work in the Real World . Where losing a laptop can’t lead to a security emergency. According to the manual, taking security extremely seriously means enabling full-disk encryption (but without rejecting simple keys), blocking all ports except for 22 (SSH, on a desktop) and 53317 (LocalSend), continuously running (even though staying bleeding-edge has repeatedly proven to be in insufficient security measure in the past) and maintaining a Cloudflare protected package mirror. That’s seemingly all. Hm. Proceeding with the installation, the TUI prompts for an email address, which makes the whole process feel a bit like the Windows setup routine. While one might assume Omarchy is simply trying to accommodate its new user base, the actual reason appears to be much simpler: . If, however, you’d be expecting for Omarchy to set up GPG with proper defaults, configure SSH with equally secure defaults, and perhaps offer an option to create new GPG/SSH keys or import existing ones, in order to enable proper commit and push signing for Git, you will be left disappointed. Unfortunately, none of this is the case. The Git config doesn’t enable commit or push signing, neither the GPG nor the SSH client configurations set secure defaults, and the user isn’t offered a way to import existing keys or create new ones. Given that Hansson himself usually does not sign his commits, it seems that these aspects are not particularly high on the project’s list of priorities. The rest of the installer routine is fairly straightforward and offers little customization, so I won’t bore you with the details, but you can check the screenshots below. After initially downloading the official ISO file, the first boot of the system greets you with a terminal window informing you that it needs to update a few packages . And by “a few” it means another 1.8GB. I’m still not entirely sure why the v3.0.2 ISO is a hefty 6.2GB, or why it requires downloading an additional 1.8GB after installation on a system with internet access. For comparison, the official Arch installer image is just 1.4GB in size . While downloading the updates (which took over an hour for me), and with over 15GB of storage consumed on my hard drive, I set out to experience the full Omarchy goodness! After hovering over a few icons on the Waybar , I discovered the menu button on the very left. It’s not a traditional menu, but rather a shortcut to the aforementioned walker launcher tool, which contains a few submenus: The menu reads: Apps, Learn, Trigger, Style, Setup, Install, Remove, Update, About, System; It feels like a random assortment of categories, settings, package manager subcommands, and actions. From a UX perspective, this main menu doesn’t make much sense to me. But I’m feeling lucky, so let’s just go ahead and type “Browser” ! Hm, nothing. “Firefox” , maybe? Nope. “Chrome” ? Nah. “Chromium” ? No. Unfortunately the search in the menu is not universal and requires you to first click into the Apps category. The Apps category seems to list all available GUI (and some TUI) applications. Let’s take a look at the default apps that Omarchy comes with: The bundled “apps” are: 1Password, Alacritty, Basecamp, Bluetooth, Calculator, ChatGPT, Chromium, Discord, Disk Usage, Docker, Document Viewer, Electron 37, Figma, Files, GitHub, Google Contacts, Google Messages, Google Photos, HEY, Image Viewer, Kdenlive, LibreOffice, LibreOffice Base, LibreOffice Calc, LibreOffice Draw, LibreOffice Impress, LibreOffice Math, LibreOffice Writer, Limine-snapper-restore, LocalSend, Media Player, Neovim, OBS Studio, Obsidian, OpenJDK Java 25 Console, OpenJDK Java 25 Shell, Pinta, Print Settings, Signal, Spotify, Typora, WhatsApp, X, Xournal++, YouTube, Zoom; Aside from the fact that nearly a third of the apps are essentially just browser windows pointing to websites , which leaves me wondering where the 15GB of used storage went, the selection of apps is also… well, let’s call it opinionated , for now at least. Starting with the browser, Omarchy comes with Chromium by default, specifically version 141.0.7390.107 in my case, which, unlike, for example, ungoogled-chromium , has disabled support for manifest v2 and thus doesn’t include extensions like uBlock Origin or any other advanced add-ons. In fact, the browser is completely vanilla, with no decent configuration. The only extension it includes is the copy-url extension, which serves a rather obscure purpose: Providing a non-intuitive way to copy the current page’s URL to your clipboard using an even less intuitive shortcut ( ) while using any of the “Apps” that are essentially just browser windows without browser controls. Other than that, it’s pretty much stock Chromium. It allows all third-party cookies, doesn’t send “Do Not Track” requests, sends browsing data to Google Safe Browsing , but doesn’t enforce HTTPS. It has JavaScript optimization enabled for all websites, which increases the attack surface, and it uses Google as the default search engine. There’s not a single opinionated setting in the configuration of the default browser on Omarchy , let alone in the choice of browser itself. And the fact that the only extension installed and active by default is an obscure workaround for the lack of URL bars in “App” windows doesn’t exactly make this first impression of what is likely one of the most important components for the typical Omarchy user very appealing. Alright, let’s have a look at what is probably the second most important app after the browser for many people in the target audience: Basecamp ! Just kidding. Obviously, it’s the terminal. Omarchy comes with Alacritty by default, which is a bit of an odd choice in 2025, especially for a desktop that seemingly prioritizes form over function, given the ultra-conservative approach the Alacritty developers take toward anything related to form and sometimes even function. I would have rather expected Kitty , WezTerm , or Ghostty . That said, Alacritty works and is fairly configurable. Unfortunately, like the browser and various other tools such as Git, there’s little to no opinionated configuration happening, especially one that would enhance integration with the Omarchy ecosystem. Omarchy seemingly highlights the availability of NeoVim by default, yet doesn’t explicitly configure Alacritty’s vi mode , leaving it at its factory defaults . In fact, aside from the keybinding for full-screen mode, which is a less-than-ideal shortcut for anyone with a keyboard smaller than 100% (unless specifically mapped), the Alacritty config doesn’t define any other shortcuts to integrate the terminal more seamlessly into the supposed opinionated workflow. Not even the desktop’s key-repeat rate is configured to a reasonable value, as it takes about a second for it to kick in. Fun fact: When you leave your computer idling on your desk, the screensaver you’ll encounter isn’t an actual hyprlock that locks your desktop and uses PAM authentication to prevent unauthorized access. Instead, it’s a shell script that launches a full-screen Alacritty window to display a CPU-intensive ASCII animation. While Omarchy does use hyprlock , its timeout is set longer than that of the screensaver. Because you can’t dismiss the screensaver with your mouse (only with your keyboard) it might give inexperienced users a false sense of security. This is yet another example of prioritizing gimmicky animations over actual functionality and, to some degree, security. Like the browser and the terminal emulator, the default shell configuration is a pretty basic B….ash , and useful extensions like Starship are barely configured. For example, I ed into a boilerplate Python project directory, activated its venv , and expected Starship to display some useful information, like the virtual environment name or the Python version. However, none of these details appeared in my prompt. “Surely if I do the same in a Ruby on Rails project, Starship will show me some useful info!” I thought, and ed into a Rails boilerplate project. Nope. In fact… Omarchy doesn’t come with Rails pre-installed. I assume Hansson ’s target audience doesn’t primarily consist of Rails developers, despite the unconditional , but let’s not get ahead of ourselves. It is nevertheless puzzling that Omarchy doesn’t come with at least Ruby pre-installed. I find it a bit odd that the person who literally built the most successful Ruby framework on earth is pre-installing “Apps” like HEY , Spotify , and X , but not his own FOSS creation or even just the Ruby interpreter. If you want Rails , you have to navigate through the menu to “Install” , then “Development” , and finally select "‘Ruby on Rails" to make RoR available on your system. Not just Ruby , though. And even going the extra mile to do so still won’t make Starship display any additional useful info when inside a Rails project folder. PS: The script that installs these development tools bypasses the system’s default package manager and repository, opting instead to use mise to install interpreters and compilers. This is yet another example of security not being taken quite as seriously as it should be. At the very least, the script should inform the user that this is about to happen and offer the option to use the package manager instead, if the distributed version meets the user’s needs. Fun fact: At the time of writing, mise installed Ruby 3.4.7. The latest package available through the package manager is – you guessed it – 3.4.7. As mentioned earlier, Omarchy is built entirely using Bash scripts, and there’s nothing inherently wrong with that. When done correctly and kept at a sane limit, Bash scripts are powerful and relatively easy to maintain. However, the scripts in Omarchy are unfortunately riddled with little oversights that can cause issues. Those scripts are also used in places in which a proper software implementation would have made more sense. Take the theme scripts, for example. If you go ahead and create a new theme under and name it , and then run a couple of times until the tool hits your new theme, you can see one effect of these oversights. Nothing catastrophic happened, except now won’t work anymore. If you’d want to annoy an unsuspecting Omarchy user, you could do this: While this is such a tiny detail to complain about, it is an equally low-hanging fruit to write scripts in a way in which this won’t happen. Apart from the numerous places where globbing and word splitting can occur, there are other instances of code that could have also been written a little bit more elegantly. Take this line , for example: To drop and from the , you don’t have to call and pipe to . Instead, you can simply use Bash’s built-in regex matching to do so: Similarly, in this line there’s no need to test for a successful exit code with a dedicated check, when you can simply make the call from within the condition: And frankly, I have no idea what this line is supposed to be: What are you doing, Hansson? Are you alright? Make no mistake to believe that the remarks made above are the only issues with Hansson ’s scripts in Omarchy . While these specific examples are nitpicks, they paint a picture that is only getting less colorful the more we look into the details. We can continue to gauge the quality of the scripts by looking beyond just syntax. Take, for example, the migration : This script runs five commands in sequence within an condition: first , followed by two invocations, then again, and finally . While this might work as expected “on a sunny day” , the first command could fail for various reasons. If it does, the subsequent commands may encounter issues that the script doesn’t account for, and the outcome of this migration will be differently from what the author anticipated. For experienced users, the impact in such a case may be minimal, but for others, it may present a more significant hurdle. Furthermore, as can be seen in here , the invoking process cannot detect if only one of the five commands failed. As a result, the entire migration might be marked as skipped , despite changes being made to the system. But let’s continue to look into specifically the migrations in just a moment. The real concern here, however, is the widespread absence of exception handling, either through status code checks for previously executed commands or via dependent executions (e.g., ). In most scripts, there is no validation to ensure that actions have the desired effect and the current state actually represents the desired outcome. Almost all sequentially executed commands depend upon one another, yet the author doesn’t make sure that if fails the script won’t just blindly run . Note: Although sets , which would cause a script like the one presented above to fail when the first command fails, the migrations are invoked by sourcing the script. This script, in turn, invokes the script using the helper function . However, this function executes the script in the following way: In this case, the options are not inherited by the actual migration , meaning it won’t stop immediately when an error occurs. This behavior makes sense, as abruptly stopping the installation would leave the system in an undefined state. But even if we ignored that and assumed that migrations would stop when the first command would fail, it still wouldn’t actually handle the exception, but merely stop the following commands from performing actions on an unexpected state. To understand the broader issue and its impact on security, we need to dive deeper into the system’s functioning, and especially into migrations . This helps illustrate how the fragile nature of Omarchy could take a dangerous turn, especially considering the lack of tests, let alone any dedicated testing infrastructure. Let’s start by adding some context and examining how configurations are applied in Omarchy . Inspired by his work as a web developer, Hansson has attempted to bring concepts from his web projects into the scripts that shape his Linux setup. In Omarchy , configuration changes are handled through migration scripts, as we just saw, which are in principle similar to the database migrations you might recall from Rails projects. However, unlike SQL or the Ruby DSL used in Active Record Migrations , these Bash scripts do not merely contain a structured query language; They execute actual system commands during installation. More importantly: They are not idempotent by default! While the idea of migrations isn’t inherently problematic, in this case, it can (and has) introduce(d) issues that go/went unnoticed by the Omarchy maintainers for extended periods, but more on that in a second. The migration files in Omarchy are a collection of ambiguously named scripts, each containing a set of changes to the system. These changes aren’t confined to specific configuration files or components. They can be entirely arbitrary, depending on what the migration is attempting to implement at the time it is written. To modify a configuration file, these migrations typically rely on the command. For instance, the first migration intended to change from to might execute something like . The then following one would have to account for the previous change: . Another common approach involves removing a specific line with and appending the new settings via . However, since multiple migrations are executed sequentially, often touching the same files and running the same commands, determining the final state of a configuration file can become a tedious process. There is no clear indication of which migration modifies which file, nor any specific keywords (e.g., ) to grep for and help identify the relevant migration(s) when searching through the code. Moreover, because migrations rely on fixed paths and vary in their commands, it’s impossible to test them against mock files/folders, to predict their outcome. These scripts can invoke anything from sourcing other scripts to running commands, with no restrictions on what they can or cannot do. There’s no “framework” or API within which these scripts operate. To understand what I mean by that, let’s take a quick look at a fairly widely used pile of scripts that is of similar importance to a system’s functionality: OpenRC . While the init.d scripts in OpenRC are also just that, namely scripts, they follow a relatively well-defined API : Note: I’m not claiming that OpenRC ’s implementation is flawless or the ultimate solution, far from it. However, given the current state of the Omarchy project, it’s fair to say that OpenRC is significantly better within its existing constraints. Omarchy , however, does not use any sort of API for that matter. Instead, scripts can basically do whatever they want, in whichever way they deem adequate. Without such well defined interfaces , it is hard to understand the effects that migrations will have, especially when changes to individual services are split across a number of different migration scripts. Here’s a fun challenge: Try to figure out how your folder looks after installation by only inspecting the migration files. To make matters worse, other scripts (outside the migration folder) may also modify configurations that were previously altered by migrations , at runtime, such as . Note: To the disappointment of every NixOS user, unlike database migrations in Rails , the migrations in Omarchy don’t support rollbacks and, judging by their current structure, are unlikely to do so moving forward. The only chance Omarchy users have in case a migration should ever brick their existing system is to make use of the available snapshots . All of this (the lack of interfaces , the missing exception handling and checks for desired outcomes, the overlapping modification, etc.) creates a chaotic environment that is hard to overview and maintain, which can severely compromise system integrity and, by extension, security. Want an example? On my fresh installation, I wanted to validate the following claim from the manual : Firewall is enabled by default: All incoming traffic by default except for port 22 for ssh and port 53317 for LocalSend. We even lock down Docker access using the ufw-docker setup to prevent that your containers are accidentally exposed to the world. What I discovered upon closer inspection, however, is that Omarchy ’s firewall doesn’t actually run, despite its pre-configured ruleset . Yes, you read that right, everyone installing the v3.0.2 ISO (and presumably earlier versions) of Omarchy is left with a system that doesn’t block any of the ports that individual software might open during runtime. Please bear in mind that apart from the full-disk encryption, the firewall is the only security measure that Omarchy puts in place. And it’s off by default. Only once I manually enabled and started using / , it did activate the rules mentioned in the handbook. As highlighted in the original issue , it appears that, with the chaos that are the migration- , preflight- and first-run- scripts no one ever realized that you need to tell to explicitly enable a service for it to actually run. And because it’s all made up of Bash scripts that can do whatever they want, you cannot easily test these things to notice that the state that was expected for a specific service was not reached. Unlike in Rails , where you can initialize your (test) database and run each migration manually if necessary to make sure that the schema reaches the desired state and that the database is seeded correctly, this agglomeration of Bash scripts is not structured data. Hence, applying the same principle to something as arbitrary as a Bash script is not as easily possible, at least not without clearly defined structures and interfaces . As a user who trusted Omarchy to secure their installation, I would be upset, to say the least. The system failed to keep users safe, and more importantly, nobody noticed for a long time. There was no hotfix ISO issued, nor even a heads-up to existing users alongside the implemented fix ( e.g. ). While mistakes happen, simply brushing them under the rug feels like rather negligent behavior. When looking into the future, the mess that is the Bash scripts certainly won’t decrease in complexity, making me doubt that things like these won’t happen again. Note: The firewall fix was listed in v2.1.1. However, on my installation of v3.0.2 the firewall would still not come up automatically. I double-checked this by running the installation of v3.0.2 twice, and both times the firewall would not autostart after the second reboot. While writing this post, v3.1.0 ( update: v3.1.1 ) was released and I also checked the issue there. v3.1.0 appears to have finally fixed the firewall issue. Having that said, it shows how much of a mess the whole system is, when things that were identified and supposedly fixed multiple versions ago still don’t work in newer releases weeks later. Tl;dr: v3.1.0 appears to be the first release to actually fix the firewall issue, even though it was identified and presumably fixed in v2.1.1, according to the changelog. With the firewall active, it becomes apparent that Omarchy ’s configuration does indeed leave port 22 (SSH) open, even though the SSH daemon is not running by default. While I couldn’t find a clear explanation for why this port is left open on a desktop system without an active SSH server, my assumption is that it’s intended to allow the user to remotely access their workstation should they ever need to. It’s important to note that the file in Omarchy , like many other system files, remains unchanged. Users might reasonably assume that, since Omarchy intentionally leaves the SSH port open, it must have also configured the SSH server with sensible defaults. Unfortunately, this is not the case. In a typical Arch installation, users would eventually come across the “Protection” section on the OpenSSH wiki page, where they would learn about the crucial settings that should be adjusted for security reasons. However, when using a system like Omarchy , which is marketed as an opinionated setup that takes security seriously , users might expect these considerations to be handled for them, making it all the more troubling that no sensible configuration is in place, despite the deliberate decision to leave the SSH port open for future use. Hansson seemingly struggles to get even basics like right. The fact that there’s so little oversight, that users are allowed to set weak password for both, their account and drive encryption, and that the only other security measure put in place, the firewall, simply hasn’t been working, does not speak in favor of Omarchy . Info: is abstraction layer that simplifies managing the powerful / firewall and it stands for “ u ncomplicated f ire w all”. Going into this review I wasn’t expecting a hardened Linux installation with SELinux , intrusion detection mechanisms, and all these things. But Hansson is repeatedly addressing users of Windows and macOS (operating systems with working firewalls and notably more security measures in place) who are frustrated with their OS, as a target audience. At this point, however, Omarchy is a significantly worse option for those users. Not only does Omarchy give a hard pass on Linux Security Modules , linux-hardened , musl , hardened_malloc , or tools like OpenSnitch , and fails to properly address security-related topics like SSH, GPG or maybe even AGE and AGE/Yubikey , but it in fact weakens the system security with changes like the increase of and login password retries and the decrease of faillock timeouts . Omarchy appears to be undoing security measures that were put in place by the software- and by the Arch -developers, while the basis it uses for building the system does not appear to be reliable enough to protect its users from future mishaps. Then there is the big picture of Omarchy that Hansson tries to curate, which is that of a TUI-centered, hacker -esque desktop that promises productivity and so on. He even goes as far as calling it “a pro system” . However, as we clearly see from the implementation, configuration and the project’s approach to security, this is unlike anything you would expect from a pro system . The entire image of a TUI-centered productivity environment is further contradicted in many different places, primarily by the lack of opinions and configuration . If the focus is supposed to be on “pro” usage, and especially the command-line, then… The configuration doesn’t live up to its sales pitch, and there are many aspects that either don’t make sense or aren’t truly opinionated , meaning they’re no different from a standard Arch Linux installation. In fact, I would go as far as to say that Omarchy is barely a ready-to-use system at all out of the box and requires a lot of in-depth configuration of the underlying Arch distribution for it to become actually useful. Let’s look at only a few details. There are some fairly basic things you’ll miss on the “lightweight” 15GB installation of Omarchy : With the attention Omarchy is receiving, particularly from Framework (Computer Inc.) , it is surprising that there is no option to install the system on RAID1 hardware: I would argue that RAID1 is a fairly common use case, especially with Framework (Computer Inc.) 16" laptops, which support a secondary storage device. Considering that Omarchy is positioning itself to compete against e.g. macOS with TimeMachine , yet it does not include an automated off-drive backup solution for user data by default – which by the way is just another notable shortcoming we could discuss – and given that configuring a RAID1 root with encryption is notoriously tedious on Linux, even for advanced users, the absence of this option is especially disappointing for the intended audience. Even moreso when neither the installer nor the post-installation process provides any means to utilize the additional storage device, leaving inexperienced users seemingly stuck with the command. Omarchy does not come with a dedicated swap partition, leaving me even more puzzled about its use of 15GB of disk space. I won’t talk through why having a dedicated swap partition that is ideally encrypted using the same mechanisms already in place is a good idea. This topic has been thoroughly discussed and written about countless times. However, if you, like seemingly the Omarchy author, are unfamiliar with the benefits of having swap on Linux, I highly recommend reading this insightful write-up to get a better understanding. What I will note, however, is that the current configuration does not appear to support hibernation via the command through the use of a dynamic swap file . This leads me to believe that hibernation may not function on Omarchy . Given the ongoing battery drain issues with especially Framework (Computer Inc.) laptops while in suspend mode, it’s clear that hibernation is an essential feature for many Linux laptop users. Additionally, it’s hard to believe that Hansson , a former Apple evangelist , wouldn’t be accustomed to the simple act of closing the lid on his laptop and expecting it to enter a light sleep mode, and eventually transitioning into deep sleep to preserve battery life. If he had ever used Omarchy day-to-day on a laptop in the same way most people use their MacBooks , he would almost certainly have noticed the absence of these features. This further reinforces the impression that Omarchy is a project designed to appear robust at first glance, but reveals a surprisingly hollow foundation upon closer inspection. Let’s keep our focus on laptop use. We’ve seen Hansson showcasing his Framework (Computer Inc.) laptop on camera, so it’s reasonable to assume he’s using Omarchy on a laptop. It’s also safe to say that many users who might genuinely want to try Omarchy will likely do so on a laptop as well. That said, as we’ve established before, closing the laptop lid doesn’t seem to trigger hibernate mode in Omarchy . But if you close the lid and slip the laptop into your backpack, surely it would activate some power-saving measures, right? At the very least, it should blank the screen, switch the CPU governor to powersaving , or perhaps even initiate suspend to RAM ? Well… Of course, I can’t test these scenarios firsthand, as I’m evaluating Omarchy within a securely confined virtual machine, where any unintended consequences are contained. Still, based on the system’s configuration, or more accurately the lack thereof, it seems unlikely that an Omarchy laptop will behave as expected. The system might switch power profiles due to the power-profiles-daemon when not plugged in, yet its functionality is not comparable to a properly configured or similar. It seems improbable that it will enter suspend to RAM or hibernate mode, and it’s doubtful any other power-saving measures (like temporarily halting non-essential background processes) will be employed to conserve battery life. Although the configuration comes with an “app” for mail, namely HEY , that platform does not support standard mail protocols . I don’t think it’s a hot take to say that probably 99% of Omarchy ’s potential users will need to work with an email system that does support IMAP and SMTP, however. Yet, the base system offers zero tools for that. I’m not even asking for anything “fancy” like ; Omarchy unfortunately doesn’t even come with the most basic tools like the command out of the box. Whether you want to send email through your provider, get a simple summary for a scheduled Cron job delivered to your local mailbox, or just debug some mail-related issue, the command is relatively essential, even on a desktop system, but it is nowhere to be found on Omarchy . Speaking of which: Cron jobs? Not a thing on Omarchy . Want to automate backing up some files to remote storage? Get ready to dive into the wonderful world of timers , where you’ll spend hours figuring out where to create the necessary files, what they need to contain, and how to activate them. Omarchy could’ve easily included a Cron daemon or at least for the sake of convenience. But I guess this is a pro system , and if the user needs periodic jobs, they will have to figure out . Omarchy is, after all, -based … … and that’s why it makes perfect sense for it to use rootless Podman containers instead of Docker. That way, users can take advantage of quadlets and all the glorious integration. Unfortunately, Omarchy doesn’t actually use Podman . It uses plain ol’ Docker instead. Like most things in Omarchy , power monitoring and alerting are handled through a script , which is executed every 30 seconds via a timer. That’s your crash course on timers right there, Omarchy users! This script queries and then uses to parse the battery percentage and state. It’s almost comical how hacky the implementation is. Given that the system is already using UPower , which transmits power data via D-Bus , there’s a much cleaner and more efficient way to handle things. You could simply use a piece of software that connects to D-Bus to continuously monitor the power info UPower sends. Since it’s already dealing with D-Bus , it can also send a desktop notification directly to whatever notification service you’re using (like in Omarchy ’s case). No need for , , or a periodic Bash script triggered by a timer. “But where could I possibly find such a piece of software?” , you might ask. Worry not, Hr. Hansson , I have just the thing you need ! That said, I can understand that you, Hr. Hansson , might be somewhat reluctant to place your trust in software created by someone who is actively delving into the intricacies of your project, rather than merely offering a superficial YouTube interview to casually navigate the Hyprland UI for half an hour. Of course, Hr. Hansson , you could have always taken the initiative to develop a more robust solution yourself, in a proper, lower-level language, and neatly integrated it into your Omarchy repository. But we will explore why this likely hasn’t been a priority for you, Hr. Hansson , in just a moment. While the author’s previous attempt for a developer setup still came with Zellij , this time his opinions seemingly changed and Omarchy doesn’t include Zellij , or Tmux or even screen anymore. And nope, picocom isn’t there either, so good luck reading that Arduino output from . That moment, when you realize that you’ve spent hours figuring out timers , only to find out that you can’t actually back up those files to a remote storage because there’s no , let alone or . At least there is the command. :-) Unfortunately not, but Omarchy comes with and by default. I could go on and on, and scavenge through the rest of the unconfigured system and the scripts, like for example the one, where Omarchy once again seems to prefer -ing random scripts from the internet (or anyone man-in-the-middle -ing it) rather than using the system package manager to install Tailscale . But, for the sake of both your sanity and mine, I’ll stop here. As we’ve seen, Omarchy is more unconfigured than it is opinionated . Can you simply install all the missing bits and piece and configure them yourself? Sure! But then what is the point of this supposed “perfect developer setup” or “pro system” to begin with? In terms of the “opinionated” buzzword, most actual opinions I’ve come across so far are mainly about colors, themes, and security measures. I won’t dare to judge the former two, but as for the latter, well, unfortunately they’re the wrong opinions . In terms of implementation: Omarchy is just scripts, scripts, and more scripts, with no proper structure or (CI) tests. BTW: A quick shout out to your favorite tech influencer , who probably has at least one video reviewing the Omarchy project without mentioning anything along these lines. It is unfortunate that these influential people barely scratch the surface on a topic like this, and it is even more saddening that recording a 30 minute video of someone clicking around on a UI seemingly counts as a legitimate “review” these days. The primary focus for many of these people is seemingly on pumping out content and generating hype for views and attention rather than providing a thoughtful, thorough analysis. ( Alright, we’re almost there. Stick with me, we’re in the home stretch. ) The Omarchy manual : The ultimate repository of Omarchy wisdom, all packed into 33 pages, clocking in at little over 10,000 words. For context, this post on Omarchy alone is almost 10,000 words long. As is the case with the rest of the system, the documentation also adheres to Hansson ’s form over function approach. I’ve mentioned this before, but it bears repeating: Omarchy doesn’t offer any built-in for its scripts, let alone auto-completion, nor does it come with traditional pages. The documentation is tucked away in yet another SaaS product from Hansson ’s company ( Writebook ) and its focus is predominantly on themes, more themes, creating your own themes, and of course, the ever-evolving hotkeys. Beyond that, the manual mostly covers how to locate configuration files for individual UI components and offers guidance on how to configure Hyprland for a range of what feels like outrageously expensive peripherals. For the truly informative content, look no further than the shell function guide, with gems such as: : Format an entire disk with a single ext4 partition. Be careful! Wow, thanks, Professor Oak, I will be! :-) On a more serious note, though, the documentation leaves much to be desired, as evidenced by the user questions over on the GitHub discussions page . Take this question , which unintentionally sums up the Omarchy experience for probably many inexperienced users: I installed this from github without knowing what I was getting into (the page is very minimal for a project of this size, and I forgot there was a link in the footnotes). Please tell me there’s a way to remove Omarchy without wiping my entire computer. I lost my flashdrive, and don’t have a way to back up all my important files anymore. While this may seem comical on the surface, it’s a sad testament to how Omarchy appears to have a knack for luring in unsuspecting users with flashy visuals and so called “reviews” on YouTube, only to leave them stranded without adequate documentation. The only recourse? Relying on the solid Arch docs, which is an abrupt plunge into the deep end, given that Arch assumes you’re at least familiar with its very basics and that you know how you set up your own system. Maybe GitHub isn’t the most representative forum for the project’s support; I haven’t tried Discord, for example. But no matter where the community is, users should be able to fend for themselves with proper documentation, turning to others only as a last resort. It’s difficult to compile a list of things that could have made Omarchy a reasonable setup for people to consider, mainly because, in my opinion, the core of the setup – scripts doing things they shouldn’t or that should have been handled by other means (e.g., the package manager) – is fundamentally flawed. That said, I do think it’s worth mentioning a few improvements that, if implemented, could have made Omarchy a less bad option. Configuration files should not be altered through loose migration scripts. Instead, updated configuration files should be provided directly (ideally via packages, see below) and applied as patches using a mechanism similar to etc-update or dpkg . This approach ensures clarity and reduces confusion, preserves user modifications, and aligns with established best practices. Improve on the user experience where necessary and maybe even contribute improvements back. Use proper software implementations where appropriate. Want a fancy screensaver? Extend Hyprlock instead of awkwardly repurposing a fullscreen terminal window to mimic one. Need to display power status notifications without relying on GNOME or KDE components? Develop a lightweight solution that integrates cleanly with the desktop environment, or extend the existing Waybar battery widget to send notifications. Don’t like existing Linux “App Store” options? Build your own, rather than diverting a launcher from its intended use only to run Bash scripts that install packages from third-party sources on a system that has a perfectly good package manager in place. Arguably the most crucial improvement: Package the required software and install it via the system’s package manager. Avoid relying on brittle scripts, third-party tools like mise , or worse, piping scripts directly into . I understand that the author is coming from an operating system where it’s sort of fine to and use software like to manage individual Ruby versions. However, we have to take into consideration that specifically macOS has a significantly more advanced security architecture in place than (unfortunately) most out-of-the-box Linux installations have, let alone Omarchy . On Hanssons setup the approach is neither sensible nor advisable, especially given that it’s ultimately a system that is built around a proper package manager. If you want multiple versions of Ruby, package them and use slotting (or the equivalent of it on the distribution that you’re using, e.g. installation to version-specific directories on Arch ). Much of what the migrations and other scripts attempt to do could, and should have been achieved through well-maintained packages and the proven mechanisms of a package manager. Whether it’s Gentoo , NixOS , or Ubuntu , each distribution operates in its own unique way, offering users a distinct set of tools and defaults. Yet, they all share one common trait: A set of strong, well-defined opinions that shape the system. Omarchy , in contrast, feels little more than a glorified collection of Hyprland configurations atop an unopinionated, barebones foundation. If you’re going to have opinions, don’t limit them to just nice colors and cute little wallpapers. Form opinions on the tools that truly matter, on how those tools should be configured, and on the more intricate, challenging aspects of the system, not just the surface-level, easy choices. Have opinions on the really sticky and complicated stuff, like power-saving modes, redundant storage, critical system functionality, and security. Above all, cultivate reasonable opinions, ones that others can get behind, and build a system that reflects those. Comprehensive documentation is essential to help users understand how the system works. Currently, there’s no clear explanation for the myriad Bash scripts, nor is there any user-facing guidance on how global system updates affect individual configuration files. ( finally… ) Omarchy feels like a project created by a Linux newcomer, utterly captivated by all the cool things that Linux can do , but lacking the architectural knowledge to get the basics right, and the experience to give each tool a thoughtful review. Instead of carefully selecting software and ensuring that everything works as promised, the approach seems to be more about throwing everything that somehow looks cool into a pile. There’s no attention to sensible defaults, no real quality control, and certainly no verification that the setup won’t end up causing harm or, at the very least, frustration for the user. The primary focus seems to be on creating a visually appealing but otherwise hollow product . Moreover, the entire Omarchy ecosystem is held together by often poorly written Bash scripts that lack any structure, let alone properly defined interfaces . Software packages are being installed via or similar mechanisms, rather than provided as properly packaged solutions via a package manager. Hansson is quick to label Omarchy a Linux distribution , yet he seems reluctant to engage with the foundational work that defines a true distribution: The development and proper packaging (“distribution”) of software . Whenever Hansson seeks a software (or software version) that is unavailable in the Arch package repositories, he bypasses the proper process of packaging it for the system. Instead, he resorts to running arbitrary scripts or tools that download the required software from third-party sources, rather than offering the desired versions through a more standardized package repository. Hansson also appears to avoid using lower-level programming languages to implement features in a more robust and maintainable manner at all costs , often opting instead for makeshift solutions, such as executing “hacky” Bash scripts through timers. A closer look at his GitHub profile and Basecamp’s repositories reveals that Hansson has seemingly worked exclusively with Ruby and JavaScript , with most contributions to more complex projects, like or , coming from other developers. This observation is not meant to diminish the author’s profession and accomplishments as a web developer, but it highlights the lack of experience in areas such as systems programming, which are crucial for the type of work required to build and maintain a proper Linux distribution. Speaking of packages, the system gobbles up 15GB of storage on a basic install, yet fails to deliver truly useful or high-quality software. It includes a hodgepodge of packages, like OpenJDK and websites of paid services in “App” -disguise, but lacks any real optimization for specific use cases. Despite Omarchy claiming to be opinionated most of the included software is left at its default settings, straight from the developers. Given Hansson ’s famously strong opinions on everything, it makes me wonder if the Omarchy author simply hasn’t yet gained the experience necessary to develop clear, informed stances on individual configurations. Moreover, his prioritization of his paid products like Basecamp and HEY over his own free software like Rails leaves a distinctly bitter aftertaste when considering Omarchy . What’s even more baffling is that seemingly no one at Framework (Computer Inc.) or Cloudflare appears to have properly vetted the project they’re directing attention (and sometimes financial support) to. I find it hard to believe that knowledgeable people at either company have looked at Omarchy and thought, “Out of all the Linux distributions out there, this barely configured stack of poorly written Bash scripts on top of Arch is clearly the best choice for us to support!” In fact, I would go as far as to call it a slap in the face to each and every proper distro maintainer and FOSS developer. Furthermore, I fail to see the supposed gap Omarchy is trying to fill. A fresh installation of Arch Linux, or any of its established derivatives like Manjaro , is by no means more complicated or time-consuming than Omarchy . In fact, it is Omarchy that complicates things further down the line, by including a number of unnecessary components and workarounds, especially when it comes to its chosen desktop environment. The moment an inexperienced user wants or needs to change anything, they’ll be confronted with a jumbled mess that’s difficult to understand and even harder to manage. If you want Arch but are too lazy to read through its fantastic Wiki , then look at Manjaro , it’ll take care of you. If that’s still not to your liking, maybe explore something completely different . On the other hand, if you’re just looking to tweak your existing desktop, check out other people’s dotfiles and dive into the unixporn communities for inspiration. As boring as Fedora Workstation or Ubuntu Desktop might sound, these are solid choices for anyone who doesn’t want to waste time endlessly configuring their OS and, more importantly, wants something that works right out of the box and actually keeps them safe. Fedora Workstation comes with SELinux enabled in “enforcing” mode by default, and Ubuntu Desktop utilizes AppArmor out of the box. Note: Yes, I hear you loud and clear, SuSE fans. The moment your favorite distro gets its things together with regard to the AppArmor-SELinux transition and actually enables SELinux in enforcing mode across all its different products and versions I will include it here as well. Omarchy is essentially an installation routine for someone else’s dotfiles slapped on top of an otherwise barebones Linux desktop. Although you could simply run its installation scripts on your existing, fully configured Arch system, it doesn’t seem to make much sense and it’s definitely not the author’s primary objective. If this was just Hansson’s personal laptop setup, nobody, including myself, would care about the oversights or eccentricities, but it is not. In fact, this project is clearly marketed to the broader, less experienced user base, with Hansson repeatedly misrepresenting Omarchy as being “for developers or anyone interested in a pro system” . I emphasize marketed here, because Hansson is using his reach and influence in every possible way to advertise and seemingly monetize Omarchy ; Apart from the corporate financial support, the project even has its own merch that people can spend money on. Given that numerous YouTubers have been heavily promoting the project over the past few weeks, often in the same breath with Framework (Computer Inc.) , it wouldn’t be surprising to see the company soon offering it as a pre-installation option on their hardware. If you’re serious about Linux, you’re unlikely to fall for the Omarchy sales pitch. However, if you’re an inexperienced user who’s heard about Omarchy from a tech-influencer raving about it, I strongly recommend starting your Linux journey elsewhere, with a distribution that actually prioritizes your security and system integrity, and is built and maintained by people who live and breathe systems, and especially Linux. Alright, that’s it. Why don’t any of the Bash scripts and functions provide a flag or maybe even autocompletions? Why are there no Omarchy -related pages? Why does the system come with GNOME Files , which requires several gvfs processes running in the background, yet it lacks basic command-line file managers like or ? Why would you define as an for unconditionally, but not install Rails by default? Why bother shipping tools like and but fail to provide aliases for , , etc to make use of these tools by default? Why wouldn’t you set up an O.G. alias like in your defaults ? Why ship the GNOME Calculator but not include any command-line calculators (e.g., , ), forcing users to rely on basics like ? Why ship the full suite of LibreOffice, but not a single useful terminal tool like , , , etc.? Why define functions like with and without an option to enable encryption, when the rest of the system uses and ? And if it’s intended for use by inexperienced users primarily for things like USB sticks, why not make it instead of so the drive works across most operating systems? Why not define actually useful functions like or / ? Why doesn’t your Bash configuration include history- and command-flag-based auto-suggestions? Or a terminal-independent vi mode ? Or at least more consistent Emacs-style shortcuts? Why don’t you include some quality-of-life tools like or some other command-line community favorites? If you had to squeeze in ChatGPT , why not have Crush available by default? Why does the base install with a single running Alacritty window occupy over 2.2GB of RAM right after booting? For comparison: My Gentoo system with a single instance of Ghostty ends up at around half of that. Why set up NeoVim but not define as an alias for , or even create a symlink? And speaking of NeoVim , why does the supposedly opinionated config make NeoVim feel slower than VSCode ?

0 views
mcyoung 2 weeks ago

Why SSA?

If you’ve read anything about compilers in the last two decades or so, you have almost certainly heard of SSA compilers , a popular architecture featured in many optimizing compilers, including ahead-of-time compilers such as LLVM, GCC, Go, CUDA (and various shader compilers), Swift 1 , and MSVC 2 , and just-in-time compilers such as HotSpot C2 3 , V8 4 , SpiderMonkey 5 , LuaJIT, and the Android Runtime 6 . SSA is hugely popular, to the point that most compiler projects no longer bother with other IRs for optimization 7 . This is because SSA is incredibly nimble at the types of program analysis and transformation that compiler optimizations want to do on your code. But why ? Many of my friends who don’t do compilers often say that compilers seem like opaque magical black boxes, and SSA, as it often appears in the literature, is impenetrably complex. But it’s not! SSA is actually very simple once you forget everything you think your programs are actually doing. We will develop the concept of SSA form, a simple SSA IR, prove facts about it, and design some optimizations on it. I have previously written about the granddaddy of all modern SSA compilers, LLVM. This article is about SSA in general, and won’t really have anything to do with LLVM. However, it may be helpful to read that article to make some of the things in this article feel more concrete. SSA is a property of intermediate representations (IRs), primarily used by compilers for optimizing imperative code that target a register machine . Register machines are computers that feature a fixed set of registers that can be used as the operands for instructions: this includes virtually all physical processors, including CPUs, GPUs, and weird tings like DSPs. SSA is most frequently found in compiler middle-ends , the optimizing component between the frontend (which deals with the surface language programmers write, and lowers it into the middle-end’s IR), and the backend (which takes the optimized IR and lowers it into the target platform’s assembly). SSA IRs, however, often have little resemblance to the surface language they lower out of, or the assembly language they target. This is because neither of these representations make it easy for a compiler to intuit optimization opportunities. Imperative code consists of a sequence of operations that mutate the executing machine’s state to produce a desired result. For example, consider the following C program: This program returns no matter what its input is, so we can optimize it down to this: But, how would you write a general algorithm to detect that all of the operations cancel out? You’re forced to keep in mind program order to perform the necessary dataflow analysis, following mutations of and through the program. But this isn’t very general, and traversing all of those paths makes the search space for large functions very big. Instead, you would like to rewrite the program such that and gradually get replaced with the expression that calculates the most recent value, like this: Then we can replace each occurrence of a variable with its right-hand side recursively… Then fold the constants together… And finally, we see that we’re returning , and can replace it with . All the other variables are now unused, so we can delete them. The reason this works so well is because we took a function with mutation, and converted it into a combinatorial circuit , a type of digital logic circuit that has no state, and which is very easy to analyze. The dependencies between nodes in the circuit (corresponding to primitive operations such as addition or multiplication) are obvious from its structure. For example, consider the following circuit diagram for a one-bit multiplier: This graph representation of an operation program has two huge benefits: The powerful tools of graph theory can be used to algorithmically analyze the program and discover useful properties, such as operations that are independent of each other or whose results are never used. The operations are not ordered with respect to each other except when there is a dependency; this is useful for reordering operations, something compilers really like to do. The reason combinatorial circuits are the best circuits is because they are directed acyclic graphs (DAGs) which admit really nice algorithms. For example, longest path in a graph is NP-hard (and because P ≠ N P P \neq NP P  = NP 8 , has complexity O ( 2 n ) O(2^n) O ( 2 n ) ). However, if the graph is a DAG, it admits an O ( n ) O(n) O ( n ) solution! To understand this benefit, consider another program: Suppose we wanted to replace each variable with its definition like we did before. We can’t just replace each constant variable with the expression that defines it though, because we would wind up with a different program! Now, we pick up an extra term because the squaring operation is no longer unused! We can put this into circuit form, but it requires inserting new variables for every mutation. But we can’t do this when complex control flow is involved! So all of our algorithms need to carefully account for mutations and program order, meaning that we don’t get to use the nice graph algorithms without careful modification. SSA stands for “static single assignment”, and was developed in the 80s as a way to enhance the existing three-argument code (where every statement is in the form ) so that every program was circuit-like, using a very similar procedure to the one described above. The SSA invariant states that every variable in the program is assigned to by precisely one operation. If every operation in the program is visited once, they form a combinatorial circuit. Transformations are required to respect this invariant. In circuit form, a program is a graph where operations are nodes, and “registers” (which is what variables are usually called in SSA) are edges (specifically, each output of an operation corresponds to a register). But, again, control flow. We can’t hope to circuitize a loop, right? The key observation of SSA is that most parts of a program are circuit-like. A basic block is a maximal circuital component of a program. Simply put, it is a sequence of non-control flow operations, and a final terminator operation that transfers control to another basic block. The basic blocks themselves form a graph, the control flow graph , or CFG. This formulation of SSA is sometimes called SSA-CFG 9 . This graph is not a DAG in general; however, separating the program into basic blocks conveniently factors out the “non-DAG” parts of the program, allowing for simpler analysis within basic blocks. There are two equivalent formalisms for SSA-CFG. The traditional one uses special “phi” operations (often called phi nodes , which is what I will call them here) to link registers across basic blocks. This is the formalism LLVM uses. A more modern approach, used by MLIR, is block arguments : each basic block specifies parameters, like a function, and blocks transferring control flow to it must pass arguments of those types to it. Let’s look at some code. First, consider the following C function which calculates Fibonacci numbers using a loop. How might we express this in an SSA-CFG IR? Let’s start inventing our SSA IR! It will look a little bit like LLVM IR, since that’s what I’m used to looking at. Every block ends in a , which transfers control to one of several possible blocks. In the process, it calls that block with the given arguments. One can think of a basic block as a tiny function which tails 10 into other basic blocks in the same function. aside Phi Nodes LLVM IR is… older, so it uses the older formalism of phi nodes. “Phi” comes from “phony”, because it is an operation that doesn’t do anything; it just links registers from predecessors. A operation is essentially a switch-case on the predecessors, each case selecting a register from that predecessor (or an immediate). For example, has two predecessors, the implicit entry block , and . In a phi node IR, instead of taking a block argument for , it would specify The value of the operation is the value from whichever block jumped to this one. This can be awkward to type out by hand and read, but is a more convenient representation for describing algorithms (just “add a phi node” instead of “add a parameter and a corresponding argument”) and for the in-memory representation, but is otherwise completely equivalent. It’s a bit easier to understand the transformation from C to our IR if we first rewrite the C to use goto instead of a for loop: However, we still have mutation in the picture, so this isn’t SSA. To get into SSA, we need to replace every assignment with a new register, and somehow insert block arguments… The above IR code is already partially optimized; the named variables in the C program have been lifted out of memory and into registers. If we represent each named variable in our C program with a pointer, we can avoid needing to put the program into SSA form immediately. This technique is used by frontends that lower into LLVM, like Clang. We’ll enhance our IR by adding a declaration for functions, which defines scratch space on the stack for the function to use. Each stack slot produces a pointer that we can from and to. Our Fibonacci function would now look like so: Any time we reference a named variable, we load from its stack slot, and any time we assign it, we store to that slot. This is very easy to get into from C, but the code sucks because it’s doing lots of unnecessary pointer operations. How do we get from this to the register-only function I showed earlier? aside Program Order We want program order to not matter for the purposes of reordering, but as we’ve written code here, program order does matter: loads depend on prior stores but stores don’t produce a value that can be used to link the two operations. We can restore not having program order by introducing operands representing an “address space”; loads and stores take an address space as an argument, and stores return a new address space. An address space, or , represents the state of some region of memory. Loads and stores are independent when they are not connected by a argument. This type of enhancement is used by Go’s SSA IR, for example. However, it adds a layer of complexity to the examples, so instead I will hand-wave this away. Now we need to prove some properties about CFGs that are important for the definition and correctness of our optimization passes. First, some definitions. The predecessors (or “preds”) of a basic block is the set of blocks with an outgoing edge to that block. A block may be its own predecessors. Some literature calls the above “direct” or immediate predecessors. For example, the preds of in our example are are (the special name for the function entry-point) . The successors (no, not “succs”) of a basic block is the set of blocks with an outgoing edge from that block. A block may be its own successors. The sucessors of are and . The successors are listed in the loop’s . If a block is a transitive pred of a block , we say that weakly dominates , or that it is a weak dominator of . For example, , and both weakly dominate . However, this is not usually an especially useful relationship. Instead, we want to speak of dominators: A block is a dominator (or dominates ) if every pred of is dominated by , or if is itself. Equivalently, the dominator set of is the intersection of the dominator sets of its preds, plus . The dominance relation has some nice order properties that are necessary for defining the core graph algorithms of SSA. We only consider CFGs which are flowgraphs, that is, all blocks are reachable from the root block , which has no preds. This is necessary to eliminate some pathological graphs from our proofs. Importantly, we can always ask for an acyclic path 11 from to any block . An equivalent way to state the dominance relationship is that from every path from to contains all of ’s dominators. proposition dominates iff every path from to contains . First, assume every to path contains . If is , we’re done. Otherwise we need to prove each predecessor of is dominated by ; we do this by induction on the length of acyclic paths from to . Consider preds of that are not , and consider all acyclic paths p p p from to ; by appending to them, we have an acyclic path p ′ p' p ′ from to , which must contain . Because both the last and second-to-last elements of this are not , it must be within the shorter path p p p which is shorter than p ′ p' p ′ . Thus, by induction, dominates and therefore Going the other way, if dominates , and consider a path p p p from to . The second-to-last element of p p p is a pred of ; if it is we are done. Otherwise, we can consider the path p p p made by deleting at the end. is dominated by , and p ′ p' p ′ is shorter than p p p , so we can proceed by induction as above. Onto those nice properties. Dominance allows us to take an arbitrarily complicated CFG and extract from it a DAG, composed of blocks ordered by dominance. The dominance relation is a partial order. Dominance is reflexive and transitive by definition, so we only need to show blocks can’t dominate each other. Suppose distinct and dominate each other.Pick an acyclic path p p p from to . Because dominates , there is a prefix p ′ p' p ′ of this path ending in . But because dominates , some prefix p ′ ′ p'' p ′′ of p ′ p' p ′ ends in . But now p p p must contain twice, contradicting that it is acyclic. This allows us to write when dominates . There is an even more refined graph structure that we can build out of dominators, which follows immediately from the partial order theorem. The dominators of a basic block are totally ordered by the dominance relation. Suppose and , but neither dominates the other. Then, there must exist acyclic paths from to which contain both, but in different orders. Take the subpaths of those paths which follow , and , neither of which contains . Concatenating these paths yields a path from to that does not contain , a contradiction. This tells us that the DAG we get from the dominance relation is actually a tree, rooted at . The parent of a node in this tree is called its immediate dominator . Computing dominators can be done iteratively: the dominator set of a block is the intersection the dominator sets of its preds, plus . This algorithm runs in quadratic time. A better algorithm is the Lengauer-Tarjan algorithm[^lta]. It is relatively simple, but explaining how to implement it is a bit out of scope for this article. I found a nice treatment of it here . What’s important is we can compute the dominator tree without breaking the bank, and given any node, we can ask for its immediate dominator. Using immediate dominators, we can introduce the final, important property of dominators. The dominance frontier of a block is the set of all blocks not dominated by with at least one pred which dominates. These are points where control flow merges from distinct paths: one containing and one not. The dominance frontier of is , whose preds are and . There are many ways to calculate dominance frontiers, but with a dominance tree in hand, we can do it like this: algorithm Dominance Frontiers. For each block with more than one pred, for each of its preds, let be that pred. Add to the dominance frontier of and all of its dominators, stopping when encountering ’ immediate dominator. We need to prove that every block examined by the algorithm winds up in the correct frontiers. First, we check that every examined block is added to the correct frontier. If , where is a pred of , and a is ’s immediate dominator, then if , is not in its frontier, because must dominate . Otherwise, must be in ’s frontier, because dominates a pred but it cannot dominate , because then it would be dominated by , a contradiction. Second, we check that every frontier is complete. Consider a block . If an examined block is in its frontier, then must be among the dominators of some pred , and it must be dominated by ’s immediate dominator; otherwise, would dominate (and thus would not be in its frontier). Thus, gets added to ’s dominator. You might notice that all of these algorithms are quadratic. This is actually a very good time complexity for a compilers-related graph algorithm. Cubic and quartic algorithms are not especially uncommon, and yes, your optimizing compiler’s time complexity is probably cubic or quartic in the size of the program! Ok. Let’s construct an optimization. We want to figure out if we can replace a load from a pointer with the most recent store to that pointer. This will allow us to fully lift values out of memory by cancelling out store/load pairs. This will make use of yet another implicit graph data structure. The dataflow graph is the directed graph made up of the internal circuit graphs of each each basic block, connected along block arguments. To follow a use-def chain is to walk this graph forward from an operation to discover operations that potentially depend on it, or backwards to find operations it potentially depends on. It’s important to remember that the dataflow graph, like the CFG, does not have a well defined “up” direction. Navigating it and the CFG requires the dominator tree. One other important thing to remember here is that every instruction in a basic block always executes if the block executes. In much of this analysis, we need to appeal to “program order” to select the last load in a block, but we are always able to do so. This is an important property of basic blocks that makes them essential for constructing optimizations. For a given , we want to identify all loads that depend on it. We can follow the use-def chain of to find which blocks contain loads that potentially depend on the store (call it ). First, we can eliminate loads within the same basic block (call it ). Replace all instructions after (but before any other s, in program order) with ’s def. If is not the last store in this block, we’re done. Otherwise, follow the use-def chain of to successors which use , i.e., successors whose case has as at least one argument. Recurse into those successors, and now replacing the pointer of interest with the parameters of the successor which were set to (more than one argument may be ). If successor loads from one of the registers holding , replace all such loads before a store to . We also now need to send into somehow. This is where we run into something of a wrinkle. If has exactly one predecessor, we need to add a new block argument to pass whichever register is holding (which exists by induction). If is already passed into by another argument, we can use that one. However, if has multiple predecessors, we need to make sure that every path from to sends , and canonicalizing those will be tricky. Worse still, if is in ’s domination frontier, a different store could be contributing to that load! For this reason, dataflow from stores to loads is not a great strategy. Instead, we’ll look at dataflow from loads backwards to stores (in general, dataflow from uses to defs tends to be more useful), which we can use to augment the above forward dataflow analysis to remove the complex issues around domination frontiers. Let’s analyze loads instead. For each in , we want to determine all stores that could potentially contribute to its value. We can find those stores as follows: We want to be able to determine which register in a given block corresponds to the value of , and then find its last store in that block. To do this, we’ll flood-fill the CFG backwards in BFS order. This means that we’ll follow preds (through the use-def chain) recursively, visiting each pred before visiting their preds, and never revisiting a basic block (except we may need to come back to at the end). Determining the “equivalent” 12 of in (we’ll call it ) can be done recursively: while examining , follow the def of . If is a block parameter, for each pred , set to the corresponding argument in the case in ’s . Using this information, we can collect all stores that the load potentially depends on. If a predecessor stores to , we add the last such store in (in program order) to our set of stores, and do not recurse to ’s preds (because this store overwrites all past stores). Note that we may revisit in this process, and collect a store to from it occurs in the block. This is necessary in the case of loops. The result is a set of pairs. In the process, we also collected a set of all blocks visited, , which are dominators of which we need to plumb a through. This process is called memory dependency analysis , and is a key component of many optimizations. Not all contributing operations are stores. Some may be references to globals (which we’re disregarding), or function arguments or the results of a function call (which means we probably can’t lift this load). For example gets traced all the way back to a function argument, there is a code path which loads from a pointer whose stores we can’t see. It may also trace back to a stack slot that is potentially not stored to. This means there is a code path that can potentially load uninitialized memory. Like LLVM, we can assume this is not observable behavior, so we can discount such dependencies. If all of the dependencies are uninitialized loads, we can potentially delete not just the load, but operations which depend on it (reverse dataflow analysis is the origin of so-called “time-traveling” UB). Now that we have the full set of dependency information, we can start lifting loads. Loads can be safely lifted when all of their dependencies are stores in the current function, or dependencies we can disregard thanks to UB in the surface language (such as loads or uninitialized loads). There is a lot of fuss in this algorithm about plumbing values through block arguments. A lot of IRs make a simplifying change, where every block implicitly receives the registers from its dominators as block arguments. I am keeping the fuss because it makes it clearer what’s going on, but in practice, most of this plumbing, except at dominance frontiers, would be happening in the background. Suppose we can safely lift some load. Now we need to plumb the stored values down to the load. For each block in (all other blocks will now be in unless stated otherwise). We will be building two mappings: one , which is the register equivalent to in that block. We will also be building a map , which is the value that must have in that block. Prepare a work queue, with each in it initially. Pop a block form the queue. For each successor (in ): If isn’t already defined, add it as a block argument. Have pass to that argument. If hasn’t been visited yet, and isn’t the block containing the load we’re deleting, add it to the queue. Once we’re done, if is the block that contains the load, we can now replace all loads to before any stores to with . There are cases where this whole process can be skipped, by applying a “peephole” optimization. For example, stores followed by loads within the same basic block can be optimized away locally, leaving the heavy-weight analysis for cross-block store/load pairs. Here’s the result of doing dependency analysis on our Fibonacci function. Each load is annotated with the blocks and stores in . Let’s look at . Is contributing loads are in and . So we add a new parameter : in , we call that parameter with (since that’s stored to it in ), while in , we pass . What about L4? The contributing loads are also in and , but one of those isn’t a pred of . is also in the subgraph for this load, though. So, starting from , we add a new parameter to and feed (the stored value, an immediate this time) through it. Now looking at , we see there is already a parameter for this load ( ), so we just pass as that argument. Now we process , which pushed onto the queue. gets a new parameter , which is fed ’s own . We do not re-process , even though it also appears in ’s gotos, because we already visited it. After doing this for the other two loads, we get this: After lifting, if we know that a stack slot’s pointer does not escape (i.e., none of its uses wind up going into a function call 13 ) or a write to a global (or a pointer that escapes), we can delete every store to that pointer. If we delete every store to a stack slot, we can delete the stack slot altogether (there should be no loads left for that stack slot at this point). This analysis is simple, because it assumes pointers do not alias in general. Alias analysis is necessary for more accurate dependency analysis. This is necessary, for example, for lifting loads of fields of structs through subobject pointers, and dealing with pointer arithmetic in general. However, our dependency analysis is robust to passing different pointers as arguments to the same block from different predecessors. This is the case that is specifically handled by all of the fussing about with dominance frontiers. This robustness ultimately comes from SSA’s circuital nature. Similarly, this analysis needs to be tweaked to deal with something like (a ternary, essentially). s of pointers need to be replaced with s of the loaded values, which means we need to do the lifting transformation “all at once”: lifting some liftable loads will leave the IR in an inconsistent state, until all of them have been lifted. Many optimizations will make a mess of the CFG, so it’s useful to have simple passes that “clean up” the mess left by transformations. Here’s some easy examples. If an operation’s result has zero uses, and the operation has no side-effects, it can be deleted. This allows us to then delete operations that it depended on that now have no side effects. Doing this is very simple, due to the circuital nature of SSA: collect all instructions whose outputs have zero uses, and delete them. Then, examine the defs of their operands; if those operations now have no uses, delete them, and recurse. This bubbles up all the way to block arguments. Deleting block arguments is a bit trickier, but we can use a work queue to do it. Put all of the blocks into a work queue. Pop a block from the queue. Run unused result elimination on its operations. If it now has parameters with no uses, remove those parameters. For each pred, delete the corresponding arguments to this block. Then, Place those preds into the work queue (since some of their operations may have lost their last use). If there is still work left, go to 1. There are many CFG configurations that are redundant and can be simplified to reduce the number of basic blocks. For example, unreachable code can help delete blocks. Other optimizations may cause the at the end of a function to be empty (because all of its successors were optimized away). We treat an empty as being unreachable (since it has no cases!), so we can delete every operation in the block up to the last non-pure operation. If we delete every instruction in the block, we can delete the block entirely, and delete it from its preds’ s. This is a form of dead code elimination , or DCE, which combines with the previous optimization to aggressively delete redundant code. Some jumps are redundant. For example, if a block has exactly one pred and one successor, the pred’s case for that block can be wired directly to the successor. Similarly, if two blocks are each other’s unique predecessor/successor, they can be fused , creating a single block by connecting the input blocks’ circuits directly, instead of through a . If we have a ternary operation, we can do more sophisticated fusion. If a block has two successors, both of which the same unique successor, and those successors consist only of gotos, we can fuse all four blocks, replacing the CFG diamond with a . In terms of C, this is this transformation: LLVM’s CFG simplification pass is very sophisticated and can eliminate complex forms of control flow. I am hoping to write more about SSA optimization passes. This is a very rich subject, and viewing optimizations in isolation is a great way to understand how a sophisticated optimization pipeline is built out of simple, dumb components. It’s also a practical application of graph theory that shows just how powerful it can be, and (at least in my opinion), is an intuitive setting for understanding graph theory, which can feel very abstract otherwise. In the future, I’d like to cover CSE/GVN, loop optimizations, and, if I’m feeling brave, getting out of SSA into a finite-register machine (backends are not my strong suit!). Specifically the Swift frontend before lowering into LLVM IR.  ↩ Microsoft Visual C++, a non-conforming C++ compiler sold by Microsoft  ↩ HotSpot is the JVM implementation provided by OpenJDK; C2 is the “second compiler”, which has the best performance among HotSpot’s Java execution engines.  ↩ V8 is Chromium’s JavaScript runtime.  ↩ SpiderMonkey is Firefox’s JavaScript runtime.  ↩ The Android Runtime (ART) is the “JVM” (scare quotes) on the Android platform.  ↩ The Glasgow Haskell Compiler (GHC), does not use SSA; it (like some other pure-functional languages) uses a continuation-oriented IR (compare to Scheme’s ).  ↩ Every compiler person firmly believes that P ≠ N P P \neq NP P  = NP , because program optimization is full of NP-hard problems and we would have definitely found polynomial ideal register allocation by now if it existed.  ↩ Some more recent IRs use a different version of SSA called “structured control flow”, or SCF. Wasm is a notable example of an SCF IR. SSA-SCF is equivalent to SSA-CFG, and polynomial time algorithms exist for losslessly converting between them (LLVM compiling Wasm, for example, converts its CFG into SCF using a “relooping algorithm”). In SCF, operations like switch statements and loops are represented as macro operations that contain basic blocks. For example, a operation might take a value as input, select a basic block to execute based on that, and return the value that basic block evaluates to as its output. RVSDG is a notable innovation in this space, because it allows circuit analysis of entire imperative programs. I am convering SSA-CFG instead of SSA-SCF simply because it’s more common, and because it’s what LLVM IR is. See also this MLIR presentation for converting between the two.  ↩ Tail calling is when a function call is the last operation in a function; this allows the caller to jump directly to the callee, recycling its own stack frame for it instead of requiring it to allocate its own.  ↩ Given any path from to , we can make it acyclic by replacing each subpath from to with a single node.  ↩ When moving from a basic block to a pred, a register in that block which is defined as a block parameter corresponds to some register (or immediate) in each predecessor. That is the “equivalent” of . One possible option for the “equivalent” is an immediate: for example, or the address of a global. In the case of a global , assuming no data races, we would instead need alias information to tell if stores to this global within the current function (a) exist and (b) are liftable at all. If the equivalent is , we can proceed in one of two ways depending on optimization level. If we want loads of to trap (as in Go), we need to mark this load as not being liftable, because it may trap. If we want loads of to be UB, we simply ignore that pred, because we can assume (for our analysis) that if the pointer is , it is never loaded from.  ↩ Returned stack pointers do not escape: stack slots’ lifetimes end at function exit, so we return a dangling pointer, which we assume are never loaded. So stores to that pointer before returning it can be discarded.  ↩ The powerful tools of graph theory can be used to algorithmically analyze the program and discover useful properties, such as operations that are independent of each other or whose results are never used. The operations are not ordered with respect to each other except when there is a dependency; this is useful for reordering operations, something compilers really like to do. Prepare a work queue, with each in it initially. Pop a block form the queue. For each successor (in ): If isn’t already defined, add it as a block argument. Have pass to that argument. If hasn’t been visited yet, and isn’t the block containing the load we’re deleting, add it to the queue. Pop a block from the queue. Run unused result elimination on its operations. If it now has parameters with no uses, remove those parameters. For each pred, delete the corresponding arguments to this block. Then, Place those preds into the work queue (since some of their operations may have lost their last use). If there is still work left, go to 1. Specifically the Swift frontend before lowering into LLVM IR.  ↩ Microsoft Visual C++, a non-conforming C++ compiler sold by Microsoft  ↩ HotSpot is the JVM implementation provided by OpenJDK; C2 is the “second compiler”, which has the best performance among HotSpot’s Java execution engines.  ↩ V8 is Chromium’s JavaScript runtime.  ↩ SpiderMonkey is Firefox’s JavaScript runtime.  ↩ The Android Runtime (ART) is the “JVM” (scare quotes) on the Android platform.  ↩ The Glasgow Haskell Compiler (GHC), does not use SSA; it (like some other pure-functional languages) uses a continuation-oriented IR (compare to Scheme’s ).  ↩ Every compiler person firmly believes that P ≠ N P P \neq NP P  = NP , because program optimization is full of NP-hard problems and we would have definitely found polynomial ideal register allocation by now if it existed.  ↩ Some more recent IRs use a different version of SSA called “structured control flow”, or SCF. Wasm is a notable example of an SCF IR. SSA-SCF is equivalent to SSA-CFG, and polynomial time algorithms exist for losslessly converting between them (LLVM compiling Wasm, for example, converts its CFG into SCF using a “relooping algorithm”). In SCF, operations like switch statements and loops are represented as macro operations that contain basic blocks. For example, a operation might take a value as input, select a basic block to execute based on that, and return the value that basic block evaluates to as its output. RVSDG is a notable innovation in this space, because it allows circuit analysis of entire imperative programs. I am convering SSA-CFG instead of SSA-SCF simply because it’s more common, and because it’s what LLVM IR is. See also this MLIR presentation for converting between the two.  ↩ Tail calling is when a function call is the last operation in a function; this allows the caller to jump directly to the callee, recycling its own stack frame for it instead of requiring it to allocate its own.  ↩ Given any path from to , we can make it acyclic by replacing each subpath from to with a single node.  ↩ When moving from a basic block to a pred, a register in that block which is defined as a block parameter corresponds to some register (or immediate) in each predecessor. That is the “equivalent” of . One possible option for the “equivalent” is an immediate: for example, or the address of a global. In the case of a global , assuming no data races, we would instead need alias information to tell if stores to this global within the current function (a) exist and (b) are liftable at all. If the equivalent is , we can proceed in one of two ways depending on optimization level. If we want loads of to trap (as in Go), we need to mark this load as not being liftable, because it may trap. If we want loads of to be UB, we simply ignore that pred, because we can assume (for our analysis) that if the pointer is , it is never loaded from.  ↩ Returned stack pointers do not escape: stack slots’ lifetimes end at function exit, so we return a dangling pointer, which we assume are never loaded. So stores to that pointer before returning it can be discarded.  ↩

0 views
Brain Baking 3 weeks ago

My (Retro) Desk Setup in 2025

A lot has happened since the desk setup post from March 2024 —that being I got kicked out of my usual cosy home office upstairs as it was being rebranded into our son’s bedroom. We’ve been trying to fit the office space into the rest of the house by exploring different alternatives: clear a corner of our bedroom and shove everything in there, cut on stuff and integrate it into the living room, … None of the options felt particularly appealing to me. I grew attached to the upstairs place and didn’t want to lose the skylight. And then we renovated our home resulting in more shuffling around of room designations: the living room migrated to the new section with high glass windows to better connect with the back garden. That logically meant I could claim the vacant living room space. Which I did: My home office setup since May 2025. Compared to the old setup, quite a few things changed. First, it’s clear that the new space is much more roomy. But that doesn’t automatically mean I’m able to fit more stuff into it. After comparing both setups, you’ll probably wonder where most of my retro hardware went off to: only the 486 made it into the corder on the left. I first experimented with replicating the same setup downstairs resulting in a very long desk shoved under the window containing the PC towers and screens. That worked, as again there’s enough space, but at the same time, it didn’t at all: putting a lot of stuff in front of the window not only blocks the view, it also makes the office feel cramped and cluttered. That is why the desk is now split into two. The WinXP and Win98 machines have been temporarily stashed away in a closet as I still have to find a way to fit the third desk somewhere at the back (not pictured). Currently, a cupboard stray from the old living room is refusing to let go. We have some ideas to better organize the space but at the moment I can’t find the energy to make it happen. I haven’t even properly reconnected the 486 tower. The messy cables on the photo have been neatly tucked away by now, at least that’s something. Next, since I also have more wall space, I moved all board games into a new Kallax in the new space (pictured on the left). There’s still ample space left to welcome new board games which was becoming a big problem in the old shelf in the hallway that now holds the games of the kids. On the opposite side of the wall (not pictured), I’ve mounted the Billy bookcases from upstairs that now bleed into the back wall (pictured on the right). These two components are new: the small one is currently holding Switch games and audio CDs and the one on the far right is still mostly empty except for fountain pen ink on the top shelf. The problem with filling all that wall space is that there’s almost none left to decorate with a piece of art. Fortunately, the Monkey Island posters survived the move, but I was hoping to be able to put up something else. The big window doesn’t help here: the old space’s skylight allowed me to optimize the wall space. The window is both a blessing and a curse. Admittedly, it’s very nice to be able to stare outside in-between the blue screen sessions, especially if it’s spring/summer when everything is bright green. The new space is far from finished. I intend to put a table down there next to the board game shelf so that noisy gaming sessions don’t bother the people in the living room. The retro hardware pieces deserve a permanent spot and I’m bummed out that some of them had to be (hopefully temporality) stowed away. A KVM switch won’t help here as I already optimized the monitor usage (see the setup of previous years ). My wife suggested to throw a TV in there to connect the SNES and GameCube but the books are eating up all the wall space and I don’t want the office to degrade into a cluttered mess. I’m not even sure whether the metre long desk is worth it for just a laptop and a second screen compared to the one I used before. The relax chair how used for nightly baby feeds still needs to find its way back here as well. I imagine that in a year things will look differently yet again. Hopefully, by then, it will feature more retroness . Related topics: / setup / By Wouter Groeneveld on 12 October 2025.  Reply via email .

0 views

What Dynamic Typing Is For

Unplanned Obsolescence is a blog is about writing maintainable, long-lasting software. It also frequently touts—or is, at the very least, not inherently hostile to—writing software in dynamically-typed programming languages. These two positions are somewhat at odds. Dynamically-typed languages encode less information. That’s a problem for the person reading the code and trying to figure out what it does. This is a simplified version of an authentication middleware that I include in most of my web services: it checks an HTTP request to see if it corresponds to a logged-in user’s session. Pretty straightforward stuff. The function gets a cookie from the HTTP request, checks the database to see if that token corresponds to a user, and then returns the user if it does. Line 2 fetches the cookie from the request, line 3 gets the user from the database, and the rest either returns the user or throw an error. There are, however, some problems with this. What happens if there’s no cookie included in the HTTP request? Will it return or an empty string? Will even exist if there’s no cookies at all? There’s no way to know without looking at the implementation (or, less reliably, the documentation). That doesn’t mean there isn’t an answer! A request with no cookie will return . That results in a call, which returns (the function checks for that). is a falsy value in JavaScript, so the conditional evaluates to false and throws an . The code works and it’s very readable, but you have to do a fair amount of digging to ensure that it works reliably. That’s a cost that gets paid in the future, anytime the “missing token” code path needs to be understood or modified. That cost reduces the maintainability of the service. Unsurprisingly, the equivalent Rust code is much more explicit. In Rust, the tooling can answer a lot more questions for me. What type is ? A simple hover in any code editor with an LSP tells me, definitively, that it’s . Because it’s Rust, you have to explicitly check if the token exists; ditto for whether the user exists. That’s better for the reader too: they don’t have to wonder whether certain edge cases are handled. Rust is not the only language with a strict, static typing. At every place I’ve ever worked, the longest-running web services have all been written in Java. Java is not as good as Rust at forcing you to show your work and handle edge cases, but it’s much better than JavaScript. Putting aside the question of which one I prefer to write, if I find myself in charge a production web service that someone else wrote, I would much prefer it to be in Java or Rust than JavaScript or Python. Conceding that, ceteris paribus , static typing is good for software maintainability, one of the reasons that I like dynamically-typed languages is that they encourage a style I find important for web services in particular: writing to the DSL. A DSL (domain-specific language) is programming language that’s designed for a specific problem area. This is in contrast to what we typically call “general-purpose programming languages” (e.g. Java, JavaScript, Python, Rust), which can reasonably applied to most programming tasks. Most web services have to contend with at least three DSLs: HTML, CSS, and SQL. A web service with a JavaScript backend has to interface with, at a minimum , four programming languages: one general-purpose and three DSLs. If you have the audacity to use something other than JavaScript on the server, then that number goes up to five, because you still need JavaScript to augment HTML. That’s a lot of languages! How are we supposed to find developers who can do all this stuff ? The answer that a big chunk of the industry settled on is to build APIs so that the domains of the DSLs can be described in the general-purpose programming language. Instead of writing HTML… …you can write JSX, a JavaScript syntax extension that supports tags. This has the important advantage of allowing you to include dynamic JavaScript expressions in your markup. And now we don’t have to kick out to another DSL to write web pages. Can we start abstracting away CSS too? Sure can! This example uses styled-components . This is a tactic I call “expanding the bounds” of the programming language. In an effort to reduce complexity, you try to make one language express everything about the project. In theory, this reduces the number of languages that one needs to learn to work on it. The problem is that it usually doesn’t work. Expressing DSLs in general-purpose programming syntax does not free you from having to understand the DSL—you can’t actually use styled-components without understanding CSS. So now a prospective developer has to both understand CSS and a new CSS syntax that only applies to the styled-components library. Not to mention, it is almost always a worse syntax. CSS is designed to make expressing declarative styles very easy, because that’s the only thing CSS has to do. Expressing this in JavaScript is naturally way clunkier. Plus, you’ve also tossed the web’s backwards compatibility guarantees. I picked styled-components because it’s very popular. If you built a website with styled-components in 2019 , didn’t think about the styles for a couple years, and then tried to upgrade it in 2023 , you would be two major versions behind. Good luck with the migration guide . CSS files, on the other hand, are evergreen . Of course, one of the reasons for introducing JSX or CSS-in-JS is that they add functionality, like dynamic population of values. That’s an important problem, but I prefer a different solution. Instead of expanding the bounds of the general-purpose language so that it can express everything, another strategy is to build strong and simple API boundaries between the DSLs. Some benefits of this approach include: The following example uses a JavaScript backend. A lot of enthusiasm for htmx (the software library I co-maintain) is driven by communities like Django and Spring Boot developers, who are thrilled to no longer be bolting on a JavaScript frontend to their website; that’s a core value proposition for hypermedia-driven development . I happen to like JavaScript though, and sometimes write services in NodeJS, so, at least in theory, I could still use JSX if I wanted to. What I prefer, and what I encourage hypermedia-curious NodeJS developers to do, is use a template engine . This bit of production code I wrote for an events company uses Nunjucks , a template engine I once (fondly!) called “abandonware” on stage . Other libraries that support Jinja -like syntax are available in pretty much any programming language. This is just HTML with basic loops ( ) and data access ( ). I get very frustrated when something that is easy in HTML is hard to do because I’m using some wrapper with inferior semantics; with templates, I can dynamically build content for HTML without abstracting it away. Populating this template in JavaScript is so easy . You just give it a JavaScript object with an field. That’s not particularly special on its own—many languages support serialized key-value pairs. This strategy really shines when you start stringing it together with SQL. Let’s replace that database function call with an actual query, using an interface similar to . I know the above code is not everybody’s taste, but I think it’s marvelous. You get to write all parts of the application in the language best suited to each: HTML for the frontend and SQL for the queries. And if you need to do any additional logic between the database and the template, JavaScript is still right there. One result of this style is that it increases the percentage of your service that is specified declaratively. The database schema and query are declarative, as is the HTML template. The only imperative code in the function is the glue that moves that query result into the template: two statements in total. Debugging is also dramatically easier. I typically do two quick things to narrow down the location of the bug: Those two steps are easy, can be done in production with no deployments, and provide excellent signal on the location of the error. Fundamentally, what’s happening here is a quick check at the two hard boundaries of the system: the one between the server and the client, and the one between the client and the database. Similar tools are available to you if you abstract over those layers, but they are lessened in usefulness. Every web service has network requests that can be inspected, but putting most frontend logic in the template means that the HTTP response’s data (“does the date ever get send to the frontend”) and functionality (“does the date get displayed in the right HTML element?”) can be inspected in one place, with one keystroke. Every database can be queried, but using the database’s native query language in your server means you can validate both the stored data (“did the value get saved?”) and the query (“does the code ask for the right value?”) independent of the application. By pushing so much of the business logic outside the general-purpose programming language, you reduce the likelihood that a bug will exist in the place where it is hardest to track down—runtime server logic. You’d rather the bug be a malformatted SQL query or HTML template, because those are easy to find and easy to fix. When combined with the router-driven style described in Building The Hundred-Year Web Service , you get simple and debuggable web systems. Each HTTP request is a relatively isolated function call: it takes some parameters, runs an SQL query, and returns some HTML. In essence, dynamically-typed languages help you write the least amount of server code possible, leaning heavily on the DSLs that define web programming while validating small amounts of server code via means other than static type checking. To finish, let’s take a look at the equivalent code in Rust, using rusqlite , minjina , and a quasi-hypothetical server implementation: I am again obfuscating some implementation details (Are we storing human-readable dates in the database? What’s that universal result type?). The important part is that this blows. Most of the complexity comes from the need to tell Rust exactly how to unpack that SQL result into a typed data structure, and then into an HTML template. The struct is declared so that Rust knows to expect a for . The derive macros create a representation that minijinja knows how to serialize. It’s tedious. Worse, after all that work, the compiler still doesn’t do the most useful thing: check whether is the correct type for . If it turns out that can’t be represented as a (maybe it’s a blob ), the query will compile correctly and then fail at runtime. From a safety standpoint, we’re not really in a much better spot than we were with JavaScript: we don’t know if it works until we run the code. Speaking of JavaScript, remember that code? That was great! Now we have no idea what any of these types are, but if we run the code and we see some output, it’s probably fine. By writing the JavaScript version, you are banking that you’ve made the code so highly auditable by hand that the compile-time checks become less necessary. In the long run, this is always a bad bet, but at least I’m not writing 150% more code for 10% more compile-time safety. The “expand the bounds” solution to this is to pull everything into the language’s type system: the database schema, the template engine, everything. Many have trod that path; I believe it leads to madness (and toolchain lock-in). Is there a better one? I believe there is. The compiler should understand the DSLs I’m writing and automatically map them to types it understands. If it needs more information—like a database schema—to figure that out, that information can be provided. Queries correspond to columns with known types—the programming language can infer that is of type . HTML has context-dependent escaping rules —the programming language can validate that is being used in a valid element and escape it correctly. With this functionality in the compiler, if I make a database migration that would render my usage of a dependent variable in my HTML template invalid, the compiler will show an error. All without losing the advantages of writing the expressive, interoperable, and backwards-compatible DSLs the comprise web development. Dynamically-typed languages show us how easy web development can be when we ditch the unnecessary abstractions. Now we need tooling to make it just as easy in statically-typed languages too. Thanks to Meghan Denny for her feedback on a draft of this blog. DSLs are better at expressing their domain, resulting in simpler code It aids debugging by segmenting bugs into natural categories The skills gained by writing DSLs are more more transferable CMD+U to View Source - If the missing data is in the HTML, it’s a frontend problem Run the query in the database - If the missing data is in the SQL, it’s a problem with the GET route Language extensions that just translate the syntax are alright by me, like generating HTML with s-expressions , ocaml functions , or zig comptime functions . I tend to end up just using templates, but language-native HTML syntax can be done tastefully, and they are probably helpful in the road to achieving the DX I’m describing; I’ve never seen them done well for SQL. Sqlx and sqlc seem to have the right idea, but I haven’t used either because I to stick to SQLite-specific libraries to avoid async database calls. I don’t know as much about compilers as I’d like to, so I have no idea what kind of infrastructure would be required to make this work with existing languages in an extensible way. I assume it would be hard.

0 views
Kix Panganiban 1 months ago

Python feels sucky to use now

I've been writing software for over 15 years at this point, and most of that time has been in Python. I've always been a Python fan. When I first picked it up in uni, I felt it was fluent, easy to understand, and simple to use -- at least compared to other languages I was using at the time, like Java, PHP, and C++. I've kept myself mostly up to date with "modern" Python -- think pure tooling, , and syntax, and strict almost everywhere. For the most part, I've been convinced that it's fine. But lately, I've been running into frustrations, especially with async workflows and type safety, that made me wonder if there’s a better tool for some jobs. And then I had to help rewrite a service from Python to Typescript + Bun. I'd stayed mostly detached from Typescript before, only dabbling in non-critical path code, but oh, what a different and truly joyful world it turned out to be to write code in. Here are some of my key observations: Bun is fast . It builds fast -- including installing new dependencies -- and runs fast, whether we're talking runtime performance or the direct loading of TS files. Bun's speed comes from its use of JavaScriptCore instead of V8, which cuts down on overhead, and its native bundler and package manager are written in Zig, making dependency resolution and builds lightning-quick compared to or even Python’s with . When I’m iterating on a project, shaving off seconds (or minutes) on installs and builds is a game-changer -- no more waiting around for to resolve or virtual envs to spin up. And at runtime, Bun directly executes Typescript without a separate compilation step. This just feels like a breath of fresh air for developer productivity. Type annotations and type-checking in Python still feel like mere suggestions, whereas they're fundamental in Typescript . This is especially true when defining interfaces or using inheritance -- compared to ABCs (Abstract Base Classes) and Protocols in Python, which can feel clunky. In Typescript, type definitions are baked into the language - I can define an or with precise control over shapes of data, and the compiler catches mismatches while I'm writing (provided that I've enabled it on my editor). Tools like enforce this rigorously. In Python, even with strict , type hints are optional and often ignored by the runtime, leading to errors that only surface when the code runs. Plus, Python’s approach to interfaces via or feels verbose and less intuitive -- while Typescript’s type system feels like better mental model for reasoning about code. About 99% of web-related code is async. Async is first-class in Typescript and Bun, while it’s still a mess in Python . Sure -- Python's and the list of packages supporting it have grown, but it often feels forced and riddled with gotchas and pitfalls. In Typescript, / is a core language feature, seamlessly integrated with the event loop in environments like Node.js or Bun. Promises are a natural part of the ecosystem, and most libraries are built with async in mind from the ground up. Compare that to Python, where was bolted on later (introduced in 3.5), and the ecosystem (in 2025!) is still only slowly catching up. I’ve run into issues with libraries that don’t play nicely with , forcing me to mix synchronous and asynchronous code in awkward ways. This experience has me rethinking how I approach projects. While I’m not abandoning Python -- it’s still my go-to for many things -- I’m excited to explore more of what Typescript and Bun have to offer. It’s like discovering a new favorite tool in the shed, and I can’t wait to see what I build with it next. Bun is fast . It builds fast -- including installing new dependencies -- and runs fast, whether we're talking runtime performance or the direct loading of TS files. Bun's speed comes from its use of JavaScriptCore instead of V8, which cuts down on overhead, and its native bundler and package manager are written in Zig, making dependency resolution and builds lightning-quick compared to or even Python’s with . When I’m iterating on a project, shaving off seconds (or minutes) on installs and builds is a game-changer -- no more waiting around for to resolve or virtual envs to spin up. And at runtime, Bun directly executes Typescript without a separate compilation step. This just feels like a breath of fresh air for developer productivity. Type annotations and type-checking in Python still feel like mere suggestions, whereas they're fundamental in Typescript . This is especially true when defining interfaces or using inheritance -- compared to ABCs (Abstract Base Classes) and Protocols in Python, which can feel clunky. In Typescript, type definitions are baked into the language - I can define an or with precise control over shapes of data, and the compiler catches mismatches while I'm writing (provided that I've enabled it on my editor). Tools like enforce this rigorously. In Python, even with strict , type hints are optional and often ignored by the runtime, leading to errors that only surface when the code runs. Plus, Python’s approach to interfaces via or feels verbose and less intuitive -- while Typescript’s type system feels like better mental model for reasoning about code. About 99% of web-related code is async. Async is first-class in Typescript and Bun, while it’s still a mess in Python . Sure -- Python's and the list of packages supporting it have grown, but it often feels forced and riddled with gotchas and pitfalls. In Typescript, / is a core language feature, seamlessly integrated with the event loop in environments like Node.js or Bun. Promises are a natural part of the ecosystem, and most libraries are built with async in mind from the ground up. Compare that to Python, where was bolted on later (introduced in 3.5), and the ecosystem (in 2025!) is still only slowly catching up. I’ve run into issues with libraries that don’t play nicely with , forcing me to mix synchronous and asynchronous code in awkward ways. Sub-point: Many Python patterns still push for workers and message queues -- think RQ and Celery -- when a simple async function in Typescript could handle the same task with less overhead. In Python, if I need to handle background tasks or I/O-bound operations, the go-to solution often involves spinning up a separate worker process with something like Celery, backed by a broker like Redis or RabbitMQ. This adds complexity -- now I’m managing infrastructure, debugging message serialization, and dealing with potential failures in the queue. In Typescript with Bun, I can often just write an function, maybe wrap it in a or use a lightweight library like if I need queuing, and call it a day. For a recent project, I replaced a Celery-based task system with a simple async setup in Typescript, cutting down deployment complexity and reducing latency since there’s no broker middleman. It’s not that Python can’t do async -- it’s that the cultural and technical patterns around it often lead to over-engineering for problems that Typescript, in my opinion, solves more elegantly.

0 views
Dangling Pointers 1 months ago

Principles and Methodologies for Serial Performance Optimization

Principles and Methodologies for Serial Performance Optimization Sujin Park, Mingyu Guan, Xiang Cheng, and Taesoo Kim OSDI'25 This paper is a psychological trojan horse for computer nerds of a certain vintage. Every paragraph of sections 3 and 4 inflates the ego a bit more. One arrives at section 5 feeling good about their performance optimization skillset, and then one learns that these skills can be replaced by an LLM. A faint hissing sound reaches one’s ears as one’s ego deflates into a shriveled piece of plastic on the ground. Eight Methodologies The authors reviewed 477 papers containing specific instances of serial performance optimization and boiled the techniques down into eight categories. Serial is the operative word here: this paper is about optimizing portions of code that cannot be parallelized. However, some of the techniques are applicable in computations that comprise a serial portion (the critical path) and a parallel portion. Here are the techniques: Amortizing a fixed cost over many items. For example, refactoring code so that some computation can be hoisted out of a (per-item) loop. Store the results of a computation in memory so that it can be used later on. Memoization is a good example. Compute something earlier than is necessary (possibly speculatively). This works in programs which alternate between serial and parallel portions. If the precomputation can be done during a parallel portion, then it can be thought of as “free” because it is off of the critical path. Act like one of my high schoolers: don’t do work now when it could be done earlier. This works great in late spring, when a teacher realizes they have assigned too much work for the year so they cancel an assignment that most students (not mine) have already completed. Deferring interacts with other techniques: Similar to precomputing, deferring can move work off of the critical path Deferring can enable batching by deferring work until a large batch has been accumulated Compute a quick and dirty approximation to the right answer rather than the exactly right answer. Make a generic component more efficient for a specific use case. For example, a library user could pass hints at runtime which gives the library more information to make tradeoffs. Or profile guided optimization to enable the compiler to make better decisions when compiling the generic library. Use a hardware accelerator, to avoid paying the Turing Tax . Chip away at performance inefficiencies caused by the abstractions in a layered software architecture. For example, DPDK allows applications to bypass many networking abstractions. After finishing section 4 of the paper, I felt pretty good. My ego happily agreed with a world view that says that serial performance optimization can be expressed in terms of 8 “basis vectors”, all of which I had extensive experience with. And then came section 5. The authors fine-tuned GPT-4o with a dataset derived from analyzing SOSP and OSDI papers. Each item in the dataset comprises a problem description, observations, and solutions (in terms of the 8 techniques described earlier). The authors made data and scripts available here . The fine-tuned model is called SysGPT . Table 4 shows example inputs (problems + observations), the output from GPT-4, the output from SysGPT, and the actual solution from a real-world paper. Here is an example: Source: https://www.usenix.org/conference/osdi25/presentation/park-sujin Table 5 has quantitative results, where each model is only asked to choose a subset of the 8 optimization strategies for a given problem (N represents the number of example problems given to the baseline GPT-4o model in a prompt): Source: https://www.usenix.org/conference/osdi25/presentation/park-sujin Dangling Pointers It would be interesting to extend these recipes to also include problems which can be parallelized. This paper assumes that the problem and observations are produced by humans. It would be fascinating to see how much of that can be automated. For example, a model could have access to source code and profiling information, and output a set of observations about system performance before optimization. Thanks for reading Dangling Pointers! Subscribe for free to receive new posts and support my work. Eight Methodologies The authors reviewed 477 papers containing specific instances of serial performance optimization and boiled the techniques down into eight categories. Serial is the operative word here: this paper is about optimizing portions of code that cannot be parallelized. However, some of the techniques are applicable in computations that comprise a serial portion (the critical path) and a parallel portion. Here are the techniques: Batching Amortizing a fixed cost over many items. For example, refactoring code so that some computation can be hoisted out of a (per-item) loop. Caching Store the results of a computation in memory so that it can be used later on. Memoization is a good example. Precomputing Compute something earlier than is necessary (possibly speculatively). This works in programs which alternate between serial and parallel portions. If the precomputation can be done during a parallel portion, then it can be thought of as “free” because it is off of the critical path. Deferring Act like one of my high schoolers: don’t do work now when it could be done earlier. This works great in late spring, when a teacher realizes they have assigned too much work for the year so they cancel an assignment that most students (not mine) have already completed. Deferring interacts with other techniques: Similar to precomputing, deferring can move work off of the critical path Deferring can enable batching by deferring work until a large batch has been accumulated

0 views
Cassidy Williams 1 months ago

2000 Poops

Flash back to Spring 2020, when we were all confused and uncertain about what the world was going to look like, and unsure of how we would stay connected to each other. One of my cousins texted our cousin group chat mentioning the app Poop Map as a cheeky (heh) way of keeping up with the fam. We started a family league, and it was honestly pretty great. We’d congratulate each other on our 5-star poops, and mourn the 1-stars. Over time I made other leagues with friends online and offline, and it was really fun. I even talked about it on Scott Hanselman’s podcast when he asked about how to maintain social connections online (if you wanna hear about it, listen at the 11 minute mark in the episode). Eventually, people started to drop off the app, because… it’s dumb? Which is fair. It’s pretty dumb. But alas, I pride myself in being consistent, so I kept at it. For years. The last person I know on the app is my sister-in-law’s high school friend, also known by her very apt username, . She and I have pretty much no other contact except for this app, and yet we’ve bonded. 2000 poops feels like a good place to stop. With 12 countries covered around the world and 45 achievements in the app (including “Are you OK?” courtesy of norovirus, and “Punctuate Pooper” for going on the same day for 12 months in a row), I feel good about saying goodbye. My mom is also really happy I’m stopping. Wonder why? Anyway, goodbye, Poop Map, and goodbye to the fun usernames for the friends along the way: (that’s me), , , , , , , , , , , , , , , , , and of course, . Also, before you go, here’s a fun data visualization I made of all my entries ! Smell ya later!

0 views
Stone Tools 1 months ago

Electric Pencil on the TRS-80

As the story goes, 40-year old filmmaker Michael Shrayer was producing a Pepsi-Cola commercial in 1974 when he had a crisis of conscience on set. He looked around at the shoot, the production, the people, the apparatus and found a kind of veil lifted from his eyes. Prior to that moment, it was work. Suddenly it was insanity in service to a "dancing bottle cap." Almost on-the-spot, he swore off Hollywood for good, tore up his union card (forfeiting union pensions in the process!), and moved to California to semi-retire as a beach bum . After moving to California, with time on his hands, he found a distraction in a new-ish gizmo on the market, the MITS Altair 8800 . Once a creator, always a creator, Shrayer put his efforts into learning how the machine worked, and more importantly how to make things with it. That was not an easy task back then on a machine whose base model used physical toggle switches to input byte code. Undeterred, he put together Software Package 1 , a collection of public domain development tools, then started work on ESP-1 ("Extended" Software Package 1). Tired of using a typewriter to write documentation, he wondered if he could use the Altair itself to help him compose the document. That software didn't exist, so he wrote it himself . Dubbed the Electric Pencil by he and his wife , it was used it to write its own manual, and began its spread through the home computing scene. Landing on the TRS-80 series in 1978 opened it to a mass market, and the word processor genre was well and truly born. For the first couple of years Electric Pencil was the only word processing option for home microcomputers. This gave Michael Shrayer what some may call arrogance, but which I'll call "work/life balance" to utterly ignore the support phone line come 5pm sharp. Customers literally had no choice but to wait until morning to call again. Shrayer was living the dream, I guess is what I'm saying. (He also said he sold 10,000 copies. At ~$200/per. In late 70s money. Again, "living the dream.") In 1982, James Fallows called Electric Pencil , "satisfying to the soul." My expectations are considerably lower than having my soul soothed, but maybe I can at least find a place for it in my heart? I can tell already this is going to be a little painful. No undo. A bit of a clunky UX with the underlying system. No spell checking. Limited number of visible lines of text. Basic cut-copy-paste is nowhere to be seen. But maybe I don't rely on those things as much as I assume I do in this context? The primary work I'm committing to is to write this very section, the bulk of the review, within Electric Pencil itself. I'll also use it to revise the document as much as possible, but this blog platform has unique tools I need to finalize the piece. Once I feel comfortable using Pencil , I think I'll take a stab at writing some fan-fiction in it. Any Mork & Mindy fans out there? I am off to a rough start. The simple fact of the matter is that the interaction between a modern keyboard, an emulator, and Electric Pencil is not strictly harmonious. One of the primary interactions with the software worked best with a key that wasn't even part of the original TRS-80 hardware. I monkey-paw the keyboard, trying to deduce which modern key maps to which retro key and finally figure it out. I write it down on my memo pad and refer to it constantly during my evaluation. Once I'm into Electric Pencil though, I'm rather struck by its stark simplicity. I am here to write, just me and my words; alone together with no external pressure to do anything but the task at hand. It's nice! I'm definitey reminded of any number of "distraction free word processors" in recent years and dozens of attempts to return to a clean, uncluttered interface for writing. I suppose, like Pepsi executives reinventing souless product shoots over and over, we are similarly doomed to reinvent "a blank screen with a blinking cursor" from time to time. Once I start typing though, I realize the blank screen is less a great, untapped plain of fruitful potential and more of a Mad Max style desert. War Boys are at the ready, poised to steal my blood and guzzoline. Initially I was concerned about how I would translate my writing into something I can use on the blog. It struck me that I kind of don't need to worry about it. Electric Pencil saves in plain ASCII format so Markdown is a realistic option. Rather than bring esoteric file formats of the past to the present, we'll instead bring text formatting of the present into the past. Well, that was the working theory until I learned that there is simply no way to type things like or or and even requires a weird combination. I hadn't mentally prepared myself for the idea that the physical keyboards of the past might present their own writing challenges unique from the software. Still, and exist, making HTML input an option for the missing characters. I move on and discover organically that deletes a character. The brief adrenaline rush of feeling attuned to the vibe of the program is nice, until I saved the document. Until I tried to save the document. No lie, this represents is my sixth attempt at getting things working. Look, I do not claim to be a smart person. I overlooked an obvious warning message or two, simply because I didn't expect them. I was looking to the left for cross-traffic when I should have looked to the right for the coming train. A fun fact about Electric Pencil is that when you save your work, you are saving from the current location of the cursor to the end of the document. Think for a moment about the implications of that statement. There will be no advanced warning when you do this, only an after-the-fact message about what was done. It will obediently wipe out large chunks, or even all, of your document, depending on the current cursor position. If you make a mistake and lose your work, the manual explicitly puts the blame on you , "The truth of the matter is that...you screwed up." It then says there are recovery techniques you can learn if you buy a copy of the book TRS-80 Disk & Other Mysteries by Harvard Pennington, the same guy who bought the rights to Electric Pencil from Shrayer. The same guy who published this very version of the program. Hate the game, not the player? But wait, the emulator itself has its own quirks. With the various things that could go wrong having actually gone wrong, dear reader I doubt you'll be surprised to learn that I lost my work multiple times. I cannot blame Electric Pencil for everything, as the emulator kind of worked against me as well, but the total experience of using this in 2025 was quite frustrating. I persevered for your entertainment . The tip jar is to your right. Back in the day, Electric Pencil was notorious for losing the user's typing. In Whole Earth Software Review , Spring 1984, Tony Bove and Cheryl Rhodes wrote, "(It) dropped characters if you typed too fast. During “wraparound” it nibbles at your keystrokes while it does what it wants." I did not personally encounter this in my time with the program. It may have been addressed as a bug fix in v2.0z, though I can't find evidence it did. It may be that the emulator itself provides more consistent keyboard polling than the original hardware did and keeps up with my typing. Or maybe I didn't flex my superior Typing of the Dead skills hard enough? No, for me the most immediate and continually frustrating aspect of using Electric Pencil is its "overtype" mode. This is a feature you still see in text editors and word processors, maybe hidden under the "Advanced" preferences, requiring a conscious choice to enable it. The modern default is to "insert" when typing. Place a cursor, start typing, and your words are injected at the point of the cursor. The text to the right moves to accommodate the new text. Overtype, as the name suggests, types over existing words replacing them. The amount of time I've spent reversing out inadvertent overtyping when I just wanted to add a space, or especially a line break, must surely have subtracted a full hour of my life by now. I have to remember to jump into "insert mode" when I want to retroactively add more space between paragraphs. I'm not one to suggest it is without its merits, though if my life were dependent on finding a reason for its existence my life would be forfeit. But what I can say absolutely is that losing your words to overtype really sucks in a program with no "undo" option. I mentioned Harvard Pennington earlier, and I want to spend a little time talking about the transfer of Electric Pencil from Shrayer to Pennington. I'm using v2.0z of Electric Pencil and it would be unfair to fail to note Michael Shrayer's absence in its development. According to the v2.0z manual, "Shrayer continued to sell ( Electric Pencil ) until January of 1981. By this time (it) was losing ground...Michael was not ready to devote his life to the daily chore of...doing business around the world." Harvard Pennington had already published a book written in Electric Pencil and wanted to keep his beloved software alive. Shrayer sold Pencil to Pennington's company International Jewelry Group. Yes, you read that right, "jewelry." But it's OK, they just called themselves IJG and their hard pivot was papered over nicely. "Pennington got together with...fans and programmers" to create a patched, improved version. Now I say "improved" because I read a review or two that suggested that. I can say absolutely that the command menu in v2.0z is a marked improvement over v1. It helps push some nuclear options (like to clear all text) behind a much-needed protective wall. But it also has, in hindsight perhaps, a couple of baffling decisions. First, if IJG was trying to improve the product, I don't really understand why the save file mechanism remains so inextricably linked to cursor position. Rather than fix it, a slightly snarky lambaste in the manual was added. Good job team, problem solved? The biggest change is the utter removal of WordStar -esque cursor controls. v1 had it, v2 doesn't. The cursor control keys are "unused" according to the v2.0z manual. The functionality was removed and replaced with nothing . Why concede a feature to your competitor that is so useful they literally named their own product for it? Just above I even called it WordStar -esque despite Pencil being first-to-market! In Creative Computing Magazine , November 1984, Pennington, newly elected chairman of the board at IJG, wrote an article about the future of computing. A jewelry salesman turned author and Pencil fanboy, now in charge of stewarding Pencil 's future, saw the coming wave of Macintosh-inspired GUIs. What did he think of this sea-change? "So what is in our future? Windows? A mouse? Icons? This is the hot stuff for 1984. How do we know it is the hot stuff? Because the computer press tells us. And how do they know? Because the marketing people tell them (and they know) because the finance people have determined the next "hot" item. How does the financial community know? They don't. However, no one is going to tell them to take their money elsewhere ... If you can come up with an idea that is bizarre enough, you can probably raise (millions) to bring it to market ... It is hype." Within two years of this bold, anti-GUI stance, Pennington would sell Electric Pencil to Electric Software Corporation. Later, PC Magazine would review ESC's v3.0 in the February 24, 1987 edition. They praised how much you got for $50, but also called it "not easy to learn" and "not for beginners." By then, with so much competition in the genre it had birthed, Electric Pencil was effectively dead. As the v1 manual states, "THE BEST WAY TO LEARN TO OPERATE THIS SYSTEM IS TO USE IT !!!" That's proving true! Why did the v2.0z manual remove that statement?! The learning curve is relatively shallow, benefitting from the software being sparse; there's just not that much to learn. From the perspective of a "daily driver," it's not growing on me. Getting into a flow is proving difficult. (future Chris here; as I come back to edit this document later, the editing flows more freely though I can't claim it is "natural." More like I'm just better at anticipating quirks, and there are plenty.) Part of editing is rearranging, but we need to forget what we know about "cut, copy, and paste." Those words had not yet been commonly adopted to describe those actions. Instead, we have the "block" tool. adds a visual marker to indicate the start of a block. Move the cursor and do that again to add a second marker at the end of a block. The text between markers is your block. Place the cursor elsewhere in the document and will clone the delimited block into the current cursor position; the original block stays as-is. You can clone as much as you like. "Cut" as we understand it today doesn't exist. deletes the block. It is GONE, no longer in memory at all. Remember also, there is no "undo" in this program, so gone really does mean GONE. Good enough for government work, I guess, just watch your step. The only feature left worth discussing is find/replace. brings up a modal screen for searching for a word. In James Fallow's discussion on it, he noted that he would use it for complicated words. He would insert a or some other unusual character to stand-in for a complex name, for example the Russian surname . Then, when he was done editing he would do a find-and-replace on the character for the surname. It only looks for the literal, case-sensitive text you type, although wild cards may be used. This is also your only method for jumping to known parts of the document. Search for a word and replace it with nothing to jump from search result to search result. Aside from some era-specific, esoteric commands, that's basically all Electric Pencil has to offer. It would have been fun if I could have tried the tape-dictation controls to transcribe a conversation. Spell-check isn't part of the base system, though it was available as an add-on. It is bare-bones, utilitarian, and sometimes frustrating for a modern user. It's forcing me to evaluate my personal threshold between "just enough" and "not enough" in such software. For me, this one is "not enough." With a few quality of life additions I suppose it could be sharpened up for the "distraction free" crowd. Maybe v3.0 of Electric Pencil PC is just right, if overtype is your jam? As-is, it is hard to recommend it on the TRS-80 for much more than writing a review of it. But don't worry, I teased some fan-fiction and I am a man of my word. I'd like to return to the Creative Computing Magazine issue where Pennington kind of poo-pooed windows, mice, and icons, "What is their future? They are here to stay. That does not mean that they will be used." In that same issue, Shrayer also gave his thoughts about the future of computing. Michael Shrayer died October 19, 2006 in Arlington, Texas. He was 72. Ways to improve the experience, notable deficiencies, workarounds, and notes about incorporating the software into modern workflows (if possible). Speaking honestly, too much is missing to recommend Electric Pencil on the TRS-80 for a modern workflow. trs80gp v2.5.5 on Windows 11 Emulating the default TRS-80 Model III 48K RAM 4 floppy drives TRSDOS v1.3 Electric Pencil 2.0z Move the cursor to the start of the document with (move to the "beginning") to open the command tool screen. type as Can't remember the filename? type Don't see your file listed? That's because only lists /pcl files. will list everything on disk, including /txt files like I'm using. In the menu, go to an empty drive number Select to get a blank, formatted disk for your work See how the disk name has around it? That means it is referencing an internal copy of a master blank disk. Your changes to it will not be saved to this disk In the drive menu for this disk, select and give your disk a name. In the drive menu for this disk, select for the named disk Now and select the disk you exported above. This disk is your personal copy, saved to the host operating system, ready to accept your writes. When you finish working and want to shut down the emulator, check the menu to see if any disks are prefixed with . If so, that disk has changes in memory that have not yet been written to the filesystem. Those changes will be lost if you shut down. Use on that diskette to save its changes out. trs80gp offers relatively minimal options for speeding up performance. It automatically kicks in "Auto Turbo" mode when accessing the disk drive, so I didn't find it annoyingly slow to read/write even though I'm using virtual floppies. A virtual hard disk is an option, but configuration looks... complex . I'll dig into that later. mode makes a noticeable difference in input and scroll speed. I didn't notice any troubles using that mode with this software. It was definitely a time-saver to set up the Windows shortcut with launch parameters (Properties > Shortcut > Target) to auto-insert program and save disks on launch, enable "Turbo", and set the emulator's typing mode to "Typing Layout" (see Troubleshooting, below) Options for those not running on Windows The TRS-80 emulator scene itself seems fairly anemic, especially outside of Windows. There is a Javascript emulator , but it feels a little heavy for my lightweight needs, and the hosted online versions seem to disallow arbitrary disk insertion. I'm completely unclear how to get at my personal data even if I did manage to save my work. That said, it is open source on Github and may be a better option than my initial tests indicated. So I suppose you could run a Node server? Or run it in a web browser? to interface with an emulator which runs the software. How many abstraction layers are you willing to put up with? For a native macOS app, the only option I can recommend is kind of non-native: run trs80gp and trstools in WINE. No other app is maintained to work on modern macOS, or if it runs its "experimental" or broken in some fundamental way to render it unusable. On Linux, I'm still investigating. Remember: in trs80gp's Diskette menu if you see beside a drive, that means it has been written to virtually but has not been written to the host operating system yet. This can happen if you have a diskette with brackets, indicating a virtual copy of an internally defined master diskette. Export that diskette, stat! Keyboard input has three options. One is notably called "Typing Layout" and it addresses the issues I encountered with certain character inputs doing a kind of double-input. For example typing always resulted in printing to screen. "Typing Layout" felt much more stable and behaved as expected, though it had its own quirks (see What's Lacking, below) If you get an error saying , you probably have the cursor at the end of the file and are trying to save. to move to the start of the file before saving. Cannot stress this enough. In a sense it is easy, and in a sense it is annoying. The easiest way I found to get my data out of TRSDOS world is through the utility trstools . Use it to browse your TRSDOS disk, then simply open a file and copy out the contents. It's just plain ASCII; there are no special word processing code embeds. Caveat! There are no embeds unless you use the printer formatting functions! Then there are absolutely embedded codes and they're unfriendly to a modern workflow. They only apply to doing real printing to real vintage printers, so I recommend ignoring those features. No undo whatsoever. This bit me more than once thanks to (delete TO end of line) and (delete ENTIRE line) being right next to each other. You are essentially restricted to no formatting or a subset of markdown formatting. Getting the emulator keyboard and Electric Pencil to be happy together has simply not panned out for me. If I use "Logical Layout" I get double-input on many keys. If I use "Physical Layout" my muscle memory of where lives (for example) betrays me every time. If I use "Typing Layout" keys like stop working and keyboard commands for marking blocks of text don't work any more. There is no perfect keyboard choice for this program that I can find. No spell-checking without a secondary package like Blue Pencil or Electric Webster's . Search is strictly, only case-sensitive. For writing a basic skeleton of a document for this very blog, it worked well-enough. But to restrict all editing to Electric Pencil means not touching a thing within the Ghost blog platform. That is hard to resist. Limited keyboard support means writing without certain characters that come up in a modern context quite a lot, like The default "overtype" mode definitely has an adjustment period to get through, and will surprise you with how often it deletes the first character of a line of text when all you wanted to do was insert a new line. Getting the data out isn't a horrible process, but adds enough friction to the process to make it frustrating in a rapid write-edit-publish cycle. The small amount of text on screen at one time makes it difficult to read and scan through long text to find specific passages for editing. If you're a visual editor, it's going to be a rough ride. This predates Unicode and software-based fonts, so no international writing!

0 views
Neil Madden 1 months ago

Rating 26 years of Java changes

I first started programming Java at IBM back in 1999 as a Pre-University Employee. If I remember correctly, we had Java 1.1.8 installed at that time, but were moving to Java 1.2 (“Java 2”), which was a massive release—I remember engineers at the time grumbling that the ever-present “ Java in a Nutshell ” book had grown to over 600 pages. I thought I’d take a look back at 26 years of Java releases and rate some of the language and core library changes (Java SE only) that have occurred over this time. It’s a very different language to what I started out with! I can’t possibly cover every feature of those releases , as there are just way too many. So I’m just going to cherry-pick some that seemed significant at the time, or have been in retrospect. I’m not going to cover UI- or graphics-related stuff (Swing, Java2D etc), or VM/GC improvements. Just language changes and core libraries. And obviously this is highly subjective. Feel free to put your own opinions in the comments! The descriptions are brief and not intended as an introduction to the features in question: see the links from the Wikipedia page for more background. NB: later features are listed from when they were first introduced as a preview. The Collections Framework : before the collections framework, there was just raw arrays, Vector, and Hashtable. It gets the job done, but I don’t think anyone thinks the Java collections framework is particularly well designed. One of the biggest issues was a failure to distinguish between mutable and immutable collections, strange inconsistencies like why Iterator as a remove() method (but not, say, update or insert), and so on. Various improvements have been made over the years, and I do still use it in preference to pulling in a better alternative library, so it has shown the test of time in that respect. 4/10 The keyword: I remember being somewhat outraged at the time that they could introduce a new keyword! I’m personally quite fond of asserts as an easy way to check invariants without having to do complex refactoring to make things unit-testable, but that is not a popular approach. I can’t remember the last time I saw an assert in any production Java code. 3/10 Regular expressions: Did I really have to wait 3 years to use regex in Java? I don’t remember ever having any issues with the implementation they finally went for. The Matcher class is perhaps a little clunky, but gets the job done. Good, solid, essential functionality. 9/10 “New” I/O (NIO): Provided non-blocking I/O for the first time, but really just a horrible API (still inexplicably using 32-bit signed integers for file sizes, limiting files to 2GB, confusing interface). I still basically never use these interfaces except when I really need to. I learnt Tcl/Tk at the same time that I learnt Java, and Java’s I/O always just seemed extraordinarily baroque for no good reason. Has barely improved in 2 and a half decades. 0/10 Also notable in this release was the new crypto APIs : the Java Cryptography Extensions (JCE) added encryption and MAC support to the existing signatures and hashes, and we got JSSE for SSL. Useful functionality, dr eadful error-prone APIs . 1/10 Absolutely loads of changes in this release. This feels like the start of modern Java to me. Generics : as Go discovered on its attempt to speed-run Java’s mistakes all over again, if you don’t add generics from the start then you’ll have to retrofit them later, badly. I wouldn’t want to live without them, and the rapid and universal adoption of them shows what a success they’ve been. They certainly have complicated the language, and there are plenty of rough edges (type erasure, reflection, etc), but God I wouldn’t want to live without them. 8/10 . Annotations: sometimes useful, sometimes overused. I know I’ve been guilty of abusing them in the past. At the time it felt like they were ushering a new age of custom static analysis, but that doesn’t really seem to be used much. Mostly just used to mark things as deprecated or when overriding a method. Meh. 5/10 Autoboxing: there was a time when, if you wanted to store an integer in a collection, you had to manually convert to and from the primitive int type and the Integer “boxed” class. Such conversion code was everywhere. Java 5 got rid of that, by getting the compiler to insert those conversions for you. Brevity, but no less inefficient. 7/10 Enums : I’d learned Haskell by this point, so I couldn’t see the point of introducing enums without going the whole hog and doing algebraic datatypes and pattern-matching. (Especially as Scala launched about this time). Decent feature, and a good implementation, but underwhelming. 6/10 Vararg methods: these have done quite a lot to reduce verbosity across the standard library. A nice small improvement that’s had a good quality of life enhancement. I still never really know when to put @SafeVarargs annotations on things though. 8/10 The for-each loop: cracking, use it all the time. Still not a patch on Tcl’s foreach (which can loop over multiple collections at once), but still very good. Could be improved and has been somewhat replaced by Streams. 8/10 Static imports: Again, a good simple change. I probably would have avoided adding * imports for statics, but it’s quite nice for DSLs. 8/10 Doug Lea’s java.util.concurrent etc : these felt really well designed. So well designed that everyone started using them in preference to the core collection classes, and they ended up back-porting a lot of the methods. 10/10 After the big bang of Java 5, Java 6 was mostly performance and VM improvements, I believe, so we had to wait until 2011 for more new language features. Strings in switch: seems like a code smell to me. Never use this, and never see it used. 1/10 Try-with-resources : made a huge difference in exception safety. Combined with the improvements in exception chaining (so root cause exceptions are not lost), this was a massive win. Still use it everywhere. 10/10 Diamond operator for type parameter inference: a good minor syntactic improvement to cut down the visual noise. 6/10 Binary literals and underscores in literals: again, minor syntactic sugar. Nice to have, rarely something I care about much. 4/10 Path and Filesystem APIs: I tend to use these over the older File APIs, but just because it feels like I should. I couldn’t really tell you if they are better or not. Still overly verbose. Still insanely hard to set file permissions in a cross-platform way. 3/10 Lambdas: somewhat controversial at the time. I was very in favour of them, but only use them sparingly these days, due to ugly stack traces and other drawbacks. Named method references provide most of the benefit without being anonymous. Deciding to exclude checked exceptions from the various standard functional interfaces was understandable, but also regularly a royal PITA. 4/10 Streams: Ah, streams. So much potential, but so frustrating in practice. I was hoping that Java would just do the obvious thing and put filter/map/reduce methods onto Collection and Map, but they went with this instead. The benefits of functional programming weren’t enough to carry the feature, I think, so they had to justify it by promising easy parallel computing. This scope creep enormously over-complicated the feature, makes it hard to debug issues, and yet I almost never see parallel streams being used. What I do still see quite regularly is resource leaks from people not realising that the stream returned from Files.lines() has to be close()d when you’re done—but doing so makes the code a lot uglier. Combine that with ugly hacks around callbacks that throw checked exceptions, the non-discoverable API (where are the static helper functions I need for this method again?), and the large impact on lots of very common code, and I have to say I think this was one of the largest blunders in modern Java. I blogged what I thought was a better approach 2 years earlier, and I still think it would have been better. There was plenty of good research that different approaches were better , since at least Oleg Kiselyov’s work in the early noughties . 1/10 Java Time: Much better than what came before, but I have barely had to use much of this API at all, so I’m not in a position to really judge how good this is. Despite knowing how complex time and dates are, I do have a nagging suspicion that surely it doesn’t all need to be this complex? 8/10 Modules: I still don’t really know what the point of all this was. Enormous upheaval for minimal concrete benefit that I can discern. The general advice seems to be that modules are (should be) an internal detail of the JRE and best ignored in application code (apart from when they spuriously break things). Awful. -10/10 (that’s minus 10!) jshell: cute! A REPL! Use it sometimes. Took them long enough. 6/10 The start of time-based releases, and a distinct ramp-up of features from here on, trying to keep up with the kids. Local type inference (“var”) : Some love this, some hate it. I’m definitely in the former camp. 9/10 New HTTP Client : replaced the old URL.openStream() approach by creating something more like Apache HttpClient. It works for most purposes, but I do find the interface overly verbose. 6/10 This release also added TLS 1.3 support, along with djb-suite crypto algorithms. Yay. 9/10 Switch expressions : another nice mild quality-of-life improvement. Not world changing, but occasionally nice to have. 6/10 Text blocks: on the face of it, what’s not to like about multi-line strings? Well, apparently there’s a good reason that injection attacks remain high on the OWASP Top 10, as the JEP introducing this feature seemed intent on getting everyone writing SQL, HTML and JavaScript using string concatenation again. Nearly gave me a heart attack at the time, and still seems like a pointless feature. Text templates (later) are trying to fix this, but seem to be currently in limbo . 3/10 Pattern matching in : a little bit of syntactic sugar to avoid an explicit cast. But didn’t we all agree that using was a bad idea decades ago? I’m really not sure who was doing the cost/benefit analysis on these kinds of features. 4/10 Records: about bloody time! Love ‘em. 10/10 Better error messages for NullPointerExceptions: lovely. 8/10 Sealed classes: in principal I like these a lot. We’re slowly getting towards a weird implementation of algebraic datatypes. I haven’t used them very much yet so far. 8/10 EdDSA signatures: again, a nice little improvement in the built-in cryptography. Came with a rather serious bug though… 8/10 Vector (SIMD) API: this will be great when it is finally done, but still baking several years later. ?/10 Pattern matching switch: another piece of the algebraic datatype puzzle. Seems somehow more acceptable than instanceof, despite being largely the same idea in a better form. 7/10 UTF-8 by default: Fixed a thousand encoding errors in one fell swoop. 10/10 Record patterns: an obvious extension, and I think we’re now pretty much there with ADTs? 9/10 Virtual threads: being someone who never really got on with async/callback/promise/reactive stream-based programming in Java, I was really happy to see this feature. I haven’t really had much reason to use them in anger yet, so I don’t know how well they’ve been done. But I’m hopeful! ?/10 String templates: these are exactly what I asked for in A few programming language features I’d like to see , based on E’s quasi-literal syntax, and they fix the issues I had with text blocks. Unfortunately, the first design had some issues, and so they’ve gone back to the drawing board. Hopefully not for too long. I really wish they’d not released text blocks without this feature. 10/10 (if they ever arrive). Sequenced collections: a simple addition that adds a common super-type to all collections that have a defined “encounter order”: lists, deques, sorted sets, etc. It defines convenient getFirst() and getLast() methods and a way to iterate items in the defined order or in reverse order. This is a nice unification, and plugs what seems like an obvious gap in the collections types, if perhaps not the most pressing issue? 6/10 Wildcards in patterns: adds the familiar syntax from Haskell and Prolog etc of using as a non-capturing wildcard variable in patterns when you don’t care about the value of that part. 6/10 Simplified console applications: Java finally makes simple programs simple for beginners, about a decade after universities stopped teaching Java to beginners… Snark aside, this is a welcome simplification. 8/10 This release also adds support for KEMs , although in the simplest possible form only. Meh. 4/10 The only significant change in this release is the ability to have statements before a call to super() in a constructor. Fine. 5/10 Primitive types in patterns: plugs a gap in pattern matching. 7/10 Markdown javadoc comments: Does anyone really care about this? 1/10 The main feature here from my point of view as a crypto geek is the addition of post-quantum cryptography in the form of the newly standardised ML-KEM and ML-DSA algorithms, and support in TLS. Stable values: this is essentially support for lazily-initialised final variables. Lazy initialisation is often trickier than it should be in Java, so this is a welcome addition. Remembering Alice ML , I wonder if there is some overlap between the proposed StableValue and a Future? 7/10 ? PEM encoding of cryptographic objects is welcome from my point of view, but someone will need to tell me why this is not just ? Decoding support is useful though, as that’s a frequent reason I have to grab Bouncy Castle still. 7/10 Well, that brings us pretty much up to date. What do you think? Agree, disagree? Are you a passionate defender of streams or Java modules? Have at it in the comments.

0 views
Dan Moore! 2 months ago

Career Leverage as a Developer

I was recently on the “I’m a Software Engineer, What’s Next?” podcast. You can view the whole podcast episode and you can subscribe and learn more about the podcast as well. (You can see all my podcast appearances .) We covered a lot of interesting ground, but one thing we talked about was undifferentiated but scary problems. When you find one of these in the business world, that makes for a good software company. FusionAuth is one example of this. There, we focus on authentication. Authentication is undifferentiated because: Authentication is scary and risky because: Of course the deeper you get into any area, the less scary it seems, but for the average developer, I think authentication is imposing. There are other undifferentiated but scary areas of software development, including: But one insight that came out of the discussion is that this applies to your career as well. If you focus on undifferentiated scary problems, then you have a lucrative career ahead of you, because the problem is important to solve (because it is scary) and transfers between companies (because it is undifferentiated). If you focus on differentiated problems, such as a scary area of the code base that is unique to the project, you’ll be tied to a particular company. If you focus on safe problems, you can switch between companies but you’re unlikely to make a lot of money, because the problems you are working on won’t be that important. For new developers, I wouldn’t recommend immediate specialization into a scary, undifferentiated problem. There’s an explore-versus-exploit continuum in careers, and early on, exploration is crucial. You have to find what you are interested in. But at some point, choosing an undifferentiated scary problem and solving it in a relatively unique way gives you significant career leverage. And leverage makes you valuable in the workplace. It also helps you avoid being a cog. Every employer wants fungible employees, but every employee should resist being fungible. Don’t be “Java Engineer 2” or “React Developer with 3 years experience.” Be someone people want to work with. The goal is for people to say, “I want to work with [your name],” not “I want to work with any React developer.” By tackling problems that are both scary (high-impact) and undifferentiated (universally applicable), you build expertise that travels with you while positioning yourself as someone who can handle what others avoid. most online apps need it it’s not a competitive advantage for most applications there are well known standards (OIDC, SAML, OAuth) it impacts conversion and user experience the risk of user data being exposed impacts reputation and bottom line there’s jargon there’s security risk performance legacy code upkeep real time systems distributed systems

0 views
Farid Zakaria 2 months ago

Writing a protoc plugin in Java

Know thy enemy. – Sun Tzu Anyone who’s used Protocol Bufffers We use Protocol Buffers heavily at $DAYJOB$ and it’s becoming increasingly a large pain point, most notably due to challenges with coercing multiple versions in a dependency graph. Recently, a team wanted to augment the generated Java code protoc (Protobuf compiler) emits. I was aware that the compiler had a “plugin” architecture but had never looked deeper into it. Let’s explore writing a Protocol Buffer plugin, in Java and for the Java generated code. 🤓 If you’d like to see the end result check out github.com/fzakaria/protoc-plugin-example Turns out that plugins are simple in that they operate solely over standard input & output and unsurprisingly marshal protobuf over them. A plugin is just a program which reads a protocol buffer from standard input and then writes a protocol buffer to standard output. [ ref ] The request & response protos are described in plugin.proto . Here is a dumb plugin that emits a fixed class to demonstrate. We can run this and see that the expected file is produced. Let’s now look at an example in . You can generate the traditional Java code for this using which by default includes the capability to output Java. Nothing out of the ordinary here, we are merely baselining our knowledge. 👌 How can I now modify this code? If you audit the generated code you will see comments that contain such as: Insertion points are markers within the generated source that allow other plugins to include additional content. We have to modify our that we include in the response to specify the insertion point and instead of a new file being created, the contents of files will be merged. ✨ Our example plugin would like to add the function to every message type described in the proto file. We do this by setting the appropriate insertion point which we found from auditing the original generated code. In this particular example, we want to add our new funciton to the Class definition and pick as our insertion point. We now run both the Java generator alongside our custom plugin. We can audit the generated source and we see that our new method is now included! 🔥 Note: The plugin must be listed after as the order matters on the command-line. While we are limited by the insertion points previously defined in the open-source implementation of the Java protobuf generator, it does provide a convenient way to augment the the generated files. We can also include additional source files that may wrap the original files for cases where the insertion points may not suffice.

0 views
matklad 2 months ago

Look Out For Bugs

One of my biggest mid-career shifts in how I write code was internalizing the idea from this post: Don’t Write Bugs Historically, I approached coding with an iteration-focused mindset — you write a draft version of a program, you set up some kind of a test to verify that it does what you want it to do, and then you just quickly iterate on your draft until the result passes all the checks. This was a great approach when I was only learning to code, as it allowed me to iterate past the things which were not relevant for me at that point, and focus on what matters. Who cares if it is or in the “паблик статик войд мэйн стринг а-эр-джи-эс”, it’s just some obscure magic spell anyway, and completely irrelevant to the maze-traversing thingy I am working on! Carrying over this approach past the learning phase was a mistake. As Lawrence points out, while you can spend time chasing bugs in the freshly written code, it is possible to dramatically cut the amount of bugs you introduce in the first place, if you focus on optimizing that (and not just the iteration time). It felt (and still feels) like a superpower! But there’s already a perfectly fine article about not making bugs, so I am not going to duplicate it. Instead, I want to share a related, but different super power: You can find bugs by just reading code. I remember feeling this superpower for the first time. I was investigating various rope implementations, and, as a part of that, I looked at the , the implementation powering IntelliJ, very old and battle tested code. And, by just reading the code, I found a bug, since fixed . It wasn’t hard, the original code is just 500 lines of verbose Java (yup, that’s all that you need for a production rope). And I wasn’t even trying to find a bug, it just sort-of jumped out at me while I was trying to understand how the code works. That is, you can find some existing piece of software, carefully skim through implementation, and discover real problems that can be fixed. You can do this to your software as well! By just re-reading a module you wrote last year, you might find subtle problems. I regularly discover TigerBeetle issues by just covering this or that topic on IronBeetle : bug discovered live , fixed , and PR merged . Here are some tips for getting better at this: The key is careful, slow reading. What you actually are doing is building the mental model of a program inside your head. Reading the source code is just an instrument for achieving that goal. I can’t emphasize this enough: programming is all about building a precise understanding inside your mind, and then looking for the diff between your brain and what’s in git. Don’t dodge an opportunity to read more of the code. If you are reviewing a PR, don’t review just the diff, review the entire subsystem. When writing code, don’t hesitate to stop and to probe and feel the context around. Go for or to understand the historical “why” of the code. When reading, mostly ignore the textual order, don’t just read each source file top-down. Instead, use these two other frames: Start at or subsystem equivalent, and use “goto definition” to follow an imaginary program counter. Identify the key data structures and fields, and search for all places where they are created and modified. You want to see a slice across space and time, state and control flow (c.f. Concurrent Expression Problem ). Just earlier today I used the second trick to debug an issue for which I haven’t got a repro. I identified as the key assignment that was recently introduced, then ctrl + f for , and that immediately revealed a gap in my mental model. Note how this was helped by the fact that the thing in question, , was always called that in the source code! If your language allows it, avoid , use proper names. Identify and collect specific error-prone patterns or general smells in the code. In Zig, if there’s an allocator and a in the same scope, you need to be very careful . If there’s an isolated tricky function, it’s probably fine. If there’s a tricky interaction between functions, it is a smell, and some bugs are lurking there. Bottom line: reading the code is surprisingly efficient at proactively revealing problems. Create space for calm reading. When reading, find ways to build mental models quickly, this is not entirely trivial.

0 views
Farid Zakaria 2 months ago

Bazel Knowledge: Testing for clean JVM shutdown

Ever run into the issue where you exit your method in Java but the application is still running? That can happen if you have non-daemon threads still running. 🤔 The JVM specification specifically states the condition under which the JVM may exit [ ref ]: A program terminates all its activity and exits when one of two things happens: What are daemon-threads? They are effectively background threads that you might spin up for tasks such as garbage collection, where you explicitly don’t want them to inhibit the JVM from shutting down. A common problem however is that if you have code-paths on exit that fail to stop all non-daemon threads, the JVM process will fail to exit which can cause problems if you are relying on this functionality for graceful restarts or shutdown. Let’s observe a simple example. If we run this, although we exit the main thread, we observe that the JVM does not exit and the thread continues to do its “work”. Often you will see classes implement or so that an orderly shutdown of these sort of resources can occur. It would be great however to test that such graceful cleanup is done appropriately for our codebases. Is this possible in Bazel? If we run this test however we notice the test PASSES 😱 Turns out that Bazel’s JUnit test runner uses after running the tests, which according to the JVM specification allows the runtime to shutdown irrespective of active non-daemon threads. [ ref ] From discussion with others in the community, this explicit shutdown was added specifically because many tests would hang due to improper non-daemon thread cleanup. 🤦 How can we validate graceful shutdown then? Well, we can leverage and startup our and validate that the application exits within a specific timeout. Additionally, I’ve put forward a pull-request PR#26879 which adds a new system property that can be added to a such that the test runner validates that there are no non-daemon threads running before exiting. It would have been great to remove the call completely when the presence of the property is true; however I could not find a way to then set the exit value of the test. Turns out that even simple things can be a little complicated and it was a bit of a headscratcher to see why our tests were passing despite our failure to properly tear down resources. All the threads that are not daemon threads terminate. Some thread invokes the method of or , and the exit operation is not forbidden by the security manager. Some thread invokes the method of or , and the exit operation is not forbidden by the security manager.

0 views
Farid Zakaria 2 months ago

Bazel Knowledge: dive into unused_deps

The Java language implementation for Bazel has a great feature called strict dependencies – the feature enforces that all directly used classes are loaded from jars provided by a target’s direct dependencies. If you’ve ever seen the following message from Bazel, you’ve encountered the feature. The analog tool for removing dependencies which are not directly referenced is unused_deps . You can run this on your Java codebase to prune your dependencies to those only strictly required. That’s a pretty cool feature, but how does it work? 🤔 Turns out the Go code for the tool is relatively short, let’s dive in! I love learning the inner machinery of how the tools I leverage work. 🤓 Let’s use a simple example to explore the tool. First thing the tool does is query which targets to look at , and it emits this to stderr so that part is a little obvious. It performs a query searching for any rules that start with , or . This would catch our common rules such as or . Here is where things get a little more interesting . The tool emits an ephemeral Bazel in a temporary directory that contains a Bazel aspect . What is the aspect the tool injects into our codebase? The aspect is designed to emit additional files that contain the arguments to the compilation actions. If we inspect what this file looks like for the simple I created , we see it’s the arguments to itself. If you are wondering what is? Bazel uses a custom compiler plugin that will be relevant shortly. ☝️ How does the aspect get injected into our project? Well, after figuring out which targets to build via the , will your target pattern and specify to include this additional dependency and enable the aspect via the flag. If you are using Bazel 8+ and have disabled, which is the default, you will need my PR#1387 to make it work. The end result after the is that every Java target (i.e. ) will have produced a file in the directory. Why did it go through such lengths to produce this file? The tool is trying is trying to find the direct dependencies of each Java target. The tool searches for the line for each target to see the dependencies that were needed to build it. QUESTION #1 : Why does the tool need to set up this aspect anyways? Bazel will already emit param files for each Java target that contains nearly identical information. The tool will then iterate through all these JAR files, open them up and look at the file within it for the value of which is the Bazel target expression for this dependency. In this case we can see the desired value is . If you happen to use rules_jvm_external to pull in Maven dependencies, the ruleset will “stamp” the downloaded JARs which means injecting them with the entry in their specifically to work with [ ref ]. QUESTION #2 Why does go to such lengths to discover the labels of the direct dependencies of a particular target? Could this be replaced with a command as well ? 🕵️ For our target we have the following After the labels of all the direct dependencies are known for each target, will parse the jdeps file, , of each target which is a binary protocol serialization of found in deps.go . Using we can inspect and explore the file. This is the super cool feature of Bazel and integrating into the Java compiler. 🔥 Bazel invokes the Java compiler itself and will then iterate through all the symbols, via a provided symbol table, the compiler had to resolve. For each symbol, if the dependency is not from the list than it must have been provided through a transitive dependency. [ ref ]. The presence of kind would actually trigger a failure for the strict Java dependency check if enabled. then takes the list of the direct dependencies and keeps only all the dependencies the compiler reported back as actually requiring to perform compilation. The set difference represents the set of targets that are effectively unused and can be reported back to the user for removal! ✨ QUESTION #3 : There is a third type of dependency kind which I saw when investigating our codebase. I was unable to discern how to trigger it and what it represents. What I enjoy about Bazel is learning how you can improve developer experience and provide insightful tools when you integrate the build system deeply with the underlying language, is a great example of this.

0 views
mcyoung 2 months ago

Default Methods in Go

Go’s interfaces are very funny. Rather than being explicitly implemented, like in Java or Rust, they are simply a collection of methods (a “method set”) that the concrete type must happen to have. This is called structural typing, which is the opposite of nominal typing. Go interfaces are very cute, but this conceptual simplicity leads to a lot of implementation problems (a theme with Go, honestly). It removes a lot of intentionality from implementing interfaces, and there is no canonical way to document that satisfies 1 , nor can you avoid conforming to interfaces, especially if one forces a particular method on you. It also has very quirky results for the language runtime. To cast an interface value to another interface type (via the type assertion syntax ), the runtime essentially has to use reflection to go through the method set of the concrete type of . I go into detail on how this is implemented here . Because of their structural nature, this also means that you can’t add new methods to an interface without breaking existing code, because there is no way to attach default implementations to interface methods. This results in very silly APIs because someone screwed up an interface. For example, in the standard library’s package , the interface represents a value which can be parsed as a CLI flag. It looks like this: also has an optional method, which is only specified in the documentation. If the concrete type happens to provide , it will be queries for determining if the flag should have bool-like behavior. Essentially, this means that something like this exists in the flag library: The package already uses reflection, but you can see how it might be a problem if this interface-to-interface cast happens regularly, even taking into account Go’s caching of cast results. There is also , which exists because they messed up and didn’t provide a way for a to unwrap into the value it contains. For example, if a flag is defined with , and then that flag is looked up with , there’s no straightforward way to get the int out of the returned . Instead, you have to side-cast to : As a result, needs to do a lot more work than if had just added , with a default return value of . It turns out that there is a rather elegant workaround for this. Go has this quirky feature called embedding , where a a field in a struct is declared without a name: The -typed embedded field behaves as if we had declared the field , but selectors on will search in if they do not match something on the level. For example, if has a method , and does not, will resolve to . However, if has a method , resolves to , not , because has a field . Importantly, any methods from which does not already have will be added to ’s method set. So this works: Now, suppose that we were trying to add to . Let’s suppose that we had also defined , a type that all satisfiers of must embed. Then, we can write the following: Then, no code change is required for all clients to pick up the new implementation of . Now, this only works if we had required in the first place that anyone satisfying embeds . How can we force that? A little-known Go feature is that interfaces can have unexported methods. The way these work, for the purposes of interface conformance, is that exported methods are matched just by their name, but unexported methods must match both name and package. So, if we have an interface like , then will only match methods defined in the same package that this interface expression appears. This is useful for preventing satisfaction of interfaces. However, there is a loophole: embedding inherits the entire method set, including unexported methods. Therefore, we can enhance to account for this: Now, it’s impossible for any type defined outside of this package to satisfy , without embedding (either directly or through another embedded ). Now, another problem is that you can’t control the name of embedded fields. If the embedded type is , the field’s name is . Except, it’s not based on the name of the type itself; it will pick up the name of a type alias. So, if you want to unexport the defaults struct, you can simply write: This also has the side-effect of hiding all of ’ methods from ’s documentation, despite the fact that exported and fields methods are still selectable and callable by other packages (including via interfaces). As far as I can tell, this is simply a bug in , since this behavior is not documented. There is still a failure mode: if a user type satisfying happened to define a method with a different interface. In this case, that takes precedence, and changes to will break users. There are two workarounds: Tell people not to define methods on their satisfying type, and if they do, they’re screwed. Because satisfying is now explicit, this is not too difficult to ask for. Pick a name for new methods that is unlikely to collide with anything. Unfortunately, this runs into a big issue with structural typing, which is that it is very difficult to avoid making mistakes when making changes, due to the lack of intent involved. A similar problem occurs with C++ templates, where the interfaces defined by concepts are implicit, and can result in violating contract expectations. Go has historically be relatively cavalier about this kind of issue, so I think that breaking people based on this is fine. And of course, you cannot retrofit a default struct into a interface; you have to define it from day one. Now that we have defaults, we can also enhance with bool flag detection: Now is more than just a random throw-away comment on a type. We can also use defaults to speed up side casts. Many functions around the package will cast an into an or to perform more efficient I/O. In a hypothetical world where we had defaults structs for all the interfaces, we can enhance with a default method that by default returns an error. We can do something similar for , but because it’s a rather general interface, it’s better to keep as-is. So, we can add a conversion method: Here, converts to an , returning if that’s not possible. How is this faster than ? Well, consider what this would look like in user code: Calling , if is an containing a , lowers to the following machine code: The main cost of this conversion is the indirect jump, compared to, at minimum, hitting a hashmap lookup loop for the cache for . Does this performance matter? Not for I/O interfaces, probably, but it can matter for some uses! Yes, it should, but here we are. Although making it a language feature has a few rather unfortunate quirks that we need to keep in mind. Suppose we can define defaults on interface methods somehow, like this: Then, any type which provides automatically satisfies . Suppose satisfies , but does not provide . Then we have a problem: Now, we might consider looking past that, but it becomes a big problem with reflection. If we passed into , the resulting conversion would discard the defaulted method, meaning that it would not be findable by . Oops. So we need to somehow add to ’s method set. Maybe we say that if is ever converted into , it gets the method. But this doesn’t work, because the compiler might not be able to see through something like . This means that must be applied unconditionally. But, now we have the problem that if we have another interface , would need to receive incompatible signatures for . Again, we’re screwed by the non-intentionality of structural typing. Ok, let’s forget about default method implementations, that doesn’t seem to be workable. What if we make some methods optional, like earlier? Let’s invent some syntax for it. Then, suppose that provides but not (or with the wrong signature). Then, the entry in the itab for would contain a nil function pointer, such that panics! To determine if is safe to call, we would use the following idiom: The compiler is already smart enough to elide construction of funcvals for cases like this, although it does mean that in general, for an interface value , requires an extra or similar to make sure that is nil when it’s a missing method. All of the use cases described above would work Just Fine using this construction, though! However, we run into the same issue that appears to have a larger method set than . It is not clear if should conform to , where is required. My intuition would be no: is a strictly weaker interface. Perhaps it might be necessary to avoid the method access syntax for optional methods, but that’s a question of aesthetics. This idea of having nulls in place of function pointers in a vtable is not new, but to my knowledge is not used especially widely. It would be very useful in C++, for example, to be able to determine if no implementation was provided for a non-pure virtual function. However, the nominal nature of C++’s virtual functions does not make this as big of a need. Another alternative is to store a related interfaces’ itabs on in an itab. For example, suppose that we invent the syntax within an to indicate that that interface will likely get cast to . For example: Satisfying does not require satisfying . However, the must be part of public API, because a cannot be used in place of an Within ’s itab, after all of the methods, there is a pointer to an itab for , if the concrete type for this itab also happens to satisfy . Then, a cast from to is just loading a pointer from the itab. If the cast would fail, the loaded pointer will be . I had always assumed that Go did an optimization like this for embedding interfaces, but no! Any inter-interface conversion, including upcasts, goes through the whole type assertion machinery! Of course, Go cannot hope to generate an itab for every possible subset of the method set of an interface (exponential blow-up), but it’s surprising that they don’t do this for embedded interfaces, which are Go’s equivalent of superinterfaces (present in basically every language with interfaces). Using this feature, we can update to look like this: Unfortunately, because changes the ABI of an interface, it does not seem possible to actually add this to existing interfaces, because the following code is valid: Even though this fix seems really clean, it doesn’t work! The only way it could work is if PGO determines that a particular interface conversion to happens a lot, and updates the ABI of all interfaces with the method set of , program-globally, to contain a pointer to a itab if available. Go’s interfaces are pretty bad; in my opinion, a feature that looks good on a slide, but which results in a lot of mess due to its granular and intention-less nature. We can sort of patch over it with embeds, but there’s still problems. Due to how method sets work in Go, it’s very hard to “add” methods through an interface, and honestly at this point, any interface mechanism that makes it impossible (or expensive) to add new functions is going to be a huge problem. Missing methods seems like the best way out of this problem, but for now, we can stick to the janky embedded structs. Go uses the term “implements” to say that a type satisfies an interface. I am instead intentionally using the term “satisfies”, because it makes the structural, passive nature of implementing an interface clearer. This is also more in-line with interfaces’ use as generic constraints. Swift uses the term “conform” instead, which I am avoiding for this reason.  ↩ Tell people not to define methods on their satisfying type, and if they do, they’re screwed. Because satisfying is now explicit, this is not too difficult to ask for. Pick a name for new methods that is unlikely to collide with anything. Load the function pointer for out of ’s itab. Perform an indirect jump on that function pointer. Inside of , load a pointer to the itab symbol and into the return slots. Go uses the term “implements” to say that a type satisfies an interface. I am instead intentionally using the term “satisfies”, because it makes the structural, passive nature of implementing an interface clearer. This is also more in-line with interfaces’ use as generic constraints. Swift uses the term “conform” instead, which I am avoiding for this reason.  ↩

0 views
neilzone 2 months ago

My third Airsoft game day and perhaps I am finally getting the hang of it

I played my third Airsoft game day today, at Red Alert, near Thatcham, again. It was great fun, and, for the first time, I felt that I might be getting the hang of Airsoft. Sure, it is just running around and shooting toy guns at each other, but the first couple of times, I really had no clue what was going on, or what to do. This time was a lot better. I did have to fight a bit with my safety glasses sweating up today, and I spent part of one of the games with less than ideal vision, but I was still reasonably effective. I resorted to a sort of headband, but over my forehead, and that worked pretty well. As it gets cooler, perhaps this will become less of a problem. I played more aggressively than before, in terms of running up and taking on the opposition. I did this whether I was attacking or defending, so more of the “front line” than hanging around at the back. I guess that I was less worried about being hit, and more keen to be involved. It doesn’t hurt too much, and I go back to respawn and start again. I think that not having to think quite so much about the mechanics of the rifle helped, as I could just get on and use it, and focus on other things. Getting used to the site layout is helpful. I also tried to make use of some of the things that I had been taught in the practice evenings, especially use of cover, which definitely helped. I spent some time being sneaky and taking the long way round to flank the enemy to attack from their rear, which was also fun, but it takes a long time to walk back to respawn, which (especially on a hot day, as today was) was a bit of a pain. But I got some sneaky kills in that way. I’m still getting used to the range of my rifle, which is a lot less than I had expected. I don’t think that it is a particularly bad rifle / range - it is not noticeably worse than other people’s similar rifles - but it is just less than I would have thought. I did pretty well with it, in terms of the number of kills, so I have no real complaints. I am not looking to spend much more money on a nascent hobby at the moment, but I could be tempted to upgrade the spring and see if that has a positive effect (within chrono limits for the site). The first two times, I played on semi-automatic the whole time (one BB per pull of the trigger). This time, I experimented with full auto, so BBs firing for as long as I keep the trigger depressed. I firing no more than three or four rounds at a time (short bursts), and that worked quite well. It did mean that I got through a lot more ammunition - about £10 worth, by my estimation. Some games, I got through three hicap magazines, and into a fourth. A sling has made a massive difference, in terms of comfort, and I’ve experimented with the attachment points. This has been a good additional purchase. I think I’d like to give pyrotechnics a go at some point. Smoke grenades, or perhaps a frag grenade. But that feels like an unnecessary distraction at the moment, and I should get better with my rifle first. Not terrible, but definitely room for improvement. I did quite a bit of running, and sprinting between cover, but by the end of the day, I was definitely feeling it.

0 views
Filippo Valsorda 2 months ago

Cross-Site Request Forgery

Cross-Site Request Forgery (CSRF) is a confused deputy attack where the attacker causes the browser to send a request to a target using the ambient authority of the user’s cookies or network position. 1 For example, can serve the following HTML to a victim and the browser will send a POST request to using the victim’s cookies. Essentially all applications that use cookies for authentication need to protect against CSRF. Importantly, this is not about protecting against an attacker that can make arbitrary requests 2 (as an attacker doesn’t know the user’s cookies), but about working with browsers to identify authenticated requests initiated from untrusted sources. Unlike Cross-Origin Resource Sharing (CORS) , which is about sharing responses across origins, CSRF is about accepting state-changing requests, even if the attacker will not see the response. Defending against leaks is significantly more complex and nuanced , especially in the age of Spectre. Why do browsers allow these requests in the first place? Like anything in the Web platform, primarily for legacy reasons: that’s how it used to work and changing it breaks things. Importantly, disabling these third-party cookies breaks important Single-Sign On (SSO) flows. All CSRF solutions need to support a bypass mechanism for those rare exceptions. (There are also complex intersections with cross-site tracking and privacy concerns, which are beyond the scope of this article.) To protect against CSRF, it’s important to first define what is a cross-site or cross-origin request, and which should be allowed. , , and even (depending on the definition) are all same-site but not same-origin. It’s tempting to declare the goal as ensuring requests are simply from the same site, but different origins in the same site can actually sit at very different trust levels: for example it might be much easier to get XSS into an old marketing blog than in the admin panel. The starkest difference in trust though is between an HTTPS and an HTTP origin, since a network attacker can serve anything it wants on the latter. This is sometimes referred to as the MitM CSRF bypass, but really it’s just a special case of a schemelessly same-site cross-origin CSRF attack. Some parts of the Web platform apply a schemeful definition of same-site, where and are not same-site: Using HTTP Strict Transport Security (HSTS) , if possible, is a potential mitigation for HTTP→HTTPS issues. There are a number of potential countermeasures to CSRF, some of which have been available only for a few years. The “classic” countermeasure is a CSRF token , a large random value submitted in the request (e.g. as a hidden ) and compared against a value stored in a cookie ( double-submit ) or in a stateful server-side session ( synchronized tokens ). Normally, double-submit is not a same-origin countermeasure, because same-site origins can set cookies on each other by “cookie tossing”. This can be mitigated with the cookie prefix , or by binding the token to the session/user with signed metadata. The former makes it impossible for the attacker to set the cookie, the latter ensures the attacker doesn’t know a valid value to set it to. Note that signing the cookies or tokens is unnecessary and ineffectual, unless it is binding the token to a user: an attacker that’s cookie tossing can otherwise obtain a valid signed pair by logging into the website themselves and then use that for the attack. This countermeasure turns a cross-origin forgery problem into a cross-origin leak problem: if the attacker can obtain a token from a cross-origin response, it can forge a valid request. The token in the HTML body should be masked as a countermeasure against the BREACH compression attack . The primary issue with CSRF tokens is that they require developers to instrument all their forms and other POST requests. Browsers send the source of a request in the Origin header, so CSRF can be mitigated by rejecting non-safe requests from other origins. The main issue is knowing the application’s own origin. One option obviously is asking the developer to configure it, but that’s friction and might not always be easy (such as for open source projects and proxied setups). The closest readily available approximation of the application’s own origin is the Host header. This has two issues: Some older (pre-2020) browsers didn’t send the Origin header for POST requests . The value can be in a variety of cases, such as due to or following cross-origin redirects. must be treated as an indication of a cross-origin request. Some privacy extensions remove the Origin header instead of setting it to . This should be considered a security vulnerability introduced by the extension, since it removes any reliable indication of a browser cross-origin request. If authentication cookies are explicitly set with the SameSite attribute Lax or Strict, they will not be sent with non-safe cross-site requests. This is, by design, not a cross-origin protection, and it can’t be fixed with the prefix (or Secure attribute), since that’s about who can set and read cookies, not about where the requests originate. (This difference is reflected in the difference between Scheme-Bound Cookies and Schemeful Same-Site .) The risk of same-site HTTP origins is still present, too, in browsers that don’t implement Schemeful Same-Site. Note that the rollout of SameSite Lax by default has mostly failed due to widespread breakage, especially in SSO flows. Some browsers now default to Lax-allowing-unsafe , while others default(ed) to None for the first two minutes after the cookie was set. These defaults are not effective CSRF countermeasures. Although CORS is not designed to protect against CSRF, “ non-simple requests ” which for example set headers that a simple couldn’t set are preflighted by an OPTIONS request. An application could choose to allow only non-simple requests, but that is fairly limiting precisely because “simple requests” includes all the ones produced by . To provide a reliable cross-origin signal to websites, browsers introduced Fetch metadata . In particular, the Sec-Fetch-Site header is set to / / / 3 and is now the recommended method to mitigate CSRF . The header has been available in all major browsers since 2023 (and earlier for all but Safari). One limitation is that it is only sent to “ trustworthy origins ”, i.e. HTTPS and localhost. Note that this is not about the scheme of the initiator origin, but of the target, so it is sent for HTTP→HTTPS requests, but not for HTTPS→HTTP or HTTP→HTTP requests (except localhost→localhost). If Sec-Fetch-Site is missing, a lax fallback on Origin=Host is an option, since HTTP→HTTPS requests are not a concern. In summary, to protect against CSRF applications (or, rather, libraries and frameworks) should reject cross-origin non-safe browser requests. The most developer-friendly way to do so is using primarily Fetch metadata, which requires no extra instrumentation or configuration. Allow all GET, HEAD, or OPTIONS requests. These are safe methods, and are assumed not to change state at various layers of the stack already. If the Origin header matches an allow-list of trusted origins, allow the request. Trusted origins should be configured as full origins (e.g. ) and compared by simple equality with the header value. If the Sec-Fetch-Site header is present: This secures all major up-to-date browsers for sites hosted on trustworthy (HTTPS or localhost) origins. If neither the Sec-Fetch-Site nor the Origin headers are present, allow the request. These requests are not from (post-2020) browsers, and can’t be affected by CSRF. If the Origin header’s host (including the port) matches the Host header, allow the request, otherwise reject it. This is either a request to an HTTP origin, or by an out-of-date browser. The only false positives (unnecessary blocking) of this algorithm are requests to non-trustworthy (plain HTTP) origins that go through a reverse proxy that changes the Host header. That edge case can be worked around by adding the origin to the allow-list. There are no false negatives in modern browsers, but pre-2023 browsers will be vulnerable to HTTP→HTTPS requests, because the Origin fallback is scheme-agnostic. HSTS can be used to mitigate that (in post-2020 browsers), but note that out-of-date browsers are likely to have more pressing security issues. Finally, there should be a tightly scoped bypass mechanism for e.g. SSO edge cases, with the appropriate safety placards . For example, it could be route-based, or require manual tagging of requests before the CSRF middleware. Go 1.25 introduces a CrossOriginProtection middleware in which implements this algorithm . (This research was done as background for that proposal.) Thank you to Roberto Clapis for helping with this analysis, and to Patrick O’Doherty for setting in motion and testing this work. For more, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @[email protected] . Back to Rome photoblogging. This was taken from the municipal rose garden, which opens for a couple weeks every spring and fall. This work is made possible by Geomys , my Go open source maintenance organization, which is funded by Smallstep , Ava Labs , Teleport , Tailscale , and Sentry . Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement .) Here are a few words from some of them! Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews. Ava Labs — We at Ava Labs , maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network ), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team. Abuse of the ambient authority of network position, often through DNS rebinding, is being addressed by Private Network Access . The rest of this post will focus on abuse of cookie authentication.  ↩ This is why API traffic generally doesn’t need to be protected against CSRF. If it looks like it’s not from a browser, it can’t be a CSRF.  ↩ means the request was directly user-initiated, e.g. a bookmark.  ↩ Cookies in general apply the schemeless definition (HTTP = HTTPS). There is a proposal to address this, Origin-Bound-Cookies (and specifically its lack of opt-out for scheme binding, which subsumes the earlier Scheme-Bound Cookies proposal), which however hasn’t shipped yet . The SameSite cookie attribute used to apply the schemeless definition (HTTP = HTTPS). Chrome changed that with Schemeful Same-Site in 2020, but Firefox and Safari never implemented it. Sec-Fetch-Site (and the HTML and Fetch specifications in general) apply the schemeful definition (HTTP ≠ HTTPS). it may be different from the browser origin if a reverse proxy is involved; it does not include the scheme, so there is no way to know if an Origin is a cross-origin HTTP→HTTPS request or a same-origin HTTP request. Allow all GET, HEAD, or OPTIONS requests. These are safe methods, and are assumed not to change state at various layers of the stack already. If the Origin header matches an allow-list of trusted origins, allow the request. Trusted origins should be configured as full origins (e.g. ) and compared by simple equality with the header value. If the Sec-Fetch-Site header is present: if its value is or , allow the request; otherwise, reject the request. If neither the Sec-Fetch-Site nor the Origin headers are present, allow the request. These requests are not from (post-2020) browsers, and can’t be affected by CSRF. If the Origin header’s host (including the port) matches the Host header, allow the request, otherwise reject it. This is either a request to an HTTP origin, or by an out-of-date browser. Abuse of the ambient authority of network position, often through DNS rebinding, is being addressed by Private Network Access . The rest of this post will focus on abuse of cookie authentication.  ↩ This is why API traffic generally doesn’t need to be protected against CSRF. If it looks like it’s not from a browser, it can’t be a CSRF.  ↩ means the request was directly user-initiated, e.g. a bookmark.  ↩

0 views
matklad 3 months ago

Zig's Lovely Syntax

It’s a bit of a silly post, because syntax is the least interesting detail about the language, but, still, I can’t stop thinking how Zig gets this detail just right for the class of curly-braced languages, and, well, now you’ll have to think about that too. On the first glance, Zig looks almost exactly like Rust, because Zig borrows from Rust liberally. And I think that Rust has great syntax, considering all the semantics it needs to express (see “Rust’s Ugly Syntax” ). But Zig improves on that, mostly by leveraging simpler language semantics, but also through some purely syntactical tasteful decisions. How do you spell a number ninety-two? Easy, . But what type is that? Statically-typed languages often come with several flavors of integers: , , . And there’s often a syntax for literals of a particular types: , , . Zig doesn’t have suffixes, because, in Zig, all integer literals have the same type: : The value of an integer literal is known at compile time and is coerced to a specific type on assignment or ascription: To emphasize, this is not type inference, this is implicit comptime coercion. This does mean that code like generally doesn’t work, and requires an explicit type. Raw or multiline strings are spelled like this: This syntax doesn’t require a special form for escaping itself: It nicely dodges indentation problems that plague every other language with a similar feature. And, the best thing ever: lexically, each line is a separate token. As Zig has only line-comments, this means that is always whitespace. Unlike most other languages, Zig can be correctly lexed in a line-by-line manner. Raw strings is perhaps the biggest improvement of Zig over Rust. Rust brute-forces the problem with syntax, which does the required job, technically, but suffers from the mentioned problems: indentation is messy, nesting quotes requires adjusting hashes, unclosed raw literal breaks the following lexical structure completely, and rustfmt’s formatting of raw strings tends to be rather ugly. On the plus side, this syntax at least cannot be expressed by a context-free grammar! For the record, Zig takes C syntax (not that C would notice): The feels weird! It will make sense by the end of the post. Here, I want only to note part, which matches the assignment syntax . This is great! This means that grepping for gives you all instances where a field is written to. This is hugely valuable: most of usages are reads, but, to understand the flow of data, you only need to consider writes. Ability to mechanically partition the entire set of usages into majority of boring reads and a few interesting writes does wonders for code comprehension. Where Zig departs from C the most is the syntax for types. C uses a needlessly confusing spiral rule. In Zig, all types are prefix: While pointer type is prefix, pointer dereference is postfix, which is a more natural subject-verb order to read: Zig has general syntax for “raw” identifiers: It is useful to avoid collisions with keywords, or for exporting a symbol whose name is otherwise not a valid Zig identifier. It is a bit more to type than Kotlin’s delightful , but manages to re-use Zig’s syntax for built-ins ( ) and strings. Like, Rust, Zig goes for function declaration syntax. This is such a massive improvement over C/Java style function declarations: it puts token (which is completely absent in traditional C family) and function name next to each other, which means that textual search for allows you to quickly find the function. Then Zig adds a little twist. While in Rust we write The arrow is gone! Now that I’ve used this for some time, I find arrow very annoying to type, and adding to the visual noise. Rust needs the arrow: Rust has lambdas with an inferred return type, and, in a lambda, the return type is optional. So you need some sort of an explicit syntax to tell the parser if there is return type: And it’s understandable that lambdas and functions would want to use compatible syntax. But Zig doesn’t have lambdas, so it just makes the type mandatory. So the main is Related small thing, but, as name of the type, I think I like more than . Zig is using and for binding values to names: This is ok, a bit weird after Rust’s, whose would be in Zig, but not really noticeable after some months. I do think this particular part is not great, because , the more frequent one, is longer. I think Kotlin nails it: , , . Note all three are monosyllable, unlike and ! Number of syllables matters more than the number of letters! Like Rust, Zig uses syntax for ascribing types, which is better than because optional suffixes are easier to parse visually and mechanically than optional prefixes. Zig doesn’t use and and spells the relevant operators as and : This is easier to type and much easier to read, but there’s also a deeper reason why they are not sigils. Zig marks any control flow with a keyword. And, because boolean operators short-circuit, they are control flow! Treating them as normal binary operator leads to an entirely incorrect mental model. For bitwise operations, Zig of course uses and . Both Zig and Rust have statements and expressions. Zig is a bit more statement oriented, and requires explicit returns: Furthermore, because there are no lambdas, scope of return is always clear. Relatedly, the value of a block expression is void. A block is a list of statements, and doesn’t have an optional expression at the end. This removes the semicolon problem — while Rust rules around semicolons are sufficiently clear (until you get to macros), there’s some constant mental overhead to getting them right all the time. Zig is more uniform and mechanical here. If you need a block that yields a value, Zig supports a general syntax for breaking out of a labeled block: Rust makes pedantically correct choice regarding s: braces are mandatory: This removes the dreaded “dangling else” grammatical ambiguity. While theoretically nice, it makes -expression one-line feel too heavy. It’s not the braces, it’s the whitespace around them: But the ternary is important! Exploding a simple choice into multi-line condition hurts readability. Zig goes with the traditional choice of making parentheses required and braces optional: By itself, this does create a risk of style bugs. But in Zig formatter (non-configurable, user-directed) is a part of the compiler, and formatting errors that can mask bugs are caught during compilation. For example, is an error due to inconsistent whitespace around the minus sign, which signals a plausible mixup of infix and binary minus. No such errors are currently produced for incorrect indentation (the value add there is relatively little, given ), but this is planned. NB: because Rust requires branches to be blocks, it is forced to make synonym with . Otherwise, the ternary would be even more unusable! Syntax design is tricky! Whether you need s and whether you make or mandatory in ifs are not orthogonal! Like Python, Zig allows on loops. Unlike Python, loops are expressions, which leads to a nicely readable imperative searches: Zig doesn’t have syntactically-infinite loop like Rust’s or Go’s . Normally I’d consider that a drawback, because these loops produce different control flow, affecting reachability analysis in the compiler, and I don’t think it’s great to make reachability dependent on condition being visibly constant. But! As Zig places semantics front and center, and the rules for what is and isn’t a comptime constant are a backbone of every feature, “anything equivalent to ” becomes sufficiently precise. Incidentally, these days I tend to write “infinite” loops as Almost always there is an up-front bound for the number of iterations until the break, and its worth asserting this bound, because debugging crashes is easier than debugging hangs. , , , , and all use the same Ruby/Rust inspired syntax for naming captured values: I like how the iterator comes first, and then the name of an item follows, logically and syntactically. I have a very strong opinion about variable shadowing. It goes both ways: I spent hours debugging code which incorrectly tried to use a variable that was shadowed by something else, but I also spent hours debugging code that accidentally used a variable that should have been shadowed! I really don’t know whether on balance it is better to forbid or encourage shadowing! Zig of course forbids shadowing, but what’s curious is that it’s just one episode of the large crusade against any complexity in name resolution. There’s no “prelude”, if you want to use anything from std, you need to import it: There are no glob imports, if you want to use an item from std, you need to import it: Zig doesn’t have inheritance, mixins, argument-dependent lookup, extension functions, implicit or traits, so, if you see , that is guaranteed to be a boring method declared on type. Similarly, while Zig has powerful comptime capabilities, it intentionally disallows declaring methods at compile time. Like Rust, Zig used to allow a method and a field to share a name, because it actually is syntactically clear enough at the call site which is which. But then this feature got removed from Zig. More generally, Zig doesn’t have namespaces. There can be only one kind of in scope, while Rust allows things like I am astonished at the relative lack of inconvenience in Zig’s approach. Turns out that is all the syntax you’ll ever need for accessing things? For the historically inclined, see “The module naming situation” thread in the rust mailing list archive to learn the story of how rust got its syntax. The lack of namespaces touches on the most notable (by its absence) feature of Zig syntax, which deeply relates to the most profound aspect of Zig’s semantics. Everything is an expression. By which I mean, there’s no separate syntactic categories of values, types, and patterns. Values, types, and patterns are of course different things. And usually in the language grammar it is syntactically obvious whether a particular text fragment refers to a type or a value: So the standard way is to have separate syntax families for the three categories, which need to be internally unambiguous, but can be ambiguous across the categories because the place in the grammar dictates the category: when parsing , everything until is a pattern, stuff between and is a type, and after we have a value. There are two problems here. First, there’s a combinatorial explosion of sorts in the syntax, because, while three categories describe different things, it turns out that they have the same general tree-ish shape. The second problem is that it might be hard to maintain category separation in the grammar. Rust started with the three categories separated by a bright line. But then, changes happen. Originally, Rust only allowed syntax for assignment. But today you can also write to do unpacking like Similarly, the turbofish used to move the parser from the value to the type mode, but now const parameters are values that can be found in the type position! The alternative is not to pick this fight at all. Rather than trying to keep the categories separately in the syntax, use the same surface syntax to express all three, and categorize later, during semantic analysis. In fact, this is already happens in the example — these are different things! One is a place (lvalue) and another is a “true” value (rvalue), but we use the same syntax for both. I don’t think such syntactic unification necessarily implies semantic unification, but Zig does treat everything uniformly, as a value with comptime and runtime behavior (for some values, runtime behavior may be missing, for others — comptime): The fact that you can write an where a type goes is occasionally useful. But the fact that simple types look like simple values syntactically consistently make the language feel significantly less busy. As a special case of everything being an expression, instances of generic types look like this: Just a function call! Though, there’s some resistance to trickery involved to make this work. Usually, languages rely on type inference to allow eliding generic arguments. That in turn requires making argument syntax optional, and that in turn leads to separating generic and non-generic arguments into separate parameter lists and some introducer sigil for generics, like or . Zig solves this syntactic challenge in the most brute-force way possible. Generic parameters are never inferred, if a function takes 3 comptime arguments and 2 runtime arguments, it will always be called with 5 arguments syntactically. Like with the (absence of) importing flourishes, a reasonable reaction would be “wait, does this mean that I’ll have to specify the types all the time?” And, like with import, in practice this is a non-issue. The trick are comptime closures. Consider a generic : We have to specify type when creating an instance of an . But subsequently, when we are using the array list, we don’t have to specify the type parameter again, because the type of variable already closes over . This is the major truth of object-orienting programming, the truth so profound that no one even notices it: in real code, 90% of functions are happiest as (non-virtual) methods. And, because of that, the annotation burden in real-world Zig programs is low. While Zig doesn’t have Hindley-Milner constraint-based type inference, it relies heavily on one specific way to propagate types. Let’s revisit the first example: This doesn’t compile: and are different values, we can’t select between two at runtime because they are different. We need to coerce the constants to a specific runtime type: But this doesn’t kick the can sufficiently far enough and essentially reproduces the with two incompatible branches. We need to sink coercion down the branches: And that’s exactly how Zig’s “Result Location Semantics” works. Type “inference” runs a simple left-to-right tree-walking algorithm, which resembles interpreter’s . In fact, is exactly what happens. Zig is not a compiler, it is an interpreter. When evaluates an expression, it gets: When interpreting code like the interpreter passes the result location ( ) and type down the tree of subexpressions. If branches store result directly into object field (there’s a inside each branch, as opposed to one after the ), and each coerces its comptime constant to the appropriate runtime type of the result. This mechanism enables concise syntax for specifying enums: When evaluates the switch, it first evaluates the scrutinee, and realizes that it has type . When evaluating arm, it sets result type to for the condition, and a literal gets coerced to . The same happens for the second arm, where result type further sinks down the . Result type semantics also explains the leading dot in the record literal syntax: Syntactically, we just want to disambiguate records from blocks. But, semantically, we want to coerce the literal to whatever type we want to get out of this expression. In Zig, is a shorthand for . I must confess that did weird me out a lot at first during writing code (I don’t mind reading the dot). It’s not the easiest thing to type! But that was fixed once I added snippet, expanding to . The benefits to lightweight record literal syntax are huge, as they allow for some pretty nice APIs. In particular, you get named and default arguments for free: I don’t really miss the absence of named arguments in Rust, you can always design APIs without them. But they are free in Zig, so I use them liberally. Syntax wise, we get two features (calling functions and initializing objects) for the price of one! Finally, the thing that weirds out some people when they see Zig code, and makes others reconsider their choice GitHub handles, even when they haven’t seen any Zig: syntax for built-in functions. Every language needs to glue “userspace” code with primitive operations supported by the compiler. Usually, the gluing is achieved by making the standard library privileged and allowing it to define intrinsic functions without bodies, or by adding ad-hoc operators directly to the language (like Rust’s ). And Zig does have a fair amount of operators, like or . But the release valve for a lot of functionality are built-in functions in distinct syntactic namespace, so Zig separates out , , , , , , , , , and . There’s no need to overload casting when you can give each variant a name. There’s also for type ascription. The types goes first, because the mechanism here is result type semantics: evaluates the first argument as a type, and then uses that as the type for the second argument. Curiously, I think actually can be implemented in the userspace: In Zig, a type of function parameter may depend on values of preceding (comptime) ones! My favorite builtin is . First, it’s the most obvious way to import code: Its crystal clear where the file comes from. But, second, it is an instance of reverse syntax sugar. You see, import isn’t really a function. You can’t do The argument of has to be a string, syntactically. It really is syntax, except that the function-call form is re-used, because it already has the right shape. So, this is it. Just a bunch of silly syntactical decisions, which add up to a language which is positively enjoyable to read. As for big lessons, obviously, the less features your language has, the less syntax you’ll need. And less syntax is generally good, because varied syntactic constructs tend to step on each other toes. Languages are not combinations of orthogonal aspects. Features tug and pull the language in different directions and their combinations might turn to be miraculous features in their own right, or might drag the language down. Even with a small feature-set fixed, there’s still a lot of work to pick a good concrete syntax: unambiguous to parse, useful to grep, easy to read and not to painful to write. A smart thing is of course to steal and borrow solutions from other languages, not because of familiarity, but because the ruthless natural selection tends to weed out poor ideas. But there’s a lot of inertia in languages, so there’s no need to fear innovation. If an odd-looking syntax is actually good, people will take to it. Is there anything about Zig’s syntax I don’t like? I thought no, when starting this post. But in the process of writing it I did discover one form that annoys me. It is the while with the increment loop: This is two-thirds of a C-style loop (without the declarator), and it sucks for the same reason: control flow jumps all over the place and is unrelated to the source code order. We go from condition, to the body, to the increment. But in the source order the increment is between the condition and the body. In Zig, this loop sucks for one additional reason: that separating the increment I think is the single example of control flow in Zig that is expressed by a sigil, rather than a keyword. This form used to be rather important, as Zig lacked a counting loop. It has form now, so I am tempted to call the while-with-increment redundant. Annoyingly, is almost equivalent to But not exactly: if contains a , or , the version would run the one extra time, which is useless and might be outright buggy. Oh well.

0 views