Latest Posts (15 found)
Julia Evans 3 months ago

New zine: The Secret Rules of the Terminal

Hello! After many months of writing deep dive blog posts about the terminal, on Tuesday I released a new zine called “The Secret Rules of the Terminal”! You can get it for $12 here: https://wizardzines.com/zines/terminal , or get an 15-pack of all my zines here . Here’s the cover: Here’s the table of contents: I’ve been using the terminal every day for 20 years but even though I’m very confident in the terminal, I’ve always had a bit of an uneasy feeling about it. Usually things work fine, but sometimes something goes wrong and it just feels like investigating it is impossible, or at least like it would open up a huge can of worms. So I started trying to write down a list of weird problems I’ve run into in terminal and I realized that the terminal has a lot of tiny inconsistencies like: If you use the terminal daily for 10 or 20 years, even if you don’t understand exactly why these things happen, you’ll probably build an intuition for them. But having an intuition for them isn’t the same as understanding why they happen. When writing this zine I actually had to do a lot of work to figure out exactly what was happening in the terminal to be able to talk about how to reason about it. It turns out that the “rules” for how the terminal works (how do you edit a command you type in? how do you quit a program? how do you fix your colours?) are extremely hard to fully understand, because “the terminal” is actually made of many different pieces of software (your terminal emulator, your operating system, your shell, the core utilities like , and every other random terminal program you’ve installed) which are written by different people with different ideas about how things should work. So I wanted to write something that would explain: Terminal internals are a mess. A lot of it is just the way it is because someone made a decision in the 80s and now it’s impossible to change, and honestly I don’t think learning everything about terminal internals is worth it. But some parts are not that hard to understand and can really make your experience in the terminal better, like: When I wrote How Git Works , I thought I knew how Git worked, and I was right. But the terminal is different. Even though I feel totally confident in the terminal and even though I’ve used it every day for 20 years, I had a lot of misunderstandings about how the terminal works and (unless you’re the author of or something) I think there’s a good chance you do too. A few things I learned that are actually useful to me: As usual these days I wrote a bunch of blog posts about various side quests: A long time ago I used to write zines mostly by myself but with every project I get more and more help. I met with Marie Claire LeBlanc Flanagan every weekday from September to June to work on this one. The cover is by Vladimir Kašiković, Lesley Trites did copy editing, Simon Tatham (who wrote PuTTY ) did technical review, our Operations Manager Lee did the transcription as well as a million other things, and Jesse Luehrs (who is one of the very few people I know who actually understands the terminal’s cursed inner workings) had so many incredibly helpful conversations with me about what is going on in the terminal. Here are some links to get the zine again: As always, you can get either a PDF version to print at home or a print version shipped to your house. The only caveat is print orders will ship in August – I need to wait for orders to come in to get an idea of how many I should print before sending it to the printer.

0 views
Julia Evans 4 months ago

Using `make` to compile C programs (for non-C-programmers)

I have never been a C programmer but every so often I need to compile a C/C++ program from source. This has been kind of a struggle for me: for a long time, my approach was basically “install the dependencies, run , if it doesn’t work, either try to find a binary someone has compiled or give up”. “Hope someone else has compiled it” worked pretty well when I was running Linux but since I’ve been using a Mac for the last couple of years I’ve been running into more situations where I have to actually compile programs myself. So let’s talk about what you might have to do to compile a C program! I’ll use a couple of examples of specific C programs I’ve compiled and talk about a few things that can go wrong. Here are three programs we’ll be talking about compiling: This is pretty simple: on an Ubuntu system if I don’t already have a C compiler I’ll install one with: This installs , , and . The situation on a Mac is more confusing but it’s something like “install xcode command line tools”. Unlike some newer programming languages, C doesn’t have a dependency manager. So if a program has any dependencies, you need to hunt them down yourself. Thankfully because of this, C programmers usually keep their dependencies very minimal and often the dependencies will be available in whatever package manager you’re using. There’s almost always a section explaining how to get the dependencies in the README, for example in paperjam ’s README, it says: To compile PaperJam, you need the headers for the libqpdf and libpaper libraries (usually available as libqpdf-dev and libpaper-dev packages). You may need (found in AsciiDoc ) for building manual pages. So on a Debian-based system you can install the dependencies like this. If a README gives a name for a package (like ), I’d basically always assume that they mean “in a Debian-based Linux distro”: if you’re on a Mac will not work. I still have not 100% gotten the hang of developing on a Mac yet so I don’t have many tips there yet. I guess in this case it would be if you’re using Homebrew. Some C programs come with a and some instead come with a script called . For example, if you download sqlite’s source code , it has a script in it instead of a Makefile. My understanding of this script is: I think there might be some options you can pass to get the script to produce a different but I have never done that. The next step is to run to try to build a program. Some notes about : Here’s an error I got while compiling on my Mac: Over the years I’ve learned it’s usually best not to overthink problems like this: if it’s talking about , there’s a good change it just means that I’ve done something wrong with how I’m including the dependency. Now let’s talk about some ways to get the dependency included in the right way. Before we talk about how to fix dependency problems: building C programs is split into 2 steps: It’s important to know this when building a C program because sometimes you need to pass the right flags to the compiler and linker to tell them where to find the dependencies for the program you’re compiling. If I run on my Mac to install , I get this error: This is not because is not installed on my system (it actually is!). But the compiler and linker don’t know how to find the library. To fix this, we need to: And we can get to pass those extra parameters to the compiler and linker using environment variables! To see how this works: inside ’s Makefile you can see a bunch of environment variables, like here: Everything you put into the environment variable gets passed to the linker ( ) as a command line argument. sometimes define their own environment variables that they pass to the compiler/linker, but also has a bunch of “implicit” environment variables which it will automatically pass to the C compiler and linker. There’s a full list of implicit environment variables here , but one of them is , which gets automatically passed to the C compiler. (technically it would be more normal to use for this, but this particular hardcodes so setting was the only way I could find to set the compiler flags without editing the ) I learned thanks to @zwol that there are actually two ways to pass environment variables to : The difference between them is that will override the value of set in the but won’t. I’m not sure which way is the norm but I’m going to use the first way in this post. Now that we’ve talked about how and get passed to the compiler and linker, here’s the final incantation that I used to get the program to build successfully! This passes to the compiler and to the linker. Also I don’t want to pretend that I “magically” knew that those were the right arguments to pass, figuring them out involved a bunch of confused Googling that I skipped over in this post. I will say that: Yesterday I discovered this cool tool called qf which you can use to quickly open files from the output of . is in a big directory of various tools, but I only wanted to compile . So I just compiled , like this: Basically if you know (or can guess) the output filename of the file you’re trying to build, you can tell to just build that file by running I sometimes write 5-line C programs with no dependencies, and I just learned that if I have a file called , I can just compile it like this without creating a : It gets automaticaly expanded to , which saves a bit of typing. I have no idea if I’m going to remember this (I might just keep typing anyway) but it seems like a fun trick. If you’re having trouble building a C program, maybe other people had problems building it too! Every Linux distribution has build files for every package that they build, so even if you can’t install packages from that distribution directly, maybe you can get tips from that Linux distro for how to build the package. Realizing this (thanks to my friend Dave) was a huge ah-ha moment for me. For example, this line from the nix package for says: This is basically saying “pass the linker flag to build this on a Mac”, so that’s a clue we could use to build it. That same file also says . I’m not sure what this means, but when I try to build the package I do get an error about something called a , so I guess that’s somehow related to the “PointerHolder transition”. Once you’ve managed to compile the program, probably you want to install it somewhere! Some s have an target that let you install the tool on your system with . I’m always a bit scared of this (where is it going to put the files? what if I want to uninstall them later?), so if I’m compiling a pretty simple program I’ll often just manually copy the binary to install it instead, like this: Once I figured out how to do all of this, I realized that I could use my new knowledge to contribute a package to Homebrew! Then I could just on future systems. The good thing is that even if the details of how all of the different packaging systems, they fundamentally all use C compilers and linkers. I think all of this is an interesting example of how it can useful to understand some basics of how C programs work (like “they have header files”) even if you’re never planning to write a nontrivial C program if your life. It feels good to have some ability to compile C/C++ programs myself, even though I’m still not totally confident about all of the compiler and linker flags and I still plan to never learn anything about how autotools works other than “you run to generate the ”. Two things I left out of this post:

0 views
Julia Evans 7 months ago

Standards for ANSI escape codes

Hello! Today I want to talk about ANSI escape codes. For a long time I was vaguely aware of ANSI escape codes (“that’s how you make text red in the terminal and stuff”) but I had no real understanding of where they were supposed to be defined or whether or not there were standards for them. I just had a kind of vague “there be dragons” feeling around them. While learning about the terminal this year, I’ve learned that: So I wanted to put together a list for myself of some standards that exist around escape codes, because I want to know if they have to feel unreliable and frustrating, or if there’s a future where we could all rely on them with more confidence. Have you ever pressed the left arrow key in your terminal and seen ? That’s an escape code! It’s called an “escape code” because the first character is the “escape” character, which is usually written as , , , , or . Escape codes are how your terminal emulator communicates various kinds of information (colours, mouse movement, etc) with programs running in the terminal. There are two kind of escape codes: Now let’s talk about standards! The first standard I found relating to escape codes was ECMA-48 , which was originally published in 1976. ECMA-48 does two things: The formats are extensible, so there’s room for others to define more escape codes in the future. Lots of escape codes that are popular today aren’t defined in ECMA-48: for example it’s pretty common for terminal applications (like vim, htop, or tmux) to support using the mouse, but ECMA-48 doesn’t define escape codes for the mouse. There are a bunch of escape codes that aren’t defined in ECMA-48, for example: I believe (correct me if I’m wrong!) that these and some others came from xterm, are documented in XTerm Control Sequences , and have been widely implemented by other terminal emulators. This list of “what xterm supports” is not a standard exactly, but xterm is extremely influential and so it seems like an important document. In the 80s (and to some extent today, but my understanding is that it was MUCH more dramatic in the 80s) there was a huge amount of variation in what escape codes terminals actually supported. To deal with this, there’s a database of escape codes for various terminals called “terminfo”. It looks like the standard for terminfo is called X/Open Curses , though you need to create an account to view that standard for some reason. It defines the database format as well as a C library interface (“curses”) for accessing the database. For example you can run this bash snippet to see every possible escape code for “clear screen” for all of the different terminals your system knows about: On my system (and probably every system I’ve ever used?), the terminfo database is managed by ncurses. I think it’s interesting that there are two main approaches that applications take to handling ANSI escape codes: Some examples of programs/libraries that take approach #2 (“don’t use terminfo”) include: I got curious about why folks might be moving away from terminfo and I found this very interesting and extremely detailed rant about terminfo from one of the fish maintainers , which argues that: [the terminfo authors] have done a lot of work that, at the time, was extremely important and helpful. My point is that it no longer is. I’m not going to do it justice so I’m not going to summarize it, I think it’s worth reading. I was just talking about the idea that you can use a “common set” of escape codes that will work for most people. But what is that set? Is there any agreement? I really do not know the answer to this at all, but from doing some reading it seems like it’s some combination of: and maybe ultimately “identify the terminal emulators you think your users are going to use most frequently and test in those”, the same way web developers do when deciding which CSS features are okay to use I don’t think there are any resources like Can I use…? or Baseline for the terminal though. (in theory terminfo is supposed to be the “caniuse” for the terminal but it seems like it often takes 10+ years to add new terminal features when people invent them which makes it very limited) I also asked on Mastodon why people found terminfo valuable in 2025 and got a few reasons that made sense to me: The way that ncurses uses the environment variable to decide which escape codes to use reminds me of how webservers used to sometimes use the browser user agent to decide which version of a website to serve. It also seems like it’s had some of the same results – the way iTerm2 reports itself as being “xterm-256color” feels similar to how Safari’s user agent is “Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15”. In both cases the terminal emulator / browser ends up changing its user agent to get around user agent detection that isn’t working well. On the web we ended up deciding that user agent detection was not a good practice and to instead focus on standardization so we can serve the same HTML/CSS to all browsers. I don’t know if the same approach is the future in the terminal though – I think the terminal landscape today is much more fragmented than the web ever was as well as being much less well funded. A few more documents and standards related to escape codes, in no particular order: I sometimes see people saying that the unix terminal is “outdated”, and since I love the terminal so much I’m always curious about what incremental changes might make it feel less “outdated”. Maybe if we had a clearer standards landscape (like we do on the web!) it would be easier for terminal emulator developers to build new features and for authors of terminal applications to more confidently adopt those features so that we can all benefit from them and have a richer experience in the terminal. Obviously standardizing ANSI escape codes is not easy (ECMA-48 was first published almost 50 years ago and we’re still not there!). I don’t even know what all of the challenges are. But the situation with HTML/CSS/JS used to be extremely bad too and now it’s MUCH better, so maybe there’s hope.

0 views
Julia Evans 8 months ago

How to add a directory to your PATH

I was talking to a friend about how to add a directory to your PATH today. It’s something that feels “obvious” to me since I’ve been using the terminal for a long time, but when I searched for instructions for how to do it, I actually couldn’t find something that explained all of the steps – a lot of them just said “add this to ”, but what if you’re not using bash? What if your bash config is actually in a different file? And how are you supposed to figure out which directory to add anyway? So I wanted to try to write down some more complete directions and mention some of the gotchas I’ve run into over the years. Here’s a table of contents: If you’re not sure what shell you’re using, here’s a way to find out. Run this: Also bash is the default on Linux and zsh is the default on Mac OS (as of 2024). I’ll only cover bash, zsh, and fish in these directions. Bash has three possible config files: , , and . If you’re not sure which one your system is set up to use, I’d recommend testing this way: (there are a lot of elaborate flow charts out there that explain how bash decides which config file to use but IMO it’s not worth it to internalize them and just testing is the fastest way to be sure) Let’s say that you’re trying to install and run a program called and it doesn’t work, like this: How do you find what directory is in? Honestly in general this is not that easy – often the answer is something like “it depends on how npm is configured”. A few ideas: Once you’ve found a directory you think might be the right one, make sure it’s actually correct! For example, I found out that on my machine, is in . I can make sure that it’s the right directory by trying to run the program in that directory like this: It worked! Now that you know what directory you need to add to your , let’s move to the next step! Now we have the 2 critical pieces of information we need: Now what you need to add depends on your shell: bash instructions: Open your shell’s config file, and add a line like this: (obviously replace with the actual directory you’re trying to add) zsh instructions: You can do the same thing as in bash, but zsh also has some slightly fancier syntax you can use if you prefer: fish instructions: In fish, the syntax is different: (in fish you can also use , some notes on that further down ) Now, an extremely important step: updating your shell’s config won’t take effect if you don’t restart it! Two ways to do this: I’ve found that both of these usually work fine. And you should be done! Try running the program you were trying to run and hopefully it works now. If not, here are a couple of problems that you might run into: If the wrong version of a program is running, you might need to add the directory to the beginning of your PATH instead of the end. For example, on my system I have two versions of installed, which I can see by running : The one your shell will use is the first one listed . If you want to use the Homebrew version, you need to add that directory ( ) to the beginning of your PATH instead, by putting this in your shell’s config file (it’s instead of the usual ) or in fish: All of these directions only work if you’re running the program from your shell . If you’re running the program from an IDE, from a GUI, in a cron job, or some other way, you’ll need to add the directory to your PATH in a different way, and the exact details might depend on the situation. in a cron job Some options: I’m honestly not sure how to handle it in an IDE/GUI because I haven’t run into that in a long time, will add directions here if someone points me in the right direction. If you edit your path and start a new shell by running (or , or ), you’ll often end up with duplicate entries, because the shell keeps adding new things to your every time you start your shell. Personally I don’t think I’ve run into a situation where this kind of duplication breaks anything, but the duplicates can make it harder to debug what’s going on with your if you’re trying to understand its contents. Some ways you could deal with this: How to deduplicate your is shell-specific and there isn’t always a built in way to do it so you’ll need to look up how to accomplish it in your shell. Here’s a situation that’s easy to get into in bash or zsh: This happens because in bash, by default, history is not saved until you exit the shell. Some options for fixing this: When you install (Rust’s installer) for the first time, it gives you these instructions for how to set up your PATH, which don’t mention a specific directory at all. The idea is that you add that line to your shell’s config, and their script automatically sets up your (and potentially other things) for you. This is pretty common (for example Homebrew suggests you eval ), and there are two ways to approach this: I don’t think there’s anything wrong with doing what the tool suggests (it might be the “best way”!), but personally I usually use the second approach because I prefer knowing exactly what configuration I’m changing. fish has a handy function called that you can run to add a directory to your like this: This is cool (it’s such a simple command!) but I’ve stopped using it for a couple of reasons: Hopefully this will help some people. Let me know (on Mastodon or Bluesky) if you there are other major gotchas that have tripped you up when adding a directory to your PATH, or if you have questions about this post!

0 views
Julia Evans 8 months ago

Some terminal frustrations

A few weeks ago I ran a terminal survey (you can read the results here ) and at the end I asked: What’s the most frustrating thing about using the terminal for you? 1600 people answered, and I decided to spend a few days categorizing all the responses. Along the way I learned that classifying qualitative data is not easy but I gave it my best shot. I ended up building a custom tool to make it faster to categorize everything. As with all of my surveys the methodology isn’t particularly scientific. I just posted the survey to Mastodon and Twitter, ran it for a couple of days, and got answers from whoever happened to see it and felt like responding. Here are the top categories of frustrations! I think it’s worth keeping in mind while reading these comments that These comments aren’t coming from total beginners. Here are the categories of frustrations! The number in brackets is the number of people with that frustration. I’m mostly writing this up for myself because I’m trying to write a zine about the terminal and I wanted to get a sense for what people are having trouble with. People talked about struggles remembering: One example comment: There are just so many little “trivia” details to remember for full functionality. Even after all these years I’ll sometimes forget where it’s 2 or 1 for stderr, or forget which is which for and . People talked about struggling with switching systems (for example home/work computer or when SSHing) and running into: as well as differences inside the same system like pagers being not consistent with each other (git diff pagers, other pagers). One example comment: I got used to fish and vi mode which are not available when I ssh into servers, containers. Lots of problems with color, like: This comment felt relatable to me: Getting my terminal theme configured in a reasonable way between the terminal emulator and fish (I did this years ago and remember it being tedious and fiddly and now feel like I’m locked into my current theme because it works and I dread touching any of that configuration ever again). Half of the comments on keyboard shortcuts were about how on Linux/Windows, the keyboard shortcut to copy/paste in the terminal is different from in the rest of the OS. Some other issues with keyboard shortcuts other than copy/paste: Aside from “the keyboard shortcut for copy and paste is different”, there were a lot of OTHER issues with copy and paste, like: There were lots of comments about this, which all came down to the same basic complaint – it’s hard to discover useful tools or features! This comment kind of summed it all up: How difficult it is to learn independently. Most of what I know is an assorted collection of stuff I’ve been told by random people over the years. A lot of comments about it generally having a steep learning curve. A couple of example comments: After 15 years of using it, I’m not much faster than using it than I was 5 or maybe even 10 years ago. That I know I could make my life easier by learning more about the shortcuts and commands and configuring the terminal but I don’t spend the time because it feels overwhelming. Some issues with shell history: One example comment: It wasted a lot of time until I figured it out and still annoys me that “history” on zsh has such a small buffer; I have to type “history 0” to get any useful length of history. People talked about: Here’s a representative comment: Finding good examples and docs. Man pages often not enough, have to wade through stack overflow A few issues with scrollback: One example comment: When resizing the terminal (in particular: making it narrower) leads to broken rewrapping of the scrollback content because the commands formatted their output based on the terminal window width. Lots of comments about how the terminal feels hampered by legacy decisions and how users often end up needing to learn implementation details that feel very esoteric. One example comment: Most of the legacy cruft, it would be great to have a green field implementation of the CLI interface. Lots of complaints about POSIX shell scripting. There’s a general feeling that shell scripting is difficult but also that switching to a different less standard scripting language (fish, nushell, etc) brings its own problems. Shell scripting. My tolerance to ditch a shell script and go to a scripting language is pretty low. It’s just too messy and powerful. Screwing up can be costly so I don’t even bother. Some more issues that were mentioned at least 10 times: There were also 122 answers to the effect of “nothing really” or “only that I can’t do EVERYTHING in the terminal” One example comment: Think I’ve found work arounds for most/all frustrations I’m not going to make a lot of commentary on these results, but here are a couple of categories that feel related to me: Trying to categorize all these results in a reasonable way really gave me an appreciation for social science researchers’ skills.

0 views
Julia Evans 9 months ago

What's involved in getting a "modern" terminal setup?

Hello! Recently I ran a terminal survey and I asked people what frustrated them. One person commented: There are so many pieces to having a modern terminal experience. I wish it all came out of the box. My immediate reaction was “oh, getting a modern terminal experience isn’t that hard, you just need to….”, but the more I thought about it, the longer the “you just need to…” list got, and I kept thinking about more and more caveats. So I thought I would write down some notes about what it means to me personally to have a “modern” terminal experience and what I think can make it hard for people to get there. Here are a few things that are important to me, with which part of the system is responsible for them: There are a million other terminal conveniences out there and different people value different things, but those are the ones that I would be really unhappy without. My basic approach is: A few things that affect my approach: What if you want a nice experience, but don’t want to spend a lot of time on configuration? Figuring out how to configure vim in a way that I was satisfied with really did take me like ten years, which is a long time! My best ideas for how to get a reasonable terminal experience with minimal config are: Personally I wouldn’t use xterm, rxvt, or Terminal.app as a terminal emulator, because I’ve found in the past that they’re missing core features (like 24-bit colour in Terminal.app’s case) that make the terminal harder to use for me. I don’t want to pretend that getting a “modern” terminal experience is easier than it is though – I think there are two issues that make it hard. Let’s talk about them! bash and zsh are by far the two most popular shells, and neither of them provide a default experience that I would be happy using out of the box, for example: And even though I love fish , the fact that it isn’t POSIX does make it hard for a lot of folks to make the switch. Of course it’s totally possible to learn how to customize your prompt in bash or whatever, and it doesn’t even need to be that complicated (in bash I’d probably start with something like , or maybe use starship ). But each of these “not complicated” things really does add up and it’s especially tough if you need to keep your config in sync across several systems. An extremely popular solution to getting a “modern” shell experience is oh-my-zsh . It seems like a great project and I know a lot of people use it very happily, but I’ve struggled with configuration systems like that in the past – it looks like right now the base oh-my-zsh adds about 3000 lines of config, and often I find that having an extra configuration system makes it harder to debug what’s happening when things go wrong. I personally have a tendency to use the system to add a lot of extra plugins, make my system slow, get frustrated that it’s slow, and then delete it completely and write a new config from scratch. In the terminal survey I ran recently, the most popular terminal text editors by far were , , and . I think the main options for terminal text editors are: The last issue is that sometimes individual programs that I use are kind of annoying. For example on my Mac OS machine, doesn’t support the keyboard shortcut. Fixing this to get a reasonable terminal experience in SQLite was a little complicated, I had to: I find that debugging application-specific issues like this is really not easy and often it doesn’t feel “worth it” – often I’ll end up just dealing with various minor inconveniences because I don’t want to spend hours investigating them. The only reason I was even able to figure this one out at all is that I’ve been spending a huge amount of time thinking about the terminal recently. A big part of having a “modern” experience using terminal programs is just using newer terminal programs, for example I can’t be bothered to learn a keyboard shortcut to sort the columns in , but in I can just click on a column heading with my mouse to sort it. So I use htop instead! But discovering new more “modern” command line tools isn’t easy (though I made a list here ), finding ones that I actually like using in practice takes time, and if you’re SSHed into another machine, they won’t always be there. Something I find tricky about configuring my terminal to make everything “nice” is that changing one seemingly small thing about my workflow can really affect everything else. For example right now I don’t use tmux. But if I needed to use tmux again (for example because I was doing a lot of work SSHed into another machine), I’d need to think about a few things, like: and probably more things I haven’t thought of. “Using tmux means that I have to change how I manage my colours” sounds unlikely, but that really did happen to me and I decided “well, I don’t want to change how I manage colours right now, so I guess I’m not using that feature!”. It’s also hard to remember which features I’m relying on – for example maybe my current terminal does have OSC 52 support and because copying from tmux over SSH has always Just Worked I don’t even realize that that’s something I need, and then it mysteriously stops working when I switch terminals. Personally even though I think my setup is not that complicated, it’s taken me 20 years to get to this point! Because terminal config changes are so likely to have unexpected and hard-to-understand consequences, I’ve found that if I change a lot of terminal configuration all at once it makes it much harder to understand what went wrong if there’s a problem, which can be really disorienting. So I usually prefer to make pretty small changes, and accept that changes can might take me a REALLY long time to get used to. For example I switched from using to eza a year or two ago and while I like it (because prints human-readable file sizes by default) I’m still not quite sure about it. But also sometimes it’s worth it to make a big change, like I made the switch to fish (from bash) 10 years ago and I’m very happy I did. Trying to explain how “easy” it is to configure your terminal really just made me think that it’s kind of hard and that I still sometimes get confused. I’ve found that there’s never one perfect way to configure things in the terminal that will be compatible with every single other thing. I just need to try stuff, figure out some kind of locally stable state that works for me, and accept that if I start using a new tool it might disrupt the system and I might need to rethink things.

0 views
Julia Evans 10 months ago

"Rules" that terminal programs follow

Recently I’ve been thinking about how everything that happens in the terminal is some combination of: The first three (your operating system, shell, and terminal emulator) are all kind of known quantities – if you’re using bash in GNOME Terminal on Linux, you can more or less reason about how how all of those things interact, and some of their behaviour is standardized by POSIX. But the fourth one (“whatever program you happen to be running”) feels like it could do ANYTHING. How are you supposed to know how a program is going to behave? This post is kind of long so here’s a quick table of contents: As far as I know, there are no real standards for how programs in the terminal should behave – the closest things I know of are: But even though there are no standards, in my experience programs in the terminal behave in a pretty consistent way. So I wanted to write down a list of “rules” that in my experience programs mostly follow. My goal here isn’t to convince authors of terminal programs that they should follow any of these rules. There are lots of exceptions to these and often there’s a good reason for those exceptions. But it’s very useful for me to know what behaviour to expect from a random new terminal program that I’m using. Instead of “uh, programs could do literally anything”, it’s “ok, here are the basic rules I expect, and then I can keep a short mental list of exceptions”. So I’m just writing down what I’ve observed about how programs behave in my 20 years of using the terminal, why I think they behave that way, and some examples of cases where that rule is “broken”. There are a bunch of common conventions that I think are pretty clearly the program’s responsibility to implement, like: But in this post I’m going to focus on things that it’s not 100% obvious are the program’s responsibility. For example it feels to me like a “law of nature” that pressing should quit a REPL, but programs often need to explicitly implement support for it – even though doesn’t need to implement support, does . (more about that in “rule 3” below) Understanding which things are the program’s responsibility makes it much less surprising when different programs’ implementations are slightly different. The main reason for this rule is that noninteractive programs will quit by default on if they don’t set up a signal handler, so this is kind of a “you should act like the default” rule. Something that trips a lot of people up is that this doesn’t apply to interactive programs like or or . This is because in an interactive program, has a different job – if the program is running an operation (like for example a search in or some Python code in ), then will interrupt that operation but not stop the program. As an example of how this works in an interactive program: here’s the code in prompt-toolkit (the library that iPython uses for handling input) that aborts a search when you press . TUI programs (like or ) will usually quit when you press . This rule doesn’t apply to any program where pressing to quit wouldn’t make sense, like or text editors. REPLs (like or ) will usually quit when you press on an empty line. This rule is similar to the rule – the reason for this is that by default if you’re running a program (like ) in “cooked mode”, then the operating system will return an when you press on an empty line. Most of the REPLs I use (sqlite3, python3, fish, bash, etc) don’t actually use cooked mode, but they all implement this keyboard shortcut anyway to mimic the default behaviour. For example, here’s the code in prompt-toolkit that quits when you press Ctrl-D, and here’s the same code in readline . I actually thought that this one was a “Law of Terminal Physics” until very recently because I’ve basically never seen it broken, but you can see that it’s just something that each individual input library has to implement in the links above. Someone pointed out that the Erlang REPL does not quit when you press , so I guess not every REPL follows this “rule”. Terminal programs rarely use colours other than the base 16 ANSI colours. This is because if you specify colours with a hex code, it’s very likely to clash with some users’ background colour. For example if I print out some text as , it would be almost invisible on a white background, though it would look fine on a dark background. But if you stick to the default 16 base colours, you have a much better chance that the user has configured those colours in their terminal emulator so that they work reasonably well with their background color. Another reason to stick to the default base 16 colours is that it makes less assumptions about what colours the terminal emulator supports. The only programs I usually see breaking this “rule” are text editors, for example Helix by default will use a purple background which is not a default ANSI colour. It seems fine for Helix to break this rule since Helix isn’t a “core” program and I assume any Helix user who doesn’t like that colorscheme will just change the theme. Almost every program I use supports keybindings if it would make sense to do so. For example, here are a bunch of different programs and a link to where they define to go to the end of the line: None of those programs actually uses directly, they just sort of mimic emacs/readline keybindings. They don’t always mimic them exactly : for example atuin seems to use as a prefix, so doesn’t go to the beginning of the line. Also all of these programs seem to implement their own internal cut and paste buffers so you can delete a line with and then paste it with . The exceptions to this are: I wrote more about this “what keybindings does a program support?” question in entering text in the terminal is complicated . I’ve never seen a program (other than a text editor) where doesn’t delete the last word. This is similar to the rule – by default if a program is in “cooked mode”, the OS will delete the last word if you press , and delete the whole line if you press . So usually programs will imitate that behaviour. I can’t think of any exceptions to this other than text editors but if there are I’d love to hear about them! Most programs will disable colours when writing to a pipe. For example: Both of those programs will also format their output differently when writing to the terminal: will organize files into columns, and ripgrep will group matches with headings. If you want to force the program to use colour (for example because you want to look at the colour), you can use to force the program’s output to be a tty like this: I’m sure that there are some programs that “break” this rule but I can’t think of any examples right now. Some programs have an flag that you can use to force colour to be on, in the example above you could also do . Usually if you pass to a program instead of a filename, it’ll read from stdin or write to stdout (whichever is appropriate). For example, if you want to format the Python code that’s on your clipboard with and then copy it, you could run: ( is a Mac program, you can do something similar on Linux with ) My impression is that most programs implement this if it would make sense and I can’t think of any exceptions right now, but I’m sure there are many exceptions. These rules took me a long time for me to learn because I had to: A lot of my understanding of the terminal is honestly still in the “subconscious pattern recognition” stage. The only reason I’ve been taking the time to make things explicit at all is because I’ve been trying to explain how it works to others. Hopefully writing down these “rules” explicitly will make learning some of this stuff a little bit faster for others.

0 views
Julia Evans 10 months ago

Why pipes sometimes get "stuck": buffering

Here’s a niche terminal problem that has bothered me for years but that I never really understood until a few weeks ago. Let’s say you’re running this command to watch for some specific output in a log file: If log lines are being added to the file relatively slowly, the result I’d see is… nothing! It doesn’t matter if there were matches in the log file or not, there just wouldn’t be any output. I internalized this as “uh, I guess pipes just get stuck sometimes and don’t show me the output, that’s weird”, and I’d handle it by just running instead, which would work. So as I’ve been doing a terminal deep dive over the last few months I was really excited to finally learn exactly why this happens. The reason why “pipes get stuck” sometimes is that it’s VERY common for programs to buffer their output before writing it to a pipe or file. So the pipe is working fine, the problem is that the program never even wrote the data to the pipe! This is for performance reasons: writing all output immediately as soon as you can uses more system calls, so it’s more efficient to save up data until you have 8KB or so of data to write (or until the program exits) and THEN write it to the pipe. In this example: the problem is that is saving up all of its matches until it has 8KB of data to write, which might literally never happen. Part of why I found this so disorienting is that will work totally fine, but then when you add the second , it stops working!! The reason for this is that the way handles buffering depends on whether it’s writing to a terminal or not. Here’s how (and many other programs) decides to buffer its output: So if is writing directly to your terminal then you’ll see the line as soon as it’s printed, but if it’s writing to a pipe, you won’t. Of course the buffer size isn’t always 8KB for every program, it depends on the implementation. For the buffering is handled by libc, and libc’s buffer size is defined in the variable. Here’s where that’s defined in glibc . (as an aside: “programs do not use 8KB output buffers when writing to a terminal” isn’t, like, a law of terminal physics, a program COULD use an 8KB buffer when writing output to a terminal if it wanted, it would just be extremely weird if it did that, I can’t think of any program that behaves that way) One annoying thing about this buffering behaviour is that you kind of need to remember which commands buffer their output when writing to a pipe. Some commands that don’t buffer their output: I think almost everything else will buffer output, especially if it’s a command where you’re likely to be using it for batch processing. Here’s a list of some common commands that buffer their output when writing to a pipe, along with the flag that disables block buffering. Those are all the ones I can think of, lots of unix commands (like ) may or may not buffer their output but it doesn’t matter because can’t do anything until it finishes receiving input anyway. Also I did my best to test both the Mac OS and GNU versions of these but there are a lot of variations and I might have made some mistakes. Also, here are a few programming language where the default print statement will buffer output when writing to a pipe, and some ways to disable buffering if you want: I assume that these languages are designed this way so that the default print function will be fast when you’re doing batch processing. Also whether output is buffered or not might depend on how you print, for example in C++ buffers when writing to a pipe but will flush its output. Let’s say you’re running this command as a hacky way to watch for DNS requests to , and you forgot to pass to tcpdump: When you press , what happens? In a magical perfect world, what I would want to happen is for to flush its buffer, would search for , and I would see all the output I missed. But in the real world, what happens is that all the programs get killed and the output in ’s buffer is lost. I think this problem is probably unavoidable – I spent a little time with to see how this works and receives the before anyway so even if tried to flush its buffer would already be dead. After a little more investigation, there is a workaround: if you find ’s PID and , then tcpdump will flush the buffer so you can see the output. That’s kind of a pain but I tested it and it seems to work. It’s not just pipes, this will also buffer: Redirecting to a file doesn’t have the same “ will totally destroy the contents of the buffer” problem though – in my experience it usually behaves more like you’d want, where the contents of the buffer get written to the file before the program exits. I’m not 100% sure whether this is something you can always rely on or not. Okay, let’s talk solutions. Let’s say you’ve run this command: I asked people on Mastodon how they would solve this in practice and there were 5 basic approaches. Here they are: Historically my solution to this has been to just avoid the “command writing to pipe slowly” situation completely and instead run a program that will finish quickly like this: This doesn’t do the same thing as the original command but it does mean that you get to avoid thinking about these weird buffering issues. (you could also do but I often prefer to use an “unnecessary” ) You could remember that grep has a flag to avoid buffering and pass it like this: Some people said that if they’re specifically dealing with a multiple greps situation, they’ll rewrite it to use a single instead, like this: Or you would write a more complicated , like this: ( also buffers, so for this to work you’ll want to be the last command in the pipeline) uses LD_PRELOAD to turn off libc’s buffering, and you can use it to turn off output buffering like this: Like any solution it’s a bit unreliable – it doesn’t work on static binaries, I think won’t work if the program isn’t using libc’s buffering, and doesn’t always work on Mac OS. Harry Marr has a really nice How stdbuf works post. will force the program’s output to be a TTY, which means that it’ll behave the way it normally would on a TTY (less buffering, colour output, etc). You could use it in this example like this: Unlike it will always work, though it might have unwanted side effects, for example ’s will also colour matches. If you want to install unbuffer, it’s in the package. It’s a bit hard for me to say which one is “best”, I think personally I’m mostly likely to use because I know it’s always going to work. If I learn about more solutions I’ll try to add them to this post. I think it’s not very common for me to have a program that slowly trickles data into a pipe like this, normally if I’m using a pipe a bunch of data gets written very quickly, processed by everything in the pipeline, and then everything exits. The only examples I can come up with right now are: I think it would be cool if there were a standard environment variable to turn off buffering, like in Python. I got this idea from a couple of blog posts by Mark Dominus in 2018. Maybe like NO_COLOR ? The design seems tricky to get right; Mark points out that NETBSD has environment variables called , , etc which gives you a ton of control over buffering but I imagine most developers don’t want to implement many different environment variables to handle a relatively minor edge case. I’m also curious about whether there are any programs that just automatically flush their output buffers after some period of time (like 1 second). It feels like it would be nice in theory but I can’t think of any program that does that so I imagine there are some downsides. Some things I didn’t talk about in this post since these posts have been getting pretty long recently and seriously does anyone REALLY want to read 3000 words about buffering?

0 views
Julia Evans 11 months ago

Importing a frontend Javascript library without a build system

I like writing Javascript without a build system and for the millionth time yesterday I ran into a problem where I needed to figure out how to import a Javascript library in my code without using a build system, and it took FOREVER to figure out how to import it because the library’s setup instructions assume that you’re using a build system. Luckily at this point I’ve mostly learned how to navigate this situation and either successfully use the library or decide it’s too difficult and switch to a different library, so here’s the guide I wish I had to importing Javascript libraries years ago. I’m only going to talk about using Javacript libraries on the frontend, and only about how to use them in a no-build-system setup. In this post I’m going to talk about: There are 3 basic types of Javascript files a library can provide: I’m not sure if there’s a better name for the “classic” type but I’m just going to call it “classic”. Also there’s a type called “AMD” but I’m not sure how relevant it is in 2024. Now that we know the 3 types of files, let’s talk about how to figure out which of these the library actually provides! Every Javascript library has a build which it uploads to NPM. You might be thinking (like I did originally) – Julia! The whole POINT is that we’re not using Node to build our library! Why are we talking about NPM? But if you’re using a link from a CDN like https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js , you’re still using the NPM build! All the files on the CDNs originally come from NPM. Because of this, I sometimes like to the library even if I’m not planning to use Node to build my library at all – I’ll just create a new temp folder, there, and then delete it when I’m done. I like being able to poke around in the files in the NPM build on my filesystem, because then I can be 100% sure that I’m seeing everything that the library is making available in its build and that the CDN isn’t hiding something from me. So let’s a few libraries and try to figure out what types of Javascript files they provide in their builds! First let’s look inside Chart.js , a plotting library. This library seems to have 3 basic options: option 1: . The suffix tells me that this is a CommonJS file , for using in Node. This means it’s impossible to use it directly in the browser without some kind of build step. option 2: . The suffix by itself doesn’t tell us what kind of file it is, but if I open it up, I see which is an immediate sign that this is an ES module – the syntax is ES module syntax. option 3: . “UMD” stands for “Universal Module Definition”, which I think means that you can use this file either with a basic , CommonJS, or some third thing called AMD that I don’t understand. When I was using Chart.js I picked Option 3. I just needed to add this to my code: and then I could use the library with the global environment variable. Couldn’t be easier. I just copied into my Git repository so that I didn’t have to worry about using NPM or the CDNs going down or anything. A lot of libraries will put their build in the directory, but not always! The build files’ location is specified in the library’s . For example here’s an excerpt from Chart.js’s . I think this is saying that if you want to use an ES Module ( ) you should use , but the jsDelivr and unpkg CDNs should use . I guess is for Node. ’s also says , which according to this documentation tells Node to treat files as ES modules by default. I think it doesn’t tell us specifically which files are ES modules and which ones aren’t but it does tell us that something in there is an ES module. is a library for logging into Bluesky with OAuth in the browser. Let’s see what kinds of Javascript files it provides in its build! It seems like the only plausible root file in here is , which looks something like this: This syntax means it’s an ES module . That means we can use it in the browser without a build step! Let’s see how to do that. Using an ES module isn’t an easy as just adding a . Instead, if the ES module has dependencies (like does) the steps are: The reason we need an import map instead of just doing something like is that internally the module has more import statements like , and we need to tell the browser where to get the code for and all of its other dependencies. Here’s what the importmap I used looks like for : Getting these import maps to work is pretty fiddly, I feel like there must be a tool to generate them automatically but I haven’t found one yet. It’s definitely possible to write a script that automatically generates the importmaps using esbuild’s metafile but I haven’t done that and maybe there’s a better way. I decided to set up importmaps yesterday to get github.com/jvns/bsky-oauth-example to work, so there’s some example code in that repo. Also someone pointed me to Simon Willison’s download-esm , which will download an ES module and rewrite the imports to point to the JS files directly so that you don’t need importmaps. I haven’t tried it yet but it seems like a great idea. I did run into some problems with using importmaps in the browser though – it needed to download dozens of Javascript files to load my site, and my webserver in development couldn’t keep up for some reason. I kept seeing files fail to load randomly and then had to reload the page and hope that they would succeed this time. It wasn’t an issue anymore when I deployed my site to production, so I guess it was a problem with my local dev environment. Also one slightly annoying thing about ES modules in general is that you need to be running a webserver to use them, I’m sure this is for a good reason but it’s easier when you can just open your file without starting a webserver. Because of the “too many files” thing I think actually using ES modules with importmaps in this way isn’t actually that appealing to me, but it’s good to know it’s possible. If the ES module doesn’t have dependencies then it’s even easier – you don’t need the importmaps! You can just: If you don’t want to use importmaps, you can also use a build system like esbuild . I talked about how to do that in Some notes on using esbuild , but this blog post is about ways to avoid build systems completely so I’m not going to talk about that option here. I do still like esbuild though and I think it’s a good option in this case. CanIUse says that importmaps are in “Baseline 2023: newly available across major browsers” so my sense is that in 2024 that’s still maybe a little bit too new? I think I would use importmaps for some fun experimental code that I only wanted like myself and 12 people to use, but if I wanted my code to be more widely usable I’d use instead. Let’s look at one final example library! This is a different Bluesky auth library than . Again, it seems like only real candidate file here is . But this is a different situation from the previous example library! Let’s take a look at : There’s a bunch of stuff like this in : This syntax is CommonJS syntax, which means that we can’t use this file in the browser at all, we need to use some kind of build step, and ESBuild won’t work either. Also in this library’s it says which is another way to tell it’s CommonJS. Originally I thought it was impossible to use CommonJS modules without learning a build system, but then someone Bluesky told me about esm.sh ! It’s a CDN that will translate anything into an ES Module. skypack.dev does something similar, I’m not sure what the difference is but one person mentioned that if one doesn’t work sometimes they’ll try the other one. For using it seems pretty simple, I just need to put this in my HTML: and then put this in . It seems to Just Work, which is cool! Of course this is still sort of using a build system – it’s just that esm.sh is running the build instead of me. My main concerns with this approach are: I also learned that you can also use to convert a CommonJS module into an ES module, though there are some limitations – the syntax doesn’t work. Here’s a github issue about that . I think the approach is probably more appealing to me than the approach because it’s a tool that I already have on my computer so I trust it more. I haven’t experimented with this much yet though. Here’s a summary of the three types of JS files you might encounter, options for how to use them, and how to identify them. Unhelpfully a or file extension could be any of these 3 options, so if the file is you need to do more detective work to figure out what you’re dealing with. The main difference between CommonJS modules and ES modules from my perspective is that ES modules are actually a standard. This makes me feel a lot more confident using them, because browsers commit to backwards compatibility for web standards forever – if I write some code using ES modules today, I can feel sure that it’ll still work the same way in 15 years. It also makes me feel better about using tooling like because even if the esbuild project dies, because it’s implementing a standard it feels likely that there will be another similar tool in the future that I can replace it with. A lot of the time when I talk about this stuff I get responses like “I hate javascript!!! it’s the worst!!!”. But my experience is that there are a lot of great tools for Javascript (I just learned about https://esm.sh yesterday which seems great! I love esbuild!), and that if I take the time to learn how things works I can take advantage of some of those tools and make my life a lot easier. So the goal of this post is definitely not to complain about Javascript, it’s to understand the landscape so I can use the tooling in a way that feels good to me. Here are some questions I still have, I’ll add the answers into the post if I learn the answer. Here’s a list of every tool we talked about in this post: Writing this post has made me think that even though I usually don’t want to have a build that I run every time I update the project, I might be willing to have a build step (using or something) that I run only once when setting up the project and never run again except maybe if I’m updating my dependency versions. Thanks to Marco Rogers who taught me a lot of the things in this post. I’ve probably made some mistakes in this post and I’d love to know what they are – let me know on Bluesky or Mastodon!

0 views
Julia Evans 11 months ago

New microblog with TILs

I added a new section to this site a couple weeks ago called TIL (“today I learned”). One kind of thing I like to post on Mastodon/Bluesky is “hey, here’s a cool thing”, like the great SQLite repl litecli , or the fact that cross compiling in Go Just Works and it’s amazing, or cryptographic right answers , or this great diff tool . Usually I don’t want to write a whole blog post about those things because I really don’t have much more to say than “hey this is useful!” It started to bother me that I didn’t have anywhere to put those things: for example recently I wanted to use diffdiff and I just could not remember what it was called. So I quickly made a new folder called /til/ , added some custom styling (I wanted to style the posts to look a little bit like a tweet), made a little Rake task to help me create new posts quickly ( ), and set up a separate RSS Feed for it. I think this new section of the blog might be more for myself than anything, now when I forget the link to Cryptographic Right Answers I can hopefully look it up on the TIL page. (you might think “julia, why not use bookmarks??” but I have been failing to use bookmarks for my whole life and I don’t see that changing ever, putting things in public is for whatever reason much easier for me) So far it’s been working, often I can actually just make a quick post in 2 minutes which was the goal. My page is inspired by Simon Willison’s great TIL blog , though my TIL posts are a lot shorter. This came about because I spent a lot of time on Twitter, so I’ve been thinking about what I want to do about all of my tweets. I keep reading the advice to “POSSE” (“post on your own site, syndicate elsewhere”), and while I find the idea appealing in principle, for me part of the appeal of social media is that it’s a little bit ephemeral. I can post polls or questions or observations or jokes and then they can just kind of fade away as they become less relevant. I find it a lot easier to identify specific categories of things that I actually want to have on a Real Website That I Own: and then let everything else be kind of ephemeral. I really believe in the advice to make email lists though – the first two (blog posts & comics) both have email lists and RSS feeds that people can subscribe to if they want. I might add a quick summary of any TIL posts from that week to the “blog posts from this week” mailing list.

0 views
Julia Evans 11 months ago

ASCII control characters in my terminal

Hello! I’ve been thinking about the terminal a lot and yesterday I got curious about all these “control codes”, like , , , etc. What’s the deal with all of them? Here’s a table of all 33 ASCII control characters, and what they do on my machine (on Mac OS), more or less. There are about a million caveats, but I’ll talk about what it means and all the problems with this diagram that I know about. You can also view it as an HTML page (I just made it an image so it would show up in RSS). The first surprising thing about this diagram to me is that there are 33 control codes, split into (very roughly speaking) these categories: There’s no real structure to which codes are in which categories, they’re all just kind of randomly scattered because this evolved organically. (If you’re curious about readline, I wrote more about readline in entering text in the terminal is complicated , and there are a lot of cheat sheets out there ) Something else that I find a little surprising is that are only 33 control codes – A to Z, plus 7 more ( ). This means that if you want to have for example as a keyboard shortcut in a terminal application, that’s not really meaningful – on my machine at least is exactly the same thing as just pressing , is the same as , etc. Also isn’t a control code – what it does depends on your terminal emulator. On Linux is often used by the terminal emulator to copy or open a new tab or paste for example, it’s not sent to the TTY at all. Also I use all the time, but that isn’t a control code, instead it sends an ANSI escape sequence ( ) which is a different thing which we absolutely do not have space for in this post. This “there are only 33 codes” thing is totally different from how keyboard shortcuts work in a GUI where you can have for any key you want. Each of these 33 control codes has a name in ASCII (for example is ). When all of these control codes were originally defined, they weren’t being used for computers or terminals at all, they were used for the telegraph machine . Telegraph machines aren’t the same as UNIX terminals so a lot of the codes were repurposed to mean something else. Personally I don’t find these ASCII names very useful, because 50% of the time the name in ASCII has no actual relationship to what that code does on UNIX systems today. So it feels easier to just ignore the ASCII names completely instead of trying to figure which ones still match their original meaning. Another thing that’s a bit weird is that is literally the same as , and is the same as , which makes it hard to use those two as keyboard shortcuts. From some quick research, it seems like some folks do still use and as keyboard shortcuts ( here’s an example ), but to do that you need to configure your terminal emulator to treat them differently than the default. For me the main takeaway is that if I ever write a terminal application I should avoid and as keyboard shortcuts in it. While writing this I needed to do a bunch of experimenting to figure out what various key combinations did, so I wrote this Python script echo-key.py that will print them out. There’s probably a more official way but I appreciated having a script I could customize. Two of these codes ( and ) are labelled in the table as “handled by the OS”, but actually they’re not always handled by the OS, it depends on whether the terminal is in “canonical” mode or in “noncanonical mode”. In canonical mode , programs only get input when you press (and the OS is in charge of deleting characters when you press or ). But in noncanonical mode the program gets input immediately when you press a key, and the and codes are passed through to the program to handle any way it wants. Generally in noncanonical mode the program will handle and similarly to how the OS does, but there are some small differences. Some examples of programs that use canonical mode: Examples of programs that use noncanonical mode: I said that sends but technically this is not necessarily true, if you really want to you can remap all of the codes labelled “OS terminal driver”, plus Backspace, using a tool called , and you can view the mappings with . Here are the mappings on my machine right now: I have personally never remapped any of these and I cannot imagine a reason I would (I think it would be a recipe for confusion and disaster for me), but I asked on Mastodon and people said the most common reasons they used were: Two signals caveats: You can see which terminal modes a program is setting using like this, terminal modes are set with the system call: here are the modes sets when it starts ( and are missing!): and it resets the modes when it exits: I think the specific combination of modes vim is using here might be called “raw mode”, man cfmakeraw talks about that. Related to “there are only 33 codes”, there are a lot of conflicts where different parts of the system want to use the same code for different things, for example by default will freeze your screen, but if you turn that off then will use to do a forward search. Another example is that on my machine sometimes will send and sometimes it’ll transpose 2 characters and sometimes it’ll do something completely different depending on: In this diagram I’ve labelled code 127 as “backspace” and 8 as “other backspace”. Uh, what? I think this was the single biggest topic of discussion in the replies on Mastodon – apparently there’s a LOT of history to this and I’d never heard of any of it before. First, here’s how it works on my machine: If I press , it has the same effect as if I’m using readline, but in a program without readline support (like for instance), it just prints out . Apparently Step 2 above is different for some folks – their key sends the byte instead of , and so if they want Backspace to work then they need to configure the OS (using ) to set . There’s an incredible section of the Debian Policy Manual on keyboard configuration that describes how and should work according to Debian policy, which seems very similar to how it works on my Mac today. My understanding (via this mastodon post ) is that this policy was written in the 90s because there was a lot of confusion about what should do in the 90s and there needed to be a standard to get everything to work. There’s a bunch more historical terminal stuff here but that’s all I’ll say for now. I’ve probably missed a bunch more ways that “how it works on my machine” might be different from how it works on other people’s machines, and I’ve probably made some mistakes about how it works on my machine too. But that’s all I’ve got for today. Some more stuff I know that I’ve left out: according to is “discard”, is “reprint”, and is “dsusp”. I have no idea how to make those actually do anything (pressing them does not do anything obvious, and some people have told me what they used to do historically but it’s not clear to me if they have a use in 2024), and a lot of the time in practice they seem to just be passed through to the application anyway so I just labelled and as . Also I want to say that I think the contents of this post are kind of interesting but I don’t think they’re necessarily that useful . I’ve used the terminal pretty successfully every day for the last 20 years without knowing literally any of this – I just knew what , , , , did in practice (plus maybe , and ) and did not worry about the details for the most part, and that was almost always totally fine except when I was trying to use xterm.js . But I had fun learning about it so maybe it’ll be interesting to you too.

0 views
Julia Evans 11 months ago

Using less memory to look up IP addresses in Mess With DNS

I’ve been having problems for the last 3 years or so where Mess With DNS periodically runs out of memory and gets OOM killed. This hasn’t been a big priority for me: usually it just goes down for a few minutes while it restarts, and it only happens once a day at most, so I’ve just been ignoring. But last week it started actually causing a problem so I decided to look into it. This was kind of winding road where I learned a lot so here’s a table of contents: I run Mess With DNS on a VM without about 465MB of RAM, which according to (the column) is split up something like: That leaves about 110MB of memory free. A while back I set GOMEMLIMIT to 250MB to try to make sure the garbage collector ran if Mess With DNS used more than 250MB of memory, and I think this helped but it didn’t solve everything. A few weeks ago I started backing up Mess With DNS’s database for the first time using restic . This has been working okay, but since Mess With DNS operates without much extra memory I think sometimes needed more memory than was available on the system, and so the backup script sometimes got OOM killed. This was a problem because There’s probably more than one solution to this, but I decided to try to make Mess With DNS use less memory so that there was more available memory on the system, mostly because it seemed like a fun problem to try to solve. I’d run a memory profile of Mess With DNS a bunch of times in the past, so I knew exactly what was using most of Mess With DNS’s memory: IP addresses. When it starts, Mess With DNS loads this database where you can look up the ASN of every IP address into memory, so that when it receives a DNS query it can take the source IP address like and tell you that IP address belongs to . This database by itself used about 117MB of memory, and a simple told me that was too much – the original text files were only 37MB! The way it worked originally is that I had an array of these: and I searched through it with a binary search to figure out if any of the ranges contained the IP I was looking for. Basically the simplest possible thing and it’s super fast, my machine can do about 9 million lookups per second. I’ve been using SQLite recently, so my first thought was – maybe I can store all of this data on disk in an SQLite database, give the tables an index, and that’ll use less memory. This did solve the initial memory goal (after a GC it now hardly used any memory at all because the table was on disk!), though I’m not sure how much GC churn this solution would cause if we needed to do a lot of queries at once. I did a quick memory profile and it seemed to allocate about 1KB of memory per lookup. Let’s talk about the issues I ran into with using SQLite though. SQLite doesn’t have support for big integers and IPv6 addresses are 128 bits, so I decided to store them as text. I think might have been better, I originally thought s couldn’t be compared but the sqlite docs say they can. I ended up with this schema: Also I learned that Python has an module, so I could use to make sure that the IPv6 addresses were expanded so that a string comparison would compare them properly. I ran a quick microbenchmark, something like this. It printed out that it could look up 17,000 IPv6 addresses per second, and similarly for IPv4 addresses. This was pretty discouraging – being able to look up 17k addresses per section is kind of fine (Mess With DNS does not get a lot of traffic), but I compared it to the original binary search code and the original code could do 9 million per second. I’d never really done an EXPLAIN in sqlite, so I thought it would be a fun opportunity to see what the query plan was doing. It looks like it’s just using the index and not the index, so maybe it makes sense that it’s slower than the binary search. I tried to figure out if there was a way to make SQLite use both indexes, but I couldn’t find one and maybe it knows best anyway. At this point I gave up on the SQLite solution, I didn’t love that it was slower and also it’s a lot more complex than just doing a binary search. I felt like I’d rather keep something much more similar to the binary search. A few things I tried with SQLite that did not cause it to use both indexes: My next idea was to use a trie , because I had some vague idea that maybe a trie would use less memory, and I found this library called ipaddress-go that lets you look up IP addresses using a trie. I tried using it here’s the code , but I think I was doing something wildly wrong because, compared to my naive array + binary search: I’m not really sure what went wrong here but I gave up on this approach and decided to just try to make my array use less memory and stick to a simple binary search. One thing I learned about memory profiling is that you can use package to see how much memory is currently allocated in the program. That’s how I got all the memory numbers in this post. Here’s the code: Also I learned that if you use to analyze a heap profile there are two ways to analyze it: you can pass either or to . I don’t know how I didn’t realize this before but will tell you about everything that was allocated, and will just include memory that’s currently in use. Anyway I ran a lot. Also every time I use pprof I find myself referring to my own intro to pprof , it’s probably the blog post I wrote that I use the most often. I should add and to it. I was storing my ip2asn entries like this: I had 3 ideas for ways to improve this: I figured I could store the ASN info in an array, and then just store the index into the array in my struct. Here are the structs so you can see what I mean: This worked! It brought memory usage from 117MB to 65MB – a 50MB savings. I felt good about this. Here’s all of the code for that part . As an aside – I’m storing the ASN in a , is that right? I looked in the ip2asn file and the biggest one seems to be 401307, though there are a few lines that say which is much bigger, but also are just inside the range of a uint32. So I can definitely use a . It turns out that I’m not the only one who felt that was using an unnecessary amount of memory – in 2021 the folks at Tailscale released a new IP address library for Go which solves this and many other issues. They wrote a great blog post about it . I discovered (to my delight) that not only does this new IP address library exist and do exactly what I want, it’s also now in the Go standard library as netip.Addr . Switching to was very easy and saved another 20MB of memory, bringing us to 46MB. I didn’t try my third idea (remove the end IP from the struct) because I’d already been programming for long enough on a Saturday morning and I was happy with my progress. It’s always such a great feeling when I think “hey, I don’t like this, there must be a better way” and then immediately discover that someone has already made the exact thing I want, thought about it a lot more than me, and implemented it much better than I would have. Even though I tried to explain this in a simple linear way “I tried X, then I tried Y, then I tried Z”, that’s kind of a lie – I always try to take my actual debugging process (total chaos) and make it seem more linear and understandable because the reality is just too annoying to write down. It’s more like: Someone asked why I don’t just give the VM more memory. I could very easily afford to pay for a VM with 1GB of memory, but I feel like 512MB really should be enough (and really that 256MB should be enough!) so I’d rather stay inside that constraint. It’s kind of a fun puzzle. Folks had a lot of good ideas I hadn’t thought of. Recording them as inspiration if I feel like having another Fun Performance Day at some point. I deployed the new version and now Mess With DNS is using less memory! Hooray! A few other notes: I’m honestly not sure if this will solve all my memory problems, probably not! But I had fun, I learned a few things about SQLite, I still don’t know what to think about tries, and it made me love binary search even more than I already did.

0 views
Julia Evans 1 years ago

Some notes on upgrading Hugo

This seems to be discussed in the release notes for 0.57.2 I just needed to replace with in the template on the homepage as well as in my RSS feed template. I had this comment in the part of my theme where I link to the next/previous blog post: “next” and “previous” in hugo apparently mean the opposite of what I’d think they’d mean intuitively. I’d expect “next” to mean “in the future” and “previous” to mean “in the past” but it’s the opposite It looks they changed this in ad705aac064 so that “next” actually is in the future and “prev” actually is in the past. I definitely find the new behaviour more intuitive. Figuring out why/when all of these changes happened was a little difficult. I ended up hacking together a bash script to download all of the changelogs from github as text files , which I could then grep to try to figure out what happened. It turns out it’s pretty easy to get all of the changelogs from the GitHub API. So far everything was not so bad – there was also a change around taxonomies that’s I can’t quite explain, but it was all pretty manageable, but then we got to the really tough one: the markdown renderer. The blackfriday markdown renderer (which was previously the default) was removed in v0.100.0 . This seems pretty reasonable: It has been deprecated for a long time, its v1 version is not maintained anymore, and there are many known issues. Goldmark should be a mature replacement by now. Fixing all my Markdown changes was a huge pain – I ended up having to update 80 different Markdown files (out of 700) so that they would render properly, and I’m not totally sure The obvious question here is – why bother even trying to upgrade Hugo at all if I have to switch Markdown renderers? My old site was running totally fine and I think it wasn’t necessarily a good use of time, but the one reason I think it might be useful in the future is that the new renderer (goldmark) uses the CommonMark markdown standard , which I’m hoping will be somewhat more futureproof. So maybe I won’t have to go through this again? We’ll see. Also it turned out that the new Goldmark renderer does fix some problems I had (but didn’t know that I had) with smart quotes and how lists/blockquotes interact. The hard part of this Markdown change was even figuring out what changed. Almost all of the problems (including #2 and #3 above) just silently broke the site, they didn’t cause any errors or anything. So I had to diff the HTML to hunt them down. Here’s what I ended up doing: (the thing is searching for red/green text in the diff) This was very time consuming but it was a little bit fun for some reason so I kept doing it until it seemed like nothing too horrible was left. Here’s a list of every type of Markdown change I had to make. It’s very possible these are all extremely specific to me but it took me a long time to figure them all out so maybe this will be helpful to one other person who finds this in the future. This doesn’t work anymore (it doesn’t expand the link): I need to do this instead: This works too: I didn’t want this so I needed to configure: This doesn’t render as a nested list anymore if I only indent by 2 spaces, I need to put 4 spaces. The problem is that the amount of indent needed depends on the size of the list markers. Here’s a reference in CommonMark for this . Previously the here didn’t render as a blockquote, and with the new renderer it does. I found a bunch of Markdown that had been kind of broken (which I hadn’t noticed) that works better with the new renderer, and this is an example of that. Lists inside blockquotes also seem to work better. Previously this didn’t render as a heading, but now it does. So I needed to replace the with . I had something which looked like this: With Blackfriday it rendered like this: and with Goldmark it rendered like this: Same thing if there was an accidental at the beginning of a line, like in this Markdown snippet To fix this I just had to rewrap the line so that the wasn’t the first character. The Markdown is formatted this way because I wrap my Markdown to 80 characters a lot and the wrapping isn’t very context sensitive. There were a bunch of places where the old renderer (Blackfriday) was doing unwanted things in code blocks like replacing with or replacing quotes with smart quotes. I hadn’t realized this was happening and I was very happy to have it fixed. The way this gets rendered got better: Before there were two left smart quotes, now the quotes match. Previously if I had an image like this: it would get wrapped in a tag, now it doesn’t anymore. I dealt with this just by adding a to images in the CSS, hopefully that’ll make them display well enough. Previously this wouldn’t get wrapped in a tag, but now it seems to: I just gave up on fixing this though and resigned myself to maybe having some extra space in some cases. Maybe I’ll try to fix it later if I feel like another yakshave. I also needed to Here’s what I needed to add to my to do all that: Maybe I’ll try to get syntax highlighting working one day, who knows. I might prefer having it off though. I also wrote a little program to compare the Blackfriday and Goldmark output for various markdown snippets, here it is in a gist . It’s not really configured the exact same way Blackfriday and Goldmark were in my Hugo versions, but it was still helpful to have to help me understand what was going on. My approach to themes in Hugo has been: So I just need to edit the theme files to fix any problems. Also I wrote a lot of the theme myself so I’m pretty familiar with how it works. Relying on someone else to keep a theme updated feels kind of scary to me, I think if I were using a third-party theme I’d just copy the code into my site’s github repo and then maintain it myself. I asked on Mastodon if anyone had used a static site generator with good backwards compatibility. The main answers seemed to be Jekyll and 11ty. Several people said they’d been using Jekyll for 10 years without any issues, and 11ty says it has stability as a core goal . I think a big factor in how appealing Jekyll/11ty are is how easy it is for you to maintain a working Ruby / Node environment on your computer: part of the reason I stopped using Jekyll was that I got tired of having to maintain a working Ruby installation. But I imagine this wouldn’t be a problem for a Ruby or Node developer. Several people said that they don’t build their Jekyll site locally at all – they just use GitHub Pages to build it. Overall I’ve been happy with Hugo – I started using it because it had fast build times and it was a static binary, and both of those things are still extremely useful to me. I might have spent 10 hours on this upgrade, but I’ve probably spent 1000+ hours writing blog posts without thinking about Hugo at all so that seems like an extremely reasonable ratio. I find it hard to be too mad about the backwards incompatible changes, most of them were quite a long time ago, Hugo does a great job of making their old releases available so you can use the old release if you want, and the most difficult one is removing support for the Markdown renderer in favour of using something CommonMark-compliant which seems pretty reasonable to me even if it is a huge pain. But it did take a long time and I don’t think I’d particularly recommend moving 700 blog posts to a new Markdown renderer unless you’re really in the mood for a lot of computer suffering for some reason. The new renderer did fix a bunch of problems so I think overall it might be a good thing, even if I’ll have to remember to make 2 changes to how I write Markdown (4.1 and 4.3). Also I’m still using Hugo 0.54 for https://wizardzines.com so maybe these notes will be useful to Future Me if I ever feel like upgrading Hugo for that site. Hopefully I didn’t break too many things on the blog by doing this, let me know if you see anything broken!

0 views
Julia Evans 1 years ago

Terminal colours are tricky

Yesterday I was thinking about how long it took me to get a colorscheme in my terminal that I was mostly happy with (SO MANY YEARS), and it made me wonder what about terminal colours made it so hard. So I asked people on Mastodon what problems they’ve run into with colours in the terminal, and I got a ton of interesting responses! Let’s talk about some of the problems and a few possible ways to fix them. One of the top complaints was “blue on black is hard to read”. Here’s an example of that: if I open Terminal.app, set the background to black, and run , the directories are displayed in a blue that isn’t that easy to read: To understand why we’re seeing this blue, let’s talk about ANSI colours! Your terminal has 16 numbered colours – black, red, green, yellow, blue, magenta, cyan, white, and “bright” version of each of those. Programs can use them by printing out an “ANSI escape code” – for example if you want to see each of the 16 colours in your terminal, you can run this Python program: This made me wonder – if blue is colour number 5, who decides what hex color that should correspond to? The answer seems to be “there’s no standard, terminal emulators just choose colours and it’s not very consistent”. Here’s a screenshot of a table from Wikipedia , where you can see that there’s a lot of variation: Bright yellow on white is even worse than blue on black, here’s what I get in a terminal with the default settings: That’s almost impossible to read (and some other colours like light green cause similar issues), so let’s talk about solutions! If you’re annoyed by these colour contrast issues (or maybe you just think the default ANSI colours are ugly), you might think – well, I’ll just choose a different “blue” and pick something I like better! There are two ways you can do this: Way 1: Configure your terminal emulator : I think most modern terminal emulators have a way to reconfigure the colours, and some of them even come with some preinstalled themes that you might like better than the defaults. Way 2: Run a shell script : There are ANSI escape codes that you can print out to tell your terminal emulator to reconfigure its colours. Here’s a shell script that does that , from the base16-shell project. You can see that it has a few different conventions for changing the colours – I guess different terminal emulators have different escape codes for changing their colour palette, and so the script is trying to pick the right style of escape code based on the environment variable. I prefer to use the “shell script” method, because: some advantages of configuring colours in your terminal emulator: This is what my shell has looked like for probably the last 5 years (using the solarized light base16 theme), and I’m pretty happy with it. Here’s : Okay, so let’s say you’ve found a terminal colorscheme that you like. What else can go wrong? Here’s what some output of , a alternative, looks like in my colorscheme: The contrast is pretty bad here, and I definitely don’t have that lime green in my normal colorscheme. What’s going on? We can see what color codes is using using the program to capture its output including the color codes: means “set the foreground color to color ”. Terminals don’t only have 16 colours – many terminals these days actually have 3 ways of specifying colours: So is using one of the colours from the extended 256-color set. (a alternative) does something similar – here’s what it looks like by default in my terminal. This looks fine though and it really seems like it’s trying to work well with a variety of terminal themes. I think it’s interesting that some of these newer terminal tools ( , , , and probably more) have support for arbitrary custom themes. I guess the downside of this approach is that the default theme might clash with your terminal’s background, but the upside is that it gives you a lot more control over theming the tool’s output than just choosing 16 ANSI colours. I don’t really use , but if I did I’d probably use to just use the ANSI colours that I have set in my normal terminal colorscheme. A bunch of people on Mastodon mentioned a specific issue with grays in the Solarized theme: when I list a directory, the base16 Solarized Light theme looks like this: but iTerm’s default Solarized Light theme looks like this: This is because in the iTerm theme (which is the original Solarized design ), colors 9-14 (the “bright blue”, “bright red”, etc) are mapped to a series of grays, and when I run , it’s trying to use those “bright” colours to color my directories and executables. My best guess for why the original Solarized theme is designed this way is to make the grays available to the vim Solarized colorscheme . I’m pretty sure I prefer the modified base16 version I use where the “bright” colours are actually colours instead of all being shades of gray though. (I didn’t actually realize the version I was using wasn’t the “original” Solarized theme until I wrote this post) In any case I really love Solarized and I’m very happy it exists so that I can use a modified version of it. If I my vim theme has a different background colour than my terminal theme, I get this ugly border, like this: This one is a pretty minor issue though and I think making your terminal background match your vim background is pretty straightforward. A few people mentioned problems with terminal applications setting an unwanted background colour, so let’s look at an example of that. Here has set the background to color #16 (“black”), but the script I use sets color 16 to be bright orange, so I get this, which is pretty bad: I think the intention is for ngrok to look something like this: I think sets color #16 to orange (instead of black) so that it can provide extra colours for use by base16-vim . This feels reasonable to me – I use in the terminal, so I guess I’m using that feature and it’s probably more important to me than (which I rarely use) behaving a bit weirdly. This particular issue is a maybe obscure clash between ngrok and my colorschem, but I think this kind of clash is pretty common when a program sets an ANSI background color that the user has remapped for some reason. A bunch of terminals (iTerm2, tabby , kitty’s text_fg_override_threshold , and folks tell me also Ghostty and Windows Terminal) have a “minimum contrast” feature that will automatically adjust colours to make sure they have enough contrast. Here’s an example from iTerm. This ngrok accident from before has pretty bad contrast, I find it pretty difficult to read: With “minimum contrast” set to 40 in iTerm, it looks like this instead: I didn’t have minimum contrast turned on before but I just turned it on today because it makes such a big difference when something goes wrong with colours in the terminal. A few people mentioned that they’ll SSH into a system that doesn’t support the environment variable that they have set locally, and then the colours won’t work. I think the way works is that systems have a database, so if the value of the environment variable isn’t in the system’s terminfo database, then it won’t know how to output colours for that terminal. I don’t know too much about terminfo, but someone linked me to this terminfo rant that talks about a few other issues with terminfo. I don’t have a system on hand to reproduce this one so I can’t say for sure how to fix it, but this stackoverflow question suggests running something like instead of . A couple of problems people mentioned with designing / finding terminal colorschemes: Another problem people mentioned is using a program like nethack or midnight commander which you might expect to have a specific colourscheme based on the default ANSI terminal colours. For example, midnight commander has a really specific classic look: But in my Solarized theme, midnight commander looks like this: The Solarized version feels like it could be disorienting if you’re very used to the “classic” look. One solution Simon Tatham mentioned to this is using some palette customization ANSI codes (like the ones base16 uses that I talked about earlier) to change the color palette right before starting the program, for example remapping yellow to a brighter yellow before starting Nethack so that the yellow characters look better. If I run , I see something like this, with the colours disabled. In general I find this useful – if I pipe a command to , I don’t want it to print out all those color escape codes, I just want the plain text. But what if you want to see the colours? To see the colours, you can run ! I just learned about recently and I think it’s really cool, opens a tty for the command to write to so that it thinks it’s writing to a TTY. It also fixes issues with programs buffering their output when writing to a pipe, which is why it’s called . Here’s what the output of looks like for me: Also some commands (including ) support a flag which will force them to always print out the colours. Some people mentioned that they don’t want to use colour at all, perhaps because uses blue, it’s hard to read on black, and maybe they don’t feel like customizing their terminal’s colourscheme to make the blue more readable or just don’t find the use of colour helpful. Some possible solutions to this one: Here’s an example of running : I used to have a lot of problems with configuring my colours in vim – I’d set up my terminal colours in a way that I thought was okay, and then I’d start vim and it would just be a disaster. I think what was going on here is that today, there are two ways to set up a vim colorscheme in the terminal: 20 years ago when I started using vim, terminals with 24-bit hex color support were a lot less common (or maybe they didn’t exist at all), and vim certainly didn’t have support for using 24-bit colour in the terminal. From some quick searching through git, it looks like vim added support for 24-bit colour in 2016 – just 8 years ago! So to get colours to work properly in vim before 2016, you needed to synchronize your terminal colorscheme and your vim colorscheme. Here’s what that looked like , the colorscheme needed to map the vim color classes like to ANSI colour numbers. But in 2024, the story is really different! Vim (and Neovim, which I use now) support 24-bit colours, and as of Neovim 0.10 (released in May 2024), the setting (which tells Vim to use 24-bit hex colours for colorschemes) is turned on by default in any terminal with 24-bit color support. So this “you need to synchronize your terminal colorscheme and your vim colorscheme” problem is not an issue anymore for me in 2024, since I don’t plan to use terminals without 24-bit color support in the future. The biggest consequence for me of this whole thing is that I don’t need base16 to set colors 16-21 to weird stuff anymore to integrate with vim – I can just use a terminal theme and a vim theme, and as long as the two themes use similar colours (so it’s not jarring for me to switch between them) there’s no problem. I think I can just remove those parts from my shell script and totally avoid the problem with ngrok and the weird orange background I talked about above. I think there are a lot of issues around the intersection of multiple programs, like using some combination tmux/ssh/vim that I couldn’t figure out how to reproduce well enough to talk about them. Also I’m sure I missed a lot of other things too. I’ve personally had a lot of success with using base16-shell with base16-vim – I just need to add a couple of lines to my fish config to set it up (+ a few lines) and then I can move on and accept any remaining problems that that doesn’t solve. I don’t think base16 is for everyone though, some limitations I’m aware of with base16 that might make it not work for you: Apparently there’s a community fork of base16 called tinted-theming , which I haven’t looked into much yet. Just one so far but I’ll link more if people tell me about them: We talked about a lot in this post and while I think learning about all these details is kind of fun if I’m in the mood to do a deep dive, I find it SO FRUSTRATING to deal with it when I just want my colours to work! Being surprised by unreadable text and having to find a workaround is just not my idea of a good day. Personally I’m a zero-configuration kind of person and it’s not that appealing to me to have to put together a lot of custom configuration just to make my colours in the terminal look acceptable. I’d much rather just have some reasonable defaults that I don’t have to change. My one big takeaway from writing this was to turn on “minimum contrast” in my terminal, I think it’s going to fix most of the occasional accidental unreadable text issues I run into and I’m pretty excited about it.

0 views
Julia Evans 1 years ago

Some Go web dev notes

I spent a lot of time in the past couple of weeks working on a website in Go that may or may not ever see the light of day, but I learned a couple of things along the way I wanted to write down. Here they are: I’ve never felt motivated to learn any of the Go routing libraries (gorilla/mux, chi, etc), so I’ve been doing all my routing by hand, like this. But apparently as of Go 1.22 , Go now has better support for routing in the standard library, so that code can be rewritten something like this: Though it would also need a login middleware, so maybe something more like this, with a middleware. One annoying gotcha I ran into was: if I make a route for , then a request for will be redirected to . I ran into an issue with this where sending a POST request to redirected to a GET request for , which broke the POST request because it removed the request body. Thankfully Xe Iaso wrote a blog post about the exact same issue which made it easier to debug. I think the solution to this is just to use API endpoints like instead of , which seems like a more normal design anyway. I got a little bit tired of writing so much boilerplate for my SQL queries, but I didn’t really feel like learning an ORM, because I know what SQL queries I want to write, and I didn’t feel like learning the ORM’s conventions for translating things into SQL queries. But then I found sqlc , which will compile a query like this: into Go code like this: What I like about this is that if I’m ever unsure about what Go code to write for a given SQL query, I can just write the query I want, read the generated function and it’ll tell me exactly what to do to call it. It feels much easier to me than trying to dig through the ORM’s documentation to figure out how to construct the SQL query I want. Reading Brandur’s sqlc notes from 2024 also gave me some confidence that this is a workable path for my tiny programs. That post gives a really helpful example of how to conditionally update fields in a table using CASE statements (for example if you have a table with 20 columns and you only want to update 3 of them). Someone on Mastodon linked me to this post called Optimizing sqlite for servers . My projects are small and I’m not so concerned about performance, but my main takeaways were: There are a more tips in that post that seem useful (like “COUNT queries are slow” and “Use STRICT tables”), but I haven’t done those yet. Also sometimes if I have two tables where I know I’ll never need to do a beteween them, I’ll just put them in separate databases so that I can connect to them independently. I run all of my Go projects in VMs with relatively little memory, like 256MB or 512MB. I ran into an issue where my application kept getting OOM killed and it was confusing – did I have a memory leak? What? After some Googling, I realized that maybe I didn’t have a memory leak, maybe I just needed to reconfigure the garbage collector! It turns out that by default (according to A Guide to the Go Garbage Collector ), Go’s garbage collector will let the application allocate memory up to 2x the current heap size. Mess With DNS ’s base heap size is around 170MB and the amount of memory free on the VM is around 160MB right now, so if its memory doubled, it’ll get OOM killed. In Go 1.19, they added a way to tell Go “hey, if the application starts using this much memory, run a GC”. So I set the GC memory limit to 250MB and it seems to have resulted in the application getting OOM killed less often: I’ve been making tiny websites (like the nginx playground ) in Go on and off for the last 4 years or so and it’s really been working for me. I think I like it because: In general everything about it feels like it makes projects easy to work on for 5 days, abandon for 2 years, and then get back into writing code without a lot of problems. For contrast, I’ve tried to learn Rails a couple of times and I really want to love Rails – I’ve made a couple of toy websites in Rails and it’s always felt like a really magical experience. But ultimately when I come back to those projects I can’t remember how anything works and I just end up giving up. It feels easier to me to come back to my Go projects that are full of a lot of repetitive boilerplate, because at least I can read the code and figure out how it works. some things I haven’t done much of yet in Go: In general I’m not sure how to implement security-sensitive features so I don’t start projects which need login/CSRF/etc. I imagine this is where a framework would help. Both of the Go features I mentioned in this post ( and the routing) are new in the last couple of years and I didn’t notice when they came out. It makes me think I should pay closer attention to the release notes for new Go versions.

0 views