Posts in Clojure (20 found)
Abhinav Sarkar 5 days ago

Solving Advent of Code 2025 in Janet: Day 1–4

I’m solving the Advent of Code 2025 in Janet . After doing the last five years in Haskell, I wanted to learn a new language this year. I’ve been eyeing the “New Lisps” 1 for a while now, and I decided to learn Janet. Janet is a Clojure like Lisp that can be interpreted, embedded and compiled, and comes with a large standard library with concurrency, HTTP and PEG parser support. I want to replace Python with Janet as my scripting language. Here are my solutions for Dec 1–4. This post was originally published on abhinavsarkar.net . All my solutions follow the same structure because I wrote a template to create new empty solutions. Actually, I added a fair bit of automation this time to build, run, test and benchmark the solutions. Day 1 was a bit mathy but it didn’t take too long to figure out. I spent more time polishing the solution to be idiomatic Janet code. , the PEG grammar to parse the input was the most interesting part for me on the day. If you know Janet, you can notice this is not the cleanest code, but that’s okay, it was my day 1 too. The most interesting part of the day 2 solution was the macro that reads the input at compile-time and creates a custom function to check whether a number is in one of the given ranges. This turned out to be almost 4x faster than writing the same thing as a function. Notice , the PEG grammar to parse the input. So short and clean! I also leaned into the imperative and mutable nature of the Janet data-structures. The code is still not the cleanest as I was still learning. The first part of day 3 was pretty easy to solve, but using the same solution for the second part just ran forever. I realized that this is a Dynamic Programming problem, but I don’t like doing array-based solutions, so I simply rewrote the solution to add caching. And it worked! It is definitely on the slower side, but I’m okay with it. The code has become a little more idiomatic Janet. Day 4 is when I learned more about Janet control flow structures. The solution for the part 2 is a straightforward Breadth-first traversal . The interesting parts are the , and statements. So concise and elegant! That’s it for now. Next note will drop after 4 or 5 days. You can browse the code repo to see the full setup. If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! The new Lisps that interest me are: Janet, Fennel and Jank . ↩︎ If you liked this post, please leave a comment . The new Lisps that interest me are: Janet, Fennel and Jank . ↩︎

0 views
(think) 3 weeks ago

Burst-driven Development: My Approach to OSS Projects Maintenance

I’ve been working on OSS projects for almost 15 years now. Things are simple in the beginning - you’ve got a single project, no users to worry about and all the time and the focus in world. Things changed quite a bit for me over the years and today I’m the maintainer of a couple of dozen OSS projects in the realms of Emacs, Clojure and Ruby mostly. People often ask me how I manage to work on so many projects, besides having a day job, that obviously takes up most of my time. My recipe is quite simple and I refer to it as “burst-driven development”. Long ago I’ve realized that it’s totally unsustainable for me to work effectively in parallel on several quite different projects. That’s why I normally keep a closer eye on my bigger projects (e.g. RuboCop, CIDER, Projectile and nREPL), where I try to respond quickly to tickets and PRs, while I typically do (focused) development only on 1-2 projects at a time. There are often (long) periods when I barely check a project, only to suddenly decide to revisit it and hack vigorously on it for several days or weeks. I guess that’s not ideal for the end users, as some of them might feel that I “undermaintain” some (smaller) projects much of the time, but this approach has worked for me very well for quite a while. The time I’ve spent develop OSS projects has taught me that: To illustrate all of the above with some example, let me tell you a bit about copilot.el 0.3 . I became the primary maintainer of about 9 months ago. Initially there were many things about the project that were frustrating to me that I wanted to fix and improve. After a month of relatively focused work I had mostly achieved my initial goals and I’ve put the project on the backburner for a while, although I kept reviewing PRs and thinking about it in the background. Today I remembered I hadn’t done a release there in quite a while and 0.3 was born. Tomorrow I might remember about some features in Projectile that have been in the back of my mind for ages and finally implement them. Or not. I don’t have any planned order in which I revisit my projects - I just go wherever my inspiration (or current problems related the projects) take me. And that’s a wrap. Nothing novel here, but I hope some of you will find it useful to know how do I approach the topic of multi-project maintenance overall. The “job” of the maintainers is sometimes fun, sometimes tiresome and boring, and occasionally it’s quite frustrating. That’s why it’s essential to have a game plan for dealing with it that doesn’t take a heavy toll on you and make you eventually hate the projects that you lovingly developed in the past. Keep hacking! few problems require some immediate action you can’t always have good ideas for how to improve a project sometimes a project is simply mostly done and that’s OK less is more “hammock time” is important

0 views
Evan Hahn 1 months ago

Scripts I wrote that I use all the time

In my decade-plus of maintaining my dotfiles , I’ve written a lot of little shell scripts. Here’s a big list of my personal favorites. and are simple wrappers around system clipboard managers, like on macOS and on Linux. I use these all the time . prints the current state of your clipboard to stdout, and then whenever the clipboard changes, it prints the new version. I use this once a week or so. copies the current directory to the clipboard. Basically . I often use this when I’m in a directory and I want use that directory in another terminal tab; I copy it in one tab and to it in another. I use this once a day or so. makes a directory and s inside. It’s basically . I use this all the time —almost every time I make a directory, I want to go in there. changes to a temporary directory. It’s basically . I use this all the time to hop into a sandbox directory. It saves me from having to manually clean up my work. A couple of common examples: moves and to the trash. Supports macOS and Linux. I use this every day. I definitely run it more than , and it saves me from accidentally deleting files. makes it quick to create shell scripts. creates , makes it executable with , adds some nice Bash prefixes, and opens it with my editor (Vim in my case). I use this every few days. Many of the scripts in this post were made with this helper! starts a static file server on in the current directory. It’s basically but handles cases where Python isn’t installed, falling back to other programs. I use this a few times a week. Probably less useful if you’re not a web developer. uses to download songs, often from YouTube or SoundCloud, in the highest available quality. For example, downloads that video as a song. I use this a few times a week…typically to grab video game soundtracks… similarly uses to download something for a podcast player. There are a lot of videos that I’d rather listen to like a podcast. I use this a few times a month. downloads the English subtitles for a video. (There’s some fanciness to look for “official” subtitles, falling back to auto-generated subtitles.) Sometimes I read the subtitles manually, sometimes I run , sometimes I just want it as a backup of a video I don’t want to save on my computer. I use this every few days. , , and are useful for controlling my system’s wifi. is the one I use most often, when I’m having network trouble. I use this about once a month. parses a URL into its parts. I use this about once a month to pull data out of a URL, often because I don’t want to click a nasty tracking link. prints line 10 from stdin. For example, prints line 10 of a file. This feels like one of those things that should be built in, like and . I use this about once a month. opens a temporary Vim buffer. It’s basically an alias for . I use this about once a day for quick text manipulation tasks, or to take a little throwaway note. converts “smart quotes” to “straight quotes” (sometimes called “dumb quotes”). I don’t care much about these in general, but they sometimes weasel their way into code I’m working on. It can also make the file size smaller, which is occasionally useful. I use this at least once a week. adds before every line. I use it in Vim a lot; I select a region and then run to quote the selection. I use this about once a week. returns . (I should probably just use .) takes JSON at stdin and pretty-prints it to stdout. I use this a few times a year. and convert strings to upper and lowercase. For example, returns . I use these about once a week. returns . I use this most often when talking to customer service and need to read out a long alphanumeric string, which has only happened a couple of times in my whole life. But it’s sometimes useful! returns . A quick way to do a lookup of a Unicode string. I don’t use this one that often…probably about once a month. cats . I use for , for a quick “not interested” response to job recruiters, to print a “Lorem ipsum” block, and a few others. I probably use one or two of these a week. Inspired by Ruby’s built-in REPL, I’ve made: prints the current date in ISO format, like . I use this all the time because I like to prefix files with the current date. starts a timer for 10 minutes, then (1) plays an audible ring sound (2) sends an OS notification (see below). I often use to start a 5 minute timer in the background (see below). I use this almost every day as a useful way to keep on track of time. prints the current time and date using and . I probably use it once a week. It prints something like this: extracts text from an image and prints it to stdout. It only works on macOS, unfortunately, but I want to fix that. (I wrote a post about this script .) (an alias, not a shell script) makes a happy sound if the previous command succeeded and a sad sound otherwise. I do things like which will tell me, audibly, whether the tests succeed. It’s also helpful for long-running commands, because you get a little alert when they’re done. I use this all the time . basically just plays . Used in and above. uses to play audio from a file. I use this all the time , running . uses to show a picture. I use this a few times a week to look at photos. is a little wrapper around some of my favorite internet radio stations. and are two of my favorites. I use this a few times a month. reads from stdin, removes all Markdown formatting, and pipes it to a text-to-speech system ( on macOS and on Linux). I like using text-to-speech when I can’t proofread out loud. I use this a few times a month. is an wrapper that compresses a video a bit. I use this about once a month. removes EXIF data from JPEGs. I don’t use this much, in part because it doesn’t remove EXIF data from other file formats like PNGs…but I keep it around because I hope to expand this one day. is one I almost never use, but you can use it to watch videos in the terminal. It’s cursed and I love it, even if I never use it. is my answer to and , which I find hard to use. For example, runs on every file in a directory. I use this infrequently but I always mess up so this is a nice alternative. is like but much easier (for me) to read—just the PID (highlighted in purple) and the command. or is a wrapper around that sends , waits a little, then sends , waits and sends , waits before finally sending . If I want a program to stop, I want to ask it nicely before getting more aggressive. I use this a few times a month. waits for a PID to exit before continuing. It also keeps the system from going to sleep. I use this about once a month to do things like: is like but it really really runs it in the background. You’ll never hear from that program again. It’s useful when you want to start a daemon or long-running process you truly don’t care about. I use and most often. I use this about once a day. prints but with newlines separating entries, which makes it much easier to read. I use this pretty rarely—mostly just when I’m debugging a issue, which is unusual—but I’m glad I have it when I do. runs until it succeeds. runs until it fails. I don’t use this much, but it’s useful for various things. will keep trying to download something. will stop once my tests start failing. is my emoji lookup helper. For example, prints the following: prints all HTTP statuses. prints . As a web developer, I use this a few times a month, instead of looking it up online. just prints the English alphabet in upper and lowercase. I use this surprisingly often (probably about once a month). It literally just prints this: changes my whole system to dark mode. changes it to light mode. It doesn’t just change the OS theme—it also changes my Vim, Tmux, and terminal themes. I use this at least once a day. puts my system to sleep, and works on macOS and Linux. I use this a few times a week. recursively deletes all files in a directory. I hate that macOS clutters directories with these files! I don’t use this often, but I’m glad I have it when I need it. is basically . Useful for seeing the source code of a file in your path (used it for writing up this post, for example!). I use this a few times a month. sends an OS notification. It’s used in several of my other scripts (see above). I also do something like this about once a month: prints a v4 UUID. I use this about once a month. These are just scripts I use a lot. I hope some of them are useful to you! If you liked this post, you might like “Why ‘alias’ is my last resort for aliases” and “A decade of dotfiles” . Oh, and contact me if you have any scripts you think I’d like. to start a Clojure REPL to start a Deno REPL (or a Node REPL when Deno is missing) to start a PHP REPL to start a Python REPL to start a SQLite shell (an alias for )

0 views
tonsky.me 1 months ago

I am sorry, but everyone is getting syntax highlighting wrong

Translations: Russian Syntax highlighting is a tool. It can help you read code faster. Find things quicker. Orient yourself in a large file. Like any tool, it can be used correctly or incorrectly. Let’s see how to use syntax highlighting to help you work. Most color themes have a unique bright color for literally everything: one for variables, another for language keywords, constants, punctuation, functions, classes, calls, comments, etc. Sometimes it gets so bad one can’t see the base text color: everything is highlighted. What’s the base text color here? The problem with that is, if everything is highlighted, nothing stands out. Your eye adapts and considers it a new norm: everything is bright and shiny, and instead of getting separated, it all blends together. Here’s a quick test. Try to find the function definition here: See what I mean? So yeah, unfortunately, you can’t just highlight everything. You have to make decisions: what is more important, what is less. What should stand out, what shouldn’t. Highlighting everything is like assigning “top priority” to every task in Linear. It only works if most of the tasks have lesser priorities. If everything is highlighted, nothing is highlighted. There are two main use-cases you want your color theme to address: 1 is a direct index lookup: color → type of thing. 2 is a reverse lookup: type of thing → color. Truth is, most people don’t do these lookups at all. They might think they do, but in reality, they don’t. Let me illustrate. Before: Can you see it? I misspelled for and its color switched from red to purple. Here’s another test. Close your eyes (not yet! Finish this sentence first) and try to remember what color your color theme uses for class names? If the answer for both questions is “no”, then your color theme is not functional . It might give you comfort (as in—I feel safe. If it’s highlighted, it’s probably code) but you can’t use it as a tool. It doesn’t help you. What’s the solution? Have an absolute minimum of colors. So little that they all fit in your head at once. For example, my color theme, Alabaster, only uses four: That’s it! And I was able to type it all from memory, too. This minimalism allows me to actually do lookups: if I’m looking for a string, I know it will be green. If I’m looking at something yellow, I know it’s a comment. Limit the number of different colors to what you can remember. If you swap green and purple in my editor, it’ll be a catastrophe. If somebody swapped colors in yours, would you even notice? Something there isn’t a lot of. Remember—we want highlights to stand out. That’s why I don’t highlight variables or function calls—they are everywhere, your code is probably 75% variable names and function calls. I do highlight constants (numbers, strings). These are usually used more sparingly and often are reference points—a lot of logic paths start from constants. Top-level definitions are another good idea. They give you an idea of a structure quickly. Punctuation: it helps to separate names from syntax a little bit, and you care about names first, especially when quickly scanning code. Please, please don’t highlight language keywords. , , , stuff like this. You rarely look for them: “where’s that if” is a valid question, but you will be looking not at the the keyword, but at the condition after it. The condition is the important, distinguishing part. The keyword is not. Highlight names and constants. Grey out punctuation. Don’t highlight language keywords. The tradition of using grey for comments comes from the times when people were paid by line. If you have something like of course you would want to grey it out! This is bullshit text that doesn’t add anything and was written to be ignored. But for good comments, the situation is opposite. Good comments ADD to the code. They explain something that couldn’t be expressed directly. They are important . So here’s another controversial idea: Comments should be highlighted, not hidden away. Use bold colors, draw attention to them. Don’t shy away. If somebody took the time to tell you something, then you want to read it. Another secret nobody is talking about is that there are two types of comments: Most languages don’t distinguish between those, so there’s not much you can do syntax-wise. Sometimes there’s a convention (e.g. vs in SQL), then use it! Here’s a real example from Clojure codebase that makes perfect use of two types of comments: Per statistics, 70% of developers prefer dark themes. Being in the other 30%, that question always puzzled me. Why? And I think I have an answer. Here’s a typical dark theme: and here’s a light one: On the latter one, colors are way less vibrant. Here, I picked them out for you: This is because dark colors are in general less distinguishable and more muddy. Look at Hue scale as we move brightness down: Basically, in the dark part of the spectrum, you just get fewer colors to play with. There’s no “dark yellow” or good-looking “dark teal”. Nothing can be done here. There are no magic colors hiding somewhere that have both good contrast on a white background and look good at the same time. By choosing a light theme, you are dooming yourself to a very limited, bad-looking, barely distinguishable set of dark colors. So it makes sense. Dark themes do look better. Or rather: light ones can’t look good. Science ¯\_(ツ)_/¯ There is one trick you can do, that I don’t see a lot of. Use background colors! Compare: The first one has nice colors, but the contrast is too low: letters become hard to read. The second one has good contrast, but you can barely see colors. The last one has both : high contrast and clean, vibrant colors. Lighter colors are readable even on a white background since they fill a lot more area. Text is the same brightness as in the second example, yet it gives the impression of clearer color. It’s all upside, really. UI designers know about this trick for a while, but I rarely see it applied in code editors: If your editor supports choosing background color, give it a try. It might open light themes for you. Don’t use. This goes into the same category as too many colors. It’s just another way to highlight something, and you don’t need too many, because you can’t highlight everything. In theory, you might try to replace colors with typography. Would that work? I don’t know. I haven’t seen any examples. Some themes pay too much attention to be scientifically uniform. Like, all colors have the same exact lightness, and hues are distributed evenly on a circle. This could be nice (to know if you have OCD), but in practice, it doesn’t work as well as it sounds: The idea of highlighting is to make things stand out. If you make all colors the same lightness and chroma, they will look very similar to each other, and it’ll be hard to tell them apart. Our eyes are way more sensitive to differences in lightness than in color, and we should use it, not try to negate it. Let’s apply these principles step by step and see where it leads us. We start with the theme from the start of this post: First, let’s remove highlighting from language keywords and re-introduce base text color: Next, we remove color from variable usage: and from function/method invocation: The thinking is that your code is mostly references to variables and method invocation. If we highlight those, we’ll have to highlight more than 75% of your code. Notice that we’ve kept variable declarations. These are not as ubiquitous and help you quickly answer a common question: where does thing thing come from? Next, let’s tone down punctuation: I prefer to dim it a little bit because it helps names stand out more. Names alone can give you the general idea of what’s going on, and the exact configuration of brackets is rarely equally important. But you might roll with base color punctuation, too: Okay, getting close. Let’s highlight comments: We don’t use red here because you usually need it for squiggly lines and errors. This is still one color too many, so I unify numbers and strings to both use green: Finally, let’s rotate colors a bit. We want to respect nesting logic, so function declarations should be brighter (yellow) than variable declarations (blue). Compare with what we started: In my opinion, we got a much more workable color theme: it’s easier on the eyes and helps you find stuff faster. I’ve been applying these principles for about 8 years now . I call this theme Alabaster and I’ve built it a couple of times for the editors I used: It’s also been ported to many other editors and terminals; the most complete list is probably here . If your editor is not on the list, try searching for it by name—it might be built-in already! I always wondered where these color themes come from, and now I became an author of one (and I still don’t know). Feel free to use Alabaster as is or build your own theme using the principles outlined in the article—either is fine by me. As for the principles themselves, they worked out fantastically for me. I’ve never wanted to go back, and just one look at any “traditional” color theme gives me a scare now. I suspect that the only reason we don’t see more restrained color themes is that people never really thought about it. Well, this is your wake-up call. I hope this will inspire people to use color more deliberately and to change the default way we build and use color themes. Look at something and tell what it is by its color (you can tell by reading text, yes, but why do you need syntax highlighting then?) Search for something. You want to know what to look for (which color). Green for strings Purple for constants Yellow for comments Light blue for top-level definitions Explanations Disabled code JetBrains IDEs Sublime Text ( twice )

0 views

Functional Threading “Macros”

Read on the website: Threading macros make Lisp-family languages much more readable. Other languages too, potentially! Except… other languages don’t have macros. How do we go about enabling threading “macros” there?

0 views
tonsky.me 5 months ago

Podcast: Datomic: самая рок-н-рольная БД @ Тысяча фичей

Чем Datomic отличается от других баз данных и почему иногда остутствие оптимизатора лучше, чем его присутствие

0 views
Ludicity 9 months ago

Ludic's Guide To Getting Software Engineering Jobs

The steps in this guide have generated A$1,179,000 in salary (updated 13th April, 2025), measured as the sum of the highest annual salaries friends and readers have reached after following along, where they were willing to attribute their success actions in here. If it works for you, email me so I can bump the number up. I currently run my business out of my own pocket. If I don't make sales, I lose savings, and it's as simple as that. I am all-in on creating work I love by force of arms, and I'd sooner leave the industry than be disrespected at a normal workplace again. The impetus to run that risk comes from two places. The first is that my tolerance for middle managers jerking themselves off at my expense is totally eroded, and I realized that I either had to do something about it or stop complaining. I'll happily go to an office if I think it will produce something I care about, but I will not do it because someone wants to impress a withered husk who thinks his sports car makes him attractive to young women. The second , and what this post is about, is that I am really good at getting jobs, and have friends with a very deep understanding of how the job market works. In Australia, when you apply for a job without permanent residency, you are filtered out of all applications immediately. It is the first question on all online application forms, and the reason is that companies do not want to deal with visa renewals and they have far too many candidates. This leads to a situation where any characteristic that is remotely inconvenient but not noticeably correlated with suitability for the business is grounds for rejection. It is not uncommon for immigrants to take months, sometimes over a year, to find their first job actually writing code. Despite being a non-white with no professional network in the country and an undesirable visa, I had my first paid programming engagement lined up before finalizing the move off my student visa. I had a full-time job on A$117K lined up for the same day my full work visa kicked in. I continued to dig up work whenever a contract was expiring, even landing a gig mid-COVID, and while most of these jobs left much to be desired , I believe this has more to do with the state of the industry in Australia than anything that I did. And I have only gotten better at this over the past two years, because while I despaired about the state of software in general, I never stopped thinking and experimenting about how to regain some control over how I'm treated. Almost everyone I spend time with now has walked away from a job without flinching. I've done it . I once caught up with a friend, and he said "Work is stupid, I'm going to Valencia for a year." I said, "W-what? Valencia? When? For a year ?" "Yeah, a year. I'm going in two weeks." And then that glorious son of a bitch did it . Came home. Had a job waiting for him. Quit that job, got another job at more money. Quit that job, got one interstate because he felt like it at similar pay for half the work. All in a "weak" market. I get a lot of emails from people who despair about the state of the industry or who otherwise can't find jobs, and I always end up giving the same advice. I don't have the time to keep doing that. So in this post, I'm going to attempt to convince thousands of people that you should have much higher standards for what you tolerate, that you can build up the reserves to do your version of going to Valencia (this could just be staying home and playing with your kids for six months), and that it is immensely risky not to have this ability in your back pocket. Along the way, we will answer questions like "How long should a CV be?", "What should go on it?", and "When will this suffering end?" From Scott Smitelli's phenomenal The Ideal Candidate Will Be Punched In The Stomach : What was the plan here? Why did you leave a perfectly sustainable—if difficult and slightly unrewarding—job to do this crap? What even is this crap? You are, and let’s not mince words here, you are a thief. That’s the only way to make sense of this situation: You are stealing money from somebody, somehow. This is money you have not earned. There is no legitimate way that you should be receiving any form of compensation for simply absorbing abuse. These people, maybe the whole company but certainly the people in your immediate management chain, are irredeemably damaged to be using human beings that way. They will take, and take, and smile at you like they’re doing you some kind of favor, and maybe throw you a little perk from time to time to distract you from thinking about it too hard. But you? You can’t stop thinking. You can’t stop thinking. You can’t stop thinking. If you're in this photo and don't like it, this blog post is for you. We have one end-goal. A career where you're paid well, are treated with real respect, and we will not settle for less. And I mean real respect, as in "we will not proceed on this major project without your professional blessing, and you can fire abusive clients", not "you can work from home two days a week if the boss is feeling generous". I had a brief email exchange with Erez Zukerman, the CEO of ZSA last year, and asked how their customer support is so good — it's the best customer support I've ever experienced and there's no close second. He replied: For support, the basic understanding is that support is the heart of the business. It is not an afterthought. Support is a senior position at ZSA, with judgment, power of review over features and website updates before anything is released (nothing major goes out without a green line from every member of support), the power to fire customers (!), real flexibility when it comes to the schedule, etc. There are also lots of ways to expand, like writing (Robin has been writing incredible blog posts and creating printables), recording (Tisha recorded Tisha Talks Switches which thousands of people enjoyed), and more. Anything short of that isn't real respect. Not a special parking spot. Not the ability to pick up your kids sometimes . Not a patronizing award on Teams. Most places fall short of this, and because we have all agreed to demand better for ourselves, we are going to consider all of these places as mildly abusive. A lot of office jobs seem like a slow death of the soul — better than the swift death of the body that careers like construction work offer, but that isn't a reason to stop striving. Shoddy work. Hour long stand-ups. The deadlines are somehow always urgent and must be delivered immediately, but are also always late and everyone knows they'll be late from day one. This is delightful at times — office scenes in improvised theater get funnier the straighter you play them — but many people eventually feel that something vital is missing from their work lives. I really enjoy David Whyte's The Three Marriages as an antidote to the tedious objection of "Work to live, don't live to work". It's a part of life, and while it isn't all of life, being bored and treated like a disposable cog for eight hours a day shouldn't be any part of your life. If you're happy to coast, adieu, catch you later. This a no judgement zone for the next five minutes. Here is a quick reality check. I have, by virtue of hundreds of people reaching out to me over this blog, seen the "I want to leave my bad job" story play out far more times than a typical person does in a lifetime. It always plays out in one of two ways. The first is that the person immediately and aggressively looks for new jobs. This usually goes well. If it does not go well, they can always find a new job again. When the job is pursued through "normal" mechanisms, such as cold Seek applications, these jobs almost never meet the standard I set above: great pay, great team, great interview process, and whatever office arrangement you prefer. But they've always been doing better along at least some of the four measures. The other story is much more typical, and it goes something like this: I'd love to leave, but there's something keeping me. One more year and I'll get a new title, and then I'll be so well-placed for a new job. I've heard the market is bad, so I should wait until it picks up again. I'll get a raise soon, then I'll negotiate for a new job. I'm scared of keeping up with mortgage repayments. I just need a year to finish up this project, it'll look great on my CV. My network is terrible , so I don't have the same options open to me. I think I can make a difference if I'm given a few more months. In two years, this second approach has never gone well. Never, ever, ever. Consider this real exchange, copied verbatim and redacted. May, 2024: Me: I'm a little bit concerned that the pathway above leads to delaying indefinitely (there's always going to be a risk of moving then getting laid off - so what risk level do you actually tolerate, and how is that balanced against [COMPANY] being run so badly that you can get laid off there too?) but you know your situation better than me. Reader: Well, the company was bought out and seems to be stabilizing. July, 2024: Reader: Got some fantastic news! Gonna get a raise at [COMPANY], 20%! It came as a surprise, apparently they think I earn too little so they're giving me a raise because of that. November, 2024: Reader: Wanted to let you know I got news, I'm gonna be fired next month. This happens so often that it's actually boring for me . I've had exchanges like the above often enough that I know the person is finished months before they do. Play stupid games, win stupid prizes. They will either be let go, burn out and quit, or burn out and stay there as their health deteriorates. No one, at any level short of executive, has managed to have the impact that would be required for them to feel it was worth the cost. The thing that is missing, to my eyes, is some sense of confidence and self-respect. I hear lots of supposed barriers to getting a better job, but almost none of them are convincing, especially from people in the first world, so what I'm actually hearing about are psychological barriers. It takes a certain degree of confidence to know that you have worth, because a great deal of our society, whether by coincidence or design, causes people to feel like they're not desirable. If you don't have confidence, you feel trapped at the current situation, because what if you can't find something else ? What if you're not good enough? This is a real risk, but guess what, life's risky! Two months ago, one of my high school classmates, one of the fittest people I know, died of an aneurysm at age 29. Think about it this way: enough people read this blog that if you are reading this sentence , you have just drawn a ticket in the "heart attack kills me by December" lottery. This isn't hypothetical, this will happen — someone reading this will die having spent a few hundred hours on spreadsheets this year, and perhaps even have time to think "I wish I had listened to Ludic, he is so smart and wise." 1 And on self-respect, I will concede that you're getting the vestiges of my time spent in psychology, but why would anyone respect you if you let someone do Scrum at you for hours ? No one respected me when I let people do Scrum at me, and that was my fault . "It what der street trolls make when dey is short o' cash an' ... what is it dey's short of, Brick?" The moving spoon paused. "Dey is short o' self-respec', Sergeant," he said, as one might who'd had the lesson shouted into his ear for twenty minutes. So where do we start off? Well, the first thing to do is bury the idea that you need this particular job, or that you are otherwise unworthy. And we're going to do that by getting really good at getting mediocre jobs, and we're almost always going to want to be doing day-rate contracts. We are going to do a lot of things that I do not endorse when going for a good job, like sending your CV anywhere, talking to recruiters, etc. Regular full-time jobs obtained through mass-market channels have dysfunctional social dynamics that are too complex to get into here. Patrick McKenzie writes : Many people think job searches go something like this: See ad for job on Monster.com Send in a resume. Get an interview. Get asked for salary requirements. Get offered your salary requirement plus 5%. Try to negotiate that offer, if you can bring yourself to. This is an effective strategy for job searching if you enjoy alternating bouts of being unemployed, being poorly compensated, and then treated like a disposable peon. Working jobs like the above comes at a real cost, even if you can get them at-will. I had an episode of intense burnout which resulted in a year of recovery , and I had to think very hard about how to not feel trapped in a bad situation again, even if the business fails. I do not want to attend hour-long stand-ups anymore. This section is about how to get the above jobs as effectively and painlessly as possible, but they will still not be great , and if you do them forever then I will be very disappointed in you. In any case, if one must engage with the market in this way to build confidence and a reputation, then day-rate contracts are amazing. I am heavily in favour of contracting. The day rate is much higher. You are forced to continue searching every few months, which means you are also forced to always be aware that you have options, and we will discuss how to minimize the pain of this. You will meet far more people because you will be at a new workplace every few months. Here in Australia, a weak contracting job will pay A$1K per day, which is approximately double what a permanent employee earns. I.e, for every six months of contracting, you can afford six months of unemployment, and you're still as well off as you were if you had been permanently employed over that period. Contracts are terminated more frequently, but you're also in a much better position because you've saved way more money per day worked, you've met tons more people, and your CV is always up-to-date. And you also knew it was going to expire in six months, so having it end three months early isn't a horrible shock to your planning. You are also excluded from the most mentally draining practices in a corporate environment, and afforded a higher status than regular employees. You will usually not be asked to attend pointless meetings, and instead be left free to execute on technical work, particularly if you indicate that you can manage scope independently. If someone does ask you to attend a pointless meeting, you can recite the Litany of a Thousands Dollars a Day in your head over and over as the project manager attempts to flay your mind. You know that delightful period after you've submitted a resignation and you're about to get out? That's the whole contract . A six month contract feels like handing in a resignation with six months of notice. When the CEO says "Can we put GenAI and blockchain in the product?", you can close your eyes, my God you are so happy, and whisper "Inshallah, I will not be on this train when it derails". None of these jobs will be great. This is not a good way to get jobs in the long-term. This is a boring, soulless way for someone that does not have any appreciable career capital or networking ability to generate adequate jobs on high pay. We only bother with this so that you know that if your business explodes, or the cool non-profit you find fires you, or if a new boss comes in and abuses you, you'll know deep down that you can walk right out that door and tell them to get fucked. I should note that the advice in this section was heavily contributed to by a friend who wishes to remain anonymous, but let us all send them silent thanks. Anyway, we take Sturgeon's Law very seriously on this blog. Ninety percent of everything is crap. It is with this understanding that we must proceed. There is a pathway to navigate that relies entirely on the broad understanding that: Let us begin with recruiters. Recruiters are an unfortunate reality of the industry. I still haven't worked out why they exist when a company can just post a job ad themselves, and their talent team has to filter out the candidates themselves anyway, but whatever. They're here, and I've learned enough about the world to accept that it's 50/50 on whether their existence is economically rational. In 2019, about a month into my first full-time programming job, I received a call from a recruiter. They were looking for someone with Airflow experience to work a contract with Coles, a massive Australian grocery chain. I had no idea what to really say to this, being inexperienced and hugely underconfident, so I just listened to his questions and answered them. Most of my answers were a sad: "Ah, no, sorry, I know what AWS is but I've never used it before at a real business. I know what Airflow is , but I've never..." Until finally we come across the fateful question: "And do you know Linux?" Why, yes, I do know Linux. At that stage of my career, it never even occurred to me to ponder what knowing Linux is. Do I even know my keyboard if I can't construct one from scratch? What does he mean know ? How deranged would I have to be to say I know Python, without qualification, without being a core contributor? But none of that occurred to me, I just said yes. He is delighted, we get to chatting, and we quickly realize that we're both working our first jobs! He is a year younger, also nervous about his job, and is so happy to be talking to someone that just sounds like a normal person. He is soon comfortable enough to ask me a very vulnerable question: "So what is Linux?" I answer, and I've been doing nothing but teach psychologists-in-training statistics for a year, so the explanation is good. Each good explanation leads to another, until I'm fielding questions like: "What is Airflow? What is AWS?" We hang up, on good terms, and I stare at the wall for a long moment. There are people out there just like, calling around and functionally asking "Have you used FeebleGlorp for eight years?" with no internal model of what FeebleGlorp might be? That can't be right. Everyone at school told me that affairs would be very serious in the real world. Affairs are not very serious in the real world. Affairs have never been less serious. I told myself for a while that this must have been because he was so young, but no, they're actually almost all like this. I have only ever met two recruiters with intact brains 2 . To quote a reader with extensive HR experience who attempted to explain this dysfunction to me: While there are professionals that specialize in tech and with time develop enough depth to understand the discipline and move the needle in the right direction, for most recruiters it is not an economic advantage to do so; as the winds of the market are ever changing, recruiters are always the first ones to go into the chopping block when there are layoffs. Better to be a generalist recruiter and keep your job options open. I.e, the recruiters you are talking to probably go out of their way to avoid learning anything, because they may be recruiting in a different industry next month. This means a few things. I normally do not send CVs anywhere and decry them, but I've reversed my stance. They're a terrible way to get good jobs, but a heavily optimized CV will demolish most other candidates, who are about as unserious as the recruiters. So how do we optimize? Well, we're trying to get past recruiters. On your CV, quality indicators only matter if the recruiter can understand them, and as per the above they do not understand anything . At 12:34 PM today, while writing this blog post, I got a call from a recruiter and I asked them a question for blog material. "Hey, question about my CV, would it be better if I mentioned that I'm well-regarded by Linus Torvalds?" (This is not true, we don't know each other.) And they said, "Uh, I'd leave that out, these are very busy people and need technical credentials." Recruiters are only looking for one thing. They are looking for the number of years of experience that you say you have in buzzword, and possibly that you've worked somewhere like Google — but I've never seen a Googler compete for open-market contracts, so don't feel too disadvantaged. Years of experience with buzzword is the only thing that matters. Delete everything else. Link to your GitHub profile? Goodbye, none of these people are going to read that. I've been assured that the typical talent team spends five to ten seconds per CV . I am a passionate front-end developer with a drive for — no one cares, and if you reflect on how you felt even writing that sentence, of course no one cares. You didn't care. My CV used to say things like "deployed a deep learning project in collaboration with Google engineers" and it had sections like this: Some of the most talented people I know in Australia have told me that that this would qualify me for an instant interview on their teams, but this CV does not work because the person reading your CV will not care about the craft. If someone that does care reads it, it will be after four untalented people decided it was allowed to land on their desk, and at that point they're going to interview you anyway so your CV doesn't matter. The ideal CV starts with lines like this: Five (5) years expert skills in cloud database development and integration (Databricks, Snowflake) using ETL/ELT tools (dbt and Airbyte) and deploying cloud computing with AWS (EC2, RDS) and Azure (VM) cloud platforms The rest of the CV should be more lines like that , nothing else matters. A senior talent acquisition nerd at McKenzie told me that CVs should be one page, because it shows that the candidate is concise. Their counterpart at another agency said that you need three pages or you can barely get to know the candidate. Which of them is right? Both of them had no idea what they're talking about, because both of them are just eyeballing it, coming up with post-hoc rationalizations for their behavior that ignores the real hard question of why they specialize in hiring talent in fields that they cannot describe. I now trend towards a three page CV for no reason other than it looks like I must have more experience if it won't fit on one page, and it gives me more space to put buzzwords in. And when I say buzzwords, I mean you need the room to write things like "Amazon Web Services (AWS)" because some of the people reading the CVs do not know they are the same thing . Act on the principle of minimum charity, and accept that this version of your CV will never get you a great job. We know what we're optimizing for at this stage, and it isn't amazing colleagues, it is the ability to refill your coffers very quickly and with minimal pain . Okay, but which buzzwords do you pick? If you hop onto a job search platform, you are going to see many jobs that are essentially asking you to cosplay as a software engineer. For example, I have just hopped onto Seek and punched in "data engineer", my own subspecialty. This immediately yields this job from an international consultancy whose frontpage reads: GenAI is the most powerful force in business—and society—since the steam engine. As software and code generate more value than ever, every worker, business leader, student, and parent is now asking: Are we ready? Wow, that sure is something! I think speak on behalf of all of us when I say "please stop, you're hurting us". Also it looks like there isn't a single mention of AI on their website in 2022, so I'm really impressed that they've become experts in a novel technology just in time to cash in on over-excited executives . But what does the actual job listing entail? Proficient in Azure Data Factory, Databricks, SQL Server Integration Services (SSIS), and SQL Server Analysis Services (SSAS). And from this, by mental force, I can tell you everything you need to know about the job. My third eye is fully open, and the recruiting department's pathetic attempts to ward off my psychic intrusions are but tattered veils before a hurricane. They are almost certainly recruiting data engineers for a company with a very weak IT department, probably a government client, that is in the middle of a failing cloud migration. SSIS is the phylactery of a millennia old lich-king, a piece of software that runs out of SQL Server on old government data warehouses everywhere. The first time I had to fix an SSIS production outage, the senior engineer on my team told me to "untick all the boxes on that screen, then re-tick them all and click save", and that actually solved the problem . The entire point of a cloud migration is to stop using SSIS and use something better, but that would require you to be good at your job, so instead consultancies sell Azure Data Factory. Azure Data Factory is notable for having been forged in the hottest furnace in Hell. The last time I used it, I clicked "save" on a slow internet connection and it started to open and close random panels for five minutes before saving my work, which I can only assume means that the product has to open every component on the front-end to fetch data from the DOM to populate a POST request... which is, you know, is certainly one way that we could do things. Why use something so bad? It's because Azure Data Factory can be used to run SSIS packages! So now you're on the cloud, and have a new bad service running your old bad service, all without actually improving anything! And of course, they are both tools that do not require programming , so the consultancy can sell you a team of non-programmers for $2,000 per day. I've worked alongside one of these teams. They had one good developer who desperately tried to handle all of their work at once, and I shit you not, four "engineers" that spent eight hours a day copying-and-pasting secrets out of legacy SSIS packages into Azure Data Factory's secret manager for weeks on end. With a bit of experience, most job listings are simply an honor roll of dead IT projects. And because many executives hop onto the same bandwagons at the same times (but call it innovation), there seem to be specific patterns for the type of cog that companies are pursuing at any given moment. The friend who gave me most of the tips in here has an "Azure Data Engineer" CV, where he removes all mention of AWS work he has done so that government recruiters don't hurt their pretty little heads, and vice versa. Companies on Azure want Databricks because you can spin it up from the Azure UI, and companies on AWS similarly use Snowflake because of groupthink. Just smash those words onto the page. Every field can think of some variation of this. If you're a data scientist, it'll be a few common patterns to try cram LLMs into things. If you're a front-end developer, it's probably going to be a soup of React and its supporting technologies. Again, no one reading your CV until the final stage will know anything. Once, a recruiter had coffee with me, and they asked me why Git is such a popular programming language. I write my CV in Overleaf because I can make faster edits during the early phases of figuring out which patterns work, and fidgeting with layout is probably the most annoying part of any sort of CV-writing. This is a tough situation. What I did was looking up a few "easy" jobs, like data analyst, hop onto LinkedIn, navigate to that company's page, then navigate to someone that looks like they might be leading the relevant team. Do not message HR. They, as a rule, do not have human frailties like mercy and kindness while they are at work. Go straight to someone that actually cares if you are good at the job, and impress upon them that you are a real person, who either has a very cool life story about changing career pathways late in life, or who is an adorable graduate UwU. If you are super, super, super desperate, my company has an unpaid internship for graduates that really, really, really think that they just need a tiny bit of experience to get taken seriously. Once you're done scrubbing all signs of personality or competence out of your CV, leaving only eight (8) years of experience, what then? If you're going to be doing this all the time, how do we make it relatively painless? The first thing is to hop onto a bunch of job platforms and upload the CV. That's simple enough. This means that recruiters will start reaching out to you every week or so, and some of them will have jobs for you. 90% of them will fail to secure you an interview. I start the conversations with something like "In the interest of making sure we're making good use of time, what's the expected compensation for the job?". They'll say a number. If the number isn't high enough, thank them for their time and hang up. Don't waste your time. They will not present you to the candidate if you ask them to do any work beyond sending a CV and collecting a commission — from their perspective, you are cattle to be sold. If I was desperate, I would take the first job offered to me at any pay, then not slow my search down at all. Most contracts allow you to quit on very short notice, so use that against the employer instead of having it used against you for a change. The second thing is that you can start testing out the CV in the lowest-effort manner possible. The recommendation from my friend that has experimented the most is to grab a job platform's app on your phone, and to apply to maybe three or four jobs every morning. Don't bother with ones that ask you to make accounts on new job platforms, or write cover letters, or anything like that. Save a filter that removes anything you aren't willing to do, whether it's pay that's too low or a long commute. Err on the side of being picky, and do this every workday , even if you already have a contract. If the list of jobs becomes empty, then you must either relax your constraints or move to a new area. Sorry! If you get as far as a call with a human and are later rejected, ask them, especially recruiters, what employers want to see. They will tell you which buzzwords are good. If there is any conceivable way that you can claim to have experience in an important buzzword, write it down. This is incidentally how the strain of doing this is best managed — by not doing anything more arduous than reading a few jobs and clicking "apply", then not thinking about it until the next day. Don't apply to so many that it feels like even a bit of an ordeal. Do not let rejection affect you, most of the people involved in this process do not deserve your respect in this instance. I am sure they are lovely husbands and wives and sons and we don't care right now . It has taken up to two months before calls have started rolling in, and that is why I'd suggest doing this more-or-less constantly, even when settled into a contract role. You want to know if jobs have suddenly started to dry up, or if you need to make adjustments to your buzzword soup. A fair number of these jobs won't do any sort of diligence. The interview will be fine. Questions will sometimes be on the level of "Do you know Python?", a real question that a real director asked me before paying me hundreds of thousands of dollars. I've done a few more unpleasant interviews, detailed here , but at this point they don't bother me. If I found myself in another one of these situations, I would hang up mid-call. Eat your assertiveness vegetables, they'll put hair on your chest. Quit. You got this job, you'll get another job. Don't quit, duh. Listen man, I didn't design the industry, but I rolled brown skin and an Indian name at character creation. I'm just doing what I've gotta do. All those jobs will be mediocre, but you won't feel like any particular person has too much power over you. But still, the second job market is where you actually want to be. This is the promised land where people have functioning test suites, the executives know something about the work being undertaken, your colleagues are not Senior Void Gazers who have been so thoroughly beaten down by the industry that they dully repeat "it's a living and I have kids", and as a bonus you're probably paid about 50% more. It is so totally divorced from the first job market that people in it sometimes do not understand that the first job market exists. Famous Netflix-guy-turned-Twitch-streamer, the Primagen, has never even heard of PowerBI , what is probably the most popular analytics tool on the planet. These people are blessed . It is not accessible via Seek. It is accessible entirely through having well-placed friends and a reputation for being a cool person with a modicum of self-respect. You can't generate these by pulling the "apply for job" lever over and over. This way you don't have to pray that your friend's companies are hiring at any given moment, you'll just always know that you've got an interview every few weeks. Because getting in here isn't very predictable, this section is general advice in no particular order. If the company asks you to do Leetcode stuff, my opinion right now is that they're probably at least a bit serious, but I don't think a place that asks you to grovel before entry is a great place to be along non-technical dimensions. Erik Dietrich calls this type of interview "carnival cash", rewarding compliant employees and middle managers with the opportunity to terrorize their fellow humans instead of with money. I'm not that sure about this point. I'd probably be bad at a Leetcode interview, so I'm biased against them. Maybe they're correlated with high quality programming performance in some way that I don't understand. People often say "I don't have any connections" or "My network is terrible". This was a 0% judgement zone earlier. It is now a 0% sympathy zone. There is a phenomenon I refer to called "trying to try". It can be broadly summarized as any set of behaviors where someone has not seriously engaged their brain, does not really believe that they're doing anything with a serious chance of success, and are more-or-less just looking for reasons to say that they tried but failed. This happens in subtle places — for example, when training with beginner sabre fencers, you can stand perfectly immobile and they will very consistently hit your blade instead of you. They are so panicked and upset that their body is not trying to win , simply going through the motions of what fencing looks like. This manifests in all sorts of ways that I'll talk about one day, but it's so apparent in the job search. "I've applied for fifty jobs and no one responded". A good indicator that you're trying to try is that you are: Most people tell me they've applied for jobs and didn't get responses. Slightly savvier people tell me they've sent some cold emails out. Some people beyond that say they've started attending Meetups but had no luck. None of them have done anything remotely interesting or otherwise indicative of novel thought. I got my first programming job by emailing Josh Wiley from my psychology degree, a man who did not know me at all, but I had been in one lecture with him, and his wife was the only senior academic honest enough to tell me not to undertake a PhD. I still have the original email. We had a brief back-and-forth, and two weeks later one of his colleagues said "One of my PhD students is freaking out because they can't process some data in R", and that got me my first paid programming job, processing microsaccade data in sleep-deprived drivers. A few weeks later, I saw that a data analyst job was up for grabs at a nearby university. The smooth brained thing to do would have been to apply via Seek and get ignored. I instead went on LinkedIn, looked up the company, look up the word "lead" and cold messaged someone who seemed like they might have something to do with the job. This led me to Dave Coulter, who I still catch up every few months, and a job offer that let me skip straight to being a mid-level engineer. During the interview, when they asked "Have you programmed professionally?", I described the microsaccade project and they hired me. I didn't mention that it was about thirty hours of work in total, and they didn't ask. I actually lost the original position to someone with six more years of experience, who was offered the original data analyst role or a much more highly-paid contract. They wanted the stability, so they took the permanent role, leaving me with a massive pay bump for the contract role, and we both quit at the same time anyway. And they do not conceptualize it as losing tends of thousands of dollars, but they were functionally unemployed for months relative to what they could have earned. Score one for contracts. Those still ended up being mediocre jobs, but I just wanted to illustrate that there is a level of trying that looks more like "there is a gun to my head and I'm willing to do unorthodox things to survive", and the people that email me for jobs have never reached the unorthodox part of that. Presumably the people that do reach this point do not need to email me for jobs. I woke up this morning to an email from Dan Tentler from Phobos about safe ways to run Incus with NixOS images pre-loaded with Airflow and an overlay VPN to client sites. Dan learned about Phobos from a group of hackers in Oslo. I learned about overlay VPNs in December from the CTO of Hyprefire , Stefan Prandl, when asking for advice on network security. I have a discussion about something like this every day, even if it isn't in the tech space. Before that, at a relatively "decent" engineering company, the most complex discussion I had was trying to explain to someone that their Snowflake workload was crashing because they were trying to read 2M+ records into a hash map and that this takes a lot of memory. I have learned more in the last three months than I have in the previous three years, basically along every dimension of my profession. I'm trying to catch up for years of working with mediocre performers, and it's hard . It's definitely doable, and remember that I'm doing this while spending half my time on sales so you can do it faster than me, but there is a real cost to not working with really great people. I've studied hard over the past few years, but nothing comes close to just having awesome people around you. This matters because really good teams don't hire total scrubs that haven't taken control of their education. The first job market does not reward skill or personal development. The second one does actually require you to be good. The best offer I've received from a good company (A$185K) was obtained not through Seek, but by meeting my current co-founder Ash Lally during the preparation for a game of Twilight Imperium IV where I absolutely smoked everyone . The only other place that I've considered might be acceptable to settle down, much better than the offer I received through Ash, was the result of getting coffee with a local reader, then eventually being invited to drinks with their team a few times. We mostly talked about split keyboards and Star Citizen. It has been a few months since I quit my last job, and I used to say all sorts of conciliatory things like "Sure, that engineer is terrible, but most of them are good!", but money talks. I only offered one of them a job with me. In retrospect, most of them had the potential to be good, but enough years in a typical corporate setting will ruin this. When I was 20, people were happy to hire me because I had potential. Now, potential is still important, but it's important that I've at least demonstrated that some of it is manifesting . Many engineers have pathologies that I think make them unsuitable for work on a healthy team, in the same way that some people need to do some self-work to enter a healthy relationship. For example, I know many people who feel guilty taking time off, so they'll burn themselves out without someone constantly getting them to slow down. I'm sympathetic, but a team as small as mine doesn't have time to walk someone through that level of self-harm and still deliver for clients reliably. We help each other through lots of little quirks we need to deprogram out of corporate contexts, but we need to be starting from a place of some progress. An example that Modal's Jonathon Belotti sent my way is that Modal's most high-performing team members will get a two week deadline, then confidently spend the entirety of the first week reading a book on the technology they're about to use. Most engineers I know, including myself a few years ago, would rather hack incompetently for two weeks. The essential reason for this is being too underconfident to act on our beliefs about how engineering should be done (or worse, not having those beliefs at all), and we'd rather fail in the approved, was-working-visibly fashion than risk looking unorthodox. "I programmed the whole two weeks and failed!" feels easier to justify than "I read a book for one of those weeks and failed!". But team members should be picked for their judgement, and if they are good for the team in proportion to the quality of that judgement and their willingness to exercise it in the face of orthodoxy . People are awful at asking for work. Here is how I advise people do it. If you don't have a good time, just leave it be. You're here to rekindle old relationships and meet interesting people, and maybe they can help you out. The moment you start asking people for help that you don't even want as friends is the moment that the entire endeavour becomes sleazy. I think of each person I know (in the context of job searching) as some sort of machine that randomly spits out jobs in a uniform distribution over a year. Let's say each person has a 5% chance of turning up a job every month, maybe more or less depending on the market. If you want an 80% chance at a job every month, you have to have enough people with you in mind that you're rolling the dice enough times to get that number. Many people tell me that they attended a few Meetups and had no results, even though that's what you're supposed to do. It's good that they tried, but most large Meetups seem to be populated by people who are ineffectually looking for jobs. Don't be ineffectual. Large Meetups were frustrating when I was a student because everyone interesting was swarmed by students trying desperately to look employable without being needy, and it is frustrating as a non-student because now I get swarmed. People go "Oh, I am a data scientist, I will go to the Data Science Meetup". That's better than not going out at all, but strictly inferior to going to a tiny Meetup with ten nerds that are deeply into Elixir or some other niche bit of technology. You will form real connections with the latter, and the fact that you know what Clojure is will be enough to make many people at such a place want to work with you. If you are in a city with a functioning tech industry and can't think of any interesting technology, then it's going to be really hard to justify why you deserve a spot on a good team, so maybe solve that problem first. If you're decent at writing and have opinions on something... write. It's amazing for meeting people. I have several readers that have sent me their writing, and without any intervention from me , about 30% of them hit the front-page of Hackernews on the strength of their material. There are surprisingly few people putting out good material on almost all topics, especially in the age of LLM slop. Reader Mira Welner wrote about something as generic as Git checkout and hit the front-page. Bernardo Stein, mentioned in various places on the blog as the guy that coaches me through my worst engineering impulses from my corporate career, has front-paged by writing about NixOS . Nat Bennett, who I've been getting advice from for months and am now hiring to coach my team, front-paged Hackernews writing about the notebook they keep at new jobs . Even Scott Smitelli, who I quoted earlier for having wrote this fantastic piece emailed it to me, and before I could finish reading it people were already recommending it to me through other channels. It's super easy to meet people through writing if you aren't afraid of pushing out your real opinions and indeed, you will see extremely stupid comments on all of the above writing, so you will need to be unafraid. Fine. Tell people that you, personally, are ChatGPT. Someone else may lose their job and think "I wish I hadn't listened to Ludic, he is so stupid and foolish", but I refuse to acknowledge them.  ↩ The main one is Gary Donovan who I didn't even meet in the wild. I met a reader for coffee, and that reader worked with a really nice engineering company. That company said Gary is their favourite recruiter. The first time I called him, I said something about Lisp and it turned out he had a copy of The Little Schemer in front of him at that very second, and we later had a great talk about engineering culture in F1 over ramen. I am still reeling at the implications in neuroscience of a recruiter that can read — is it possible that some of them are sentient?  ↩

0 views
Jimmy Miller 12 months ago

Dec 14: Bidrectional Type Checking

Type checking is one of those topics in programming that I feel is often taught in a way that makes it rather confusing. Hindley Milner is a rather elegant, but confusing algorithm. Seeing how easy it is to make in mini-Karen, certainly didn't give me any insight into how it works. When you dive into papers about type checking it's full of formalism and very little code. For academics, I'm sure that's fine. However, for an industry programmer like myself, I often find it rather difficult. This paper (Bidirectional Typing Rules: A Tutorial by David Raymond Christiansen) is a great contrast to this usual setup. It isn't lacking in formalism, but provides a nice discussion of the formalism and some simple translation to code! There's even a talk which is how I first discovered this paper (though there are some audio issues). The talk goes into a few more details. But this paper is still a great introduction. Bidirectional type checking is a method somewhere between full type inference and no type inference. An imprecise characterization of this is that you only have to write type annotations on type-level entities, but not on every variable. So, for function definitions you need to write your types, but not much else. The bidirectionality bit gets its name because this algorithm works by checking types and inferring (or synthesizing) types in an interleaving fashion. If we know a type and we have a value, we can check that value against a type. If we don't know a type but have a value, we can infer the type and then check it. This may seem a bit simplistic. When I first saw it, I thought there was no way this kind of setup could work but it does! Consider the very simple example: We can very easily infer the type of the literal and then we can check that the type of matches the annotation. This very simple idea scales up incredibly way to larger structures. Just from the features laid out in this paper, I was able to implement a simple bidirectional type checker! I actually did it twice. Once in a terse way in Clojure using a library called Meander a friend of mine wrote. (Maybe I need to find a good paper on term rewriting, not sure I ever found one of those). Then I decided to translate that into more verbose, but maybe a bit more familiar Javascript . This post was rather short as I didn't have a ton of time today. But I promise, this paper is super readable. It is full of code translations and hints on how to read things. For example: Translating the formalism from above into the code below makes this whole thing way more understandable! I know there are reasons academics prefer the notion they use for these things, but every time I see them translated into code, I just imagine an alternative universe where that didn't happen and how happy I'd be.

0 views
Jimmy Miller 12 months ago

Dec 11: On Understanding Data Abstraction Revisited

One of my meta-criteria for choosing a paper in this series is the likelihood we will cover the paper on the future of coding. I won't completely rule out covering papers I think we will do at some point. But for the most part, I've decided to steer clear of them. This paper is one I don't think we'd cover on a podcast because I'm not sure it makes for great conversation. This is one of the difficult things about the podcasting medium. Or at least the medium as I conceive it. Not all papers lend themselves to a free-wheeling discussion about the paper. A lecture might work for a paper, and a blog post almost certainly works, but some things just don't come out well in a discussion. One class of papers I don't think translate well are those that just tell you facts of the matter. I don't mean of course that there isn't bias in these papers. But simply that the goal of the paper isn't to convince you of anything but to simply tell you about things. In some ways, this paper doesn't quite qualify. It is in fact an argument, but not about something I'd consider a real live debate for me. Are objects and abstract data types different things? I'm not sure I really care about that distinction and yet this paper has stuck with me ever since I read it. So, I want to cast this paper in a different light for you readers who may, like me, not really care about objects vs adts. Rather than make a division between objects and adts, I want you to view this paper as bridging a divide between the values held by object-oriented programmers and those held by functional programmers. Object-oriented programmers value the ability to have multiple implementations of objects interoperate together. They want systems to be extended not modified. Functional programmers value immutability. They want to be certain that the values they have cannot change out from under them. Until I read this paper, I had believed the view that objects were about encapsulating state. In this paper, I see a very object-oriented way of programming, that constructs completely immutable objects. In keeping with my now adopted goal of not replacing the paper, but instead enticing you to read it. I will leave off what William R. Cook has to say about the difference between ADTs (Abstract not Algebraic) and Objects. It is actually quite interesting. He shows how the standard Java way of programming and has us build ADTs rather than Objects. But I won't explain further. Instead, I want to show from his example of what an interface for an object-oriented way of defining sets might look like. Here, in our fictional language, is an object-oriented interface for defining a set. We have not defined in any way a concrete implementation of a set. We have just talked about the operations a set must have. This is one of the keys, Cook tells us, to the object-oriented style. Most notably here, there isn't a means of constructing a set. These will come from objects that conform to this interface. Consider this implementation Here we have the empty set (In pseudo rust). It contains nothing. It mutates nothing. To insert things into the empty set, we just return a new set called Insert. To union, just pass whatever set we want to union with. What do Insert and Union look like? Here we see no inheritance, no encapsulating of state. Just completely immutable objects. But we also see object-oriented principles upheld. Most notably, these objects only know about themselves, they do not inspect the representation of the arguments they are passed except through their public interfaces. He calls this term Autognosis. Autognosis means ‘self-knowledge’. An autognostic object can only have detailed knowledge of itself. All other objects are abstract. The converse is quite useful: any programming model that allows inspection of the representation of more than one abstraction at a time is not object-oriented. But I must admit, these objects are not the ones that got me so excited about this idea. They are fairly straightforward, but when you embrace this style of behavioral abstraction you can make some much more interesting implementations. Here are sets that not only are completely immutable, but they also store a large quantity of things (infinitely many), without storing almost anything at all. I actually made my own little implementation of a language like this in Clojure and wrote quite a number of things in it. The paper explains all sorts of limitations with this particular interface for sets and the tradeoffs that come with object-oriented interfaces. How do you add the intersect operator to this interface? You can't. Why? Because there is no way to tell if two set intersections are empty without iterating over the sets. But how do you iterate over these infinitely large sets? I find all of these tradeoffs fascinating and wish this was the kind of stuff talked about with Object Oriented software (in before someone tells me that of course this stuff is talked about everywhere and I'm just not reading it). I'd highly recommend trying to write some software in this style. It's a fascinating set of constraints that make for some really interesting programs.

0 views
Jimmy Miller 1 years ago

Named Function Composition

Some time ago I release a little library on NPM called . I've had some mixed feelings about my creation. I know that no one has, or will use it and if they looked at it would probably dismiss it. In fact, if I hadn't written it, I would do the same. And yet, I think I've stumbled onto a fairly decent idea. By decent idea, I mean a hack. But before we dive into this hack, let's look at the situation that gave rise to it. There is this fantastic, little known library called Zaphod . The idea behind Zaphod is to mirror Clojure's immutable data API. This makes it incredibly simple to do immutable updates on plain javascript objects. The way I've written the code above is actually not the default way Zaphod works. I imported the part of zaphod. By default, the functions are exposed to take advantage of the function bind operator . This is actually some really neat functionality. It allows you to chain your operators together. We can build pipelines by continuing to bind. Unfortunately, we don't get function bind syntax for free. Function bind is still a stage 0 proposal. This means there is a very good possibility it will never make it into javascript. In fact, after a few years of sitting at stage 0, it is basically considered dead. There is quite a lot of risk involved in using it and more conservative configurations like wouldn't use it. But function bind syntax also has flaws even if it were accepted into the language. Function bind syntax abuses the most misunderstood keyword in all of javascript. The functions you write with function binding in mind must use , they can't be normal functions. Of course, you can wrap up those functions, but if we need to wrap functions up, why not wrap them in a way that doesn't require function bind? This is where comes in. Let's look at an example. Here we see the function in use. This allows us to take a collection of functions, in this case , and wrap them up into a fluent interface. But what does this fluent interface do? It is just function composition. After calling it, we get a function back. We can now use this function to pass our data through the pipeline. Since what we get back is just a function, we can also pass this function around. We can see its use on line 14 as just a normal function that lets us perform a series of transformations on data. This is a fairly simple use of , let's take it one step further. Here we can see a combination of two totally separate libraries. In fact, I even used because rather than taking its primary argument first, it makes it last. Yet, we were still able to compose these libraries in a simple, yet flexible way. Yet, holds still more power. This time, we will be using some of the lower level features of , explaining them here would be beyond the scope of this post. Using we've made a fluent reducer for redux! No longer would we need to write switch statements in order to make a reducer. In fact, since just makes functions, you can use this reducer with combine reducers. But another really cool thing you can do with it is add on the reducer after the fact. One feature to note with this implementation is that it actually short circuits, as soon as it finds the action that matches the type, it returns, so there is no wasted computation. I really do think this library is really useful, but at the same time, I can't help but feel a little weird about this library. In order to make this library work, I have to take advantage of the fact that functions are objects. I am making a function and then assigning methods to it. This definitely a strange thing to do. Now, I do avoid mutating the functions passed into, I "copy" them before I assign properties to them, but it still feels like the wrong means for accomplishing the task of creating a pipeline. In fact, that is the thing that makes this library a hack; it is the wrong means. This library was created out of the limitation javascript imposes on us. How would we accomplish similar things in other languages? Here are just a couple of examples. Above we see how we could accomplish similar things in Haskell and Clojure. Almost all functional programming languages have a way to do this. In fact, there are some much more powerful techniques for function composition in both Haskell and Clojure. At the same time, this method has some interesting features all on its own. What we have done is allow our functions to have special ways in which they compose. Each function can determine for itself special composition points. At each point along the way, we keep these composition properties, allowing us to compose further. Each of these composition methods has a name, hence "named function composition". While born out of necessity an implemented as a hack, there is something here, something interesting that might be worth exploring further. (Addendum: It has been two years and I've yet to explore it further.)

0 views
Jimmy Miller 1 years ago

Meander for Practical Data Transformation

As Clojure programmers we love data. We believe that, at its core, programming is just data manipulation. To that end, Clojure provides fantastic data literals for its immutable data structures. Moreover core provides tons of functions for the manipulation of data. But as our data grows more complex, things become difficult. Our beautiful declarative data transformation pipeline becomes a nested mess. We wind up, yet again, playing computer in our heads. In this tutorial, we are going to build up slowly to understand how Meander can be used to solve practical data transformation problems. We will start with simple examples and move to more complicated ones, hopefully choosing problems that reflect the sorts of menial data transformation tasks we all encounter in our day jobs. Let's start with some vanilla Clojure code. Here we have a pretty decent Clojure function that converts between two different address formats. This sort of code is fairly common when we need to convert from the data requirements of one system to another. Honestly, with this simple example, the code is fairly straightforward. Our data requirements are simple and so our code isn't difficult. Let's look at how to accomplish this same task in Meander. Here is code that does the same thing written with Meander. One obvious thing to note is that the Meander version is much longer. Judging code based on the number of lines is not something we are going to do. Let's explain what is going on. First, we are using the Meander's feature. takes the thing that we are matching on ( ), a pattern to try to match, and the output. Our pattern here is in the exact shape of the person map we passed in. In order to extract out pieces of this map, we use logic variables ( , , etc). Logic variables are just symbols that start with . We can assign values in our data to any logic variables we'd like and then use those logic variables in our output. One thing I love about this simple Meander example is that you can see the exact shape of the input immediately. This example while somewhat realistic is very limited. While I like the fact that Meander's match shows us the shape of our data, for simple examples like this, Clojure does pretty well. Let's make things harder. In the example above we left out some things. A person has a preferred address, but they also have other addresses. We have a few different things we want to do with this data. First, we want to find all the distinct zip codes that a person has. Here is some pretty straightforward Clojure code for doing exactly that. I'm sure some people could have minor quibbles about how this is written, but I doubt other solutions would be much different. One thing to note here is that we have lost a little bit of the structure of our input data. We could maybe change that up a bit. Maybe using destructuring is the right approach? Regardless, this is a simple and reasonable Clojure function. Now, let's look at the Meander version. Here is the exact same function, but we've introduced two new concepts. The first one is memory variables, in this case . Memory variables start with and remember all the values they match with. The next concept is the zero or more operator ( ). The zero or more operator says to repeat the pattern to its left zero or more times. In this case . Using these two, we can declaratively gather up all the zip codes in this data structure. What happens if one of our zip codes is nil? Well for both of our functions, nil gets returned in the output. That is probably not what we want. Let's fix that in both versions. These two functions aren't that different. In Meander we could have used in the exact same way if we wanted. But it's nice that we can set these conditions on the input, which is really more closely stating our intent. Here we used a short-circuiting operator which says that we should match one of these patterns. Our first pattern is just the literal nil. If it is nil, the pattern will match, but it won't be saved anywhere. If the value isn't nil, it will be saved in our memory variable . Before we move on to more complex examples, let's consider one more modification. This time we want a distinct list of non-nil zips and cities output in a map like this . With both of these examples, I extended them in the most obvious way I could think of. I think the Meander held up pretty well, but I wouldn't have written the plain Clojure function that way. Here's what I probably would have done instead. I think this is a pretty good function. But what I find interesting is that I needed to refactor to get here. It took me a little bit to think this way. Ideally, small changes to output should not require us to restructure our code. In this case, the change is minor. But if we have to change our structure in such small cases, won't we have to change it even more in larger cases? All our examples up until this point have had one answer. Yes, that answer might have been a collection, but there was only one way for our pattern to match. This isn't always the case. To see an example of that, let's write some functions using this data structure. I apologize for the amount of room this takes up on the screen, but real-world examples are much larger. I want to try and make something that approaches realistic and to do that our input needs to be a bit bigger. Okay, so what we want to do now is given a zip code, find all people that have an address with that zip code, and for each of the addresses that match that zip code, return a map of . So in this case, if we asked for zip we should get the following response: Okay let's start with the vanilla Clojure example. This code might not be very idiomatic. I almost never use in actual code. But honestly, this was the most succinct way I could think to write it. We could also have written something like this: It seems like there is a better way I'm overlooking. But regardless I think any of these solutions will be a tiny bit complicated. We've lost the shape of the input data. We have some imperative stuff going on here. Let's contrast this with the Meander implementation. This is actually incredibly straight forward even if unfamiliar. We are now using to find multiple answers. Also note . The here let's us splice in variables that are in scope. And finally, we can name our whole map using the pattern. This code reads like what we conceptually want to do, scan people's addresses looking for zips that match the one passed in. We do not have to think at all about how this code runs. For our final example of how Meander can be used to perform data manipulation, will show one feature of logic variables that we have left off so far. To do so we need some more complex data. Here we have some much more realistic data than anything we've seen before. We have a map with three top-level keys. These represent data we have gathered from various sources. The first key is our collection of people with names and ids. The next is the of these people, indexed by id for efficient lookup. And finally we have , this represents the dates that the users visited our site, again indexed by user-id. Here's the mock scenario, we've seen suspicious activity on our site and we aren't quite sure how to narrow it down. We are going to start our investigation by finding any users who had visits that were not in the same zip as their preferred address. Because of the nature of our application, we happen to know that it is typically used at the preferred location. Once we know the users affected, we need to return their name, id, the date of access, and the zip code that didn't match. But I want to show that despite this somewhat complicated scenario, we can easily express this using Meander. Before we get there, the Clojure implementation. I really wanted to come up with a better implementation. might have been useful here. If any reader has a better implementation, I'm happy to replace this one. But honestly, I think no matter what version we went with, it is going to have the features that make this one less than desirable. Just look at how much of this code is about telling the computer what to do. Let's look at the Meander version now. This is where Meander shines. is being used to join across data structures. We can now find an id in people and use that to index into other collections. This allows us to find out everything we need to know about a person easily. We can also search into any collection and match on data at any level. We don't need to rely on pulling things out into a higher scope by using let bindings, making helper functions to work on sub-collections, or creating a series of transformations to get at the data we care about. Instead, we declare our data needs and the relationships that need to hold between them. I hope that this has been a good introduction to how Meander can be used for practical data transformation problems. In many of these examples, the vanilla Clojure made for pretty good code. But as the data requirements become more complex, we need tools to handle these. While we may be able to accomplish any of these tasks, the understanding of the structure of our code becomes lost. Looking at the example above, we know so much about what the data coming in looks like. Our code mirrors precisely the shape of data we get in. Now I do admit, my examples here are a bit contrived. But they are meant to be simple so we don't focus on the examples and instead focus on the code. In coming posts, I will explore more directly various ways we can apply Meander for data transformation. Some ideas I have in mind are using Meander with honeysql to turn our data into sql, transforming a collection of data into hiccup for display as html, and using Meander to scrap the web. I'd also love to do more computer science examples. Using Meander to make a little lisp interpreter, a CEK machine, or basic arithmetic. And yet, Meander goes way beyond all of these things. Meander is about more than practical data manipulation. It is about a new way of programming, a new way of thinking about problems. Hopefully, this introduction will help you to dive in and try it yourself.

0 views
Jimmy Miller 1 years ago

Term Rewriting with Meander

Meander is heavily inspired by the capabilities of term rewriting languages. But sadly, there aren't many introductions to term rewriting aimed at everyday software engineers. Typically introductions to term rewriting immediately dive into discussing mathematical properties or proving theorems. These can be interesting and useful in their own right. But personally, I like to get an intuitive feel for something before diving into a formalism. That is the aim of this post, to help you have a more intuitive understanding of how Term Rewriting works and what it is capable of. This post will not focus on practical uses of meander, if you are interested in that check out Meander for Practical Data Transformation . The goal of Term Rewriting is to take some bit of data and rewrite it into some other bit of data. We accomplish this by writing rules that tell us for a given piece of data what we should turn it into. Here is the most simple rewrite rule imaginable. If we are given we turn it into . In term rewriting, the pattern we are using to match is often called the and the data we return is called the . So is our left-hand-side and is our right-hand-side. The data we pass in to transform is called the reducible-expression (or for short). Admittedly, this seems almost useless, and it really is with this overly simplistic example. But let's take it slow and build it up. Here we've extended our rewrite to have multiple rules. Now we can handle more than just . Of course, this is still really limiting. We definitely can't list every single possible input for all of our rules. We need a way to match any input. That is where come in. Here we added the variable to our left-hand-side. Variables start with a and match any value. Whatever they match is now accessible on the right-hand-side. So we can match anything with and then use it in our output. Let's see a more interesting example. Here we can see some really simple rules that work on vectors of various sizes. We can use this to extract the first element from each. In this case, since we only care about , we can actually simplify this code. The is a wildcard match that matches anything but doesn't bind at all. What happens if we try to extend this to work for not just vectors, but just a single number? The order of our rules matters, matches anything, so we will always get the first match. We could change the order, or we can constrain the match. Okay, now it works. But many of you are probably thinking "Isn't this just pattern matching?". And in many ways it is. Term Rewriting is a kind of pattern matching. But it doesn't stop with simple pattern matching. Term Rewriting is a way to do all computation through pattern matching. To see that, let's move beyond the basics. We've seen that with Meander we can do simple rewrites where we match on the left-hand-side and output a right-hand-side. But just being able to do a single rewrite in this way is really limiting. To see this problem let's consider a classic example in term rewriting. Zero added to anything is just that thing. We can easily express this with term rewriting. But what if we have multiple 0's nested? As you can see, the first time we apply our rules we do simplify, but not all the way. If we call our rules again, we fully simplify the expression. But how could we express this with term rewriting? We can use what are called . Strategies let us control how our terms are rewritten. Let's start with an easy strategy the strategy. Strategies wrap our rewriting rules and make them do additional things. In this case, the rewriting will be applied twice. But there are a few problems with the strategy as we've written it. Let's slowly discover those problems together and fix them. Our apply-twice strategy works for things that need to be simplified twice, but not for simple cases. We can fix that by using the strategy. It will try to rewrite and if it fails, just return our value. Now it works for both. But having it only rewrite twice is a little arbitrary. What we really want to say is to continue applying our rewrite rules until nothing changes. We can do that by using the strategy. We can now simplify things no matter how deep they are, but as we can see we didn't fully eliminate 0s from all our expressions. Why is that? Well, our pattern only matches things that are in the outermost expression. We don't look at all at the sub-expressions. We can fix that by applying another strategy. In this case, we will use the strategy. We have now eliminated all the zeros in our additions no matter where they are in the tree. For the sake of space in our examples, we kept our rules and our strategies together, but these are actually separable. What if we wanted to try the strategy with our rules? Our rules are completely separate from how we want to apply them. When writing our transformations, we don't have to think at all about the context they live in. We just express our simple rules and later we can apply strategies to them. But what if we want to understand what these strategies are doing? After playing around with things, it seems that the top-down strategy and the bottom-up strategy always give us the same result. But what are they doing that is different? We can inspect our strategies at any point by using the strategy. So now we have modified our rewrites to trace every time the top-down or bottom-up rules are called. Let's try a fairly complicated expression and see what happens. If we look at the top-down approach, we can see that the top-down strategy actually gets called three times. Once it rewrites quite a bit but leaves in a 0 that needs to be rewritten. Then it gets called again, eliminating all zeros. Finally, it is called and nothing changes. Our bottom-up strategy however is only called twice. But we can actually get more fine-grained than this. We can put trace at any point in our strategies. Here we moved our trace down outside our strategy. Now we can see the exact order of our bottom-up strategy. Having this sort of visibility into how the process is working is really fantastic. What have been doing so far is interesting, but it falls short of the true power of term rewriting. Term rewriting is a general programming technique. Using it we can compute absolutely anything that is computable. Let's start with a classic example, fibonacci, but to further show general computability, we will make our own numbers instead of relying on Clojure's. If you aren't familiar with defining natural numbers via Peano numbers this may be a little bit confusing. But for our purposes all you need to know is that means 0 and means successor. means 1 means 2 and so on and so forth. Our fibonacci rules start by defining addition for our Peano numbers. Anything added to 0 is zero. Otherwise, we can add two numbers by moving all the s to one side until the right hand side equals 0. With those definitions in place, we can define fibonacci, which is basically just the definition of fibonacci. With term rewriting our strategies can enable us to have recursion without directly implementing it. Our rules read like they are recursive. But our rules don't call a function. They don't cause anything to occur. They just return more data. It is the process of interpretation that makes them recursive. In fact, with Meander, we are limited to what the clojure reader can interpret, but in general, with term rewriting, the syntax doesn't matter. I wrote things as merely as convention. I could have writen . There is nothing special about the syntax other than what rules we apply to it. Admittedly the example of fibonacci above isn't very useful. And of course, if we had a real language, we would never want a number system like that. So why should we care about term rewriting? Term Rewriting offers a powerful yet simple way of viewing programming. It gives us the potential to take the lisp mantra that code is data and data is code much more seriously. How so? First, in lisps functions might be values, but they are opaque. Evaluating a function definition returns you something that you can't inspect directly. Something you can't directly transform ( in clojure). With term rewriting, things can just remain data, because we have separated execution from description. Not only can our "code" be data more than it can in lisp, but we can actually have our execution as data. Executing a Term Rewriting rule is just taking in data, matching on it, and producing more data. That means all our intermediate values are data. The entire execution of our program now becomes data. Have you ever run your program and had no idea where a certain value came from? Well, imagine if you could just ask your language to pattern match on every intermediate value that contains that value. Or maybe, give me the last 5 steps that led to this value. With Term Rewriting this is entirely possible. Term Rewriting also gives us an easy basis for talking about partial programs. Our current programming languages have a problem where if they encounter something they don't understand, they just blow up, not telling us anything. Let's consider the following program: What does the program return? Well as its name is clear, unimplemented is in fact, unimplemented. So most languages will just throw an error. That can be what we want at times. But as people, we can look at that code and tell something else. We know that it will return . Why can't our languages tell us that? Why can't we start writing partial programs and run them continually refining things as we go? Term Rewriting gives us this ability. Term Rewriting represents a distinct way of programming. It offers us a uniform way of dealing with data. It gives us the ability to think about things as syntactic structures. It offers us a way to truly have code as data, to go beyond the arbitrary distinctions imposed by our languages about what can and cannot be manipulated. It is a fundamental shift in how we view programs. It gives us new perspectives, new ways of thinking about how code executes and what our programs mean. Meander isn't at that point. But it is the beginning of an exploration into how to get there. In many ways, Meander is a testament to the flexibility of lisps and Clojure in particular. Using Clojure's rich data literals and macros we can embed our own language inside it. Yet at the same time, Meander pushes us beyond the way we've traditionally conceived of programming. Maybe functions aren't the best abstraction for working with data. Could programming be better if we truly had a way to work with data directly? That is Meander's conviction and its chief aim.

0 views
Jimmy Miller 1 years ago

Building Meander in Meander

Meander has been (in my personal opinion) a wonderful success. With meander.epsilon, we can express most of the data transformations we are interested in. There are of course a few rough edges, a few things we'd change. But as more people have begun to use meander and more people present problems they are tackling with meander, it becomes clear that this approach is working. And yet, there is something that isn't working quite the way we'd like; the implementation of the meander compiler itself. This isn't meant as a diss on the code. Ultimately the organization of the code is actually really nice. There are clear, defined boundaries, there is a clear separation of functionality, the code itself isn't a mess by any standard. Nor is this a diss on the quality of code generated by the meander compiler. There are certainly areas that we could improve, but in general, meander produces code that is fast and small. In all the meander matches we've written, we have never once encountered the "method code too large" error that has plagued complex pattern matches when using libraries like core.match. But there is still something not right with the meander.epsilon compiler. As you dive into the code base and try to make modifications, it becomes hard to trace the way data is being transformed. The code is littered with if statements needed to inspect the structure of the data we are getting. Then, we have to pull out all the bits and parts we care about. What data is and isn't available at a given point is far from clear. But perhaps even more importantly, the shape of our data is lost. The meander.epsilon compiler is just converting between data structures, we read matches in as data, parse them as an ast, build a matrix-based IR, build a more direct IR, do optimizations, and deduplication, and then generate clojure code (also data). But looking at our compiler all of this is completely opaque, despite our best efforts. In meander.zeta we are taking a different approach. All the details haven't been worked out yet, but in this article, I want to share the general approach so that others can understand what we are looking to accomplish. To do that, we are going to build our own mini-meander compiler using meander.epsilon. Our compiler will not be efficient or support many matches. But it will give you a taste of what meander can do, as well as give a taste of how we are using meander to implement itself. Before we get started though, let's talk about our plan. First to keep our code clear and this article from stretching out forever, we are going to limit our feature set to matching on logic-variables and vectors. For our purposes that should be all we need. Further, we are only going to be implementing the "match" side of meander. Substitution is left as an exercise to the reader. To accomplish all of this clearly, we will start by first implementing a parser. Then taking our AST (abstract syntax tree) produced from our parser, we will implement a simple meander interpreter. Finally, we will show how meander's symbolic nature allows us to easily transform this interpreter into a compiler, with minimal changes. Let's begin. Our parser is going to mirror the format used by the meander parser. I will deviate a bit, but the general approach will be the same, so you will begin to see a bit of what the internals of meander looks like. But rather than build the parser first, let's describe the output we would like from our parser. Here we have two very simple examples of the input and output we expect from our parser. Our goal is to take our pattern and turn it into these nice, unambiguous maps. These maps will always have a value and then any other keys they need to record the information our interpreter or compiler might want. So, let's start by writing a parser that can only handle logic-variables, then will figure out how to deal with vectors. This parser is very straightforward. We are taking advantage of meander's and operators to make sure that we get a symbol whose name starts with a . Other than that, we do not match on anything else and so everything else will just return nil. Let's start by trying to extend this to vectors. Here we pull out all the contents of our vector and use to recursively parse our input. If you have never used cata you can think of it just like recur but for patterns. With that we have written our parser as far as we need to for our purposes. Now that we have an ast, we can write a simple interpreter. What our interpreter will do is given an input and a pattern and an environment, we will return an environment with all our logic variables set to some value, or we will return . Rather than try to assemble our interpreter piece by piece, I will begin by showing you the entire thing. If you've written an interpreter before this shouldn't be too surprising. First we handle logic variables by looking them up in the environment. We handle the cases of the logic variable existing in the environment and matching, it existing and not matching, and it not existing. Next we handle some vectors cases. Here we handle the empty case, the single element case, and the case with more than one element. This interpreter does work for the input we've given it. But think about what would happen if we did the same pattern but just passed a single number? We'd throw an error because we never actually check that our input is a vector. We could just go and add a vector check to all of our vector cases, but that means we will be checking that something is a vector for every single element of our vector. So let's try a different approach. Taking advantage of the fact that meander matches are ordered, we added an earlier match that will perform the check for us, and then when we recurse we simply set checked to true. That means, this pattern will no longer match and we can continue with the interpreter as before. There is still a problem with this interpreter that we aren't going to fix in this post, it does not check the size of the vector. For our purposes, doing this would be fairly easy, we check the size of and ensure target has the same size. But what would we do if we added repeats (e.g. , etc)? For now, we will leave this off, but this might be a good exercise for thinking about on your own. Now that we have a working interpreter, let's look at how we can make this a compiler. Doing so with meander will actually be suprisingly easy. Our transition from interpreter to compiler will be simpler than any I have seen before. The translation is basically mechanical. Here is our first version of our compiler. At first glance, it might be quite hard to spot the difference. There are only a few differences here. First and most crucially is that we have changed from to . So now instead of our right-hand side being code that will be immediately run, it is actually data that we are outputting, in this case, that data is code. Secondly, we have quoted some of our symbols. Because we will ultimately be outputting this code in a macro and looking up symbols in an environment, they need to be quoted. Finally we have changed from clojure recursive function calls to using meanders recursion operator . These are our only real changes and given that, we can now make a macro for matching that will compile our pattern. No longer is there a runtime cost to parsing our pattern and then interpretive overhead for crawling through the ast deciding what code to run. All of this happens in our macroexpansion. But there is one small problem. If we look at the code generated by this compiler, we will see that it is rather repetitive and long for what it does. This occurs because of the following clause above: Line 4 here is the culprit for our explosion of code. Ultimately our compilation returns us an expression that returns an environment. We build up this environment as we go through our vector. Line 4 allows us to do exactly that, expand into some code that will update our environment. But it does this over and over again. As we continue our compilation our becomes more and more branching code. Luckily there is a pretty simple fix for this. Here is our updated code that no longer creates a huge compilation output. Rather than directly updating our by embedding more and more code. We make a new symbol that will store our environment and pass that down through our recursion. Now that we have solved that problem we end up with some fairly reasonable generated code. Admittedly this is still quite a bit of code for what we are doing. If you look at it for even a moment you can see some issues. We definitely run nth and subvec entirely too many times. But once we look back at code it becomes pretty obvious that the gensym trick we used before could easy solve that problem. But there is also something still a bit unsatisfying about this generated code. Shouldn't it just be simpler? We know that is only assigned once so why check if it is in the environment or not yet? We also know that the first thing we match on will always succeed. Why are we checking there as well? For the case we are looking at now, we know that our input is a vector and in fact, we know exactly what our output should be at compile because we were passed a literal! These sorts of optimizations are completely possible with this framework. We don't have the space to fully explore them, but I will just give a general flavor. What if during compile time we also kept a compile-time env of all the things we know? So we know that our input is a vector, so why check that at run time? We know exactly which logic-variables have been bound or not, so can just be directly assigned to right away. Hopefully, you can see that there is nothing about this approach that stops us from making these sorts of optimizations in the future. I hope that from this post you learned how meander can be incredibly useful when building out a compiler in clojure. Its direct, symbolic, pattern matching approach simplifies a great deal of what goes into writing a compiler for your own customer dsl. It gives you clear and concise code that lets you reason about your cases. As we've built out zeta we've found meander's structured approach to help us understand our code and give us a clear sense of what to do next. Give this technique a try the next time you have a dsl in mind. Why settle for an interpreter when meander makes it this easy to write a compiler?

0 views
Jimmy Miller 1 years ago

My Experience Building an Editor in Rust

I've always wanted to build a text editor. I've played around before with trying to modify existing editors like codemirror. But ultimately those just felt incredibly unsatisfying. While I was able to make some fun little experiments with them, I was just gluing code together. As I started to play with Rust, it seemed like the perfect opportunity to go beyond glue code. To write a GUI text editor from "scratch". Overall, I'd say it worked out pretty well. Here's a screenshot of where things are right now. (I'll explain what you see in a bit.) The question of course is, what does "scratch" mean. In my case, I decided that "scratch" was going to be defined as using sdl2 as my set of primitives. So I had a fairly high-level way of dealing with things like drawing to a screen and rendering fonts, but nothing text editor specific. This choice I think was pretty good for me to get stuff going. I have no experience with graphics APIs and had I started there, I might have just stayed there. From the other angle, sdl2 was also good in that it didn't do too many things for me. There were no text edit widgets I was trying to control, no framework, just a library for windowing, drawing, font rendering, and eventing. Before we talk about the general path I took, let's talk about my goals. Initially, my goal was to just build an editor. I wanted color syntax, I wanted general editing capabilities, that was about it. As time grew on, my goals became less and less clear. What I discovered as I kept developing is that building a text editor was a bit more finicky than I had expected, but also way easier than I ever imagined. Rust is a really fast language. In my career, I've mostly worked in slow languages. I mostly worked on problems where speed was important, but not the most important. For this project I wanted performance. I use emacs to do my Clojure development and to be honest, it's terrible. I like paredit, I like my cider setup, but emacs freezes constantly. When I try to print data out in the repl, I can completely lock the editor. That wasn't something I'd allow my editor to do, so I thought I'd have to be super clever to make that happen. Turns out, I didn't. As I was looking to undertake this project, Casey Muratori had put out his perhaps video about building a fast terminal. He walks through how he made the rendering fast by default using a standard technique called a font atlas. So that's where I started. If you like me had never heard of a font atlas, it is a simple idea. Draw a picture of all the characters you need and as you are rendering, you just reference those characters. No need to render the font each time. In my case, I focused on ascii and just made a texture of all those characters. It was blazingly fast. Turning off vsync, I had 1000s of frames per second. I played around with manually adding some color and found that the performance held. Good sign. Now I had two things to figure out. 1) What data structure to use for editing. 2) How to do color syntax? Luckily for the first, there was some prior art that lead me down an unexpected path. After I started my project, I found out that Jamie had implemented a very similar idea but in zig. Because it was so similar though (used SDL, a font atlas, etc) I didn't want to look too closely at it and spoil the fun. But what I did do was read his blog posts on it and the one on text handling was particularly interesting. In the article, he says he just kept the text as a string and it was fast enough. Given my simplification of only handling ascii, I decided to try a and it turns out, that was incredibly fast. Modern computers are way faster than I think many of us realize. I was able to implement text editing operations just by using the Vec. When I insert text in the middle of a line, I insert it into the Vec, shifting the whole thing down. And yet, every file feels smooth. Didn't need to use a rope or any fancy data structure like that. Of course, those have other benefits, but in this case, I could keep focusing on my text editor. At this point, I had edit and display, but no color syntax. To figure out line breaks, I parsed the buffer and made a very inefficient line array of tuples with . I think this is one of those choices I wish I had done differently, mostly just on how I wrote the code, but it worked. One thing it let me do was only render the visible lines, so my first instinct for color syntax was to take advantage of that. I knew that editors like sublime use a regex-based solution that only looks a line at a time. So I thought, maybe I should just use that approach and take something off the shelf. I first looked at using Syntect for my highlighting. It works with sublime syntax files, so I'd be able to support many languages out of the box. It was incredibly easy to add to my project and well documented. I was able to integrate it very quickly and very quickly learn it was much too slow for what I wanted to do. Now, this isn't to fault Syntect. The sublime syntax format is based on TextMate's format, which relies heavily on regex. Given the complexity and constraints of these formats, there is only so much you can do. But it wasn't cutting it for me. You see at this point in the project, I wanted to keep things as simple as possible. Syntect though would often completely miss the frame deadline. If I was in the middle of editing a file and needed to re-syntax-highlight it, it would kill the performance. So I had to look elsewhere. Tree Sitter is a very interesting project out of Github. It does incremental parsing with really robust error handling of many languages. So, if I added tree-sitter to my editor, I should get syntax highlighting for cheap, but also, not have a performance issue as I'm editing. Or so I thought. First getting tree-sitter setup was far from straightforward. The packaging and build situation for tree-sitter was a bit weird. But once I got that going, I was actually quite sad to find out that the highlighting portion of tree-sitter was not incremental at all . I looked for a little while at making my own custom integration and I knew it was possible, but also didn't sound like fun. So I took a different path building my own custom setup. I started with the simplest possible thing I could do, make a custom tokenizer and tokenize on every single frame. So I did that, wrote a really terrible, representative tokenizer, and revamped my rendering to use it. Turns out, that was actually really fast and easy! Even with doing the incredibly naive thing of parsing every single frame, I was able to have the largest source files I could find on my machine open and editable in 60fps. Honestly, I was pretty blown away by how well that setup worked. Admittedly, my tokenizer is not very good now right. But as they say, that's just a matter of programming. I know that I can keep and even improve the performance while making it more feature-rich. At this point, I had the basics and wanted to play. First question, since I'm a big fan of Muse , I thought what if my text editor was a canvas? Implementing that was very straightforward, if a bit finicky, and moved me directly into the more interesting things I now wanted to do. As I was working on my tokenizer, I wanted to be able to see the output of the tokens right in the app. So I created what I called the token pane. If there is a pane called Its contents are defined as the raw tokens of the active pane. So now I could see exactly what things were tokenizing into. Incredibly useful for debugging. Next was the action pane. Quite a bit trickier. Here I would display every action that happened in the app. But, what about scrolling the action pane? Well, if I did that, then as I scrolled the action pane would constantly get new actions. The whole thing was a bit of a mess. The other hard part of this setup was that I originally didn't have a great way to refer to panes. My actions would be something like "MoveActivePane". But what was the active pane, or more precisely, when? Well, if I was looking at the action pane, it was the active pane, so now as I'm filtering out action pane actions, I would filter out all active pane actions! Not what I wanted. So I had to set up a system where your actions resolve to ids. Ultimately what I want out of an editor more than anything was captured in this early blog post on LightTable. . In it, they imagine an editor configurable in itself. But I wanted a different flavor. What if you could extend the editor in any language? I already had seen ways of seeing the internals of the editor and using panes as output. What if I could do the opposite, use panes as input? I later discovered a nice name for this, afterburner rendering ala Mary Rose Cook . Here's an example. What you can see here is some javascript that prints output like , that output is then parsed by the editor and drawn as rectangles to the screen. Obviously, rectangles aren't the most useful thing to draw. But I also played with rendering in text space. For example, here is a quick proof of concept of underlining text mentioned in rust compiler output. On the right, you see some unused imports. On the left, a quick bash script for drawing that to the screen. The dream is that as things evolved, your editor could gain new powers simply by code you have running in panes. No need for an extension language. Simply output things to stdout and you can control and extend the editor. What I found with this experiment is that even with the most naive, unoptimized code doing things that way was entirely possible. One fun experiment I played with was a way for any language to get the contents of a pane. Obviously, if a pane is backed by a file, you can read that file. But that wasn't good enough for me. I want you to be able to access the contents before changes have been saved. Further, you should be able to access it with just standard tools. So, I exposed the editor as an http service. Honestly, as weird as it may seem, it was pretty easy to do, not expensive computationally, and made it easy to access the data. Ultimately, I'd love to even expose more things. Like being able to control the editor via http requests. Now, external programs can interact with the editor in a way I've never seen supported. Once we have that, it means we have the full Unix power accessible in our editor in a very first class way. Rust was a wonderful language to write this in. I've gotten past the learning curve where I basically can write Rust without thinking too much about it. With all of these features, I was just thinking about the problems and not the language. While of course, Rust's ownership model pushed me in one direction, it never felt like it limited me. One thing I really have grown to enjoy is rusts explicit clones. Without GC, clones can be expensive in a tight loop. Making them explicit lets me be able to easily spot my bottlenecks. Several times I could do a quick clone to get things working and then come back and move the code around to make the ownership clear and avoid the clone. SDL2 was easy to get going on this project. From day one I had things drawing on the screen. The primitives SDL provides were just what I needed to focus on my task and not worry about the details. I often get to a point in projects where things just stall out because I want to clean up my code and I don't know exactly how I want things to be. I know that the path I'm on is not the right one and instead of forging ahead with working code, I stop and consider, then lose interest and stop working on it. I didn't do that here. The codebase is a mess. There is a lot of duplicated code. And as it stands right now, things are broken from unfinished experiments. But, I got a lot done with this spare time project. And even with all the duplicated and poorly designed code, I only have 3442 lines of code. While my willingness to let the code get messy was definitely good. The actual messiness of the code did cause some quality issues. I never went back and changed my line parsing to take advantage of my tokenizing code. So I loop through each file twice per frame. Because I was doing things hacky from the beginning, it took me a while to get to the point where I could track changes and know if I need to do things like reparse a file. The code now has that ability, but hooking it up was a decent amount of work and I never did it. I spent a lot of time refactoring code to emit data when an action happened. This is something I should have done from the beginning. As I was looking to implement undo/redo, this lack of reification and centralization of event handling led to all sorts of issues that still exist in the codebase. I really had a blast making this. It went in entirely different directions than I expected. I think this idea of an editor extensible via its own pane contents is really interesting. Of course, emacs is perhaps the ultimate example of an extensible editor, but it is still via elisp. Admittedly, I am sure there are parts where keeping things external may become a bottleneck. If that became the case, it might be interesting to explore webassembly extension points. But what I would find more interesting is to try and continue down this path. Perhaps there is a nice binary format that makes these things not too slow. Could you for example implement a language server protocol library in the editor without it needing to be built in and without it needing to use an extension library? Focusing on external extensibility also means that these extensions could be used by other editors. Basically, this work would expose an editor IR. An idea I find incredibly interesting and perhaps incredibly powerful. I never took my canvas idea very far. I didn't even implement panning around on the canvas. But I found myself loving these layouts. I did some code live in the editor for things like Conway's game of life, and being able to move my panes around and resize panes based on my focus were very nice interactions. I always have been a bit of a messy worker and I find having positions for things helps me out tremendously. I had a blast doing this. Rust was a great language. I tried out lots of interesting ideas and they worked pretty darn well. But I think my biggest takeaway is that there is so much opportunity here. Text editors have largely remained unchanged in terms of user experience. Now with language servers, they can offer better static analysis and refactoring tools, but the ways of interacting with them have largely not changed. There is so much potential here. Why haven't we explored new avenues for user interfaces in text editors? Why aren't there editors that let me work with them more flexibly? Is a column-based layout with some intellisense really the best there is? Will we never discover something better? In my view, there haven't been many serious attempts at this. Maybe there needs to be.

0 views
Abhinav Sarkar 1 years ago

Interesting Links for November 2024

A special Programming Languages: Theory, Design and Implementation edition of some interesting articles I recently read on the internet: There is something amazing about making your own programming language. In “You Should Make a New Programming Language” Nicole Tietz-Sokolsaya puts forward some great reasons to do the same, but I do it just for the sheer excitement of witnessing a program written in my own language run. Why aren’t there programming languages that are convenient to write but slow by default, and allow the programmer to drop to a harder to write but more performant form, if required? Alex Kladov ponders on this question in “On Ousterhout’s Dichotomy” , and offers a possible solution. I am big fan of Algebraic data types , and consider them an indispensable tool in the modern programmers’ toolbox. In “Where Does the Name ‘Algebraic Data Type’ Come From?” Li-yao Xia investigates the possible sources of the name, going back to the programming languages from half a century ago. Follow Casey Rodarmor through the rabbithole to learn where an unexpected newline character comes from in this entertaining and enlightening article “Whence ‘\n’?” . Turnstyle is an esoteric, graphical functional language by Jasper Van der Jeugt. I have never seem anything like it before. It’s truly mind-blowing and I’m still trying to understand how it works. As good programmers, we try to stay away from the dark corners of programming languages, but Justine Tunney takes a head-first dive into them and comes up with an enthralling tale in the article “Weird Lexical Syntax” . I am not going to lie, I love Lisps! I must have implemented at least a dozen of them by now. If you are like me, you may have wondered “Why Is It Easy to Implement a Lisp?” . Eli Bendersky puts forward a compelling argument. How better to implement a fast (and small) Lisp than to compile it to LLVM IR. Using Clojure this time, John Jacobsen showcases it in “To The Metal… Compiling Your Own Language(s)” . Phil Eaton takes an ingenious approach for “Compiling Dynamic Programming Languages” , one that has never occurred to me before, but now will be a part of my toolbox forever. Here’s another technique that I was only vaguely familiar with: JIT compilation using macros. In “Runtime Optimization with Eval” Gary Verhaegen demonstrates this technique using Clojure. When compiling dynamically typed programming languages, we need to tag pointers to data with the runtime type information. In “What Is the Best Pointer Tagging Method?” Troy Hinckley describes some good ways of doing the same. I relish Max Bernstein’s articles about programming language implementation techniques. In “What’s in an e-graph?” they describe an optimization technique using e-graphs used in compilers. I love atypical uses of Programming Language Theory. Adam Dueck explains their PLT adventure in “How I Learned Pashto Grammar Through Programming Syntax Trees” . Brainfuck, the most popular of esoteric programming languages, has been a lot on my mind recently. And who better to learn about compiling BF from than Wilfred Hughes. In “An Optimising BF Compiler” they go over the algorithms they used to write “An Industrial-Grade Brainfuck Compiler” . And lastly, from the wicked mind of Srijan Paul, comes a twist: “Compiling to Brainf#ck” about their programming language Meep that, you guessed right, compiles to BF. If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! This note was originally published on abhinavsarkar.net . If you liked this note, please leave a comment .

0 views
Abhinav Sarkar 1 years ago

Going REPLing with Haskeline

So you went ahead and created a new programming language, with an AST, a parser, and an interpreter. And now you hate how you have to write the programs in your new language in files to run them? You need a REPL ! In this post, we’ll create a shiny REPL with lots of nice features using the Haskeline library to go along with your new PL that you implemented in Haskell. This post was originally published on abhinavsarkar.net . First a short demo: That is a pretty good REPL, isn’t it? You can even try it online 1 , running entirely in your browser. Let’s assume that we have created a new small Lisp 2 , just large enough to be able to conveniently write and run the Fibonacci function that returns the nth Fibonacci number . That’s it, nothing more. This lets us focus on the features of the REPL 3 , not the language. We have a parser to parse the code from text to an AST, and an interpreter that evaluates an AST and returns a value. We are not going into the details of the parser and the interpreter, just listing the type signatures of the functions they provide is enough for this post. Let’s start with the AST: That’s right! We named our little language FiboLisp. FiboLisp is expression oriented; everything is an expression. So naturally, we have an AST. Writing the Fibonacci function requires not many syntactic facilities. In FiboLisp we have: We also have function definitions, captured by , which records the function name, its parameter names, and its body as an expression. And finally we have s, which are a bunch of function definitions to define, and another bunch of expressions to evaluate. Short and simple. We don’t need anything more 4 . This is how the Fibonacci function looks in FiboLisp: We can see all the AST types in use here. Note that FiboLisp is lexically scoped. The module also lists a bunch of keywords ( ) that can appear in the car 5 position of a Lisp expression, that we use later for auto-completion in the REPL, and some functions to convert the AST types to nice looking strings. For the parser, we have this pared-down code: The essential function is , which takes the code as a string, and returns either a on failure, or a on success. If the parser detects that an S-expression is not properly closed, it returns an error. We also have this pretty-printer module that converts function ASTs back to pretty Lisp code: Finally, the last thing before we hit the real topic of this post, the FiboLisp interpreter: We have elided the details again. All that matters to us is the function that takes a program, and returns either a runtime error or a value. is the runtime representation of the values of FiboLisp expressions, and all we care about is that it can be n and fully evaluated via 6 . also takes a function, that’ll be demystified when we get into implementing the REPL. Lastly, we have a map of built-in functions and a list of built-in values. We expose them so that they can be treated specially in the REPL. If you want, you can go ahead and fill in the missing code using your favourite parsing and pretty-printing libraries 7 , and the method of writing interpreters. For this post, those implementation details are not necessary. Let’s package all this functionality into a module for ease of importing: Now, with all the preparations done, we can go REPLing. The main functionality that a REPL provides is entering expressions and definitions, one at a time, that it R eads, E valuates, and P rints, and then L oops back, letting us do the same again. This can be accomplished with a simple program that prompts the user for an input and does all these with it. However, such a REPL will be quite lackluster. These days programming languages come with advanced REPLs like IPython and nREPL , which provide many functionalities beyond simple REPLing. We want FiboLisp to have a great REPL too. You may have already noticed some advanced features that our REPL provides in the demo. Let’s state them here: Haskeline — the Haskell library that we use to create the REPL — provides only basic functionalities, upon which we build to provide these features. Let’s begin. As usual, we start the module with many imports 8 : Notice that we import the previously shown module qualified as , and Haskeline as . Another important library that we use here is terminfo , which helps us do colored output. A REPL must preserve the context through a session. In case of FiboLisp, this means we should be able to define a function 9 as one input, and then use it later in the session, one or many times 10 . The REPL should also respect the REPL settings through the session till they are unset. Additionally, the REPL has to remember whether it is in middle of writing a multiline input. To support multiline input, the REPL also needs to remember the previous indentation, and the input done in previous lines of a multiline input. Together these form the : Let’s deal with settings first. We set and unset settings using the and commands. So, we write the code to parse setting the settings: Nothing fancy here, just splitting the input into words and going through them to make sure they are valid. The REPL is a monad that wraps over : also lets us do IO — is it really a REPL if you can’t do printing — and deal with exceptions. Additionally, we have a read-only state that is a function, which will be explained soon. The REPL starts in the single line mode, with no indentation, functions definitions, settings, or previously seen input. Let’s go top-down. We write the function that is the entry point of this module: This sets up Haskeline to run our REPL using the functions we provide in the later sections: and . This also demystifies the read-only state of the REPL: a function that adds colors to our output strings, depending on the capabilities of the terminal in which our REPL is running in. We also set up a history file to remember the previous REPL inputs. When the REPL starts, we output some messages in nice colors, which are defined as: Off we go ing now: We infuse our with the powers of Haskeline by wrapping it with Haskeline’s monad transformer, and call it the type. In the function, we , it, and again. We also deal with the user quitting the REPL (the case), and hitting Ctrl + C to interrupt typing or a running evaluation (the handling for ). Wait a minute! What is that imperative looking doing in our Haskell code? That’s right, we are looking through some lenses! If you’ve never encountered lenses before, you can think of them as pairs of setters and getters. The lenses above are for setting and getting the corresponding fields from the data type 11 . The , , and functions are for getting, setting and modifying respectively the state in the monad using lenses. We see them in action at the beginning of the function when we use to set the various fields of to their initial values in the monad. All that is left now is actually reading the input, evaluating it and printing the results. Haskeline gives us functions to read the user’s input as text. However, being Haskellers, we prefer some structure around it: We’ve got all previously mentioned cases covered with the data type. We also do some input validation and capture errors for the failure cases with the constructor. is used for when the user quits the REPL. Here is how we read the input: We use the function provided by Haskeline to show a prompt and read user’s input as a string. The prompt shown depends on the of the REPL state. In the mode we show , where in the mode we show . If there is no input, that means the user has quit the REPL. In that case we return , which is handled in the function. If the input is empty, we read more input, preserving the previous indentation ( ) in the mode. If the input starts with , we parse it for various commands: The and cases are straightforward. In case of , we make sure to check that the file asked to be loaded is located somewhere inside the current directory of the REPL or its recursive subdirectories. Otherwise, we deny loading by returning a . We parse the settings using the function we wrote earlier. If the input is not a command, we parse it as code: We append the previously seen input (in case of multiline input) with the current input and parse it using the function provided by the module. If parsing fails with an , it means that the input is incomplete. In that case, we set the REPL line mode to , REPL indentation to the current indentation, and seen input to the previously seen input appended with the current input, and read more input. If it is some other error, we return a with it. If the result of parsing is a program, we return it as a input. That’s it for reading the user input. Next, we evaluate it. Recall that the function calls the function with the read input: The cases of , and are straightforward. For settings, we insert or remove the setting from the REPL settings, depending on it being set or unset. For the other cases, we call the respective helper functions. For a command, we check if the requested identifier maps to a user-defined or builtin function, and if so, print its source. Otherwise we print an error. For a command, we check if the requested file exists. If so, we read and parse it, and interpret the resultant program. In case of any errors in reading or parsing the file, we catch and print them. Finally, we come to the workhorse of the REPL: the interpretation of the user provided program: We start by collecting the user defined functions in the current input with the previously defined functions in the session such that current functions override the previous functions with the same names. At this point, if the setting is set, we print the program AST. Then we invoke the function provided by the module. Recall that the function takes the program to interpret and a function of type . This function is a color-adding wrapper over the function returned by the Haskeline function 12 . This function allows non-REPL code to safely print to the Haskeline driven REPL without garbling the output. We pass it to the function so that the interpret can invoke it when the user code invokes the builtin function or similar. We make sure to and the value returned by the interpreter so that any lazy values or errors are fully evaluated 13 , and the measured elapsed time is correct. If the interpreter returns an error, we print it. Else we convert the value to a string, and if is it not empty 14 , we print it. Finally, we print the execution time if the setting is set, and set the REPL defs to the current program defs. That’s all! We have completed our REPL. But wait, I think we forgot one thing … The REPL would work fine with this much code, but it would not be a good experience for the user, because they’d have to type everything without any help from the REPL. To make it convenient for the user, we provide contextual auto-completion functionality while typing. Haskeline lets us plug in our custom completion logic by setting a completion function, which we did way back at the start. Now we need to implement it. Haskeline provides us the function to easily create our own completion function. It takes a callback function that it calls with the current word being completed (the word immediately to the left of the cursor), and the content of the line before the word (to the left of the word), reversed. We use these to return different completion lists of strings. Going case by case: This covers all cases, and provides helpful completions, while avoiding bad ones. And this completes the implementation of our wonderful REPL. I wrote this REPL while implementing a Lisp that I wrote 15 while going through the Essentials of Compilation book, which I thoroughly recommend for getting started with compilers. It started as a basic REPL, and gathered a lot of nice functionalities over time. So I decided to extract and share it here. I hope that this Haskeline tutorial helps you in creating beautiful and useful REPLs. Here is the complete code for the REPL. If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading! The online demo is rather slow to load and to run, and works only on Firefox and Chrome. Even though I managed to put it together somehow, I don’t actually know how it exactly works, and I’m unable to fix the issues with it. ↩︎ Lisps are awesome and I absolutely recommend creating one or more of them as an amateur PL implementer. Some resources I recommend are: the Build Your Own Lisp book, and the Make-A-Lisp tutorial. ↩︎ REPLs are wonderful for doing interactive and exploratory programming where you try out small snippets of code in the REPL, and put your program together piece-by-piece. They are also good for debugging because they let you inspect the state of running programs from within. I still fondly remember the experience of connecting (or jacking in ) to running productions systems written in Clojure over REPL, and figuring out issues by dumping variables. ↩︎ We don’t even need . We can, and have to, define variables by creating functions, with parameters serving the role of variables. In fact, we can’t even assign or reassign variables. Functions are the only scoping mechanism in FiboLisp, much like old-school JavaScript with its IIFEs . ↩︎ car is obviously C ontents of the A ddress part of the R egister , the first expression in a list form in a Lisp. ↩︎ You may be wondering about why we need the instances for the errors and values. This will become clear when we write the REPL. ↩︎ I recommend the sexp-grammar library, which provides both parsing and printing facilities for S-expressions based languages. Or you can write something by yourself using the parsing and pretty-printing libraries like megaparsec and prettyprinter . ↩︎ We assume that our project’s Cabal file sets the default-language to GHC2021, and the default-extensions to , , , and . ↩︎ Recall that there is no way to define variables in FiboLisp. ↩︎ If the interpreter allows mutually recursive function definitions, functions can be called before defining them. ↩︎ We are using the basic-lens library here, which is the tiniest lens library, and provides only the five functions and types we see used here. ↩︎ Using the function returned from is not necessary in our case because the REPL blocks when it invokes the interpreter. That means, nothing but the interpreter can print anything while it is running. So the interpreter can actually print directly to and nothing will go wrong. However, imagine a case in which our code starts a background thread that needs to print to the REPL. In such case, we must use the Haskeline provided print function instead of printing directly. When printing to the REPL using it, Haskeline coordinates the prints so that the output in the terminal is not garbled. ↩︎ Now we see why we derive instances for errors and . ↩︎ Returned value could be of type void with no textual representation, in which case we would not print it. ↩︎ I wrote the original REPL code almost three years ago. I refactored, rewrote and improved a lot of it in the course of writing this post. As they say, writing is thinking. ↩︎ If you liked this post, please leave a comment .

0 views