Latest Posts (15 found)
Bill Mill 3 weeks ago

Writing good technical articles is hard, we should do more of it anyway

An article is making the rounds about what it feels like to read a technical article when you don't understand the terminology

0 views
Bill Mill 2 months ago

An AI tool I find useful

One of the tasks that I do most often is to review code. I've written a review command that asks an AI to review a code sample, and I've gotten a lot of value out of it. I ignore most of the suggestions that the tool outputs, but it has already saved me often enough from painful errors that I wanted to share it in the hope that others might find it useful. The main job of the script is to generate context from a git diff and pass it to llm for code review. If you run with no arguments, it will: The result looks like this in my terminal: My main use of the command is to review a PR I'm preparing before I file it. The biggest value I've gotten out of it is that it frequently catches embarrassing errors before I file a PR - misspellings, s I forgot to remove, and occasionally logic errors. It also often suggests features that make sense to add before finishing the PR, or as next steps. It is very important to use it intelligently ! The LLM is just an LLM, and it also may be missing context. The screenshot above has two examples of mistaken suggestions that I read and ignored; you have to apply your own understanding and taste to its output. Keep in mind that it is tasked via its system prompt with finding problems and making suggestions; no matter how good your code is it will try to find and suggest something. I also use it for reviewing other people's PRs, with . In these cases, I really am just using it for clues as to some things that I may need to investigate with my actual human brain. Please do not just dump llm suggestions into a PR ! That's both rude and likely to be unhelpful. That last point brings me to why I prefer this tool to github's own copilot review tool. Thanks to a suggestion on lobste.rs from , I added the ability to provide context via stdin. Thanks for the suggestion!

0 views
Bill Mill 2 months ago

A pretty decent retry, and not a library

I'm pretty happy with the retry function I've ended up with for my nba stats downloader : One thing I like about it is that it doesn't calculate a backoff value - it picks from an explicit list of backoffs. Before I wrote this one, I had written a bunch of retry functions and always tried to come up with some clever exponential algorithm and do a bit of math. One day I was working with backoffs for my day job in a database context and I decided to go look at what sqlite uses : It's pleasantly grug-brained ; why use clever code when simple code do trick? So I stole the idea and I use it whenever I need a retry function now. The retry function does not allow you to specify your timeouts or configure the list of exceptions that don't get tracebacks or configure the message that gets thrown. It doesn't support regular args because I only need kwargs in the context I'm using it in. If I needed to use it again, I'd probably have to modify one or both of those choices. That's OK! This leads to: One instinct for a free software developer, when they have a pleasant battle-tested function, is to make it into a library. The Javascript ecosystem is full of libraries this small or smaller - I could go look and I'm sure I'd find seventeen similar examples of retry libraries. It feels nice, like you're giving back to the community that you've drawn value from. But it's a mirage! The value of having a small function like this in your codebase, where you can see it, read it, and tailor it to suit your needs is greater than the value of having a highly configurable library that's difficult to read and debug. As the code gets more complex, the weight of that complexity can make it such that it starts to make sense to isolate the code into a library and share it. I'd like to advocate that people usually underweight how much value there is in having the code included in your application by tailoring it to your application exactly rather than abstracting it into a library. For small code tasks like this, it's better to have a toolbox of examples you can pull from and customize. If I needed retry in a golang or rust program tomorrow, I could easily remember that I've written this code and translate it for use. If I'd instead used a retry library, I'd have to find one in that ecosystem and figure out which library was best, how to use it, and add an external dependency. This is much of the value that StackOverflow has provided to the world: a giant set of comments containing small, customizable code examples such as this one that solve small problems, which people can copy and modify as they wish. Given that StackOverflow seems to be dying, I wonder how we could do a better job supporting this "toolbox" type of code? Would it make sense for language ecosystems to host toolboxes? What if python had a library where people can paste code examples with a description and usage guide, and perhaps star them when they find them useful? Or perhaps we could try to have a sort of "codex" by problem type, where had a thousand examples in different languages, with different techniques, usages, and lineages? The evolution of software in such a universe is very different from what we see today. There is no official repository, no development timeline, no releases. There is only a network of many variants of some code, connected by forking relations. Centralized maintenance as we know it today does not exist. Instead, the community of scientists using the code improves it in small steps, with each team taking over improvements from other forks if they consider them advantageous. Improvement thus happens by small-step evolution rather than by large-scale design. While this may look strange toanyone used to today’s software development practices, it is very similar to how scientific models and theories have evolved in the pre-digital era

0 views
Bill Mill 3 months ago

Federation is extremely expensive

There is a certain feeling among the developers of messaging software that if you want to create something worthwhile, you need to create a federated system. This is how you get ActivityPub , atproto , matrix , xmpp , nostr , and on and on and on. A series of protocols that aim to recreate the successes of email and IRC. Email and IRC have huge problems, many of which are caused by their own decentralization. Both protocols suffer from the extreme difficulty of upgrades - it's nearly impossible to organize a whole herd of people running servers to upgrade at the same time. The effect is that once a decentralized protocol has stabilized, it is extremely resistant to upgrades. Both protocols suffer from the challenge of preventing spam. It has proven so difficult for email that is has de facto centralized around Google Mail and a few other major providers. Email's design is extremely substandard by modern standards, but despite the quasi-centralization of providers, we have no hope of changing it because there are still too many providers and consumers to make modernization possible. This leaves us with the worst of both worlds, a sub-standard protocol with a mostly-centralized structure. A key lesson of the modern internet is that a well moderated service is essential to providing a decent experience for users who are not trolls. If your service is truly decentralized, consistent moderation is impossible by design. Taking on federation means taking on the Herculean task of coordinating decentralized moderation. I don't think it's impossible to do an OK job of it, but it's a huge task and an extreme social challenge. I think a lot of the desire for federated systems comes down to a desire for ownership ; the ability of a person or a group of people to own some piece of the system. It feels like in a decentralized system, even if you don't, you could own a piece of it. You could stand up an email server, or a mastodon server, and do things your own way if you had to. This is an excellent impulse, and we should nurture it! But federation is not the only way to achieve ownership. We can make open software that allows people to own their own computation in meaningful ways a lot faster by skipping federation and focusing instead on building software that delivers a stronger user interface and user experience while providing data ownership. I want to see is a world in which we can deliver more open software to develop communities more rapidly, and I think we might get there faster if we accept that the cost of federation is so high that we ought to consider avoiding it. This post follows a conversation I had on a private Slack. Part of the reason why I think that conversation happened on Slack instead of on an open chat system is that we have spent so much effort building federated systems that we've failed to deliver working open software to replace it.

0 views
Bill Mill 10 months ago

Triple Storyline for Ethical Design

A good user experience is only as good as the action it enables. Designing a system that makes it easy to do bad things is bad. A system that automates discrimination is, by Spinoza’s light, evil.

0 views
Bill Mill 10 months ago

Advent of Code 2024

Last year I kept my advent of code log in a note on this site: 2023 problem log This year I wanted to make it easier for me to generate and share visualizations, and break the code into more logical explanations, so I'm doing it at https://llimllib.github.io/advent2024/ in an observable framework notebook. The source code is all on github , but hopefully the notebook provides an easy reading experience. I've tried to do visualizations for anything that can be remotely visualized, and plan to play around with it some more and learn more about framework and plot.

0 views
Bill Mill 1 years ago

Comparing golang sqlite to C sqlite

Following a question on reddit , I got curious about how well golang sqlite bindings (and translations, and WASM embeddings) compare to using C directly. I'm competent with C, but no expert, so I used the help of an LLM to translate a golang sqlite benchmark to C and tried to make it as equivalent to the golang as possible in terms of the work it was doing. The test I copied is pretty simple: insert a million users into a database, then query the users table and create user structs for each. The result is just one test, but it suggests that all sqlite bindings (or translations) have trouble keeping up with sqlite in C. All times here are in milliseconds, and I picked the best of two runs: The tests were run with go 1.23.1 and clang 16.0.0 on an apple m1 mac with 32gb of ram. This is a deeply non-serious benchmark - I was browsing the web while I was running it, for starters. Still, the magnitude of the results indicates that sqlite in C is still quite a bit faster than any of the alternatives in go. Somebody on mastodon asked me how this compares to previous versions of go, so I did a brief test of mattn and crawshaw on go 1.19, and found that they were in the range of 10-15% slower, so real progress on making cgo faster has been made in a pretty short timeframe. The code is available here in a gist. For kicks, I added a python version to the table above; source is in the same gist 2025 Jun 19 : noticed that the python ratios were not correct and fixed them; I did not re-run any tests

0 views
Bill Mill 1 years ago

Colophon

To add to this blog, I type out a note in Obsidian; I use Obsidian with very few plugins. When it's reasonably done, I back it up with the obsidian git plugin by clicking the "backup" button That backs it up to a private github repository. That repository has a github action that runs run.py from my obsidian_notes repository . That script converts the big pile of Obsidian-flavored markdown into a big pile of HTML. The github action uploads that big pile of HTML to a Digital Ocean space using s3cmd . (I don't recommend Digital Ocean spaces, but that's what I use for now) To serve , I have caddy installed on my web server, proxying requests to , which is a to the digital ocean space. That's it! Feel free to ask me questions on mastodon.

0 views
Bill Mill 1 years ago

How to debug git with visual studio code on a mac

I needed to figure out how to debug the binary to figure out exactly what it was doing, and I thought I'd share the setup because it took me a little while to get going. make a directory: open the git directory with visual studio code: create a file in the directory called that tells vs code how to start for debugging. Its contents should be: Open the file in the editor Set a breakpoint on the first line inside the function; in my current version of git, that's located at line 1504 Switch to the tab of VS code click on , the green arrow located at the top left of the window, or press F5 to do the same thing. If all goes well, you should now be controlling the execution of git inside VS code! Go ahead and dig in to try and understand what's going on. I might write more about what it is that's going on in there, but I'm not sure - let me know if you would like to understand some part of it better.

0 views
Bill Mill 1 years ago

A command to print terminal color codes

I write shell scripts that use terminal colors often enough that it's handy to know a bit about them, but not often enough that I remember how they work without looking it up. So, this morning, I wrote , a shell script that prints out tables of the basic ANSI colors and text effects, and demonstrates what they look like on your terminal. Here's the script , and here's what it looks like on a mostly-stock apple terminal.app with a dark color scheme: You can pass to the script to get a table of the 8-bit colors, in case you want to get real fancy with your script:

0 views
Bill Mill 1 years ago

Serving a billion web requests with boring code

When I worked as a contractor to the US government at ad hoc , I was fortunate enough to get the opportunity to design large parts of a relaunch of medicare plan compare , the US government site through which hundreds of thousands of medicare recipients purchase their health care plans each year. We released after about a year of development, and were able to help millions of people find and purchase health care - if you're in the US, you are pretty likely to know somebody who used this system. Though the US health care system is incredibly awful in many respects, I'm very proud of what the team I worked with was able to build in a short time frame and under a lot of constraints. The team of people that I worked with - managers, designers, engineers and business analysts - were excellent from top to bottom. I've never before experienced such a dedicated, collaborative, trusting team, and I learned a tremendous amount from them. I especially want to share this story because I want to show that quality software can get written under government constraints. It can! And if we all start believing that it can, we're more likely to produce it. I worked on this system for about two and a half years, from the very first commit through two open enrollment periods. The API system served about 5 million requests on a normal weekday, with < 10 millisecond average request latency and a 95th percentile latency of less than 100 milliseconds. It served a very low baseline rate of errors, mostly spurious errors due to vulnerability scrapers. I'm proud that I can count the number of times an engineer was woken up by an emergency page on one hand. I was amazed at how far you can get by leaning on postgres and golang, keeping your system as organized and simple as possible, and putting in work every day for a long period of time. My north star when building the system was to keep it as boring as possible, in the Dan McKinley sense . (Go read "Choose Boring Technology" if you haven't yet, it's required reading. It's better than this article.) There is a concept of "Innovation tokens" in that article, and I was explicit in choosing the pieces I used to build the site with how I spent them. There are many valid criticisms of react; this piece is an example, and I was aware of the issues already in 2018 when I was building the site. The main thrust is that it tends towards large application bundles, which take a long time to download and execute, especially on the cheap mobile phones that are the main link to the internet for so many people. In building a piece of infrastructure for the government, it was especially concerning that the application be available to as many people as possible. We took accessibility seriously both in the sense that the site needed to have proper design for users with disabilities and also in the sense that people with many devices needed to connect to it. Nevertheless, I chose an SPA architecture and react for the site. I would have loved to have done differently, but I worried that choosing to use a multi-page architecture or a different library would have slowed us down enough that we wouldn't have delivered on our tight timeline. I didn't have enough trust in any of the alternatives available to me at the time to make me believe we could choose them safely enough. The result fell prey after a few years to a common failure mode of react apps, and became quite heavy and loaded somewhat slowly. I still think I made the right choice at the time, but it's unfortunate that I felt I had to make it and I wish I had known of a nice clean way to avoid it. Golang was overall a joy to build this project in. It runs efficiently both at build time and at run time, and having binary executable artifacts that build quickly makes it easy to deploy rapidly. Developers new to the language (our team of engineers grew from 2 to 15) were able to get onboard quickly and understand the language with no trouble. Error handling being both immediate and verbose is, in my opinion, a great feature for building systems that are resilient. Every time you do something that might fail, you are faced with handling the error case, and once you develop patterns they are consistent and predictable. (I know this is a big topic, I should probably write more about it) The day that we were able to switch to go modules, a big pain point went away for us. We hit a few bumps in the road as very early adopters but it was worth it. My biggest gripe with the golang ecosystem was that the documentation generation for projects that are not public sucks. For a long time, the documentation generator didn't even support projects that used modules. That said, I was overwhelmingly happy with my choice here and never regretted it. I made two architectural bets that I was less confident of than the others: I split the backend up into three parts; they all lived in the same repository but were designed such that they could be pulled apart and given to a new team if necessary. Each component had its own postgres database (which were physically co-located, but never intertwined) and strictly used gRPC to communicate between themselves. The split was largely based around data access patterns: One thing the site needed to be able to do was estimate the cost of any packaging variation of any drug at any pharmacy on any health insurance plan. Easy, right? This combinatorial explosion (I once calculated how many trillions of possibilities this was) necessitated a very carefully-indexed database, a large amount of preprocessing, and a commitment to engineering with performance in mind. It took a long time and a ton of government health system reverse engineering to figure out how to get even close to right with this part. I'm forever indebted to my colleagues who dove deep into the depths of CMS bureaucracy, and then turned it into documentation and code. The main purpose of the site was for people to search for and purchase medicare part C and part D health care plans. Every day, we'd get a new dump of detailed health care plan information from CMS; this module would load the information into a new postgres database, and then we'd deploy a new version pointing at the new day's data. Both and had entirely immutable databases in this way; their only job was to serve an API based on the most recent data for each. In the insurance argot, a person on a health care plan is a "beneficiary" of the plan. It sounds a little self-important to me, but that's just what it is I suppose. (One thing I tried to do throughout my work on this project was to use the proper industry jargon wherever possible rather than something more familiar to myself. I felt it was part of the commitment to boringness to keep the jargon friction down to a minimum.) The job of the module was to store information about plan customers, and was the only part of the application where the database was long-lived and mutable. We strove to store as little data as possible here, to minimize the risk should there be any data leakage, but there was no way around storing a very scary amount of Personally Identifiable Information (PII). We were as serious as possible about this data, and the risk of losing control of it kept me nervous at all times. Overall, gRPC was not as great for us as I'd hoped when beginning the project. The best benefit of using it, and the driver behind my choice to use it, was that every interface was specified in code. This was very useful; we could generate tools and interfaces that changed in lockstep. The biggest pain points were all related to the tooling. I maintained a set of very hairy makefiles with eldritch commands to build all the interfaces into go files that we could use, and debugging those was always painful. Not being able to curl the system, as we would if it were a JSON API, was a pain in the butt. existed, and we used it, but was not nearly as nice. grpc-gateway was the best part of the ecosystem I used, it served more than a billion requests for us and was never once the source of a problem. It enabled us to do what gRPC ought to have been able to do from the start, serve requests to web clients. I loved having interface schemas, but we used so few of gRPC's features and the code generation was so complicated that we probably would have been slightly better off without it. We followed a strict backwards-compatibility requirement, and only added and never removed fields from our interfaces. Once a field was exposed in the public API, it would be exposed forever unless it became a security problem (which, thankfully, never happened to us in the years I worked on this project). We did the same with the databases as the API; columns were added and rarely removed. If a column actually merited removal, the process was to add a column, remove all references to the old one, wait a few weeks to make sure we didn't need to roll back , then finally remove the column from the database. Our discipline with backwards compatibility gave us freedom to keep up a high rate of changes and maintain confidence that local changes would not have negative downstream consequences. A core principle of the app was to rely on postgres whenever possible, and also to be stupid instead of clever whenever possible. Faceted search was an excellent example of both of those properties. We could have reached for elasticsearch, and we also could have tried to use an ORM, or built a little faceting language. We implemented faceted search by having a well-indexed table of plans, and building a SQL query string by tacking on conditions based on a long series of conditions. The function which implemented the core part of this scheme is a single 250 line function, heavily commented, which lays out the logic in a nearly flat way. The focus is kept squarely on business requirements, instead of on fancy code. We stored the database schemas in a series of files with leading numbers, so that they could be loaded in order at database creation time. For and , there were no migrations, because the database was recreated every day. Instead, there was a number stored in both the database and the application. Given that we (tried very hard to) never make backwards incompatible changes to the database, the apps would check at startup that their database schema version number was greater than or equal to the database version number stored in the database, and refuse to start if they were not. This was a general pattern : If an app encountered any unexpected or missing configuration, it refused to start and threw noticeable, hopefully clear, errors. I tried hard to make it so that if the apps actually started up, they would have everything they needed to run properly. There were occasional instances where we accidentally rolled out backwards-incompatible changes to the database, and in those cases we generally rolled back the data update and rebuilt it. The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process. We had a series of shell scripts for each data source we ingested (there were many), which would pull the data and put it in an s3 bucket. Then, early in the morning, a cron job would spin up an EC2 instance, which would pull in the latest ETL code and all the data files. It would spin up a new database in our RDS instance, and begin the ETL process. If things went well, right about the time the east coasters got into work, a new database would be rotating into service. My recollections are not exact, but it took something like two to four hours to generate a new database with more than 250 million rows out of several gigabytes of text files in various formats. The code to insert data into the database heavily utilized postgres' statement, avoiding s as much as possible in favor of generating batches of collections that could be ed into the database. We used the xo library to connect to the database and generate models, along with heavily customized templates. The templates themselves, and the code to create the models from them was hairy. Thankfully it mostly only had to be written once and occasionally edited. Here was my biggest mistake : I invested a great deal of time and effort in creating sql-mock tests for data that changed regularly. These tests needed constant, tedious maintenance. I should instead have tested against a live database, especially given that we were working mostly with immutable databases and wouldn't have had to deal with recreating it for each test. Each table in the database had an accompanying script that would generate a subset of the data for use in local development, since the final database was too large to run on a developer's machine. This let each developer work with a live, local, copy of the database and enabled efficient development of changes. I highly recommend building in this tooling from the start, it saves you from either trying to add it in once your database grows large, or having your team connect to a remote database, making development slower. We had a CLI tool, written mostly as a bunch of shell scripts, with a ton of available commands that performed all kinds of utility functions related to observability and operations. It was mostly written by an excellent coworker, and working with it is where I learned to write effective shell scripts. (Thanks to Nathan and shellcheck for this vital skill). Having this available from the start served as a really useful place for utility features to coalesce; without it they tend to scatter to more places further afield, or just live in a developer's shell history. One fun bit of tooling I built was the ability to generate graphs from splunk (our log aggregation service) via slack commands, which was particularly helpful in incident handling, as you could easily share graphs with your coworkers. Every request that entered the backend got a request id as soon as it hit the system, and that request id was carried around with it wherever it went. Middleware gave the request its id, and then another middleware constructed a sub-logger with the request id embedded into it that was attached to the request context, so that all logs always had the request ID attached. The system logged on entry and exit, with as much detail as was safe. Any other logging that was above the debug level was supposed to be exceptional, although we weren't super strict about that. We used zerolog , and it worked great for us. At some point, I converted the markdown docs I and others had written in github into a book about how the system worked, using sphinx-book-theme . Miraculously, this book gained traction, and I got great contributions from teammates and everybody knew where to look to find system documentation. I have started documentation websites for many other projects, and no other ones have ever worked as successfully, and I wish that I had any idea why that was but I don't. It proudly featured our mascot (the corgi) showing off its most notable feature Our client frequently wanted us to add queries that would operate from the browser, and I was fortunate to be able to push back and turn many of those into build-time requests instead. One place where our performance got killed at our clients' request was with render-blocking analytics scripts; it seemed every team wanted a different script run to get the analytics that they "needed". I advised them against it and tried to demonstrate the performance and download size problems they incurred, but the client was not interested in my arguments. There are so many more parts of a system like this that I haven't covered here; I mostly wanted to write down a bit about a bunch of the pieces while I still remember what they are. I was very fortunate to be able to work with such a positive, creative, and engaged team that made the space for such a successful project to be possible. An article about the social factors and personalities that made the team go, and the site happen, would be a second article as long as this one is.

0 views
Bill Mill 1 years ago

New site feature - separate blogs

I added a new feature to notes.billmill.org , separate blogs. Here they all are: I'm also planning to improve the quality of the feeds, I put a few notes on this in Feeds are not fit for gardening . At the least, I plan to add summaries and avoid putting minor edits into the feeds.

0 views
Bill Mill 1 years ago

game of life v3

https://llimllib.github.io/ca/03/ My kids inexplicably wanted to play the quick game of life I coded a year ago. Unfortunately when I opened it up and tried to use it, performance had regressed in firefox tremendously - it was unusably slow. So I profiled it, and found that drawing rectangles to a canvas in firefox seems to be a lot slower than in safari. With that in mind, I figured out how to draw cells to the canvas using , which sped the program up nicely.

0 views
Bill Mill 1 years ago

A Mandelbrot viewer

I realized the other day that I had never drawn a mandelbrot fractal before, and that seemed like something I ought to have done, so today I wrote a little mandelbrot viewer . Click and drag to zoom in on an area. Javascript's lack of support for complex numbers means you have to do a tiny bit of legwork before you can write the core mandelbrot loop. Here's how I defined simple, inefficient complex number addition and multiplication: The $mandelbrot$ function accepts a complex number $c$, sets $z = c$ and runs the recursion $z = z^2 + c$ in a loop, incrementing a counter $count$ as it goes. Here's a javascript implementation, where complex numbers are represented as a two-element array: To draw the pretty picture that so many people recognize: Here's a javascript function to draw a mandelbrot on a canvas. It makes use of a few d3.js functions : Go give the toy a try and maybe implement your own Mandelbrot - it's really not too difficult and it's quite fun.

0 views
Bill Mill 1 years ago

mitzVah - the "worst" pangrams part 2

update apr 2 : I accidentally deleted this article and now have re-posted it in part 1 , I introduced the spelling bee and the concept of the "worst" pangram, which I defined as the one which produced the fewest possible words. Then I wrote and described a program that found them, and labelled the Worst Pangram™. I thought I was done getting nerdsniped after that, but a few people helpfully provided bits of information I was missing the first time, and I also thought a bit more about the problem to change its definition a bit. People shared two bits of information with me that I didn't have when I started: I took Brad's word list (thanks!) and added calculation of the word score and each pangram's total score. I didn't mention how words are scored in the first article: This led to a funny result, where was the new "worst" pangram: The initial prompt for this exploration was a question about whether there were any spelling bee puzzles where just the pangram would get you to the "genius" level. I had changed the question to suit the knowledge I had at the start, but now I realized I had enough information to try and answer the original question. I looked at that list and realized that if you set the puzzle as , and used the as a required letter, you'd have only the pangram and three additional words: , (an alternative spelling of , apparently!) and . That would give you: So would result in a puzzle where the pangram was responsible for 15/22 points, or just a tiny hair below genius level. Could we do better? I took the same program I used in part 1 and added a search through each letter of the pangram, scoring the result of making that letter required. Here's a table of every pangram my program finds that would reach genius level all by itself: Following the same start of the program I used to find the word count for every pangram in part 1 , I added a couple more loops: The implementation in python: I didn't dive much into performance, but here's a few notes: With that, I feel like I can finally wash my hands of this problem! (Maybe? we'll see!)

0 views