Latest Posts (16 found)

Notes on Lagrange Interpolating Polynomials

Polynomial interpolation is a method of finding a polynomial function that fits a given set of data perfectly. More concretely, suppose we have a set of n+1 distinct points [1] : And we want to find the polynomial coefficients {a_0\cdots a_n} such that: Fits all our points; that is p(x_0)=y_0 , p(x_1)=y_1 etc. This post discusses a common approach to solving this problem, and also shows why such a polynomial exists and is unique. When we assign all points (x_i, y_i) into the generic polynomial p(x) , we get: We want to solve for the coefficients a_i . This is a linear system of equations that can be represented by the following matrix equation: The matrix on the left is called the Vandermonde matrix . This matrix is known to be invertible (see Appendix for a proof); therefore, this system of equations has a single solution that can be calculated by inverting the matrix. In practice, however, the Vandermonde matrix is often numerically ill-conditioned, so inverting it isn’t the best way to calculate exact polynomial coefficients. Several better methods exist. Lagrange interpolation polynomials emerge from a simple, yet powerful idea. Let’s define the Lagrange basis functions l_i(x) ( i \in [0, n] ) as follows, given our points (x_i, y_i) : In words, l_i(x) is constrained to 1 at and to 0 at all other x_j . We don’t care about its value at any other point. The linear combination: is then a valid interpolating polynomial for our set of n+1 points, because it’s equal to at each (take a moment to convince yourself this is true). How do we find l_i(x) ? The key insight comes from studying the following function: This function has terms (x-x_j) for all j\neq i . It should be easy to see that l'_i(x) is 0 at all x_j when j\neq i . What about its value at , though? We can just assign into l'_i(x) to get: And then normalize l'_i(x) , dividing it by this (constant) value. We get the Lagrange basis function l_i(x) : Let’s use a concrete example to visualize this. Suppose we have the following set of points we want to interpolate: (1,4), (2,2), (3,3) . We can calculate l'_0(x) , l'_1(x) and l'_2(x) , and get the following: Note where each l'_i(x) intersects the axis. These functions have the right values at all x_{j\neq i} . If we normalize them to obtain l_i(x) , we get these functions: Note that each polynomial is 1 at the appropriate and 0 at all the other x_{j\neq i} , as required. With these l_i(x) , we can now plot the interpolating polynomial p(x)=\sum_{i=0}^{n}y_i l_i(x) , which fits our set of input points: We’ve just seen that the linear combination of Lagrange basis functions: is a valid interpolating polynomial for a set of n+1 distinct points (x_i, y_i) . What is its degree? Since the degree of each l_i(x) is , then the degree of p(x) is at most . We’ve just derived the first part of the Polynomial interpolation theorem : Polynomial interpolation theorem : for any n+1 data points (x_0,y_0), (x_1, y_1)\cdots(x_n, y_n) \in \mathbb{R}^2 where no two x_j are the same, there exists a unique polynomial p(x) of degree at most that interpolates these points. We’ve demonstrated existence and degree, but not yet uniqueness . So let’s turn to that. We know that p(x) interpolates all n+1 points, and its degree is . Suppose there’s another such polynomial q(x) . Let’s construct: That do we know about r(x) ? First of all, its value is 0 at all our , so it has n+1 roots . Second, we also know that its degree is at most (because it’s the difference of two polynomials of such degree). These two facts are a contradiction. No non-zero polynomial of degree \leq n can have n+1 roots (a basic algebraic fact related to the Fundamental theorem of algebra ). So r(x) must be the zero polynomial; in other words, our p(x) is unique \blacksquare . Note the implication of uniqueness here: given our set of n+1 distinct points, there’s only one polynomial of degree \leq n that interpolates it. We can find its coefficients by inverting the Vandermonde matrix, by using Lagrange basis functions, or any other method [2] . The set P_n(\mathbb{R}) consists of all real polynomials of degree \leq n . This set - along with addition of polynomials and scalar multiplication - forms a vector space . We called l_i(x) the "Lagrange basis" previously, and they do - in fact - form an actual linear algebra basis for this vector space. To prove this claim, we need to show that Lagrange polynomials are linearly independent and that they span the space. Linear independence : we have to show that implies a_i=0 \quad \forall i . Recall that l_i(x) is 1 at , while all other l_j(x) are 0 at that point. Therefore, evaluating s(x) at , we get: Similarly, we can show that a_i is 0, for all \blacksquare . Span : we’ve already demonstrated that the linear combination of l_i(x) : is a valid interpolating polynomial for any set of n+1 distinct points. Using the polynomial interpolation theorem , this is the unique polynomial interpolating this set of points. In other words, for every q(x)\in P_n(\mathbb{R}) , we can identify any set of n+1 distinct points it passes through, and then use the technique described in this post to find the coefficients of q(x) in the Lagrange basis. Therefore, the set l_i(x) spans the vector space \blacksquare . Previously we’ve seen how to use the \{1, x, x^2, \dots x^n\} basis to write down a system of linear equations that helps us find the interpolating polynomial. This results in the Vandermonde matrix . Using the Lagrange basis, we can get a much nicer matrix representation of the interpolation equations. Recall that our general polynomial using the Lagrange basis is: Let’s build a system of equations for each of the n+1 points (x_i,y_i) . For : By definition of the Lagrange basis functions, all l_i(x_0) where i\neq 0 are 0, while l_0(x_0) is 1. So this simplifies to: But the value at node is , so we’ve just found that a_0=y_0 . We can produce similar equations for the other nodes as well, p(x_1)=a_1 , etc. In matrix form: We get the identity matrix; this is another way to trivially show that a_0=y_0 , a_1=y_1 and so on. Given some numbers \{x_0 \dots x_n\} a matrix of this form: Is called the Vandermonde matrix. What’s special about a Vandermonde matrix is that we know it’s invertible when are distinct. This is because its determinant is known to be non-zero . Moreover, its determinant is [3] : Here’s why. To get some intuition, let’s consider some small-rank Vandermonde matrices. Starting with a 2-by-2: Let’s try 3-by-3 now: We can use the standard way of calculating determinants to expand from the first row: Using some algebraic manipulation, it’s easy to show this is equivalent to: For the full proof, let’s look at the generalized n+1 -by- n+1 matrix again: Recall that subtracting a multiple of one column from another doesn’t change a matrix’s determinant. For each column k>1 , we’ll subtract the value of column k-1 multiplied by from it (this is done on all columns simultaneously). The idea is to make the first row all zeros after the very first element: Now we factor out x_1-x_0 from the second row (after the first element), x_2-x_0 from the third row and so on, to get: Imagine we erase the first row and first column of . We’ll call the resulting matrix . Because the first row of is all zeros except the first element, we have: Note that the first row of has a common factor of x_1-x_0 , so when calculating \det(W) , we can move this common factor out. Same for the common factor x_2-x_0 of the second row, and so on. Overall, we can write: But the smaller matrix is just the Vandermonde matrix for \{x_0 \dots x_{n-1}\} . If we continue this process by induction, we’ll get: If you’re interested, the Wikipedia page for the Vandermonde matrix has a couple of additional proofs.

0 views

Notes on Linear Algebra for Polynomials

We’ll be working with the set P_n(\mathbb{R}) , real polynomials of degree \leq n . Such polynomials can be expressed using n+1 scalar coefficients a_i as follows: The set P_n(\mathbb{R}) , along with addition of polynomials and scalar multiplication form a vector space . As a proof, let’s review how the vector space axioms are satisfied. We’ll use p(x) , q(x) and r(x) as arbitrary polynomials from the set P_n(\mathbb{R}) for the demonstration. Similarly, a and b are arbitrary scalars in . Associativity of vector addition : This is trivial because addition of polynomials is associative [1] . Commutativity is similarly trivial, for the same reason: Commutativity of vector addition : Identity element of vector addition : The zero polynomial 0 serves as an identity element. \forall p(x)\in P_n(\mathbb{R}) , we have 0 + p(x) = p(x) . Inverse element of vector addition : For each p(x) , we can use q(x)=-p(x) as the additive inverse, because p(x)+q(x)=0 . Identity element of scalar multiplication The scalar 1 serves as an identity element for scalar multiplication. For each p(x) , it’s true that 1\cdot p(x)=p(x) . Associativity of scalar multiplication : For any two scalars a and b : Distributivity of scalar multiplication over vector addition : For any p(x) , q(x) and scalar a : Distributivity of scalar multiplication over scalar addition : For any scalars a and b and polynomial p(x) : Since we’ve shown that polynomials in P_n(\mathbb{R}) form a vector space, we can now build additional linear algebraic definitions on top of that. A set of k polynomials p_k(x)\in P_n(\mathbb{R}) is said to be linearly independent if implies a_i=0 \quad \forall i . In words, the only linear combination resulting in the zero vector is when all coefficients are 0. As an example, let’s discuss the fundamental building blocks of polynomials in P_n(\mathbb{R}) : the set \{1, x, x^2, \dots x^n\} . These are linearly independent because: is true only for zero polynomial, in which all the coefficients a_i=0 . This comes from the very definition of polynomials. Moreover, this set spans the entire P_n(\mathbb{R}) because every polynomial can be (by definition) expressed as a linear combination of \{1, x, x^2, \dots x^n\} . Since we’ve shown these basic polynomials are linearly independent and span the entire vector space, they are a basis for the space. In fact, this set has a special name: the monomial basis (because a monomial is a polynomial with a single term). Suppose we have some set polynomials, and we want to know if these form a basis for P_n(\mathbb{R}) . How do we go about it? The idea is using linear algebra the same way we do for any other vector space. Let’s use a concrete example to demonstrate: Is the set Q a basis for P_n(\mathbb{R}) ? We’ll start by checking whether the members of Q are linearly independent. Write: By regrouping, we can turn this into: For this to be true, the coefficient of each monomial has to be zero; mathematically: In matrix form: We know how to solve this, by reducing the matrix into row-echelon form . It’s easy to see that the reduced row-echelon form of this specific matrix is I , the identity matrix. Therefore, this set of equations has a single solution: a_i=0 \quad \forall i [2] . We’ve shown that the set Q is linearly independent. Now let’s show that it spans the space P_n(\mathbb{R}) . We want to analyze: And find the coefficients a_i that satisfy this for any arbitrary , and \gamma . We proceed just as before, by regrouping on the left side: and equating the coefficient of each power of separately: If we turn this into matrix form, the matrix of coefficients is exactly the same as before. So we know there’s a single solution, and by rearranging the matrix into I , the solution will appear on the right hand side. It doesn’t matter for the moment what the actual solution is, as long as it exists and is unique. We’ve shown that Q spans the space! Since the set Q is linearly independent and spans P_n(\mathbb{R}) , it is a basis for the space. I’ve discussed inner products for functions in the post about Hilbert space . Well, polynomials are functions , so we can define an inner product using integrals as follows [3] : Where the bounds a and b are arbitrary, and could be infinite. Whenever we deal with integrals we worry about convergence; in my post on Hilbert spaces, we only talked about L^2 - the square integrable functions. Most polynomials are not square integrable, however. Therefore, we can restrict this using either: Let’s use the latter, and restrict the bounds into the range [-1,1] , setting w(x)=1 . We have the following inner product: Let’s check that this satisfies the inner product space conditions. Conjugate symmetry : Since real multiplication is commutative, we can write: We deal in the reals here, so we can safely ignore complex conjugation. Linearity in the first argument : Let p_1,p_2,q\in P_n(\mathbb{R}) and a,b\in \mathbb{R} . We want to show that Expand the left-hand side using our definition of inner product: The result is equivalent to a\langle p_1,q\rangle +b\langle p_2,q\rangle . Positive-definiteness : We want to show that for nonzero p\in P_n(\mathbb{R}) , we have \langle p, p\rangle > 0 . First of all, since p(x)^2\geq0 for all , it’s true that: What about the result 0 though? Well, let’s say that Since p(x)^2 is a non-negative function, this means that the integral of a non-negative function ends up being 0. But p(x) is a polynomial, so it’s continuous , and so is p(x)^2 . If the integral of a continuous non-negative function is 0, it means the function itself is 0. Had it been non-zero in any place, the integral would necessarily have to be positive as well. We’ve proven that \langle p, p\rangle=0 only when p is the zero polynomial. The positive-definiteness condition is satisfied. In conclusion, P_n(\mathbb{R}) along with the inner product we’ve defined forms an inner product space . Now that we have an inner product, we can define orthogonality on polynomials: two polynomials p,q are orthogonal (w.r.t. our inner product) iff Contrary to expectation [4] , the monomial basis polynomials are not orthogonal using our definition of inner product. For example, calculating the inner product for 1 and x^2 : There are other sets of polynomials that are orthogonal using our inner product. For example, the Legendre polynomials ; but this is a topic for another post. A special weight function w(x) to make sure the inner product integral converges Set finite bounds on the integral, and then we can just set w(x)=1 .

0 views

Rewriting pycparser with the help of an LLM

pycparser is my most widely used open source project (with ~20M daily downloads from PyPI [1] ). It's a pure-Python parser for the C programming language, producing ASTs inspired by Python's own . Until very recently, it's been using PLY: Python Lex-Yacc for the core parsing. In this post, I'll describe how I collaborated with an LLM coding agent (Codex) to help me rewrite pycparser to use a hand-written recursive-descent parser and remove the dependency on PLY. This has been an interesting experience and the post contains lots of information and is therefore quite long; if you're just interested in the final result, check out the latest code of pycparser - the main branch already has the new implementation. While pycparser has been working well overall, there were a number of nagging issues that persisted over years. I began working on pycparser in 2008, and back then using a YACC-based approach for parsing a whole language like C seemed like a no-brainer to me. Isn't this what everyone does when writing a serious parser? Besides, the K&R2 book famously carries the entire grammar of the C99 language in an appendix - so it seemed like a simple matter of translating that to PLY-yacc syntax. And indeed, it wasn't too hard, though there definitely were some complications in building the ASTs for declarations (C's gnarliest part ). Shortly after completing pycparser, I got more and more interested in compilation and started learning about the different kinds of parsers more seriously. Over time, I grew convinced that recursive descent is the way to go - producing parsers that are easier to understand and maintain (and are often faster!). It all ties in to the benefits of dependencies in software projects as a function of effort . Using parser generators is a heavy conceptual dependency: it's really nice when you have to churn out many parsers for small languages. But when you have to maintain a single, very complex parser, as part of a large project - the benefits quickly dissipate and you're left with a substantial dependency that you constantly grapple with. And then there are the usual problems with dependencies; dependencies get abandoned, and they may also develop security issues. Sometimes, both of these become true. Many years ago, pycparser forked and started vendoring its own version of PLY. This was part of transitioning pycparser to a dual Python 2/3 code base when PLY was slower to adapt. I believe this was the right decision, since PLY "just worked" and I didn't have to deal with active (and very tedious in the Python ecosystem, where packaging tools are replaced faster than dirty socks) dependency management. A couple of weeks ago this issue was opened for pycparser. It turns out the some old PLY code triggers security checks used by some Linux distributions; while this code was fixed in a later commit of PLY, PLY itself was apparently abandoned and archived in late 2025. And guess what? That happened in the middle of a large rewrite of the package, so re-vendoring the pre-archiving commit seemed like a risky proposition. On the issue it was suggested that "hopefully the dependent packages move on to a non-abandoned parser or implement their own"; I originally laughed this idea off, but then it got me thinking... which is what this post is all about. The original K&R2 grammar for C99 had - famously - a single shift-reduce conflict having to do with dangling else s belonging to the most recent if statement. And indeed, other than the famous lexer hack used to deal with C's type name / ID ambiguity , pycparser only had this single shift-reduce conflict. But things got more complicated. Over the years, features were added that weren't strictly in the standard but were supported by all the industrial compilers. The more advanced C11 and C23 standards weren't beholden to the promises of conflict-free YACC parsing (since almost no industrial-strength compilers use YACC at this point), so all caution went out of the window. The latest (PLY-based) release of pycparser has many reduce-reduce conflicts [2] ; these are a severe maintenance hazard because it means the parsing rules essentially have to be tie-broken by order of appearance in the code. This is very brittle; pycparser has only managed to maintain its stability and quality through its comprehensive test suite. Over time, it became harder and harder to extend, because YACC parsing rules have all kinds of spooky-action-at-a-distance effects. The straw that broke the camel's back was this PR which again proposed to increase the number of reduce-reduce conflicts [3] . This - again - prompted me to think "what if I just dump YACC and switch to a hand-written recursive descent parser", and here we are. None of the challenges described above are new; I've been pondering them for many years now, and yet biting the bullet and rewriting the parser didn't feel like something I'd like to get into. By my private estimates it'd take at least a week of deep heads-down work to port the gritty 2000 lines of YACC grammar rules to a recursive descent parser [4] . Moreover, it wouldn't be a particularly fun project either - I didn't feel like I'd learn much new and my interests have shifted away from this project. In short, the Potential well was just too deep. I've definitely noticed the improvement in capabilities of LLM coding agents in the past few months, and many reputable people online rave about using them for increasingly larger projects. That said, would an LLM agent really be able to accomplish such a complex project on its own? This isn't just a toy, it's thousands of lines of dense parsing code. What gave me hope is the concept of conformance suites mentioned by Simon Willison . Agents seem to do well when there's a very clear and rigid goal function - such as a large, high-coverage conformance test suite. And pycparser has an very extensive one . Over 2500 lines of test code parsing various C snippets to ASTs with expected results, grown over a decade and a half of real issues and bugs reported by users. I figured the LLM can either succeed or fail and throw its hands up in despair, but it's quite unlikely to produce a wrong port that would still pass all the tests. So I set it to run. I fired up Codex in pycparser's repository, and wrote this prompt just to make sure it understands me and can run the tests: Codex figured it out (I gave it the exact command, after all!); my next prompt was the real thing [5] : Here Codex went to work and churned for over an hour . Having never observed an agent work for nearly this long, I kind of assumed it went off the rails and will fail sooner or later. So I was rather surprised and skeptical when it eventually came back with: It took me a while to poke around the code and run it until I was convinced - it had actually done it! It wrote a new recursive descent parser with only ancillary dependencies on PLY, and that parser passed the test suite. After a few more prompts, we've removed the ancillary dependencies and made the structure clearer. I hadn't looked too deeply into code quality at this point, but at least on the functional level - it succeeded. This was very impressive! A change like the one described above is impossible to code-review as one PR in any meaningful way; so I used a different strategy. Before embarking on this path, I created a new branch and once Codex finished the initial rewrite, I committed this change, knowing that I will review it in detail, piece-by-piece later on. Even though coding agents have their own notion of history and can "revert" certain changes, I felt much safer relying on Git. In the worst case if all of this goes south, I can nuke the branch and it's as if nothing ever happened. I was determined to only merge this branch onto main once I was fully satisfied with the code. In what follows, I had to git reset several times when I didn't like the direction in which Codex was going. In hindsight, doing this work in a branch was absolutely the right choice. Once I've sufficiently convinced myself that the new parser is actually working, I used Codex to similarly rewrite the lexer and get rid of the PLY dependency entirely, deleting it from the repository. Then, I started looking more deeply into code quality - reading the code created by Codex and trying to wrap my head around it. And - oh my - this was quite the journey. Much has been written about the code produced by agents, and much of it seems to be true. Maybe it's a setting I'm missing (I'm not using my own custom AGENTS.md yet, for instance), but Codex seems to be that eager programmer that wants to get from A to B whatever the cost. Readability, minimalism and code clarity are very much secondary goals. Using raise...except for control flow? Yep. Abusing Python's weak typing (like having None , false and other values all mean different things for a given variable)? For sure. Spreading the logic of a complex function all over the place instead of putting all the key parts in a single switch statement? You bet. Moreover, the agent is hilariously lazy . More than once I had to convince it to do something it initially said is impossible, and even insisted again in follow-up messages. The anthropomorphization here is mildly concerning, to be honest. I could never imagine I would be writing something like the following to a computer, and yet - here we are: "Remember how we moved X to Y before? You can do it again for Z, definitely. Just try". My process was to see how I can instruct Codex to fix things, and intervene myself (by rewriting code) as little as possible. I've mostly succeeded in this, and did maybe 20% of the work myself. My branch grew dozens of commits, falling into roughly these categories: Interestingly, after doing (3), the agent was often more effective in giving the code a "fresh look" and succeeding in either (1) or (2). Eventually, after many hours spent in this process, I was reasonably pleased with the code. It's far from perfect, of course, but taking the essential complexities into account, it's something I could see myself maintaining (with or without the help of an agent). I'm sure I'll find more ways to improve it in the future, but I have a reasonable degree of confidence that this will be doable. It passes all the tests, so I've been able to release a new version (3.00) without major issues so far. The only issue I've discovered is that some of CFFI's tests are overly precise about the phrasing of errors reported by pycparser; this was an easy fix . The new parser is also faster, by about 30% based on my benchmarks! This is typical of recursive descent when compared with YACC-generated parsers, in my experience. After reviewing the initial rewrite of the lexer, I've spent a while instructing Codex on how to make it faster, and it worked reasonably well. While working on this, it became quite obvious that static typing would make the process easier. LLM coding agents really benefit from closed loops with strict guardrails (e.g. a test suite to pass), and type-annotations act as such. For example, had pycparser already been type annotated, Codex would probably not have overloaded values to multiple types (like None vs. False vs. others). In a followup, I asked Codex to type-annotate pycparser (running checks using ty ), and this was also a back-and-forth because the process exposed some issues that needed to be refactored. Time will tell, but hopefully it will make further changes in the project simpler for the agent. Based on this experience, I'd bet that coding agents will be somewhat more effective in strongly typed languages like Go, TypeScript and especially Rust. Overall, this project has been a really good experience, and I'm impressed with what modern LLM coding agents can do! While there's no reason to expect that progress in this domain will stop, even if it does - these are already very useful tools that can significantly improve programmer productivity. Could I have done this myself, without an agent's help? Sure. But it would have taken me much longer, assuming that I could even muster the will and concentration to engage in this project. I estimate it would take me at least a week of full-time work (so 30-40 hours) spread over who knows how long to accomplish. With Codex, I put in an order of magnitude less work into this (around 4-5 hours, I'd estimate) and I'm happy with the result. It was also fun . At least in one sense, my professional life can be described as the pursuit of focus, deep work and flow . It's not easy for me to get into this state, but when I do I'm highly productive and find it very enjoyable. Agents really help me here. When I know I need to write some code and it's hard to get started, asking an agent to write a prototype is a great catalyst for my motivation. Hence the meme at the beginning of the post. One can't avoid a nagging question - does the quality of the code produced by agents even matter? Clearly, the agents themselves can understand it (if not today's agent, then at least next year's). Why worry about future maintainability if the agent can maintain it? In other words, does it make sense to just go full vibe-coding? This is a fair question, and one I don't have an answer to. Right now, for projects I maintain and stand behind , it seems obvious to me that the code should be fully understandable and accepted by me, and the agent is just a tool helping me get to that state more efficiently. It's hard to say what the future holds here; it's going to interesting, for sure. There was also the lexer to consider, but this seemed like a much simpler job. My impression is that in the early days of computing, lex gained prominence because of strong regexp support which wasn't very common yet. These days, with excellent regexp libraries existing for pretty much every language, the added value of lex over a custom regexp-based lexer isn't very high. That said, it wouldn't make much sense to embark on a journey to rewrite just the lexer; the dependency on PLY would still remain, and besides, PLY's lexer and parser are designed to work well together. So it wouldn't help me much without tackling the parser beast. The code in X is too complex; why can't we do Y instead? The use of X is needlessly convoluted; change Y to Z, and T to V in all instances. The code in X is unclear; please add a detailed comment - with examples - to explain what it does.

0 views

Compiling Scheme to WebAssembly

One of my oldest open-source projects - Bob - has celebrated 15 a couple of months ago . Bob is a suite of implementations of the Scheme programming language in Python, including an interpreter, a compiler and a VM. Back then I was doing some hacking on CPython internals and was very curious about how CPython-like bytecode VMs work; Bob was an experiment to find out, by implementing one from scratch for R5RS Scheme. Several months later I added a C++ VM to Bob , as an exercise to learn how such VMs are implemented in a low-level language without all the runtime support Python provides; most importantly, without the built-in GC. The C++ VM in Bob implements its own mark-and-sweep GC. After many quiet years (with just a sprinkling of cosmetic changes, porting to GitHub, updates to Python 3, etc), I felt the itch to work on Bob again just before the holidays. Specifically, I decided to add another compiler to the suite - this one from Scheme directly to WebAssembly. The goals of this effort were two-fold: Well, it's done now; here's an updated schematic of the Bob project: The new part is the rightmost vertical path. A WasmCompiler class lowers parsed Scheme expressions all the way down to WebAssembly text, which can then be compiled to a binary and executed using standard WASM tools [2] . The most interesting aspect of this project was working with WASM GC to represent Scheme objects. As long as we properly box/wrap all values in ref s, the underlying WASM execution environment will take care of the memory management. For Bob, here's how some key Scheme objects are represented: $PAIR is of particular interest, as it may contain arbitrary objects in its fields; (ref null eq) means "a nullable reference to something that has identity". ref.test can be used to check - for a given reference - the run-time type of the value it refers to. You may wonder - what about numeric values? Here WASM has a trick - the i31 type can be used to represent a reference to an integer, but without actually boxing it (one bit is used to distinguish such an object from a real reference). So we don't need a separate type to hold references to numbers. Also, the $SYMBOL type looks unusual - how is it represented with two numbers? The key to the mystery is that WASM has no built-in support for strings; they should be implemented manually using offsets to linear memory. The Bob WASM compiler emits the string values of all symbols encountered into linear memory, keeping track of the offset and length of each one; these are the two numbers placed in $SYMBOL . This also allows to fairly easily implement the string interning feature of Scheme; multiple instances of the same symbol will only be allocated once. Consider this trivial Scheme snippet: The compiler emits the symbols "foo" and "bar" into linear memory as follows [3] : And looking for one of these addresses in the rest of the emitted code, we'll find: As part of the code for constructing the constant cons list representing the argument to write ; address 2051 and length 3: this is the symbol bar . Speaking of write , implementing this builtin was quite interesting. For compatibility with the other Bob implementations in my repository, write needs to be able to print recursive representations of arbitrary Scheme values, including lists, symbols, etc. Initially I was reluctant to implement all of this functionality by hand in WASM text, but all alternatives ran into challenges: So I bit the bullet and - with some AI help for the tedious parts - just wrote an implementation of write directly in WASM text; it wasn't really that bad. I import only two functions from the host: Though emitting integers directly from WASM isn't hard , I figured this project already has enough code and some host help here would be welcome. For all the rest, only the lowest level write_char is used. For example, here's how booleans are emitted in the canonical Scheme notation ( #t and #f ): This was a really fun project, and I learned quite a bit about realistic code emission to WASM. Feel free to check out the source code of WasmCompiler - it's very well documented. While it's a bit over 1000 LOC in total [4] , more than half of that is actually WASM text snippets that implement the builtin types and functions needed by a basic Scheme implementation. In Bob this is currently done with bytecodealliance/wasm-tools for the text-to-binary conversion and Node.js for the execution environment, but this can change in the future. I actually wanted to use Python bindings to wasmtime, but these don't appear to support WASM GC yet. Experiment with lowering a real, high-level language like Scheme to WebAssembly. Experiments like the recent Let's Build a Compiler compile toy languages that are at the C level (no runtime). Scheme has built-in data structures, lexical closures, garbage collection, etc. It's much more challenging. Get some hands-on experience with the WASM GC extension [1] . I have several samples of using WASM GC in the wasm-wat-samples repository , but I really wanted to try it for something "real". Deferring this to the host is difficult because the host environment has no access to WASM GC references - they are completely opaque. Implementing it in another language (maybe C?) and lowering to WASM is also challenging for a similar reason - the other language is unlikely to have a good representation of WASM GC objects.

0 views

Summary of reading: October - December 2025

"The Origins of Political Order: From Prehuman Times to the French Revolution" by Francis Fukuyama - while reading this book it occurred to me that domains of study like political sciense must be incredibly difficult and frustrating. Imagine trying to match a model onto a set of data; the model has thousands of parameters, but you only have dozens or a couple of hundred of data points. This is what political sciense is like; there's a huge number of parameters and variables, far more than actual historical examples. And moreover, the historical examples are vague and often based on very partial memory and sketchy records. So books like this most often just devolve to history. As a history book, this one isn't bad, but I found it hard to draw wide conclusions from the themes it presents. "Exploding the Phone: The Untold Story of the Teenagers and Outlaws Who Hacked Ma Bell" by Phil Lapsley - a detailed history of phone phreaking. While I wish it focused more on the technical details than on the legal escapades of well-known phreaks, it's still a good book that provides decent coverage of an important era in the history of computing. "The Zone" by Sergei Dovlatov - (read in Russian) a satirical novella about the life of a guard in a Soviet prison camp in the 1960s. I liked this book less than "The Compromise". "The Joy of SET" by McMahon and Gordon x3 - explores the various mathematical dimensions of the SET card game. It's surprising how much interesting math there is around the game! Combinatorics and probability sure, but also modular arithmetic, vectors, linear algebra and affine geometry. This is a fun book for fans of the game (and of math); it's well written and even contains exercises. Don't expect it to teach you to become better at playing SET though - that's really not its goal. "Doom Guy: Life in First Person" by John Romero - Romero's auto-biography, also read by himself in the Audible version. Very good book, gives another angle at id software and the seminal games they developed. "Masters of Doom" is one of my favorite books, and this one complements it very nicely. "Buffett: The Making of an American Capitalist" by Roger Lowenstein - a detailed biography of Warren Buffett. Great book, very informative and interesting; the only issue is that it was written in 1995, and doesn't mention the last 30 years. It would be interesting to read an up-to-date biography at some point. "The Great Democracies: A History of the English Speaking Peoples, Volume IV" by Winston Churchill - the final volume, covering the years 1815 - 1901. There's still focus on England, but also coverage of the American civil war, Australia, and some of Britain's colonial interests in Africa. "Starburst and Luminary, an Apollo Memoir" Don Eyles - the author worked on coding the landing programs for the lunar module of several Apollo missions as a young engineer in MIT. The book must be based on fairly detailed journals, because it contains an astonishing amount of detail (given that it was written 50 years after the events described). Pretty interesting insight into that era, all in all, though I didn't care much about the author's mixing in his love life into it. It's his book, of course, and he can write whatever he wants in it, but IMO it just dilutes the other great material and makes it generally less suitable for younger audiences. "Stoner" by John Williams - I have mixed feelings about this book, and they will probably take (at least) another read to resolve. On one hand, the writing is clearly masterful and "mood-evoking" in a way that only few authors managed to do for me. Character development is beautiful, and there are glimpses of the flow of learning described amazingly well w.r.t. Stoner's own work. On the other hand, the characters are also too extreme - almost caricatures, and not very well connected to each other. There are huge amounts of page real-estate allocated to certain topics that are barely mentioned later on; this happens again and again. Edith, in particular, is a very troubling character, and since Stoner is clearly presented as someone who is not a pushover when he wants to, his behavior is puzzling to me. "The Magic Mountain" by Thomas Mann. A young German college student arrives to a sanatorium in the Swiss Alps to visit his cousin who suffers from TB, and stays for years, chronicling the odd personas flowing through the establishment. There's always some risk with trying famous books from over 100 years ago, and in this case the risk materialized - I found this one to be tedious, rambling and outdated. It's not all bad; there are certainly good parts, funny parts and some timeless lessons about human nature. But on the balance, I didn't enjoy this book and the only reason I managed to actually finish it cover to cover is because of the audiobook format (which let me zone out at times while doing something else). "Breaking Through: My Life in Science" by Katalin Karikó - an autobiography by the molecular biologist who contributed significantly to therapeutic uses of mRNA, including its use for the COVID-19 vaccine. Highly recommended. "Thinking Fast and Slow" by Daniel Kahneman - still a great book, though I did not enjoy the re-read as much as I'd thought I would. "The Man Who Changed Everything" by Basil Mahon "Of mice and men" by John Steinbeck

0 views

Plugins case study: mdBook preprocessors

mdBook is a tool for easily creating books out of Markdown files. It's very popular in the Rust ecosystem, where it's used (among other things) to publish the official Rust book . mdBook has a simple yet effective plugin mechanism that can be used to modify the book output in arbitrary ways, using any programming language or tool. This post describes the mechanism and how it aligns with the fundamental concepts of plugin infrastructures . mdBook's architecture is pretty simple: your contents go into a directory tree of Markdown files. mdBook then renders these into a book, with one file per chapter. The book's output is HTML by default, but mdBook supports other outputs like PDF. The preprocessor mechanism lets us register an arbitrary program that runs on the book's source after it's loaded from Markdown files; this program can modify the book's contents in any way it wishes before it all gets sent to the renderer for generating output. The official documentation explains this process very well . I rewrote my classical "nacrissist" plugin for mdBook; the code is available here . In fact, there are two renditions of the same plugin there: Let's see how this case study of mdBook preprocessors measures against the Fundamental plugin concepts that were covered several times on this blog . Discovery in mdBook is very explicit. For every plugin we want mdBook to use, it has to be listed in the project's book.toml configuration file. For example, in the code sample for this post , the Python narcissist plugin is noted in book.toml as follows: Each preprocessor is a command for mdBook to execute in a sub-process. Here it uses Python, but it can be anything else that can be validly executed. For the purpose of registration, mdBook actually invokes the plugin command twice . The first time, it passes the arguments supports <renderer> where <renderer> is the name of the renderer (e.g. html ). If the command returns 0, it means the preprocessor supports this renderer; otherwise, it doesn't. In the second invocation, mdBook passes some metadata plus the entire book in JSON format to the preprocessor through stdin, and expects the preprocessor to return the modified book as JSON to stdout (using the same schema). In terms of hooks, mdBook takes a very coarse-grained approach. The preprocessor gets the entire book in a single JSON object (along with a context object that contains metadata), and is expected to emit the entire modified book in a single JSON object. It's up to the preprocessor to figure out which parts of the book to read and which parts to modify. Given that books and other documentation typically have limited sizes, this is a reasonable design choice. Even tens of MiB of JSON-encoded data are very quick to pass between sub-processes via stdout and marshal/unmarshal. But we wouldn't be able to implement Wikipedia using this design. This is tricky, given that the preprocessor mechanism is language-agnostic. Here, mdBook offers some additional utilities to preprocessors implemented in Rust, however. These get access to mdBook 's API to unmarshal the JSON representing the context metadata and book's contents. mdBook offers the Preprocessor trait Rust preprocessors can implement, which makes it easier to wrangle the book's contents. See my Rust version of the narcissist preprocessor for a basic example of this. Actually, mdBook has another plugin mechanism, but it's very similar conceptually to preprocessors. A renderer (also called a backend in some of mdBook 's own doc pages) takes the same input as a preprocessor, but is free to do whatever it wants with it. The default renderer emits the HTML for the book; other renderers can do other things. The idea is that the book can go through multiple preprocessors, but at the end a single renderer. The data a renderer receives is exactly the same as a preprocessor - JSON encoded book contents. Due to this similarity, there's no real point getting deeper into renderers in this post. One in Python, to demonstrate how mdBook can invoke preprocessors written in any programming language. One in Rust, to demonstrate how mdBook exposes an application API to plugins written in Rust (since mdBook is itself written in Rust).

0 views

Revisiting "Let's Build a Compiler"

There's an old compiler-building tutorial that has become part of the field's lore: the Let's Build a Compiler series by Jack Crenshaw (published between 1988 and 1995). I ran into it in 2003 and was very impressed, but it's now 2025 and this tutorial is still being mentioned quite often in Hacker News threads . Why is that? Why does a tutorial from 35 years ago, built in Pascal and emitting Motorola 68000 assembly - technologies that are virtually unknown for the new generation of programmers - hold sway over compiler enthusiasts? I've decided to find out. The tutorial is easily available and readable online , but just re-reading it seemed insufficient. So I've decided on meticulously translating the compilers built in it to Python and emit a more modern target - WebAssembly. It was an enjoyable process and I want to share the outcome and some insights gained along the way. The result is this code repository . Of particular interest is the TUTORIAL.md file , which describes how each part in the original tutorial is mapped to my code. So if you want to read the original tutorial but play with code you can actually easily try on your own, feel free to follow my path. To get a taste of the input language being compiled and the output my compiler generates, here's a sample program in the KISS language designed by Jack Crenshaw: It's from part 13 of the tutorial, so it showcases procedures along with control constructs like the while loop, and passing parameters both by value and by reference. Here's the WASM text generated by my compiler for part 13: You'll notice that there is some trickiness in the emitted code w.r.t. handling the by-reference parameter (my previous post deals with this issue in more detail). In general, though, the emitted code is inefficient - there is close to 0 optimization applied. Also, if you're very diligent you'll notice something odd about the global variable X - it seems to be implicitly returned by the generated main function. This is just a testing facility that makes my compiler easy to test. All the compilers are extensively tested - usually by running the generated WASM code [1] and verifying expected results. While reading the original tutorial again, I had on opportunity to reminisce on what makes it so effective. Other than the very fluent and conversational writing style of Jack Crenshaw, I think it's a combination of two key factors: To be honest, I don't think either of these are a big problem with modern resources, but back in the day the tutorial clearly hit the right nerve with many people. Jack Crenshaw's tutorial takes the syntax-directed translation approach, where code is emitted while parsing , without having to divide the compiler into explicit phases with IRs. As I said above, this is a fantastic approach for getting started, but in the latter parts of the tutorial it starts showing its limitations. Especially once we get to types, it becomes painfully obvious that it would be very nice if we knew the types of expressions before we generate code for them. I don't know if this is implicated in Jack Crenshaw's abandoning the tutorial at some point after part 14, but it may very well be. He keeps writing how the emitted code is clearly sub-optimal [3] and can be improved, but IMHO it's just not that easy to improve using the syntax-directed translation strategy. With perfect hindsight vision, I would probably use Part 14 (types) as a turning point - emitting some kind of AST from the parser and then doing simple type checking and analysis on that AST prior to generating code from it. All in all, the original tutorial remains a wonderfully readable introduction to building compilers. This post and the GitHub repository it describes are a modest contribution that aims to improve the experience of folks reading the original tutorial today and not willing to use obsolete technologies. As always, let me know if you run into any issues or have questions! Concretely: when we compile subexpr1 + subexpr2 and the two sides have different types, it would be mighty nice to know that before we actually generate the code for both sub-expressions. But the syntax-directed translation approach just doesn't work that way. To be clear: it's easy to generate working code; it's just not easy to generate optimal code without some sort of type analysis that's done before code is actually generated. The tutorial builds a recursive-descent parser step by step, rather than giving a long preface on automata and table-based parser generators. When I first encountered it (in 2003), it was taken for granted that if you want to write a parser then lex + yacc are the way to go [2] . Following the development of a simple and clean hand-written parser was a revelation that wholly changed my approach to the subject; subsequently, hand-written recursive-descent parsers have been my go-to approach for almost 20 years now . Rather than getting stuck in front-end minutiae, the tutorial goes straight to generating working assembly code, from very early on. This was also a breath of fresh air for engineers who grew up with more traditional courses where you spend 90% of the time on parsing, type checking and other semantic analysis and often run entirely out of steam by the time code generation is taught.

1 views

Notes on the WASM Basic C ABI

The WebAssembly/tool-conventions repository contains "Conventions supporting interoperability between tools working with WebAssembly". Of special interest, in contains the Basic C ABI - an ABI for representing C programs in WASM. This ABI is followed by compilers like Clang with the wasm32 target. Rust is also switching to this ABI for extern "C" code. This post contains some notes on this ABI, with annotated code samples and diagrams to help visualize what the emitted WASM code is doing. Hereafter, "the ABI" refers to this Basic C ABI. In these notes, annotated WASM snippets often contain descriptions of the state of the WASM value stack at a given point in time. Unless otherwise specified, "TOS" refers to "Top Of value Stack", and the notation [ x  y ] means the stack has y on top, with x right under it (and possibly some other stuff that's not relevant to the discussion under x ); in this notation, the stack grows "to the right". The WASM value stack has no linear memory representation and cannot be addressed, so it's meaningless to discuss whether the stack grows towards lower or higher addresses. The value stack is simply an abstract stack, where values can be pushed onto or popped off its "top". Whenever addressing is required, the ABI specifies explicitly managing a separate stack in linear memory. This stack is very similar to how stacks are managed in hardware assembly languages (except that in the ABI this stack pointer is held in a global variable, and is not a special register), and it's called the "linear stack". By "scalar" I mean basic C types like int , double or char . For these, using the WASM value stack is sufficient, since WASM functions can accept an arbitrary number of scalar parameters. This C function: Will be compiled into something like: And can be called by pushing three values onto the stack and invoking call $add_three . The ABI specifies that all integral types 32-bit and smaller will be passed as i32 , with the smaller types appropriately sign or zero extended. For example, consider this C function: It's compiled to the almost same code as add_three : Except the last i32.extend8_s , which takes the lowest 8 bits of the value on TOS and sign-extends them to the full i32 (effectively ignoring all the higher bits). Similarly, when $add_three_chars is called, each of its parameters goes through i32.extend8_s . There are additional oddities that we won't get deep into, like passing __int128 values via two i64 parameters. C pointers are just scalars, but it's still educational to review how they are handled in the ABI. Pointers to any type are passed in i32 values; the compiler knows they are pointers, though, and emits the appropriate instructions. For example: Is compiled to: Recall that in WASM, there's no difference between an i32 representing an address in linear memory and an i32 representing just a number. i32.store expects [ addr  value ] on TOS, and does *addr = value . Note that the x parameter isn't needed any longer after the sum is computed, so it's reused later on to hold the return value. WASM parameters are treated just like other locals (as in C). According to the ABI, while scalars and single-element structs or unions are passed to a callee via WASM function parameters (as shown above), for larger aggregates the compiler utilizes linear memory. Specifically, each function gets a "frame" in a region of linear memory allocated for the linear stack. This region grows downwards from high to low addresses [1] , and the global $__stack_pointer points at the bottom of the frame: Consider this code: When do_work is compiled to WASM, prior to calling pair_calculate it copies pp into a location in linear memory, and passes the address of this location to pair_calculate . This location is on the linear stack, which is maintained using the $__stack_pointer global. Here's the compiled WASM for do_work (I also gave its local variable a meaningful name, for readability): Some notes about this code: Before pair_calculate is called, the linear stack looks like this: Following the ABI, the code emitted for pair_calculate takes Pair* (by reference, instead of by value as the original C code): Each function that needs linear stack space is responsible for adjusting the stack pointer and restoring it to its original place at the end. This naturally enables nested function calls; suppose we have some function a calling function b which, in turn, calls function c , and let's assume all of these need to allocate space on the linear stack. This is how the linear stack looks after c 's prologue: Since each function knows how much stack space it has allocated, it's able to properly restore $__stack_pointer to the bottom of its caller's frame before returning. What about returning values of aggregate types? According to the ABI, these are also handled indirectly; a pointer parameter is prepended to the parameter list of the function. The function writes its return value into this address. The following function: Is compiled to: Here's a function that calls it: And the corresponding WASM: Note that this function only uses 8 bytes of its stack frame, but allocates 16; this is because the ABI dictates 16-byte alignment for the stack pointer. There are some advanced topics mentioned in the ABI that these notes don't cover (at least for now), but I'll mention them here for completeness: This is similar to x86 . For the WASM C ABI, a good reason is provided for the direction: WASM load and store instructions have an unsigned constant called offset that can be used to add a positive offset to the address parameter without extra instructions. Since $__stack_pointer points to the lowest address in the frame, these offsets can be used to efficiently access any value on the stack. There are two instance of the pair pp in linear memory prior to the call to pair_calculate : the original one from the initialization statement (at offset 8), and a copy created for passing into pair_calculate (at offset 0). Theoretically, as pp is unused used after the call, the compiler could do better here and keep only a single copy. The stack pointer is decremented by 16, and restored at the end of the function. The first few instructions - where the stack pointer is adjusted - are usually called the prologue of the function. In the same vein, the last few instructions where the stack pointer is reset back to where it was at the entry are called the epilogue . "Red zone" - leaf functions have access to 128 bytes of red zone below the stack pointer. I found this difficult to observe in practice [2] . Since we don't issue system calls directly in WASM, it's tricky to conjure a realistic leaf function that requires the linear stack (instead of just using WASM locals). A separate frame pointer (global value) to be used for functions that require dynamic stack allocation (such as using C's VLAs ). A separate base pointer to be used for functions that require alignment > 16 bytes on the stack.

0 views

LaTeX, LLMs and Boring Technology

Depending on your particular use case, choosing boring technology is often a good idea. Recently, I've been thinking more and more about how the rise and increase in power of LLMs affects this choice. By definition, boring technology has been around for a long time. Piles of content have been written and produced about it: tutorials, books, videos, reference manuals, examples, blog posts and so on. All of this is consumed during the LLM training process, making LLMs better and better at reasoning about such technology. Conversely, "shiny technology" is new, and has much less material available. As a result, LLMs won't be as familiar with it. This applies to many domains, but one specific example for me personally is in the context of LaTeX. LaTeX certainly fits the "boring technology" bill. It's decades old, and has been the mainstay of academic writing since the 1980s. When I used it for the first time in 2002 (for a project report in my university AI class), it was already very old. But people keep working on it and fixing issues; it's easy to install and its wealth of capabilities and community size are staggering. Moreover, people keep working with it, producing more and more content and examples the LLMs can ingest and learn from. I keep hearing about the advantages of new and shiny systems like Typst. However, with the help of LLMs, almost none of the advantages seem meaningful to me. LLMs are great at LaTeX and help a lot with learning or remembering the syntax, finding the right packages, deciphering errors and even generating tedious parts like tables and charts, significantly reducing the need for scripting [1] . You can use LLMs either as standalone or fully integrated into your LaTeX environment; Overleaf has a built-in AI helper, and for local editing you can use VSCode plugins or other tools. I'm personally content with TeXstudio and use LLMs as standalone help, but YMMV. There are many examples where boring technology and LLMs go well together. The main criticism of boring technology is typically that it's "too big, full of cruft, difficult to understand". LLMs really help cutting through the learning curve though, and all that "cruft" is very likely to become useful some time in the future when you graduate from the basic use cases. To be clear: Typst looks really cool, and kudos to the team behind it! All I'm saying in this post is that for me - personaly - the choice for now is to stick with LaTeX as a "boring technology". For finding the right math symbols, I rarely need to scan reference materials any longer. LLMs will easily answer questions like "what's that squiggly Greek letter used in math, and its latex symbol?" or "write the latex for Green's theorem, integral form". For the trickiest / largest equations, LLMs are very good at "here's a picture I took of my equation, give me its latex code" these days [2] . "Here's a piece of code and the LaTeX error I'm getting on it; what's wrong?" This is made more ergonomic by editor integrations, but I personally find that LaTeX's error message problem is hugely overblown. 95% of the errors are reasonably clear, and serious sleuthing is only rarely required in practice. In that minority of cases, pasting some code and the error into a standalone LLM isn't a serious time drain. Generating TikZ diagrams and plots. For this, the hardest part is getting started and finding the right element names, and so on. It's very useful to just ask an LLM to emit something initial and then tweak it manually later, as needed. You can also ask the LLM to explain each thing it emits in detail - this is a great learning tool for deeper understanding. Recently I had luck going "meta" with this: when the diagram has repetitive elements, I may ask the LLM to "write a Python program that generates a TikZ diagram ...", and it works well. Generating and populating tables, and converting them from other data formats or screenshots. Help with formatting and typesetting (how do I change margins to XXX and spacing to YYY). When it comes to scripting, I generally prefer sticking to real programming languages anyway. If there's anything non-trivial to auto-generate I wouldn't use a LaTeX macro, but would write a Python program to generate whatever I need and embed it into the document with something like \input{} . Typst's scripting system may be marketed as "clean and powerful", but why learn yet another scripting language? Ignoring LaTeX's equation notation and doing their own thing is one of the biggest mistakes Typst makes, in my opinion. LaTeX's notation may not be perfect, but it's near universal at this point with support in almost all math-aware tools. Typst's math mode is a clear sign of the second system effect, and isn't even stable . For finding the right math symbols, I rarely need to scan reference materials any longer. LLMs will easily answer questions like "what's that squiggly Greek letter used in math, and its latex symbol?" or "write the latex for Green's theorem, integral form". For the trickiest / largest equations, LLMs are very good at "here's a picture I took of my equation, give me its latex code" these days [2] . "Here's a piece of code and the LaTeX error I'm getting on it; what's wrong?" This is made more ergonomic by editor integrations, but I personally find that LaTeX's error message problem is hugely overblown. 95% of the errors are reasonably clear, and serious sleuthing is only rarely required in practice. In that minority of cases, pasting some code and the error into a standalone LLM isn't a serious time drain. Generating TikZ diagrams and plots. For this, the hardest part is getting started and finding the right element names, and so on. It's very useful to just ask an LLM to emit something initial and then tweak it manually later, as needed. You can also ask the LLM to explain each thing it emits in detail - this is a great learning tool for deeper understanding. Recently I had luck going "meta" with this: when the diagram has repetitive elements, I may ask the LLM to "write a Python program that generates a TikZ diagram ...", and it works well. Generating and populating tables, and converting them from other data formats or screenshots. Help with formatting and typesetting (how do I change margins to XXX and spacing to YYY).

0 views

Notes on using LaTeX to generate formulae

This post collects some notes on using LaTeX to render mathematical documents and formulae, mostly focused on a Linux machine. For background, I typically use LaTeX for one of two (related) purposes: I don't currently use LaTeX for either precise typesetting or for authoring very large, book-sized documents. For day-to-day authoring, I find TeXstudio to be excellent. It has everything I need for local editing with a convenient preview window. I really like that TeXstudio doesn't hide the fact that it's just a graphical veneer on top of command-line LaTeX tooling, and lets you examine what it's doing through logs. Note that web-based solutions like Overleaf exist; I can see myself using that, especially if collaborating with others or having to author LaTeX from a diverse set of computers and OSes, but for local editing of git-backed text files, TeXstudio is great. pandoc is very capable for converting documents from LaTeX to other formats. Recently I find that it's easier to write math-heavy blog posts in LaTeX, and then convert them to reStructuredText with pandoc . For example, the recent post on Hilbert spaces was written like this and then converted using this command: The resulting reStructuredText is very readable and requires very little tweaking before final publishing. pandoc supports many formats, so if you use Markdown or something else, it should work similarly well. A useful feature of LaTeX tooling is the ability to render a specific formula in standalone mode to an image. We can write the formula into its own file (call it standaloneformula.tex ): In case you were wondering, this is the Gaussian integral: Once we have that standalone .tex file, there's a number of things we can do. First, the texlive package should be installed [1] . Using apt : Now, we can run the tools from texlive , for example pdflatex : This creates a PDF file that's useful for previews. To convert the .tex file to an image in SVG format, we'll use a two-step process: If we want a PNG file instead of SVG: The latexmk tool can build a .tex file into a PDF whenever the input file changes, so running: And opening the PDF in a separate window, we can observe live refreshes of edits without having to recompile explicitly. While useful in some scenarios, I find that TeXstudio already does this well. The same tooling flow works for TikZ diagrams! A standalone LaTeX document containing a single tikzpicture element can also be rendered to a SVG or PNG using the same exact commands. If you'd rather not install all these tools directly but use Docker instead, the texlive image can be used to do the same things: And now we can use the same invocations, just through docker: When a formula like \frac{n+1}{n^2-1} is embedded in text, it should be aligned properly to look good with the surrounding text. The information required to do this is emitted by tools like dvisvgm and dvipng ; for example: Note the height=..., depth=... line in the output. The height is the total height of the formula, and depth is its height below the "baseline" (how much down it should stick out from the line). In my blog, these two are translated to attributes on the image element embedding the SVG. Height is translated to style="height: ... and depth to vertical-align: ... . Render math for my blog posts, which are usually written using reStructuredText . This sometimes includes diagrams generated using TikZ. Write personal (unpublished) notes on math-y subjects entirely in LaTeX. These are typically short (up to 10-20 pages), single-subject booklets.

0 views

Summary of reading: July - September 2025

"The Compromise" by Sergei Dovlatov - (read in Russian) the author was a journalist in the Soviet Union in the 60s and 70s. This book is a humorous, semi-biographic account of some of the issues faced by Soviet journalists in their attempt to report news aligned with party lines. Very good writing, though the Russian in this book was a bit difficult for me at times. "Twilight of the Gods: War in the Western Pacific, 1944-1945" by Ian Toll - the third part of the trilogy. As an overall conclusion to the series, I will reiterate the earlier feedback: the writing is great, the book is very readable for such immense size, but I wish the author's focus was elsewhere. If you're looking for very detailed tactical accounts of key battles, this is the book for you. It doesn't have much about the more strategic aspects, and especially the U.S. industrial capacity that played such a key role in the war. How was the production scaled so much, especially with millions of people drafted? I'd be definitely interested in looking for additional sources of information on this subject. "Threaded Interpretive Languages" by R.G. Loeliger - describes some traditional approaches to implementing FORTH (which is the prime example of a thread-interpretive language, or TIL) in assembly. This book is from the late 1970s, so the target machine used is a Z80. Overall it's pretty good, with useful diagrams and quirky humor, but it certainly shows its age. "System Design Interview – An insider's guide" by Alex Xu - a book form of the author's guidelines for system design interviews. It's okay , far from great. The sections are all very repetitive and the sum total of unique insights and ideas in the book is low. Moreover, it's some sort of samizdat instant-Amazon printing of rather low quality, no index, unfocused diagrams and barely any copywriting. "Why Nations Fail: The Origins of Power, Prosperity, and Poverty" by Daron Acemoglu and James A. Robinson - describes the author's theory of why some countries are rich and others are poor. The crux if the theory is extractive vs. inclusive political and economical institutions; in other words, a dictatorship vs. a pluralist government. Overall, the theory is interesting and insightful; the book is a bit scattered, though, with the authors jumping between examples haphazardly, making it difficult to focus. I like that the book doesn't shy away from making predictions for the future rather than just analyzing history. "A biography of the Pixel" by Alvy Ray Smith - the history of computer graphics, told by one of the founders of Pixar. Some parts of this book are good, but I can't say I really enjoyed most of it. Lots of very detailed history and names, and project names, etc. "The Age of Revolution: A History of the English Speaking Peoples, Volume III" by Winston Churchill - covers the period from 1688 to 1815. Though this series is ostensibly about all the "English speaking peoples", the focus is clearly on England. There's some coverage of the USA, but it mostly focuses on the interactions with the British (revolution and war of 1812), and there's also quite a bit on Napoleon and France. The series becomes somewhat more interesting as it approaches the more modern era. "The Nvidia Way: Jensen Huang and the making of a tech giant" by Tae Kim - a very interesting and well-written biography of Nvidia, from the early founding days to ~2024. "Babylon: Mesopotamia and the Birth of Civilization" by Paul Kriwaczek - an interesting historic account of Mesopotamia, from Eridu and until the fall of Babylon. "Demon Copperhead" by Barbara Kingsolver - a novel about a boy coming of age as an orphan in foster care, houses of friends, etc. The backdrop is the opioid epidemic of the early 2000s in the Appalachia, with broken families and lots of drugs. The book is pretty good, but the Pulitzer prize here is clearly for the unsettling coverage of an ongoing hot topic, not for any sort of literary flourish. "The Color of Our Sky" by Amita Trasi - the fictional story of two girls from different castes in India who find their lives intertwined in complex ways. Some thought provoking and troubling accounts of traditions still prevalent in India in relation to discrimination, human trafficking, child abuse and modern slavery. "El murmullo de las abejas" by Sofía Segovia - (read in Spanish) slightly mystical novel about the life of an aristocratic family in the north of Mexico in the early 20th century. Maybe it's just the Spanish, but I definitely got "100 años de soledad" vibes from this book: the mysticism, the multi-generational story going in circles, the ambience. "The Mysterious Island" by Jules Verne

0 views

Consistent hashing

As a concrete experiment, I ran a similar simulation to the one above, but with 10 virtual nodes per node. We'll consider the total portion of the circle mapped to a node when it maps to either of its virtual nodes. While the average remains 18 degrees, the variance is reduced drastically - the smallest one is 11 degrees and the largest 26. You can find the code for these experiments in the demo.go file of the source code repository. This is close to the original motivation for the development of consistent caching by researchers at MIT. Their work, described in the paper "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web" became the foundation of Akamai Technologies . There are other use cases of consistent hashing in distributed systems. AWS popularized one common application in their paper "Dynamo: Amazon’s Highly Available Key-value Store" , where the algorithm is used to distribute storage keys across servers.

1 views

Implementing Forth in Go and C

I first ran into Forth about 20 years ago when reading a book about designing embedded hardware . The reason I got the book back then was to actually learn more about the HW aspects, so having skimmed the Forth chapter I just registered an "oh, this is neat" mental note and moved on with my life. Over the last two decades I heard about Forth a few more times here and there, such as that time when Factor was talked about for a brief period, maybe 10-12 years ago or so. It always occupied a slot in the "weird language" category inside my brain, and I never paid it much attention. Until June this year, when a couple of factors combined fortuitously: And something clicked. I'm going to implement a Forth, because... why not? So I spent much of my free hacking time over the past two months learning about Forth and implementing two of them. It's useful to think of Forth (at least standard Forth , not offshoots like Factor) as having two different "levels": Another way to look at it (useful if you belong to a certain crowd) is that user-level Forth is like Lisp without macros, and hacker-level Forth has macros enabled. Lisp can still be great and useful without macros, but macros take it to an entire new level and also unlock the deeper soul of the language. This distinction will be important when discussing my Forth implementations below. There's a certain way Forth is supposed to be implemented; this is how it was originally designed, and if you get closer to the hacker level, it becomes apparent that you're pretty much required to implement it this way - otherwise supporting all of the language's standard words will be very difficult. I'm talking about the classical approach of a linked dictionary, where a word is represented as a "threaded" list [1] , and this dictionary is available for user code to augment and modify. Thus, much of the Forth implementation can be written in Forth itself. The first implementation I tried is stubbornly different. Can we just make a pure interpreter? This is what goforth is trying to explore (the Go implementation located in the root directory of that repository). Many built-in words are supported - definitely enough to write useful programs - and compilation (the definition of new Forth words using : word ... ; ) is implemented by storing the actual string following the word name in the dictionary, so it can be interpreted when the word is invoked. This was an interesting approach and in some sense, it "works". For the user level of Forth, this is perfectly usable (albeit slow). However, it's insufficient for the hacker level, because the host language interpreter (the one in Go) has all the control, so it's impossible to implement IF...THEN in Forth, for example (it has to be implemented in the host language). That was a fun way to get a deeper sense of what Forth is about, but I did want to implement the hacker level as well, so the second implementation - ctil - does just that. It's inspired by the jonesforth assembly implementation, but done in C instead [2] . ctil actually lets us implement major parts of Forth in Forth itself. For example, variable : Conditionals: These are actual examples of ctil's "prelude" - a Forth file loaded before any user code. If you understand Forth, this code is actually rather mind-blowing. We compile IF and the other words by directly laying our their low-level representation in memory, and different words communicate with each other using the data stack during compilation . Forth made perfect sense in the historic context in which it was created in the early 1970s. Imagine having some HW connected to your computer (a telescope in the case of Forth's creator), and you have to interact with it. In terms of languages at your disposal - you don't have much, even BASIC wasn't invented yet. Perhaps your machine still didn't have a C compiler ported to it; C compilers aren't simple, and C isn't very great for exploratory scripting anyway. So you mostly just have your assembly language and whatever you build on top. Forth is easy to implement in assembly and it gives you a much higher-level language; you can use it as a calculator, as a REPL, and as a DSL for pretty much anything due to its composable nature. Forth certainly has interesting aspects; it's a concatenative language , and thus inherently point-free . A classical example is that instead of writing the following in a more traditional syntax: You just write this: There is no need to explicitly pass parameters, or to explicitly return results. Everything happens implicitly on the stack. This is useful for REPL-style programming where you use your language not necessarily for writing large programs, but more for interactive instructions to various HW devices. This dearth of syntax is also what makes Forth simple to implement. All that said, in my mind Forth is firmly in the "weird language" category; it's instructive to learn and to implement, but I wouldn't actually use it for anything real these days. The stack-based programming model is cool for very terse point-free programs, but it's not particularly readable and hard to reason about without extensive comments, in my experience. Consider the implementation of a pretty standard Forth word: +! . It expects and address at the top of stack, and an addend below it. It adds the addend to the value stored at that address. Here's a Forth implementation from ctil's prelude: Look at that stack wrangling! It's really hard to follow what goes where without the detailed comments showing the stack layout on the right of each instruction (a common practice for Forth programs). Sure, we can create additional words that would make this simpler, but that just increases the lexicon of words to know. My point is, there's fundamental difficulty here. When you see this C code: Even without any documentation, you can immediately know several important things: Written in Forth [3] : How can you know the arity of the functions without adding explicit comments? Sure, if you have a handful of words like bar and foo you know like the back of your hand, this is easy. But imagine reading a large, unfamiliar code base full of code like this and trying to comprehend it. The source code of my goforth project is on GitHub ; both implementations are there, with a comprehensive test harness that tests both. The learn Forth itself, I found these resources very useful: To learn how to implement Forth: Implementing Forth is a great self-improvement project for a coder; there's a pleasantly challenging hump of understanding to overcome, and you gain valuable insights into stack machines, interpretation vs. compilation and mixing these levels of abstraction in cool ways. Also, implementing programming languages from scratch is fun! It's hard to beat the feeling of getting to interact with your implementation for the first time, and then iterating on improving it and making it more featureful. Just one more word ! Which is another deviation from the norm. Forth is really supposed to be implemented in assembly - this is what it was designed for, and it's very clear from its structure that it must be so in order to achieve peak performance. But where's the fun in doing things the way they were supposed to be done? Besides, jonesforth is already a perfectly fine Forth implementation in assembly, so I wouldn't have learned much by just copying it. I had a lot of fun coding in C for this one; it's been a while since I last wrote non-trivial amounts of C, and I found it very enjoyable.

0 views