Learning OCaml: PPX for Mere Mortals
When I started learning OCaml I kept running into code like this: My first reaction was “what the hell is ?” Coming from languages like Ruby and Clojure, where metaprogramming is either built into the runtime (reflection) or baked into the language itself (macros), OCaml’s approach felt alien. There’s no runtime reflection, no macro system in the Lisp sense – just this mysterious syntax that somehow generates code at compile time. That mystery is PPX (PreProcessor eXtensions), and once you understand it, a huge chunk of the OCaml ecosystem suddenly makes a lot more sense. This article is my attempt to demystify PPX for people like me – developers who want to use PPX effectively without necessarily becoming PPX authors themselves. OCaml is a statically typed language with no runtime reflection. That means you can’t do things like “iterate over all fields of a record at runtime” or “automatically serialize any type to JSON.” The type information simply isn’t available at runtime – it’s erased during compilation. One of my biggest frustrations as a newcomer was not being able to just print arbitrary data for debugging – there’s no generic or that works on any type. That frustration was probably my first real interaction with PPX. PPX solves this by generating code at compile time . When the OCaml compiler parses your source code, it builds an Abstract Syntax Tree (AST) – a tree data structure that represents the syntactic structure of your program. PPX rewriters are programs that receive this AST, transform it, and return a modified AST back to the compiler. The compiler then continues as if you had written the generated code by hand. In practical terms, this means that when you write: The PPX rewriter generates something like this behind the scenes: You get a pretty-printer for free, derived from the type definition. No boilerplate, no manual work, and it stays in sync with your type automatically. If you’ve used Rust’s or Haskell’s , the idea is very similar. The syntax is different, but the motivation is identical – generating repetitive code from type definitions. If you’re coming from Rust, you might wonder why OCaml doesn’t just have a built-in macro system like . It’s a fair question, and the answer says a lot about OCaml’s design philosophy. OCaml has always favored a small, stable language core . The compiler is famously lean and fast, and the language team is conservative about adding complexity to the specification. A full macro system baked into the compiler would be a significant undertaking – it would need to be designed, specified, maintained, and kept compatible across versions, forever. Instead, OCaml took a more minimal approach: the compiler provides just two things – extension points and attributes – as syntactic hooks in the AST. Everything else lives in the ecosystem. The actual PPX rewriters are ordinary OCaml programs that happen to transform ASTs. The ppxlib framework that ties it all together is a regular library, not part of the compiler. This has some real advantages: The trade-offs are real, though. Rust’s proc macros are more tightly integrated – you get better error messages pointing at macro-generated code, better IDE support for macro expansions, and the macro system is a documented, stable part of the language. With PPX, you’re sometimes left staring at cryptic type errors in generated code and reaching for to figure out what went wrong. That said, OCaml’s approach feels very OCaml – pragmatic, minimal, and trusting the ecosystem to build what’s needed on top of a simple foundation. And in practice, it works remarkably well. PPX wasn’t OCaml’s first metaprogramming system. Before PPX, there was Camlp4 (and its fork Camlp5 ) – a powerful but complex preprocessor that maintained its own parser, separate from the compiler’s parser. Camlp4 could extend OCaml’s syntax in arbitrary ways, which sounds great in theory but was a maintenance nightmare in practice. Every OCaml release risked breaking Camlp4, and code using Camlp4 extensions often couldn’t be processed by standard tools like editors and documentation generators. OCaml 4.02 (2014) introduced extension points and attributes directly into the language grammar – syntactic hooks specifically designed for preprocessor extensions. This was a much simpler and more maintainable approach: PPX rewriters use the compiler’s own AST, the syntax is valid OCaml (so tools can still parse your code), and the whole thing is conceptually just “AST in, AST out.” Camlp4 was officially retired in 2019. Today, the PPX ecosystem is built on ppxlib , a unified framework that provides a stable API across OCaml versions and handles all the plumbing for PPX authors. Before diving into specific libraries, let’s decode the bracket soup. PPX uses two syntactic mechanisms built into OCaml: Extension nodes are placeholders that a PPX rewriter must replace with generated code (compilation fails if no PPX handles them): Attributes attach metadata to existing code. Unlike extension nodes, the compiler silently ignores attributes that no PPX handles: The one you’ll see most often is on type declarations. The distinction between , , and is about scope – one for the innermost node, two for the enclosing declaration, three for the whole module-level. Tip: Don’t worry about memorizing all of this upfront. In practice, you’ll mostly use and occasionally or – and the specific PPX library’s documentation will tell you exactly which syntax to use. To use a PPX library in your project, you add it to the stanza in your file: That’s it. List all the PPX rewriters you need after , and Dune takes care of the rest (it even combines them into a single binary for performance). For plugins specifically, you use dotted names like . Let’s look at the PPX libraries that cover probably 90% of real-world use cases. ppx_deriving is the community’s general-purpose deriving framework. It comes with several built-in plugins: is the one you’ll reach for first – it’s essentially the answer to “how do I just print this thing?” that every OCaml newcomer asks sooner or later. The most commonly used plugins: A neat convention: if your type is named (as is idiomatic in OCaml), the generated functions drop the type name suffix – you get , , , instead of , , etc. You can also customize behavior per field with attributes: And you can derive for anonymous types inline: ppx_deriving_yojson generates JSON serialization and deserialization functions using the Yojson library: You can use or if you only need one direction. This is incredibly useful in practice – writing JSON serializers by hand for complex types is tedious and error-prone. If you’re using Jane Street’s Core library, you’ll encounter S-expression serialization everywhere. ( Tip: Jane Street bundles most of their PPXs into a single ppx_jane package, so you can add just to your instead of listing each one individually.) ppx_sexp_conv generates converters between OCaml types and S-expressions: The attributes here are quite handy – provides a default value during deserialization, and means the field is represented as a present/absent atom rather than . Two more Jane Street PPXs that you’ll see a lot in Core-based codebases. ppx_fields_conv generates first-class accessors and iterators for record fields: ppx_variants_conv does something similar for variant types – generating constructors as functions, fold/iter over all variants, and more. These Jane Street PPXs let you write tests directly in your source files: ppx_expect is particularly nice – it captures printed output and compares it against expected output: If the output doesn’t match, the test fails and you can run to automatically update the expected output in your source file. It’s a very productive workflow for testing functions that produce output. ppx_let provides syntactic sugar for working with monads and other “container” types: How does know which to call? It looks for a module in scope that provides the underlying and functions. In practice, you’ll typically open a module that defines before using : Note: Since OCaml 4.08, the language has built-in binding operators ( , , , ) that cover the basic use cases of without needing a preprocessor. If you’re not using Jane Street’s ecosystem, binding operators are probably the simpler choice. still offers extra features like , , and optimized though. ppx_blob is beautifully simple – it embeds a file’s contents as a string at compile time: No more worrying about file paths at runtime or packaging data files with your binary. The file contents become part of your compiled program. One thing that’s always bugged me about OCaml is the lack of string interpolation. ppx_string fills that gap: The suffix tells the PPX to convert the value using . You can use any module that provides a function. Most OCaml developers will never need to write a PPX, but understanding the basics helps demystify the whole system. Let’s build a very simple one. Say we want an extension that converts a string literal to uppercase at compile time. Here’s the complete implementation using ppxlib : The dune file: The key pieces are: For more complex PPXs (especially derivers), you’ll also want to use Metaquot ( ), which lets you write AST-constructing code using actual OCaml syntax instead of manual AST builder calls: The ppxlib documentation has excellent tutorials if you want to go deeper. One practical tip: when something goes wrong with PPX-generated code and you’re staring at a confusing type error, you can inspect what the PPX actually generated: Seeing the expanded code often makes the error immediately obvious. Most of the introductory PPX content out there was written around 2018-2019, so it’s worth noting how things have evolved since then. The big story has been ppxlib’s consolidation of the ecosystem . Back in 2019, some PPX rewriters still used the older (OMP) library, creating fragmentation. By 2021, nearly all PPXs had migrated to ppxlib , effectively ending the split. Today ppxlib is the way to write PPX rewriters – there’s no real alternative to consider. The transition hasn’t always been smooth, though. In 2025, ppxlib 0.36.0 bumped its internal AST to match OCaml 5.2, which changed how functions are represented in the parse tree. This broke many downstream PPXs and temporarily split the opam universe between packages that worked with the new version and those that didn’t. The community worked through it with proactive patching, but it highlighted an ongoing tension in the PPX world: ppxlib shields you from most compiler changes, but major AST overhauls still ripple through the ecosystem. On the API side, ppxlib is gradually deprecating its copy of in favor of , with plans to remove entirely in a future 1.0.0 release. If you’re writing a new PPX today, use exclusively. Meanwhile, OCaml 4.08’s built-in binding operators ( , , etc.) have reduced the need for in projects that don’t use Jane Street’s ecosystem. It’s a nice example of the language absorbing a pattern that PPX pioneered. Perhaps one day we’ll see more of this (e.g. native string interpolation). This article covers a lot of ground, but the PPX topic is pretty deep and complex, so depending on how far you want to go you might want to read more on it. Here are some of the best resources I’ve found on PPX: I was amused to see whitequark’s name pop up while I was doing research for this article – we collaborated quite a bit back in the day on her Ruby parser project, which was instrumental to RuboCop . Seems you can find (former) Rubyists in pretty much every language community. This article turned out to be a beast! I’ve wanted to write something on the subject for quite a while now, but I’ve kept postponing it because I was too lazy to do all the necessary research. I’ll feel quite relieved to put it behind me! PPX might look intimidating at first – all those brackets and symbols can feel like line noise. But the core idea is simple: PPX generates boilerplate code from your type definitions at compile time. You annotate your types with what you want ( , , , , etc.), and the PPX rewriter produces the code you’d otherwise have to write by hand. For day-to-day OCaml programming, you really only need to know: The “writing your own PPX” part is there for when you need it, but honestly most OCaml developers get by just fine using the existing ecosystem. That’s all I have for you today. Keep hacking! The ecosystem can evolve independently. ppxlib can ship new features, fix bugs, and improve APIs without waiting for a compiler release. Compare this to Rust, where changes to the proc macro system require the full RFC process and a compiler update. Tooling stays simple. Because and are valid OCaml syntax, every tool – editors, formatters, documentation generators – can parse PPX-annotated code without knowing anything about the specific PPX. The code is always syntactically valid OCaml, even before preprocessing. The compiler stays lean. No macro expander, no hygiene system, no special compilation phases – just a hook that says “here, transform this AST before I type-check it.” – registers an extension with a name, the context where it can appear (expressions, patterns, types, etc.), the expected payload pattern, and an expansion function. – a pattern-matching DSL for destructuring AST nodes. Here matches a string literal and captures its value. – helpers for constructing AST nodes. builds a string literal expression. – registers the rule with ppxlib’s driver. Preprocessors and PPXs – the official OCaml documentation on metaprogramming. A solid reference, though it assumes some comfort with the compiler internals. An Introduction to OCaml PPX Ecosystem – Nathan Rebours’ 2019 deep dive for Tarides. This is the most thorough tutorial on writing PPX rewriters I’ve seen. Some API details have changed since 2019 (notably the → shift), but the concepts and approach are still excellent. ppxlib Quick Introduction – ppxlib’s own getting-started guide. The best place to begin if you want to write your own PPX. A Guide to PreProcessor eXtensions – OCamlverse’s reference page with a comprehensive list of available PPX libraries. A Guide to Extension Points in OCaml – Whitequark’s original 2014 guide that introduced many developers to PPX. Historically interesting as a snapshot of the early PPX days. on type declarations to generate useful functions How to add PPX libraries to your dune file with Which PPX libraries exist for common tasks (serialization, testing, pretty-printing)