Posts in Ocaml (11 found)
(think) 1 months ago

Why I Chose Ruby over Python

This year I spent a bit of time playing with Python, after having mostly ignored it since 2005 when was learning it originally. I did like Python back then, but a few years afterwards I discovered Ruby and quickly focused my entire attention on it. There were many (mostly small) reasons why I leaned towards Ruby back then and playing with Python now made me remember a few of them. I thought it might be interesting to write a bit about those, so here we go. Disclaimer: This is not a rant and I know that: So, treat this as an amusing personal account and nothing more than that. Probably the thing about Python that bothered me the most what stuff that would normally be methods are global functions (that often call some object methods internally). I’m referring to the likes of: You can find the full list here . I’m guessing the reason for this (as usual) is historical, but I much prefer the way Ruby does things. E.g. instead of or vs . is a pretty weird beast as it can: Why would someone want a keyword for removing items from lists and dictionaries instead of some method is beyond me. Ruby’s arrays have methods like: In Ruby almost everything is an expression (meaning that evaluating it would result in a value). In Python a lot of things are consider “statements” - something executed for their side effects only. If you haven’t used languages like Ruby or Lisp this might sound a bit strange, but if we go back to the previous section about , we can observe that: That’s something that I really value and I consider it one of the bigger practical advantages of Ruby over Python. At first I thought the semantic indentation used by Python is super cool, as it reduces a bit the typing one needs to do: In hindsight, however, I quickly realized that it also: One more thing - 4 spaces by default seems a tad too much to me, although that’s obviously debatable. P.S. I feel obliged to admit I’m not a big fan of either and would have preferred instead, but it is how it is. I’m not a fan of Python’s type as for me it’s a pretty weird duck: I get how things ended up the way they are, but for me it’s not OK to be able to write code like . I’m also not a fan of treating empty collection literals as , although I definitely have less issues with this than with 0 and 1. To compare this with Ruby: This definitely resonates better with me. Side note: Lately I’ve been playing a lot with languages like OCaml, F# and Rust, that’s why to me it now feels extra strange to have a boolean type that works like an integer type. I really like the range literals in Ruby: Python has the function that kind of gets the job done, but for whatever reason it doesn’t have the option to mark something as inclusive range. Definitely not a big deal, but one of the many small touches of Ruby’s syntax that I came to appreciate over time. In Python one has to pass to instance methods explicitly, which always seemed to me like an excessive level of verbosity. Also, that’s quite uncommon in other object-oriented programming languages. Many special methods have names surrounded with , which I find both odd and not very easy to type. I get why this was chosen (to avoid naming conflicts), but I don’t like it regardless. I really like that in Ruby the return value of a method is the value of the last expression that got evaluated in the method. There’s a expression in Ruby, but it’s rarely needed in practice. In Python, by comparison, you always have to use , otherwise your method will return . Not a big deal in general, but as I spend a lot of time with Ruby and various functional programming languages, it’s definitely something that bothers me. Admittedly that’s a very small one, but I would have preferred if anonymous functions were created with a keyword like or instead of . In Ruby historically they were created with as well, but afterwards the shorthand was introduced as well. There’s nothing wrong with the Python syntax per se, but I think that in general for lambdas it’s better to have a more compact syntax. Ruby predicates typically have names ending in - e.g. , , . This makes them really easy to spot while reading some code. Python sticks to the more common convention of prefixing such methods with , , etc and that’s fine. One thing that bothers me a bit is that often there’s not spaces between the prefix and the rest of the name (e.g. ), which doesn’t read great in my opinion. More importantly, in Ruby and Python it’s common to have destructive and non-destructive versions of some methods. E.g. - vs in Ruby, and vs in Python. I don’t know about you, but to me it seems that: I’m guessing the reasons here are also historical. Multi-line text literals are common in many languages, but I’m not super fond of: Who thought that typing those would be easy? It’s not that HEREDOCS in Ruby are great either, but I guess they are at least more common in various programming languages. Ruby has and . That’s it. Everyone uses them. Life is simple. Things are a lot more complicated in the realm of Python where several different tools have been in fashion over the years: Now it seems that might replace them all. Until something replaces I guess… And that’s a wrap. I’m guessing at this point most Rubyists reading this would probably agree with my perspective (shocking, right?) and most Pythonistas won’t. And that’s fine. I’m not trying to convince anyone that Ruby’s a better language than Python, I’m just sharing the story of how I ended up in team Ruby almost 20 years ago. Back in the day I felt that Ruby’s syntax was more elegant and more consistent than Python’s, and today my sentiment is more or less the same. Don’t get me wrong, though - I like Python overall and enjoy using it occasionally. It just doesn’t make me as happy as Ruby does. I’ve long written about my frustrations with Ruby, so it feels pretty good to write for once about the aspects of Ruby that I really enjoy. Keep hacking! P.S. After writing this I realized I had already written a similar article 14 years ago, but I had totally forgotten about it! Oh, well… the things I prefer in Ruby over Python are super subjective for every thing that Ruby does “better” there’s something else that Python does better delete variables dictionary items (to remove some values) (to remove some index) There’s no obvious way to determine if actually removed something In Ruby, however, most deletions result in informative results makes things harder for tooling, as it can’t really rearrange indented code sections in editors you can’t have much in terms of auto-indenting as you type (as the editor can’t know when a block finishes without you outdenting explicitly) was added only in Python 2.3 it inherits from and are essentially 1 and 0 has nothing to do with numbers there the only things are considered logically false are and (inclusive range) (exclusive range) Ruby encourages you to favor the non-destructive versions of the methods, unlike Python Ruby’s more consistent than Python

0 views
(think) 2 months ago

Learning OCaml: Having Fun with the Fun Module

When I started to play with OCaml I was kind of surprised that there was no (identity) function that was available out-of-box (in module, that’s auto-opened). A quick search lead me to the Fun module, which is part of the standard library and is nested under . It was introduced in OCaml 4.08, alongside other modules such as , and . 1 The module provides a few basic combinators for working with functions. Let’s go over them briefly: The identity function: returns its single argument unchanged. Returns a function that always returns the first argument, ignoring its second argument. Composes two functions, applying the second function to the result of the first. Haskell and F# have special syntax for function composition, but that’s not the case in OCaml. (although you can easily map this to some operator if you wish to do so) Also, introduced a bit later than the other functions in the module - namely in OCaml 5.2. Reverses the order of arguments to a two-argument function. Negates a boolean-returning function, returning the opposite boolean value. Useful when you want to provide a pair of inverse predicates (e.g. and ) I believe that those functions are pretty self-explanatory, but still below we’ll go over a few examples of using them: Admittedly the examples are not great, but I hope they managed to convey how to use the various combinators. Those are definitely not the type of functions that you would use every day, but they can be useful in certain situations. Obviously I needed at some point to discover the module in the first place, and all of the functions there can be considered “classic” combinators in functional programming. In practice most often I need and , and infrequently and . Right now I’m struggling to come up with good use-cases for , but I’m sure those exist. Perhaps you’ll share some examples in the comments? How often do you use the various combinators? Which ones do you find most useful? I find myself wondering if such fundamental functions shouldn’t have been part of module directly, but overall I really like the modular standard library approach that OCaml’s team has been working towards in the past several years. 2 The important thing in the end of the day is to know that these functions exist and you can make use of them. Writing this short article will definitely help me to remember this. That’s all I have for you today. Keep hacking! It was part of some broader efforts to slim down and move in the direction of a more modular standard library.  ↩ And obviously you can open the module if you wish to at whatever level you desire.  ↩ The identity function: returns its single argument unchanged. Returns a function that always returns the first argument, ignoring its second argument. Composes two functions, applying the second function to the result of the first. Haskell and F# have special syntax for function composition, but that’s not the case in OCaml. (although you can easily map this to some operator if you wish to do so) Also, introduced a bit later than the other functions in the module - namely in OCaml 5.2. Reverses the order of arguments to a two-argument function. Negates a boolean-returning function, returning the opposite boolean value. Useful when you want to provide a pair of inverse predicates (e.g. and ) It was part of some broader efforts to slim down and move in the direction of a more modular standard library.  ↩ And obviously you can open the module if you wish to at whatever level you desire.  ↩

0 views
(think) 2 months ago

Learning OCaml: Numerical Type Conversions

Today I’m going to cover a very basic topic - conversions between OCaml’s primary numeric types and . I guess most of you are wondering if such a basic topic deserves a special treatment, but if you read on I promise that it will be worth it. So, let’s start with the basics that probably everyone knows: Both functions live in module, which is opened by default in OCaml. Here it gets a bit more interesting. For whatever reasons there’s also a function, that’s a synonym to . There’s no function, however. Go figure why… Here’s a bit of trivia for you - does truncation to produce an integer. And there’s also a function that’s another alias for . Again, for whatever reasons it seems there are no functions that allow you to produce an integer by rounding up or down. (although such functions exist for floats - e.g. , and ) More interestingly, OCaml 4.08 introduced the modules and that bring together common functions for operating on integers and floats. 1 And there are plenty of type conversion functions in those modules as well: The introduction of the and modules was part of (ongoing) effort to make OCaml’s library more modular and more useful. I think that’s great and I hope you’ll agree that most of the time it’s a better idea to use the new modules instead of reaching to the “historical” functions in the module. Sadly, most OCaml tutorials out there make no mention of the new modules, so I’m hoping that my article (and tools like ChatGPT) will steer more people in the right direction. If you’re familiar with Jane Street’s , you’ll probably notice that it also employs similar structure when it comes to integer and float functionality. Which type conversion functions do you prefer? Why? That’s all I have for you today. Keep hacking! Technically speaking, existed before 4.08, but it was extended then. 4.08 also introduced the modules , , and . Good stuff!  ↩ you can convert integers to floats with you can convert floats to integers with Technically speaking, existed before 4.08, but it was extended then. 4.08 also introduced the modules , , and . Good stuff!  ↩

0 views
Max Bernstein 4 months ago

What I talk about when I talk about IRs

I have a lot of thoughts about the design of compiler intermediate representations (IRs). In this post I’m going to try and communicate some of those ideas and why I think they are important. The overarching idea is being able to make decisions with only local information. That comes in a couple of different flavors. We’ll assume that we’re compiling a method at a time, instead of a something more trace-like (tracing, tracelets, basic block versioning, etc). A function will normally have some control-flow: , , , any amount of jumping around within a function. Let’s look at an example function in a language with advanced control-flow constructs: Most compilers will deconstruct this , with its many nested expressions, into simple comparisons and jumps. In order to resolve jump targets in your compiler, you may have some notion of labels (in this case, words ending with a colon): This looks kind of like a pseudo-assembly language. It has its high-level language features decomposed into many smaller instructions. It also has implicit fallthrough between labeled sections (for example, into ). I mention these things because they, like the rest of the ideas in this post, are points in an IR design space. Representing code this way is an explicit choice, not a given. For example, one could make the jumps explicit by adding a at the end of . As soon as we add that instruction, the code becomes position-independent: as long as we start with , the chunks of code between labels could be ordered any which way: they are addressed by name and have no implicit ordering requirements. This may seem arbitrary, but it gives the optimizer more flexibility. If some optimization rule decides, for example, that a branch to may rarely be taken, it can freely re-order it toward the end of the function (or even on a different page!) so that more hot code can be on a single cache line. Explicit jumps and labels turn the code from a strictly linear assembly into a control-flow graph (CFG). Each sequence of code without internal control-flow is a called basic block and is a vertex in this graph. The directed edges represent jumps between blocks. See for example this crude GraphViz representation: We’re actually kind of looking at extended basic blocks (EBBs), which allow for multiple control exits per block but only one control entry. A strict basic block representation of the above code would look, in text form, something like this: Notice how each block has exactly one terminator (control-flow instruction), with (in this case) 0 or 2 targets. Opinions differ about the merits and issues of extended vs normal basic blocks. Most compilers I see use normal basic blocks. In either case, bringing the IR into a graph form gives us an advantage: thanks to Cousot and Cousot, our favorite power couple, we know how to do abstract interpretation on graphs and we can use this to build an advanced optimizer. See, for example, my intro to abstract interpretation post . Some IRs are stack based. For concatenative languages or some newer JIT compilers, IRs are formatted in such a way that each opcode reads its operands from a stack and writes its outputs to a stack. This is reminiscent of a point-free coding style in languages such as Haskell or OCaml. In this style, there is an implicit shared state: the stack. Dataflow is explicit (pushes and pops) and instructions can only be rearranged if the stack structure is preserved. This requires some non-local reasoning: to move an instruction, one must also rearrange the stack. By contrast, in a register-based IR, things are more explicit. Instructions take named inputs ( , , etc) and produce named outputs. Instructions can be slightly more easily moved around (modulo effects) as long as inputs remain defined. Local variables do not exist. The stack does not exist. Everything is IR “variables”. The constraints (names being defined) are part of the IR . This gets a little bit tricky if it’s possible to define a name multiple times. What does mean in the instruction for ? Which definition does it refer to? In order to reason about the instruction , we have to keep around some context. This is non-trivial: requiring compiler writers to constantly truck around side-tables and update them as they do analysis is slow and error-prone. Fortunately, if we enforce some interesting rules, we can push that analysis work into one pass up-front… Static single assignment (SSA) was introduced by a bunch of folks at IBM (see my blog post about the different implementations). In SSA-based IRs, each variable can only be defined once. Put another way, a variable is its defining instruction; alternately, a variable and its defining instruction are addressed by the same name. The previous example is not valid SSA; has two definitions. If we turn the previous example into SSA, we can now use a different name for each instruction. This is related to the unique name assumption or the global names property: names do not depend on context. Now we can identify each different instruction by the variable it defines. This is useful in analysis… I’d be remiss if I did not mention continuation-passing style (CPS) based IRs (and in fact, I had forgotten in the original draft of the post). As an IR, CPS is normally used in the analysis and optimization of functional programs, for example in the OCaml and Racket compilers. It is not required, however; MLton, for example, uses SSA in its compiler for Standard ML. SSA and CPS can more or less represent the same programs, but they can each feel a natural fit for different languages (and different compiler authors). I don’t feel qualified to say much more here. For a more informed opinion, check out Andy Wingo’s approaching cps soup , especially the benefits and drawbacks near the end. Speaking of CPS, I took a class with Olin Shivers and he described abstract interpretation as “automated theorem finding”. Unlike theorem provers such as Lean and Rocq, where you have to manually prove the properties you want, static analysis finds interesting facts that already exist in your program (and optimizers use them to make your program faster). Your static analysis pass(es) can annotate your IR nodes with little bits of information such as: If your static analysis is over SSA, then generally the static analysis is easier and (potentially) storing facts is easier. This is due to this property called sparseness . Where a static analysis over non-SSA programs has to store facts about all variables at all program points , an analysis over SSA need only store facts about all variables, independent of context. I sometimes describe this as “pushing time through the IR” but I don’t know that that makes a ton of sense. Potentially more subtle here is that we could represent the above IR snippet as a list of tuples, where instructions are related via some other table (say, a “variable environment”): Instead, though, we could allocate an object for each instruction and let them refer to one another by pointer (or index, if using Rust or something). Then they directly refer to one another (no need for a side-table), which might be faster and more compact. We can re-create nice names as needed for printing. Then, when optimizing, we look up the type information of an operand by directly reading a field ( or similar). Another thing to note: when you start adding type information to your IR, you’re going to start asking type information questions in your analysis. Questions such as “what type is this instruction?”, where “type” could span a semilattice, and even refer to a specific run-time object by its pointer. In that case, it’s important to ask the right questions . For example: instructions are likely not the only opcodes that could produce specific objects; if you have an instruction like , for example, that burns a specific expected pointer into the generated code, the type (and therefore the pointer) will come from the instruction. The big idea is that types represent a different slice of your IR than the opcodes and should be treated as such. Anyway, SSA only stores type information about instructions and does not encode information that we might later learn in the IR. With basic SSA, there’s not a good way to encode refinements… Static single information (SSI) form gives us new ways to encode metadata about instructions (variables). It was introduced by C. Scott Ananian in 1999 in his MS thesis (PDF). (I also discussed it briefly in the Scrapscript IR post .) Consider the following SSA program (represented as pseudo-Python): is undefined at . is defined and an integer at . But then we do something interesting: we split control flow based on the run-time value of . We can take this split to add new and interesting information to . For non-sparse analysis, we can record some fact on the side. That’s fine. When doing a dataflow analysis, we can keep track of the fact that at , is nonnegative, and at , is negative. This is neat: we can then determine that all paths to this function return a positive integer. Importantly, does not override the existing known type of . Instead, it is a refinement: a set intersection. A lattice meet. The middle bit of a Venn diagram containing two overlapping circles, and . If we want to keep our information sparse, though, we have to add a new definition to the IR. This is complicated (choose which variables to split, replace all uses, to maintain SSA, etc) but gives us new places to store information inside the IR . It means that every time we refer to , we know that it is nonnegative and every time we refer to , we know that it is negative. This information is independent of context! I should note that you can get a lot of the benefits of SSI without going “full SSI”. There is no need to split every variable at every branch, nor add a special new merge instruction. Okay, so we can encode a lot of information very sparsely in the IR. That’s neat. It’s powerful. But we should also be mindful that even in this very sparse representation, we are encoding information implicitly that we may not want to: execution order. In a traditional CFG representation, the instructions are already scheduled , or ordered. Normally this comes from the programmer in the original source form and is faithfully maintained. We get data use edges in an IR like SSA, but the control information is left implicit. Some forms of IR, however, seek to reify both data and control dependencies into the IR itself. One such IR design is sea of nodes (SoN), which was originally designed by Cliff Click during his PhD. In sea of nodes, every instruction gets its own vertex in the graph. Instructions have use edges to their operands, which can be either data or some other ordering property (control, effects, etc). The main idea is that IR nodes are by default unordered and are only ordered later, after effect analysis has removed a bunch of use edges. Per Vognsen also notes that there is another motivating example of sea of nodes: in the previous SSI example, the cannot be validly hoisted above the check. In a “normal” IR, this is implicit in the ordering. In a sea of nodes world, this is explicitly marked with an edge from the to the . I think Graal, for example, calls these nodes “Pi nodes”. I think I need to re-read the original paper, read a modern implementation (I get the feeling it’s not done exactly the same way anymore), and then go write more about it later. For now, see Simple , by Cliff Click and friends. It is an implementation in Java and a little book to go with it. Design neighbors include value dependence graphs (VDG), value state dependence graphs (VSDG), region value state dependence graphs (RVSDG), and program dependence graphs (PDG). Speaking of Cliff Click, I once heard/read something he said that sounded really interesting. Roughly, it was “elaborate the full semantics of the operation into the IR and let the optimizer sort it out”. That is, “open code” or “inline” your semantics. For example, don’t emit code for a generic add operation that you later specialize: Instead, emit code that replicates the written semantics of the operation, whatever that is for your local language. This can include optimistic fast paths: This has the advantage that you may end up with fewer specialized rewrite rules because constant propagation and branch folding take care of these specializations “for free”. You can even attach probabilities to more or less likely branches to offer outlining hints in case all of this is never specialized. Sure, the downside of this is that the generated IR might be bigger, so your optimizer might be slower—or worse, that your resulting generated code at the end might be bigger. But outlining, deduplication (functionalization?), and probably some other clever methods can help here. Similarly, Maxime Chevalier-Boisvert and Marc Feeley write about this (PDF) in the context of basic block versioning. If the runtime’s generic add functions is written in IR, then callers to that function can specialize “through it” by calling it in different basic block contexts. That more or less gets you call-site specialization “for free”. See Figure 4 from their paper (lightly edited by me), where I think dollar-prefixed variable names indicate special functions known to the compiler: This is nice if you are starting a runtime from scratch or have resources to devote to re-writing chunks of the runtime in your IR. Then, even in a method JIT, you can get your inlined language semantics by function (partial) inlining. There’s probably more in this vein to be explored right now and probably more to invent in the future, too. Some other potentially interesting concepts to think about include: Thank you to Chris Fallin , Hunter Goldstein, and Per Vognsen for valuable feedback on drafts of this post.

0 views
baby steps 5 months ago

Rust turns 10

Today is the 10th anniversary of Rust’s 1.0 release . Pretty wild. As part of RustWeek there was a fantastic celebration and I had the honor of giving some remarks, both as a long-time project member but also as representing Amazon as a sponsor. I decided to post those remarks here on the blog. “It’s really quite amazing to see how far Rust has come. If I can take a moment to put on my sponsor hat, I’ve been at Amazon since 2021 now and I have to say, it’s been really cool to see the impact that Rust is having there up close and personal. “At this point, if you use an AWS service, you are almost certainly using something built in Rust. And how many of you watch videos on PrimeVideo? You’re watching videos on a Rust client, compiled to WebAssembly, and shipped to your device. “And of course it’s not just Amazon, it seems like all the time I’m finding out about this or that surprising place that Rust is being used. Just yesterday I really enjoyed hearing about how Rust was being used to build out the software for tabulating votes in the Netherlands elections . Love it. “On Tuesday, Matthias Endler and I did this live podcast recording. He asked me a question that has been rattling in my brain ever since, which was, ‘What was it like to work with Graydon?’ “For those who don’t know, Graydon Hoare is of course Rust’s legendary founder. He was also the creator of Monotone , which, along with systems like Git and Mercurial, was one of the crop of distributed source control systems that flowered in the early 2000s. So defintely someone who has had an impact over the years. “Anyway, I was thinking that, of all the things Graydon did, by far the most impactful one is that he articulated the right visions. And really, that’s the most important thing you can ask of a leader, that they set the right north star. For Rust, of course, I mean first and foremost the goal of creating ‘a systems programming language that won’t eat your laundry’. “The specifics of Rust have changed a LOT over the years, but the GOAL has stayed exactly the same. We wanted to replicate that productive, awesome feeling you get when using a language like Ocaml – but be able to build things like web browsers and kernels. ‘Yes, we can have nice things’, is how I often think of it. I like that saying also because I think it captures something else about Rust, which is trying to defy the ‘common wisdom’ about what the tradeoffs have to be. “But there’s another North Star that I’m grateful to Graydon for. From the beginning, he recognized the importance of building the right culture around the language, one committed to ‘providing a friendly, safe and welcoming environment for all, regardless of level of experience, gender identity and expression, disability, nationality, or other similar characteristic’, one where being ‘kind and courteous’ was prioritized, and one that recognized ’there is seldom a right answer’ – that ‘people have differences of opinion’ and that ’every design or implementation choice carries a trade-off’. “Some of you will probably have recognized that all of these phrases are taken straight from Rust’s Code of Conduct which, to my knowledge, was written by Graydon. I’ve always liked it because it covers not only treating people in a respectful way – something which really ought to be table stakes for any group, in my opinion – but also things more specific to a software project, like the recognition of design trade-offs. “Anyway, so thanks Graydon, for giving Rust a solid set of north stars to live up to. Not to mention for the keyword. Raise your glass! “For myself, a big part of what drew me to Rust was the chance to work in a truly open-source fashion. I had done a bit of open source contribution – I wrote an extension to the ASM bytecode library, I worked some on PyPy, a really cool Python compiler – and I loved that feeling of collaboration. “I think at this point I’ve come to see both the pros and cons of open source – and I can say for certain that Rust would never be the language it is if it had been built in a closed source fashion. Our North Star may not have changed but oh my gosh the path we took to get there has changed a LOT. So many of the great ideas in Rust came not from the core team but from users hitting limits, or from one-off suggestions on IRC or Discord or Zulip or whatever chat forum we were using at that particular time. “I wanted to sit down and try to cite a bunch of examples of influential people but I quickly found the list was getting ridiculously long – do we go all the way back, like the way Brian Anderson built out the infrastructure as a kind of quick hack, but one that lasts to this day? Do we cite folks like Sophia Turner and Esteban Kuber’s work on error messages? Or do we look at the many people stretching the definition of what Rust is today … the reality is, once you start, you just can’t stop. “So instead I want to share what I consider to be an amusing story, one that is very Rust somehow. Some of you may have heard that in 2024 the ACM, the major academic organization for computer science, awarded their SIGPLAN Software Award to Rust. A big honor, to be sure. But it caused us a bit of a problem – what names should be on there? One of the organizers emailed me, Graydon, and a few other long-time contributors to ask us our opinion. And what do you think happened? Of course, we couldn’t decide. We kept coming up with different sets of people, some of them absurdly large – like thousands of names – others absurdly short, like none at all. Eventually we kicked it over to the Rust Leadership Council to decide. Thankfully they came up with a decent list somehow. “In any case, I just felt that was the most Rust of all problems: having great success but not being able to decide who should take credit. The reality is there is no perfect list – every single person who got named on that award richly deserves it, but so do a bunch of people who aren’t on the list. That’s why the list ends with All Rust Contributors, Past and Present – and so a big shout out to everyone involved, covering the compiler, the tooling, cargo, rustfmt, clippy, core libraries, and of course organizational work. On that note, hats off to Mara, Erik Jonkers, and the RustNL team that put on this great event. You all are what makes Rust what it is. “Speaking for myself, I think Rust’s penchant to re-imagine itself, while staying true to that original north star, is the thing I love the most. ‘Stability without stagnation’ is our most important value. The way I see it, as soon as a language stops evolving, it starts to die. Myself, I look forward to Rust getting to a ripe old age, interoperating with its newer siblings and its older aunts and uncles, part of the ‘cool kids club’ of widely used programming languages for years to come. And hey, maybe we’ll be the cool older relative some day, the one who works in a bank but, when you talk to them, you find out they were a rock-and-roll star back in the day. “But I get ahead of myself. Before Rust can get there, I still think we’ve some work to do. And on that note I want to say one other thing – for those of us who work on Rust itself, we spend a lot of time looking at the things that are wrong – the bugs that haven’t been fixed, the parts of Rust that feel unergonomic and awkward, the RFC threads that seem to just keep going and going, whatever it is. Sometimes it feels like that’s ALL Rust is – a stream of problems and things not working right. “I’ve found there’s really only one antidote, which is getting out and talking to Rust users – and conferences are one of the best ways to do that. That’s when you realize that Rust really is something special. So I do want to take a moment to thank all of you Rust users who are here today. It’s really awesome to see the things you all are building with Rust and to remember that, in the end, this is what it’s all about: empowering people to build, and rebuild, the foundational software we use every day. Or just to ‘hack without fear’, as Felix Klock legendarily put it. “So yeah, to hacking!”

0 views
sunshowers 1 years ago

Debugging a rustc segfault on illumos

At Oxide , we use Helios as the base OS for the cloud computers we sell. Helios is a distribution of illumos , a Unix-based operating system descended from Solaris. As someone who learned illumos on the job, I’ve been really impressed by the powerful debugging tools it provides. I had a chance to use some of them recently to track down a segmentation fault in the Rust compiler, with the help of several of my colleagues. I learned a lot from the process, and I thought I’d write about it! I’m writing this post for an audience of curious technologists who aren’t necessarily familiar with systems work. If you’re an experienced systems developer, parts of it are likely familiar to you—feel free to skip over them. A couple of weeks ago, I wanted to make a change to the Rust standard library on illumos. I logged into my illumos box and cloned the Rust repository (revision ). Following the setup instructions , I configured the build system with the build profile. When I went to run , I saw an error with the following output: Quite concerning! Like any good technologist I tried running the command again. But the segfault seemed to be completely deterministic: the program would crash while compiling every time. Coincidentally, we had our fortnightly “Rust @ Oxide” virtual meetup at around that time. There wasn’t much to discuss there, so we turned that meeting into a debugging session. (I love how my coworkers get excited about debugging strange issues.) Like the compilers for many other languages, the Rust compiler is written in the language it is intending to compile (in this case, Rust). In other words, the Rust compiler is self-hosting . Any self-hosting compiler needs to answer the question: how in the world do you compile the compiler if you don’t already have a working compiler? This is known as the bootstrapping problem . There are several ways to address the problem, but the two most common are: Use the previous version of the compiler. In other words, use version N-1 of the compiler to compile version N. For example, use Rust 1.75 to compile Rust 1.76. The earliest versions of Rust were written in Ocaml. So if you’re spinning up Rust on a brand new platform and have an Ocaml compiler available, you can actually start from there and effectively create your own lineage of compilers. There are also implementations of Rust in other languages, like in C++, which can be used to build some (typically pretty old) version of the compiler. Interestingly, these other implementations don’t need to be perfect—for example, since they’re only used to compile code that’s known to be valid, they don’t need to handle errors well. That’s a large chunk of the complexity of a real compiler. Cross-compile from another platform. As a shortcut, if you have a way to cross-compile code from another platform, you can use that to set up the initial compiler. This is the most common method for setting up Rust on a new platform. (But note that method 1 must be used on at least one platform.) While bootstrapping from the previous version of Rust, the toolchain follows a series of stages , ranging from stage 0 to stage 2 . In our case, since we’re working with the standard library we’re only concerned with stage 0 : the standard library compiled with the previous version of . That is the build process that crashed. The first thing to find is the version of that’s crashing. There are a few ways to find the compiler, but a simple command works well: This command finds at . Let’s ask it for its version: Can the bug be reproduced independently of the Rust toolchain? The toolchain does all sorts of non-standard things, so it’s worth checking. The output says , so let’s try building that separately. Again, there are a few ways to do this, but the easiest is to make a simple Cargo project that depends on the crate. And then run . I didn’t have rustc 1.80.0 beta 1 on the machine, so I tried with the 1.80.0 release: Yep, it crashes in the same spot. This is a minimal-enough example, so let’s work with this. When a program crashes, systems are typically configured to generate a core dump , also known as a core file. The first step while debugging any crash is to ensure that core dumps are generated, and then to find one to examine it. On illumos, many of the system-level administration tools are called . The tool for managing core files is called . Let’s run that: This suggests that core “per-process core dumps” are enabled. The lack of a pattern indicates that the defaults are used. Generally, on Unix systems the default is to generate a file named in the current directory of the crashing process. A simple in our little test project doesn’t show a file, which means that it might be elsewhere. Let’s just do a global for it. This showed a few files on my system, including: . Bingo! That looks like a hit. (Why is it in the registry? Because when compiling a crate, Cargo sets the current working directory of the child process to the crate’s directory.) The next step is to move the file into another directory 1 . After doing that, let’s start examining it. The best way to examine a core file on illumos is with the Modular Debugger, . is a powerful tool that can be used to inspect the state of both live and dead processes, as well as the kernel itself. Using with the core file is simple: just run . The first step is to enable symbol demangling 2 . The command to do that in is , so let’s run that: (The output says “C++”, but illumos’s demangler can handle Rust symbols, too.) Let’s look at the CPU registers now. A register stores a small amount of data that the CPU can access very quickly. Core files typically have the contents of registers at the time of the crash, which can be very useful for debugging. In , the command to print out registers is or . Here’s the output: All right, there’s a lot going on here. A full accounting of the registers on x86-64 is beyond the scope of this post, but if you’re interested here’s a quick summary . The most important registers here are , , and . All three of these are 64-bit addresses. is the instruction pointer , also known as the program counter . is a special register that points to the next instruction to be executed. The CPU uses to keep track of where it is in the program. is the stack pointer . The call stack is a region of memory that is used to store function call information and local variables. The stack pointer points to the head of the stack. Note that on most architectures including x86-64, the stack grows down in memory: when a function is called, a new stack frame is set up and the stack pointer is decremented by however much space the function needs. is the base pointer , more commonly known as the frame pointer . It points to the base of the current stack frame 3 . We can also look at the call stack via the command. The stack turns out to be enormous ( full output ): (The is used to send the output to a shell command, in this case one that counts the number of lines.) It looks like the crash is in the parser. (Notably, the crash is while compiling a crate called , which suggests automatic code generation. Generated code often tends to stress the parser in ways that manually written code does not.) Based on the call stack, it looks like the parser is recursive in nature. A quick Google search confirms that the parser is a “simple hand-written recursive descent parser”. This isn’t surprising, since most production parsers are written this way. (For example, is also a recursive descent parser.) Turning our attention to the instruction pointer , we can use the command to disassemble the function at that address. ( Full output ; the flag ensures that addresses are not converted to very long function names.) So it looks like the crash is happening in a instruction to another function, . (Keep in mind that this information could be completely unreliable! The stack might be corrupted, the registers might be wrong, and so on. But it’s what we have for now.) On virtual memory systems , which includes all modern desktop and server systems, each process gets the illusion that it has a very large amount of memory all to itself. This is called the address space of a process. The instructions, the call stack, and the heap all get their own regions of addresses in that space, called memory mappings . The 64-bit addresses that we saw earlier are all part of the address space. has a command called to look up which part of memory an address is at. Let’s look at the stack pointer first: This tells us that the address is in the range to . This is a small 4 KiB range. What about the frame pointer? This appears to be in a different range. In this case, the ending address is (note the , not the !). This address is bytes away from the starting address. That is equal to 1028 KiB , or 1 MiB + 4 KiB page 4 . Something else that’s relevant here is what permissions each range of addresses has. Like files on Unix, a block of virtual memory can have read , write , or execute permissions. (In this case, execute means that it is valid for the instruction pointer to point here 5 .) On illumos, a tool called can show these spaces. works on both live processes and core files. Running shows the permissions for the addresses we’re interested in ( full output ): The 1028 KiB range is read-write, and the 4 KiB range above that doesn’t have any permissions whatsoever. This would explain the segfault . A segfault is an attempt to operate on a part of memory that the program doesn’t have permissions for. Attempting to read from or write to memory which has no permissions is an example of that. At this point, we have enough information to come up with a theory: But there are also other bits of evidence that this theory doesn’t explain, or even cuts against. (This is what makes post-mortem debugging exciting! There are often contradictory-seeming pieces of information that need to be explained.) The memory is marked or . That’s not how call stacks are supposed to be marked! In the output, there’s a line which says: So you’d expect call stacks to be marked with , not . Why is the size of the allocation 1028 KiB? You’d generally expect stack sizes to be a round power of two. Isn’t 1028 KiB kind of small? The thread is a non-main thread, and the default stack size for Rust threads is 2 MiB . Why is our thread ~1 MiB and not 2 MiB? On Unix platforms, for the main thread, the call stack size is determined by (in KiB). On my illumos machine, this printed , indicating a 10 MiB call stack. For child threads, the call stack size is determined by whatever created them. For Rust, the default is 2 MiB. Why doesn’t this crash happen on other platforms? If this is a crash in the parser, one would ordinarily expect it to arise everywhere. Yet it doesn’t seem to occur on Linux, macOS, or Windows. What’s special about illumos? Setting doesn’t help. Rust-created thread stack sizes can be configured via the environment variable . If we try to use that: It turns out that crashes at exactly the same spot. That’s really strange! It is possible that the stack size was overridden at thread creation time. The documentation for says: “Note that setting will override this.” But that seems unlikely. Looking towards the bottom of the call stack, there’s something really strange : Notice the jump in addresses from to ? Normally, stack addresses are decremented as new functions are called: the number goes down. In this case the stack address is incremented . The number went up. Strange. Also notice that this coincides with the use of a function called . Now that’s a real lead! What part of memory is in? says: So this address is part of the stack for thread 3. agrees : What is ? Time for some googling! Per the documentation , is: A library to help grow the stack when it runs out of space. This is an implementation of manually instrumented segmented stacks where points in a program’s control flow are annotated with “maybe grow the stack here”. Each point of annotation indicates how far away from the end of the stack it’s allowed to be, plus the amount of stack to allocate if it does reach the end. Because the parser is recursive, it is susceptible to call stack exhaustion. The use of is supposed to prevent, or at least mitigate, that. How does work? The library has a pretty simple API : The developer is expected to intersperse calls to within their recursive function. If less than bytes of stack space remain, will allocate a new segment of bytes, and run with the stack pointer pointing to the new segment. How does rustc use ? The code is in this file . The code requests an additional 1 MiB stack with a red zone of 100 KiB. Why did create a new stack segment? In our case, the call is at the very bottom of the stack, when plenty of space should be available, so ordinarily should not need to allocate a new segment. Why did it do so here? The answer is in ’s source code . There is code to guess the stack size on many platforms. But it isn’t enabled on illumos: always returns . With this information in hand, we can flesh out our call stack exhaustion theory: Some file in was triggering the crash by requiring more than 1 MiB of stack space. Had this bug occurred on other platforms like Linux, this issue would have been a showstopper. However, it wasn’t visible on those platforms because: didn’t call enough! In order for it to work, needs to be interspersed throughout the recursive code. But some recursive parts did not appear to have called it. (It is somewhat ironic that , a library meant to prevent call stack exhaustion, was actively making life worse here.) Where does the 1028 KiB come from? Looking at the source code : It looks like first computes the number of requested pages by dividing the requested stack size by the page size, rounding up. Then it adds 2 to that. In our case: This explains both the 1028 KiB allocation (one guard page after the stack), and the 4 KiB guard page we’re crashing at (one guard page before the stack). If the issue is that a 1 MiB stack isn’t enough, it should be possible to reproduce this on other platforms by setting their stack size to something smaller than the 2 MiB default. With a stack size <= 1 MiB, we would expect that: Let’s try to compile on Linux with a reduced stack size. This does crash as expected. The full output is here . Some of the symbols are missing, but the crash does seem to be in parser code. (At this point, we could have gone further and tried to make a debug-assertions build of – but it was already pretty clear why the crash was happening.) Call stack exhaustion in the parser suggests that the crash is happening in some kind of large, automatically generated file. But what file is it? It’s hard to tell by looking at the core file itself, but we have another dimension of debugging at hand: syscall tracers! These tools print out all the syscalls made by a process. Most OSes have some means to trace syscalls: on Linux, on macOS, Process Monitor on Windows, and on illumos 7 . Since we’re interested in file reads, we can try filtering it down to the and syscalls . You need to open a file to read it, after all. (Alternatively, we can also simply not filter out any syscalls, dump the entire trace to a file, and then look at it afterwards.) On illumos, we tell to run , filtering syscalls to and ( ), and following child processes ( ): This prints out every file that the child tries to open ( full output ): It looks like the crash is in a file called in the directory. With Cargo, a file being in an directory is a pretty strong indication that it is generated by a build script. On Linux, a similar command is: This command also blames the same file, . What does this file look like, anyway? Here’s my copy. It’s pretty big and deeply nested! It does look large and complex enough to trigger call stack exhaustion. Syscall traces would definitely be somewhat harder to get if the crash weren’t so easily reproducible. Someone smarter than me should write about how to figure this out using just the core file. The file’s fully loaded into memory so it seems like it should be possible. Going back to the beginning: the reason I went down this adventure was because I wanted to make an unrelated change to the Rust standard library. But the stage 0 compiler being broken meant that it was impossible to get to the point where I could build the standard library as-is, let alone test that change. How can we work around this? Well, going back to basics, where did the stage 0 compiler come from? It came from Rust’s CI, and it wasn’t actually built on illumos! (Partly because there’s no publicly-available CI system running illumos.) Instead, it was cross-compiled from Linux to illumos. Based on this, my coworker Joshua suggested that I try and do whatever Rust’s CI does to build a stage 0 compiler for illumos. Rust’s CI uses a set of Docker images to build distribution artifacts. In theory, building a patched rustc should be as simple as running these commands on my Linux machine: In reality, there were some Docker permissions issues due to which I had to make a couple of changes to the script. Overall, though, it was quite simple. Here’s the patch I built the compiler with, including the changes to the CI scripts. The result of building the compiler was a set of files, just like the ones published by Rust’s CI . After copying the files over to my illumos machine, I wasn’t sure which tarballs to extract. So I made a small change to the bootstrap script to use my patched tarballs. With this patch, I was able to successfully build Rust’s standard library on illumos and test my changes. Hooray! ( Here’s what I was trying to test.) Update 2024-08-05: After this post was published, jyn pointed out on Mastodon that is actually optional, and that I could have also worked around the issue by disabling it in the build system’s . Thanks! The bug occurred due to a combination of several factors. It also revealed a few other issues, such as the lack of an environment variable workaround and some missing error reporting. Here are some ways we can make the situation better, and help us have an easier time debugging similar issues in the future. isn’t using enough. The basic problem underneath it all is that the part of the parser that triggered the bug wasn’t calling often enough to make new stack segments. should be calling more than it is today. cannot detect the stack size on illumos. This is something that we should fix in , but this is actually a secondary issue here. On other platforms, ’s ability to detect the stack size was masking the bug. Fixing this requires two changes: -created segments don’t print a nice message on stack exhaustion. This is a bit ironic because is supposed to prevent stack exhaustion. But when it does happen, it would be nice if printed out a message like standard Rust does. On illumos, the Rust runtime doesn’t print a message on stack exhaustion. Separate from the previous point, on illumos the Rust runtime doesn’t print a message on stack exhaustion even when using native stacks. Rust’s CI doesn’t run on illumos. At Oxide, we have an existential dependency on Rust targeting illumos. Even a shadow CI that ran on nightly releases would have caught this issue right away. We’re discussing the possibilities for this internally; stay tuned! segment sizes can’t be controlled via the environment. Being able to control stack sizes with is a great way to work around issues. It doesn’t appear that segment sizes can be controlled in this manner. Maybe that functionality should be added to , or to itself? Maybe a crater run with a smaller stack size? It would be interesting to see if there are other parts of the Rust codebase that need to call more as well. suggests disabling optional components. Since was an optional component that can be disabled, the tooling could notice if a build failed in such a component, and recommend disabling that component. Added 2024-08-05, suggested by jyn . To me, this is the most exciting part of debugging: what kinds of changes can we make, both specific and systemic ones, to make life easier for our future selves? This was a really fun debugging experience because I got to learn about several illumos debugging tools, and also because we could synthesize information from several sources to figure out a complex issue. (Thankfully, the root cause was straightforward, with no memory corruption or other “spooky action at a distance” involved.) Debugging this was a real team effort. I couldn’t have done it without the assistance of several of my exceptional colleagues. In no particular order: Thanks to all of you! I neglected to do this during my own debugging session, which led to some confusion when I re-ran the process and found that the core file had been overwritten.  ↩︎ Name mangling is a big topic of its own, but the short version is that the Rust compiler uses an algorithm to encode function names into the binary. The encoding is designed to be reversible, and the process of doing so is called demangling. (Other languages like C++ do name mangling, too.)  ↩︎ You might have heard about “frame pointer omission”, which is a technique to infer the base of stack frames rather than storing it in explicitly. In this case, the frame pointer is not omitted.  ↩︎ A page is the smallest amount of physical memory that can be atomically mapped to virtual memory. On x86-64, the page size is virtually always 4 KiB.  ↩︎ Memory being both writable and executable is dangerous, and modern systems do not permit this by default for security reasons. Some platforms like iOS even make it impossible for memory to be writable and executable, unless the platform holder gives you the corresponding permissions.  ↩︎ This is generally known as a “stack overflow”, but that term can also mean a stack-based buffer overflow . Throughout this document, we use “call stack exhaustion” to avoid confusion.  ↩︎ There is likely some way to get itself to print out which files it opened, but the beauty of system call tracers is that you don’t need to know anything about the program you’re tracing.  ↩︎ Use the previous version of the compiler. In other words, use version N-1 of the compiler to compile version N. For example, use Rust 1.75 to compile Rust 1.76. From where do you begin, though? The earliest versions of Rust were written in Ocaml. So if you’re spinning up Rust on a brand new platform and have an Ocaml compiler available, you can actually start from there and effectively create your own lineage of compilers. There are also implementations of Rust in other languages, like in C++, which can be used to build some (typically pretty old) version of the compiler. Interestingly, these other implementations don’t need to be perfect—for example, since they’re only used to compile code that’s known to be valid, they don’t need to handle errors well. That’s a large chunk of the complexity of a real compiler. Cross-compile from another platform. As a shortcut, if you have a way to cross-compile code from another platform, you can use that to set up the initial compiler. This is the most common method for setting up Rust on a new platform. (But note that method 1 must be used on at least one platform.) is the instruction pointer , also known as the program counter . is a special register that points to the next instruction to be executed. The CPU uses to keep track of where it is in the program. is the stack pointer . The call stack is a region of memory that is used to store function call information and local variables. The stack pointer points to the head of the stack. Note that on most architectures including x86-64, the stack grows down in memory: when a function is called, a new stack frame is set up and the stack pointer is decremented by however much space the function needs. is the base pointer , more commonly known as the frame pointer . It points to the base of the current stack frame 3 . The thread had a call stack of 1028 KiB available to it, starting at . The call stack pointer was at (only = 320 bytes away), and it tried to create a frame of size (1312) bytes, at . This caused the call stack to be exhausted : the thread ran out of space 6 . When the thread ran out of space, it indexed into a 4 KiB section known as a guard page . The thread did not have any permissions to operate on the page, and was in fact designed to cause a segfault if accessed in any way. The program then (correctly) segfaulted. The memory is marked or . That’s not how call stacks are supposed to be marked! In the output, there’s a line which says: So you’d expect call stacks to be marked with , not . Why is the size of the allocation 1028 KiB? You’d generally expect stack sizes to be a round power of two. Isn’t 1028 KiB kind of small? The thread is a non-main thread, and the default stack size for Rust threads is 2 MiB . Why is our thread ~1 MiB and not 2 MiB? How are call stack sizes determined? On Unix platforms, for the main thread, the call stack size is determined by (in KiB). On my illumos machine, this printed , indicating a 10 MiB call stack. For child threads, the call stack size is determined by whatever created them. For Rust, the default is 2 MiB. Why doesn’t this crash happen on other platforms? If this is a crash in the parser, one would ordinarily expect it to arise everywhere. Yet it doesn’t seem to occur on Linux, macOS, or Windows. What’s special about illumos? Setting doesn’t help. Rust-created thread stack sizes can be configured via the environment variable . If we try to use that: It turns out that crashes at exactly the same spot. That’s really strange! It is possible that the stack size was overridden at thread creation time. The documentation for says: “Note that setting will override this.” But that seems unlikely. Some file in was triggering the crash by requiring more than 1 MiB of stack space. The parser running against needed more than 1 MiB of stack space, but less than 2 MiB. Had this bug occurred on other platforms like Linux, this issue would have been a showstopper. However, it wasn’t visible on those platforms because: Threads created by Rust use a 2 MiB stack by default. requested that create a 1 MiB stack segment, but only if less than 100 KiB of stack space was left. On the other platforms, could see that well over 100 KiB of stack space was left, and so it did not allocate a new segment. On illumos, could not see how much stack was left, and so it allocated a new 1 MiB segment. This 1 MiB stack was simply not enough to parse . didn’t call enough! In order for it to work, needs to be interspersed throughout the recursive code. But some recursive parts did not appear to have called it. The requested stack size is 1 MiB. With 4 KiB pages, this works out to 256 pages. then requests 256 + 2 = 258 pages, which is 1032 KiB. calls as before. There are two possibilities: either decides there is enough stack space and doesn’t create a new segment, or it decides there isn’t enough and does create a new 1 MiB segment. In either case, 1 MiB is simply not enough to parse , and the program crashes. isn’t using enough. The basic problem underneath it all is that the part of the parser that triggered the bug wasn’t calling often enough to make new stack segments. should be calling more than it is today. Filed as rust-lang/rust#128422 . cannot detect the stack size on illumos. This is something that we should fix in , but this is actually a secondary issue here. On other platforms, ’s ability to detect the stack size was masking the bug. Fixing this requires two changes: A PR to to add the function to it. A PR to to use this function to detect the stack size on illumos. -created segments don’t print a nice message on stack exhaustion. This is a bit ironic because is supposed to prevent stack exhaustion. But when it does happen, it would be nice if printed out a message like standard Rust does. This is rust-lang/stacker#59 . On illumos, the Rust runtime doesn’t print a message on stack exhaustion. Separate from the previous point, on illumos the Rust runtime doesn’t print a message on stack exhaustion even when using native stacks. Filed as rust-lang/rust#128568 . Rust’s CI doesn’t run on illumos. At Oxide, we have an existential dependency on Rust targeting illumos. Even a shadow CI that ran on nightly releases would have caught this issue right away. We’re discussing the possibilities for this internally; stay tuned! segment sizes can’t be controlled via the environment. Being able to control stack sizes with is a great way to work around issues. It doesn’t appear that segment sizes can be controlled in this manner. Maybe that functionality should be added to , or to itself? Opened a discussion on internals.rust-lang.org . Maybe a crater run with a smaller stack size? It would be interesting to see if there are other parts of the Rust codebase that need to call more as well. suggests disabling optional components. Since was an optional component that can be disabled, the tooling could notice if a build failed in such a component, and recommend disabling that component. Added 2024-08-05, suggested by jyn . Joshua M. Clulow Matt Keeter Cliff Biffle Steve Klabnik artemis everfree I neglected to do this during my own debugging session, which led to some confusion when I re-ran the process and found that the core file had been overwritten.  ↩︎ Name mangling is a big topic of its own, but the short version is that the Rust compiler uses an algorithm to encode function names into the binary. The encoding is designed to be reversible, and the process of doing so is called demangling. (Other languages like C++ do name mangling, too.)  ↩︎ You might have heard about “frame pointer omission”, which is a technique to infer the base of stack frames rather than storing it in explicitly. In this case, the frame pointer is not omitted.  ↩︎ A page is the smallest amount of physical memory that can be atomically mapped to virtual memory. On x86-64, the page size is virtually always 4 KiB.  ↩︎ Memory being both writable and executable is dangerous, and modern systems do not permit this by default for security reasons. Some platforms like iOS even make it impossible for memory to be writable and executable, unless the platform holder gives you the corresponding permissions.  ↩︎ This is generally known as a “stack overflow”, but that term can also mean a stack-based buffer overflow . Throughout this document, we use “call stack exhaustion” to avoid confusion.  ↩︎ There is likely some way to get itself to print out which files it opened, but the beauty of system call tracers is that you don’t need to know anything about the program you’re tracing.  ↩︎

0 views
Emil Privér 1 years ago

Why I Like Ocaml

According to my Linkedin profile, I have been writing code for a company for almost 6 years. During this time, I have worked on PHP and Wordpress projects, built e-commerce websites using NextJS and JavaScript, written small backends in Python with Django/Flask/Fastapi, and developed fintech systems in GO, among other things. I have come to realize that I value a good type system and prefer writing code in a more functional way rather than using object-oriented programming. For example, in GO, I prefer passing in arguments rather than creating a method. This is why I will be discussing OCaml in this article. If you are not familiar with the language OCaml or need a brief overview of it, I recommend reading my post OCaml introduction before continuing with this post. It will help you better understand the topic I am discussing. Almost every time I ask someone what they like about OCaml, they often say “oh, the type system is really nice” or “I really like the Hindley-Milner type system.” When I ask new OCaml developers what they like about the language, they often say “This type system is really nice, Typescript’s type system is actually quite garbage.” I am not surprised that these people say this, as I agree 100%. I really enjoy the Hindley-Milner type system and I think this is also the biggest reason why I write in this language. A good type system can make a huge difference for your developer experience. For those who may not be familiar with the Hindley-Milner type system, it can be described as a system where you write a piece of program with strict types, but you are not required to explicitly state the types. Instead, the type is inferred based on how the variable is used. Let’s look at some code to demonstrate what I mean. In GO, you would be required to define the type of the arguments: However, in OCaml, you don’t need to specify the type: Since expects to receive a string, the signature for will be: But it’s not just for arguments, it’s also used when returning a value. This function will not compile because we are trying to return a string as the first value and later an integer. I also want to provide a larger example of the Hindley-Milner type system: The signature for this piece of code will be: In this example, we create a new module where we expose 3 functions: make, print_car_age, and print_car_name. We also define a type called . One thing to note in the code is that the type is only defined once, as OCaml infers the type within the functions since is a type within this scope. OCaml playground for this code Something important to note before concluding this section is that you can define both the argument types and return types for your function. The next topic is pattern matching. I really enjoy pattern matching in programming languages. I have written a lot of Rust, and pattern matching is something I use when I write Rust. Rich pattern matching is beneficial as it eliminates the need for many if statements. Additionally, in OCaml, you are required to handle every case of the match statement. For example, in the code below: In the code above, I am required to include the last match case because we have not handled every case. For example, what should the compiler do if the is Adam? The example above is very simple. We can also match on an integer and perform different actions based on the number value. For instance, we can determine if someone is allowed to enter the party using pattern matching. OCaml playground But the reason I mention variants in this section is that variants and pattern matching go quite nicely hand in hand. A variant is like an enumeration with more features, and I will show you what I mean. We can use them as a basic enumeration, which could look like this: This now means that we can do different things depending on this type: But I did mention that variants are similar to enumeration with additional features, allowing for the assignment of a type to the variant. Now that we have added types to our variants and included , we are able to adjust our pattern matching as follows: OCaml Playground We can now assign a value to the variant and use it in pattern matching to print different values. As you can see, I am not forced to add a value to every variant. For instance, I do not need a type on so I simply don’t add it. I often use variants, such as in DBCaml where I use variants to retrieve responses from a database. For example, I return if I did not receive any rows back, but no error. OCaml also comes with Exhaustiveness Checking, meaning that if we don’t check each case in a pattern matching, we will get an error. For instance, if we forget to add to the pattern matching, OCaml will throw an error at compile time. The next topic is operators and specific binding operators. OCaml has more types of operators, but binding operators are something I use in every project. A binding could be described as something that extends how works in OCaml by adding extra logic before storing the value in memory with . I’ll show you: This code simply takes the value “Emil” and stores it in memory, then assigns the memory reference to the variable hello. However, we can extend this functionality with a binding operator. For instance, if we don’t want to use a lot of match statements on the return value of a function, we can bind so it checks the value and if the value is an error, it bubbles up the error. This allows me to reduce the amount of code I write while maintaining the same functionality. In the code above, one of the variables is an , which means that the binding will return the error instead of returning the first name and last name. I really like the concept of functional programming, such as immutability and avoiding side-effects as much as possible. However, I believe that a purely functional programming language could force us to write code in a way that becomes too complex. This is where I think OCaml does a good job. OCaml is clearly designed to be a functional language, but it allows for updating existing values rather than always returning new values. Immutability means that you cannot change an already existing value and must create a new value instead. I have written about the Concepts of Functional Programming and recommend reading it if you want to learn more. One example where functional programming might make the code more complex is when creating a reader to read some bytes. If we strictly follow the rule of immutability, we would need to return new bytes instead of updating existing ones. This could lead to inefficiencies in terms of memory usage. Just to give an example of how to mutate an existing value in OCaml, I have created an example. In the code below, I am updating the age by 1 as it is the user’s birthday: What I mean by “it’s functional on easy mode” is simply that the language is designed to be a functional language, but you are not forced to strictly adhere to functional programming rules. It is clear to me that a good type system can greatly improve the developer experience. I particularly appreciate OCaml’s type system, as well as its and types, which I use frequently. In languages like Haskell, you can extend the type system significantly, to the point where you can write an entire application using only types. However, I believe that this can lead to overly complex code. This is another aspect of OCaml that I appreciate - it has a strong type system, but there are limitations on how far you can extend it. I hope you enjoyed this article. If you are interested in joining a community of people who also enjoy functional programming, I recommend joining this Discord server.

0 views
Emil Privér 1 years ago

From Computer to Production With Nix

A while ago, I wrote “ Bye Opam, Hello Nix ” where the topic of that post was that I replaced Opam with Nix as it works much better. This post is about taking this a bit further, discussing how I use Nix for local development, testing, and building Docker images. The core concept of Nix is “reproducible builds,” which means that “it works on my machine” is actually true. The idea of Nix is that you should be able to make an exact copy of something and send it to someone else’s computer, and they should get the same environment. The good thing about this is that we can extend it further to the cloud by building Docker images. Even if Docker’s goal was to also solve the “it works on my machine” problem, it only does so to a certain level as it is theoretically possible to change the content of a tag (I guess that you also tag a new image ;) ? ) by building a new image and pushing it to the same tag. Another thing I like about Nix is that it allows me to create a copy of my machine and send it to production. I can create layers, import them using Docker, and then tag and push them to my registry. This specific post was written after working with and using Nix at work. However, the code in this post won’t be work-related, but I will show code that accomplishes the same task in OCaml instead of Python. The problems I wanted to solve at work were: In this article, we will create a new basic setup for an OCaml project using Nix for building and development. The initial code will be as follows, and it will also be available at https://github.com/emilpriver/ocaml-nix-template The code in this Nix config is for building OCaml projects, so there will be OCaml related code in the config. However, you can customize your config to suit the language and the tools you work with. The content of this config informs us that we can’t use the unstable channel of Nix packages, as it often provides us with newer versions of packages. We also define the systems for which we will build, due to Riot not supporting Windows. Additionally, we create an empty devShells and packages config, which we will populate later. We also specify the formatter we want to use with Nix. It’s important to note that this article is based on , which you can read more about here: https://shopify.engineering/what-is-nix The first thing I wanted to fix is the development environment for everyone working on the project. The goal is to simplify the setup for those who want to contribute to the project and to achieve the magical “one command to get up and running” situation. This is something we can easily accomplish with Nix. The first thing we need to do is define our package by adding the content below to our flake.nix. Here, I tell Nix that I want to build a dune package and that I need Riot, which is added to inputs: This also makes it possible for me to add our dev shell by adding this: So, what we have done now is that we have created a package called “nix_template” which we use as input within our devShell. So, when we run , we now get everything the needs and we get the necessary tools we need to develop, such as LSP, dune, and ocamlformat. This means that we are now working with a configuration like this: When working with Nix, I prefer to use it for running the necessary tools both locally and in the CI. This helps prevent discrepancies between local and CI environments. It also simplifies the process for others to run tests locally; they only need to execute a single command to replicate my setup. For example, I like to run my tests using Nix. It allows me to run the tests, including setting up everything I need such as Docker, with just one command. Let’s add some code into the object in our flake.nix. In the code provided, we create a new package named . This executes to verify our code. To run our tests in the CI or locally, we use . This method could potentially eliminate the need for installing tools directly in the CI and running tests, replacing all of it with a single Nix command. Including Docker as a package and running a Docker container in the buildPhase is also possible. This is just one effective method I’ve discovered during my workflows, but there are other ways to achieve this as well. Additionally, you can execute tasks like linting or security checks. To do this, replace with the necessary command. Then, add the output, such as coverage, to the folder so you can read it later. I have tried to use Nix apps for this type of task, but I have always fallen back to just adding a new package and building a package as it has always been simpler for me. So, time for building for release and this is the part where we make a optimized build which we can send out to production. How this works will depend on what you want to achieve but I will cover 2 common ways of building for release which is either docker image or building the binary. To enable binary building, we only need to add a and an to our default package used for building. This makes our definition appear as follows: This implies that when we construct the project using , we are building the project in an isolated sandbox environment and returning only the required binary. For example, the folder now includes: Here, main.exe is the binary we built. Another way to achieve a release is by building a docker image layers using nix that we later import into docker to make it possible to run it. The benefit of this is that we get a reproducible docker as we don’t use to build our image and that we can reuse a lof of the existing code to build the image and the way we achieve this is by creating a new where I in this case call this And to build our docker image now do we simply only need to run And we can later on load the layers into docker Afterwards, we can tag the image and distribute it. Quite convenient. There are some tools specifically designed for this purpose, which are very useful. For example, can be used to tag and push an image to a container registry, such as in a GitHub action. What Nix does when building a Docker image is that it replaces the Docker build system, often referred to as . Instead, we build layers that we then import into Docker. Not all packages exist on https://search.nixos.org/packages , but it’s not impossible to use that library if it doesn’t. Under the hood, all the packages on the Nix packages page are just Nix configs that build projects, which means that it’s possible to build projects directly from source as well. This is how I do it with the package below: This now allows me to refer to this package in other packages to let Nix know that I need it and that it needs to build it for me. Something to keep in mind when you fetch from sources is that if you use something such as , you use the host machine’s ssh-agent while uses the sandbox environment’s ssh-agent if it has any. This means that some requests don’t work unless you either use something like or add your ssh config during the build step. After all these configurations, we should now have a flake.nix file that matches the code below This code also exist at github.com/emilpriver/ocaml-nix-template I hope this article has helped you with working with Nix. In this post, I built a flake.nix for OCaml projects, but it shouldn’t be too hard to replace the OCaml components with whatever language you want. For instance, packages exist for JavaScript to replace NPM and Rust to replace Cargo. These days, I use Nix for the development environment, testing, and building, and for me, it has been a quite good experience, especially when working with prebuilt flakes. My goal with this post was just to show “a way” of doing it. I’ve noticed that the Nix community tends to give a lot of opinions about how you should do things in Nix. The hard truth is that there are a lot of different ways to solve the same problem in Nix, and you should pick a way that suits you. If you like this type of content and want to follow me to get more information on when I post stuff, I recommend following me on Twitter: https://x.com/emil_priver

0 views
Emil Privér 1 years ago

Announcing DBCaml, Silo, Serde Postgres and a new driver for postgres

I’ve spent the last four months working on DBCaml. Some of you may be familiar with this project, while others may not. When I started DBCaml, I had one primary goal: to build a toolkit for OCaml that handles pooling and mapping to types and more. This toolkit would also run on Riot, which is an actor-model multi-core scheduler for OCaml 5. An issue I’ve found in the OCaml space and databases is that most of the existing database libraries either don’t support Postgres version 14 and higher, or they run on the PostgreSQL library, which is a C-binding library. The initial release of DBcaml also used the Postgresl library just to get something published. However, this wasn’t something I wanted as I felt really limited in what I was able to do, and the C-bindings library would also limit the number of processes I could run with Riot, which is something I didn’t want. So, I decided to work hard on the Postgres driver to write a native OCaml driver which uses Riot’s socket connection for the database. This post is to describe the new change by talking about each library. The GitHub repo for this project exists here: https://github.com/dbcaml/dbcaml Before I continue, I want to say a big thank you to Leandro Ostera , Antonio Monteiro and many more in the OCaml community. When I’ve been in need of help, you have provided me with information and code to fix the issues I encountered. Thank you tons! <3 Now that DBCaml has expanded into multiple libraries, I will refer to these as “The DBCaml project”. I felt it was important to write about this project again because the direction has changed since v0.0.1. DBCaml, the central library in this project, was initially designed to handle queries, type mapping, and pooling. As the project expanded, I decided to make DBCaml more developer-friendly. It now aids in pooling and sending queries to the database, returning raw bytes in response. DBCaml’s pool takes inspiration from Elixir’s ecto. Currently, I recommend developers use DBCaml for querying the database and receiving raw bytes, which they can then use to build any desired features. However, my vision for DBCaml is not yet complete. I plan to extract the pooling function from DBCaml and create a separate pool manager, inspired by Elixir’s ecto. This manager can be used by developers to construct features, such as a Redis pool. If you’re interested in learning more about how DBCaml works, I recommend reading these articles: ”Building a Connnection Pool for DBCaml on top of riot ” and ”Introducing DBCaml, Database toolkit for OCaml” . A driver essentially serves as the bridge between your code and the database. It’s responsible for making queries to the database, setting up the connection, handling security, and managing TLS. In other words, it performs “the real job.” The first version of the driver was built for Postgresql, using a C-binding library. However, I wasn’t fond of this library because it didn’t provide raw bytes, which are crucial when mapping data to types. This library has since been rewritten into native OCaml code, using Riot’s sockets to connect to the database. The next library to discuss is Serde Postgres, a Postgres wire deserializer. The Postgres wire is a protocol used by Postgres to define the structure of bytes, enabling us to create clients for Postgres. You can read about the Postgres wire protocol at: https://www.postgresql.org/docs/current/protocol.html With the introduction of Serde Postgres, it’s now possible to deserialize Postgres wire and map the data to types. Here’s an example: By creating a separate library, developers can use Serde, Postgres, and Dbcaml to make queries and later parse the data into types. The final library to discuss is Silo. This is the high-level library I envisioned for DBcaml, one that handles everything for you and allows you to simply write your queries and work with the necessary types. Silo uses DBcaml to make raw queries to the database and then maps the bytes from Postgres to types using Serde Postgres. Here’s an example: Silo is the library I anticipate most developers will use if they don’t create their own database library and need further control over functionality. There is some more stuff I’ve planned for this project, such as building more drivers and deserializers for different databases: I also want to build more tools for you as a developer when you write your OCaml projects, and some of these are: I hope you appreciate these changes. If you’re interested in contributing to the libraries or discussing them, I recommend joining the Discord: https://discord.gg/wqbprMmgaD For more minor updates, follow my Twitter page: https://twitter.com/emil_priver If you find a bug would I love if you create a issue here: https://github.com/dbcaml/dbcaml/issues

0 views
Emil Privér 1 years ago

My Love/Hate Letter to Copilot

Just as Post Malone expressed a love-hate relationship with alcohol, I’m here to share my mixed feelings about Copilot. This bittersweet tool has been in my toolkit for a while. Despite its frequent frustrations, I find myself relying on it more than I’d like. Only after disabling the AI did I notice the changes in my programming habits due to its influence. For those unfamiliar with Copilot, here’s a quick introduction: Copilot is a tool that attempts to understand your problem and then searches GitHub for matching solutions and suggest it to you. For many is this tool something they love as the can create the solution faster as they can just wait for a suggestion to be recommended and apply it as fast they can and then move on with their life. When I began using Copilot and ChatGPT, I was amazed at how much faster I could create things. However, I hadn’t anticipated how it would change my approach to system creation, or how it might affect my perspective during development and how it made the engineer within me smaller. As software engineers, developers, or programmers, our role is not just to understand a problem and find a solution, but also to write good software. This means that while certain solutions, such as using in Rust, may technically work, they may not be the best practice. For example, it’s generally advisable to minimize the use of in order to conserve memory. If you’re unfamiliar with , it essentially copies existing memory into another part of memory and provides a new reference to this new memory. While this might not sound problematic, it can be quite detrimental. For instance, if you clone something that is 1GB in size, you allocate another 1GB within the memory. This could likely be avoided by properly utilizing reference and ownership rules. Another issue is that you could potentially overload the memory, causing your service to crash. Copilot might suggest using . We could apply this suggestion and proceed. However, instead of doing our job, we may end up relying on a solution provided by an AI that might not understand the real context. This can be problematic. I noticed this happening when I first started learning OCaml. There were instances when I waited for a suggestion, even though the solution was just two lines of straightforward code. One instance that I recall is when I was parsing a URL. Copilot suggested using Regex, which is not ideal due to its potential for bugs. Instead, I solved it using the following code: One of my major issues with Copilot is that it can diminish our problem-solving skills. Instead of analyzing a situation and finding a good solution ourselves, we increasingly delegate this task to AI. Over time, this could diminish our engineering mindset. For instance, when we encounter a bug, our first instinct might be to ask an AI about the bug, rather than trying to figure it out ourselves. Believe it or not, at this stage, the AI is less capable than you. You may find yourself in a loop where you keep telling the AI, “No, this doesn’t work,” and the AI keeps suggesting new code. Ironically, the solution could be as simple as changing a single character in the code. Another issue is that you might create a solution that, although functional, is subpar due to not leveraging your own skills. I believe it’s common for us to become complacent when AI becomes a significant part of our development process. I experienced this myself recently when I was building SASL authentication for Postgres in OCaml and encountered a tricky bug. Instead of manually inserting print statements into the code to debug, I copied the code and the error and handed it over to ChatGPT. The solution came from a combination of reading sqlx code and realizing that I had overlooked a small detail. As software engineers, continuous learning is essential. We often learn by problem-solving, addressing issues, and devising solutions. However, over-reliance on AI in our development process can hinder this learning. We may apply code without fully understanding it, which can be detrimental in the long run. Just because you used an AI to solve a bug doesn’t mean you should rely on it every time a similar issue arises. This can be a significant issue, especially for new developers. It’s crucial that we’re able to “feel” the code we’re working on. Programming is not just about understanding code; it’s about connecting the pieces in the larger puzzle to build a solution and it takes time to understand this and it’s really important in the beginning of the career and also why learning programming takes time. Imagine having a bytestring from which you need to parse values. Ideally, you would do this piece by piece - extract the first value, print it, then move on to the next value. Repeat this process until you’ve gathered all necessary values. It’s common to print the initial data for transparency. However, as AI becomes more integral to development and begins to handle such tasks for us, there may be situations where starting from scratch becomes challenging due to our reliance on these tools and we’re end up stuck. One day, we may face a problem that AI can’t solve, and we might not know where to start debugging. This is because we’re so used to having someone else handle it for us. This is why I always advise new developers to try to solve problems themselves first before asking for help. I want them to understand what they’re doing and not rely on me. I also don’t believe that we necessarily become more effective by using AI. Often, we might find ourselves stuck in a loop, waiting for new suggestions repeatedly. In such situations, we could likely solve the problem faster, and perhaps even better, by using our own brains instead. We often associate efficient programming with the ability to produce large amounts of code quickly. However, this isn’t necessarily true. A single line of code can sometimes be more efficient and easier to work with than ten lines of code. AI is effective at generating boilerplate but often falls short in providing quality solutions. I’ve critiqued Copilot for a while, but it’s worth mentioning that it’s not necessarily bad to use it, provided you choose the appropriate time. I still use Copilot, but only when I’m working on simpler tasks that it can easily handle, like generating boilerplate code. However, I only enable it occasionally. I’ve noticed that it’s crucial not to rely heavily on such tools, as doing so can lead to negative habits, like waiting for a suggestion and hitting enter repeatedly to find a solution. Another area I’ve noticed where it works quite well is when you’re programming in GO. GO is designed to be simple, and Copilot works well with simple tasks, so the code it recommends is mostly okay. AI can pose a significant challenge for new developers. It’s tempting to let AI dictate the path to a solution, rather than using it as one of many potential paths. This often leads to developers accepting the code returned by AI without truly understanding it. However, understanding the code is essential for new developers. The easiest way to contact me is through Twitter: https://twitter.com/emil_priver

0 views
Emil Privér 1 years ago

Introducing DBCaml, Database toolkit for OCaml

It’s time for me to discuss what I’ve been working on recently: Dbcaml. Dbcaml is a database toolkit built for OCaml, based on Riot. Riot is an actor-model multi-core scheduler for OCaml 5. Riot works by creating lightweight processes that execute code and communicate with the rest of the system using message-passing. You can find Riot on GitHub . The core idea of DBCaml is to provide a toolkit that assists with the “boring” tasks you don’t want to deal with, allowing you to focus on your queries. Some examples of these tasks include: This is an example of how you can use DBCaml: During the initial v0.0.1 release, DBCaml can be installed using the following command: I wanted to learn a new language and decided to explore functional programming. I came across OCaml online and found it interesting. When Advent of Code 2023 started, I chose OCaml as the language for my solutions. However, I didn’t build them using a functional approach. Instead, I wrote them in a non-functional way, using a lot of references. My solutions turned out to be so bad that a colleague had to rewrite my code. However, this experience further sparked my interest. One day, I came across Leostera , a developer working on Riot, an actor-model multi-core scheduler for OCaml 5. Riot is similar to Erlang’s beam, which intrigued me. It dawned on me that if I wanted to explore OCaml further, I needed a project to work on. That’s when I made the decision to build a database library for OCaml. I believed that it would be a useful addition to the Rio ecosystem. DBCaml can be categorized into three layers: Driver, Connection pool, and the interface that the developer works with. I have already explained how the connection pool works in a previous post, which you can find here: Building a Connection Pool . However, I would like to provide further explanation on drivers and the interface. The driver is responsible for communicating with the database. It acts as a bridge between DBCaml and the database. The main idea behind having separate drivers as a library is to avoid the need for installing unnecessary libraries. For example, if you are working with a Postgres database, there is no need to install any MySQL drivers. By keeping the drivers separate, unnecessary dependencies can be avoided. Additionally, the driver takes care of all the security measures within the library. DBCaml simply provides the necessary data to the drivers, which handle the security aspects. I will describe the current functionality of everything and explain my vision for how I believe this library will evolve in future releases. Currently, DBCaml provides four methods: start_link, fetch_one, fetch_many, and exec. These methods serve as the highest level of functionality in the package and are the primary interface used by developers for testing purposes in v0.0.1. These methods handle most of the tasks that developers don’t need to worry about, such as requesting a connection from the pool. I have a broad vision for DBCaml, which encompasses three categories: testing, development, and runtime. The specifics of what will be included in the testing and development areas will become clearer as we start working on it. However, currently, the most important aspect is to have a v0.0.1 release for the connection pool. This is the critical component of the system, and we need feedback on its functionality and to identify any potential bugs or issues. Writing effective tests can be challenging, particularly when it is not possible to mock queries. However, one solution to this problem is to utilize DBCaml. DBCaml can help you in writing tests by providing reusable code snippets. This includes the ability to define rows, close a database, and more, giving you control over how you test your application. I believe SQLx by Rust ( https://github.com/launchbadge/sqlx ) has done an excellent job of providing a great developer experience (DX). It allows users to receive feedback on the queries they write without the need to test them during runtime. In other words, SQLx enables the use of macros to execute code against the database during the compilation process. This way, any issues with the queries can be identified early on. It is, of course, optional for users to opt in to this feature. The advantage of this feedback during development is that users can work quickly without having to manually send additional HTTP requests in tools like Postman to trigger the queries they want to test. This saves users valuable time. By allowing users to test queries during compilation, they can skip writing tests for queries. This provides feedback on whether the query works or not during development. During runtime, it is important to have a system that can handle pooling for your application. This ensures that if a connection dies, it is recreated and booted again. Currently, we are in version v0.0.1, which is a small release with limited functionality. However, I have big plans for the future of this package. The purpose of creating v0.0.1, despite knowing that there will be upcoming changes, is to test the connection pool and ensure its functionality. The v0.0.1 release includes the ability to fetch data from the database and use it, along with a connection pool and a PostgreSQL driver. However, I will soon be branching out DBCaml into three new packages: This significant change will be implemented in the v0.0.2 milestone. I want to give a special thank you to Leostera, who has helped me a lot during the development. I wouldn’t argue that this is something I’ve just worked on. This is a joint effort between me, Leostera, and other members of the Riot Discord to make this happen. If you are interested and would like to follow along with the development, I can recommend some links for you:

0 views