Latest Posts (11 found)

IPC channel multiplexing: what next?

About three months ago, I posted IPC channel multiplexing: next steps . Since then I’ve taken a complete break from the project to make the most of the summer and to go on holiday. As I consider how to proceed, I think those next steps still make sense, but that’s not the whole story. The basic problem the multiplexing prototype is trying to solve is as follows. If an IPC channel endpoint is sent over another IPC channel, when it is received, it consumes a file descriptor (at least on Unix variants). A new file descriptor is consumed even if the same IPC channel endpoint is received multiple times. This can crash the receiving process if it runs out of file descriptors. The thing that has changed in the intervening gap is my motivation. I really enjoyed implementing multiplexing of IPC channels as it was relatively self-contained. Extending the API to support more Servo usecases does not feel so fun. Also, I would like more assurance that if I invest the effort to make IPC channel multiplexing suitable for adoption by Servo, that there’s a reasonable chance it will actually be adopted. There seem to be relatively few Servo developers who understand IPC channel well enough to engage with adopting multiplexing. Plus they are likely to be very busy with other things. So there may simply be a lack of traction. Also, multiplexing isn’t a simple piece of code, so merging it into IPC channel will increase the project’s size and complexity and therefore its maintenance cost. There may be performance or usability issues in adopting multiplexing. I’m not aware of any such issues and I don’t anticipate these being significant if they crop up, but there’s still a risk. Currently, I’m working in isolation from the Servo team and I’d like some reassurance that the direction I’m heading in is likely to be adopted. The advantages of continuing are: The disadvantages of continuing are: On balance, I think I’ll continue. It would be possible to move the multiplexing prototype to a separate repository and crate which on the IPC channel crate. The advantages of this are: One possible disadvantage is that it would not be possible to reuse IPC channel internals. For example, if one of the missing features for multiplexing was essentially the same as that for vanilla IPC channel, I couldn’t just generalise the code and share it. I think the most effective way forward is to test the Servo team’s willingness to adopt multiplexing by focussing on a usecase that is known to exhibit the bug, reproducing the bug in isolation, showing that multiplexing fixes the bug, and proposing a fix for Servo. So I’ll start by looking at the bug reports, picking one, and looking at the IPC channel usecase in Servo which hits the bug. I’ll defer the decision of whether to package the prototype as a separate repository until I start to touch the prototype code again. This is contrary to the sunk cost fallacy. ↩︎ I’m not sure what else I would prefer to do with my spare mental capacity. ↩︎ I really dislike Microsoft’s policy of trawling github.com to build AI models. I’m also shocked about Microsoft’s willingness to create e-waste by dead-ending Windows 10 and not supporting older hardware with Windows 11, although they have delayed the deadline with the Windows 10 Extended Security Updates (ESU) programme . (On the other hand, maybe this move will push more people to adopt Linux. See END OF 10 .) ↩︎ Unfortunately, and it’s a big unfortunately, this still requires the repository to be mirrored to github.com . See Non-Github account creation . ↩︎ Capitalising on the effort already expended. [1] Potentially fixing the bug. Overcoming the difficulties involved would give a greater sense of achievement. I enjoy solving difficult problems and it would keep my brain active. Potentially wasting more effort. Now may be an opportunity to retire properly from my career in software development. [2] It could increase the profile of the prototype. I could host this on codeberg.org rather than github.com [3] Ease of code navigation, since the code would be pure Rust rather than multiplatform. Ease of CI: Linux only. Ease of promotion of changes, since it wouldn’t require the involvement of IPC channel committers. Publication to crates.io for ease of consumption by Servo. [4] Documentation could be centred on multiplexing. This is contrary to the sunk cost fallacy. ↩︎ I’m not sure what else I would prefer to do with my spare mental capacity. ↩︎ I really dislike Microsoft’s policy of trawling github.com to build AI models. I’m also shocked about Microsoft’s willingness to create e-waste by dead-ending Windows 10 and not supporting older hardware with Windows 11, although they have delayed the deadline with the Windows 10 Extended Security Updates (ESU) programme . (On the other hand, maybe this move will push more people to adopt Linux. See END OF 10 .) ↩︎ Unfortunately, and it’s a big unfortunately, this still requires the repository to be mirrored to github.com . See Non-Github account creation . ↩︎

0 views
underlap 1 months ago

Metrics

With these weighted means, I would no longer double the total. So I had to be really pessimistic in estimating the large value. You may wonder how test code figured in KLOC sizings. For many years, the approach of development groups to test code was not to write any and, if they did write any, to delete it to avoid the overhead of maintaining it. You may also wonder how reused code figured in KLOC sizings. But in that period of time, software reuse was a future hope. Ultimately, a group from IBM Boblingen developed some reusable “building blocks” in the form of macros for various data structures. Prior to that, any data structure that was needed was coded for the situation at hand. Collections were represented using linked lists. Anything more sophisticated, such as hash tables or binary trees involved consulting Knuth’s “The Art of Computer Programming” and developing the data structure from scratch, often without unit tests. So data structures often added to the KLOC count, but were often under-estimated in the design phase. I was once briefly seconded to a department which was responsible for, among other things, estimating the number of defects for software releases. As a fresh mathematics graduate, I developed probabilistic equations for the distribution over time of defects of a release, based on its total size (in KLOCs). It turned out this distribution was usually drawn freehand by one of the long-standing members of that department, without the need for any equations. At various points in my early career, there was discussion in development teams of using “function points” as a measure of complexity of a feature and an alternative to KLOC (or person-month) sizings. It was said to be possible to calculate the function point sizing of a feature from a sufficiently detailed natural language description, although I never understood how this was remotely possible. I used this measure at various points to identify overly-complex modules. However, it was usually very difficult to reduce the complexity of such modules substantially, so the benefit of checking cyclomatic complexity was not clear. In one particularly ambitious IBM process, Cleanroom Software Engineering, the goal was to estimate the Mean Time To Failure (MTTF) of software components. This was based on similar efforts to measure the MTTF of hardware. [2] The process was known as 6 σ 6\sigma - the goal being to improve the MTTF so that it was six standard deviations higher than some supposed industry average. Various techniques were applied including informal mathematical refinement proofs. I personally introduced a process, branded “Clean/Z”, which attempted to combine cleanroom software engineering, the Z notation for specification and refinement, and literate programming. Refinement used a subset of the implementation language with associated proof rules. Some shrewd developers used to decide what the code structure would be up front and then present that decomposition as a refinement step. Very few others attempted anything like an informal proof – it was simply too laborious. With the advent of agile methods, we used to estimate features using story points. These were meant to represent customer value rather than implementation effort, although there was always an unspoken understanding among developers that story points really did represent implementation effort. [3] It was explicit in agile methods that story points were only a very rough estimate of the size of a feature. Later we adoped Fibonnacci sizings, using the values 1, 2, 3, 5, 8, 13, etc. There was a consensus that larger story point values were complete guesses and required more breaking down by first developing a rough prototype or “spike”. (Sizing of spikes was treated with even more disbelief.) Why did I use metrics? In the early days, because I was required to. Later on, as a way of negotiating rough schedules with management. I’m not saying software metrics are particularly worthwhile, but it is sometimes necessary to get a rough feel for the size of a piece of work before committing to a significant amount of effort. LID stood for “Line Item Description” - the first stage of a waterfall process in which a feature, or “line item” would be estimated in person-months or, equivalently, KLOCs. This was also a pun because there were signs pointing to large and small lids for take-away cups in the IBM Hursley coffee lounge. ↩︎ “Cleanroom” was a reference to a dust-free environment used to manufacture silicon chips. ↩︎ See, for example, the remarkably frank account Why Estimate In Points, Not Time? in “FAQ: Pivotal/Tanzu Labs Engineering” by Joe Moore, Matt Parker, and others. ↩︎ I added this conclusion after accidentally publishing the post, because I didn’t flag it as a draft, and in response to the question “Why oh why?” from Bob Marshall. ↩︎

0 views
underlap 1 months ago

RSS thank-you

This is just a short post to thank those of you subscribing to the RSS feed of this site. RSS is excellent and I really value the sites I subscribe to. Keep up the good work!

1 views
underlap 1 months ago

Visitor stats and the point of blogging

One of the best things about migrating to eleventy (and removing commenting) is that my site’s /privacy page now looks great: [1] This site does not track, or collect any information, about visitors. It does not use cookies. However, I’m still adjusting to the absence of visitor stats and this is causing me to review the point of blogging and get some draft posts published, including this one. WriteFreely used to tell me how many times each blog post had been visited. This was usually in single or double digits, although certain posts racked up around a thousand visits each. But this figure wasn’t a reliable indication of the number of readers. For example, if I (or anyone else) posted a link to one of my posts on Mastodon, then multiple servers would access the post to generate preview cards . [2] So, at best, the statistic was an upper bound on the number of times a post had been read. [3] With eleventy, I no longer have the statistic. The number of visitors shouldn’t concern me, but it was always tempting to look and maybe get a little dopamine hit. I like to think that at least a few people enjoy my posts. I know some people subscribe to the RSS feed because I changed the feed URL during migration and a couple of people mentioned it. This brings me to the point of blogging. Partly I’m writing for my own pleasure, and partly to share potentially useful, interesting, or enjoyable thoughts with others. It doesn’t matter that my readership is tiny. Another point of blogging is to clarify my own thinking and to help me reflect on certain topics. I would probably continue for that reason alone. On that basis, I should probably feel freer to write about subjects other than software development. [4] I hesitate because I don’t want to force my readers to filter out material that doesn’t interest them. But, dear reader, I guess I’m already posting on topics that are of no interest to you whatsoever and you are still here (for which I thank you). I guess the best solution is to create a separate blog for other subjects, a bit like I did with my notes instance (now a subset of my eleventy site), but instead to make it quite separate, with its own RSS feed. Anyway, I’ll mull that over and think about whether I really want to do it. Over four days drafting this post on and off, I received a couple of emails from people appreciating my blog. The main reason was that, yesterday, one of my posts reached the second slot on Hacker News with 90 odd comments. [5] So it’s encouraging to have some tangible feedback. I suspect drafting the present post helped motivate me to get some other draft posts over the line, including the one which got the attention of HN. It’s hard when a post has some rough edges, but I’ve got to learn not to let that hold me back. Not counting the FreshRSS instance I run on the site (which has cookies), because I’m the only user of that. ↩︎ My blog no longer produces preview cards on Mastodon, but who cares? I could use eleventy-plugin-metagen, but the default configuration seems to accommodate X/Twitter, which puts me off. ↩︎ Reader stats are necessarily approximate. Just because a browser loads a page doesn’t mean to say a person has read the content. ↩︎ Yeah, I know, this very post isn’t about software development! But blogging often ends up being a bit introspective, doesn’t it? ↩︎ The last time I got any real traction on Hacker News was with The Little Book of Rust Books , although that comment thread is no longer around. A close second was AI-Shunning robots.txt . ↩︎

0 views
underlap 1 months ago

Developer's block

Writer’s block is the paralysis induced by a blank page, but software developers experience a similar block and it can even get worse over time. Sometimes a good analogy is that your wheels are spinning and you need to gain traction. Let’s look at the different kinds of developer’s block, what causes them, and how to get unblocked. You want to write great code. In fact, most developers want each of their coding projects to be their best ever. That means different thing to different people, but if you apply all of the following practices from the start, you’ll soon get blocked. Once you buy into the benefits of testing, you’ll want to include decent unit and integration test suits in your code. Of course, at least in the longer term, a decent test suite helps maintain velocity. Right? You might also want to include some fuzz testing, to exercise edge cases you haven’t thought of. When you’ve realised how useful good documentation is, you’ll want a good README or user guide and probably some other documentation on how to contribute to or maintain the code. You might want to document community standards too, just in case. Then there are specific coding practices that you have learned such as good naming, modularity, and the creation and use of reusable libraries. You’ll want to stick to those, even if they need a bit more effort up front. You may have favourite programming languages that will influence your choice of language and tooling, regardless of what would actually make the job in hand easier to complete. For example, if you’re working on open source, you may prefer an open source programming language, build tools, and editor or IDE. Then you will probably want to use version control and write good commit logs. How could you not? You’ll then want to set up CI to run the test suite automatically. You may want to set up cross-compilation so you can support multiple operating systems. You may want to stick to a standard coding style and enforce that with automation in your preferred editor or IDE and maybe a check in CI. You’ll want a consistent error-handling approach and decent diagnostics so it’s easy to debug the code. If the code involves concurrency, you’ll want to put in extra effort to make sure your code is free from data races, deadlocks, and livelocks. All these practices are valuable, but sometimes they just mount up until you’re blocked. Another kind of developer’s block occurs later on in a project. Either you are new to the project and you just feel overwhelmed or you’ve been working on the project for a while, but you run out of stream and get stuck. The causes in these two cases are different. Feeling overwhelmed is often due to trying to rush the process of gaining understanding. Nobody comes to a new codebase and instantly understands it. Another issue with a new codebase is unfamiliarity with the implementation language or the conventions in the way the language is used. Running out of steam may be due to overwork or a lack of motivation. You have to find a way in. Sometimes trying the code out as a user gives you a better idea of what it’s all about. Sometimes you need to read the docs or tests to get an idea of the externals. Eventually, you can start looking at the source code and building up a mental model of how it all fits together to achieve its purpose. If there are other people working on the project, don’t be afraid to ask questions. [1] Sometimes a newcomer’s naive questions help others to understand something they took for granted. If you’re new to the implementation language of a project, take some time to learn the basics. Maybe you’re fluent in another language, but that doesn’t mean you can instantly pick up a new language. When you come across a confusing language feature, take the opportunity to go and learn about the feature. Remember the dictum “If you think education is expensive, try ignorance”. It’s important to take regular breaks and holidays, but sometimes you’re mentally exhausted after finishing one or more major features. This is the time to take stock and ease off a little. Perhaps do some small tasks, sometimes known as “chores”, which are less mentally taxing, but nevertheless worthwhile. Maybe take time to pay off some technical debt. Pick a small feature or bug and implement it with the minimum effort. Circle back round to improve the tests, docs, etc. Rather than implementing all your best practices at the start of a project, see if there are some which can wait a while until you’ve gained some traction. Sometime you need to do a quick prototype, sometimes called a “spike”, in which case just hack together something that just about solves the problem. Concern yourself only with the happy path. Write just enough tests to help you gain traction. Then keep the prototype on a branch and circle back round and implement the thing properly with decent tests and docs. It’s ok to refer to the prototype to remind yourself how you did some things, [2] but don’t copy the code wholesale, otherwise you’ll be repaying the technical debt for ages. If you’re trying to learn about a dependency, it’s sometimes easier to write a quick prototype of using the dependency, possibly in an empty repository, or even not under version control at all if it’s really quick. Don’t polish your docs prematurely. Keep the format simple and check it in alongside the code. Capture why you did things a particular way. Provide basic usage instructions, but don’t do too much polishing until you start to gain users. I think Michael A. Jackson summed this up best: Rules of Optimization: Rule 1: Don’t do it. Rule 2 (for experts only): Don’t do it yet. So don’t optimise unless there is a genuine problem - most code performs perfectly well if you write it so a human being can understand it. If you write it that way, you have some chance of being able to optimise it if you need to. In that case, do some profiling to find out where the bottlenecks are and then attack the worst bottleneck first. After any significant changes and if the problem still remains, re-do the profiling. The code might be a little half-baked, with known issues (hopefully in an issue tracker), but don’t let this hold you back from releasing. This will give you a better feeling of progress. You could even get valuable early feedback from users or other developers. You may be held up by a problem in a dependency such as poor documentation. It is tempting to start filling in the missing docs, but try to resist that temptation. Better to make minimal personal notes for now and, after you’ve made good progress, considering scheduling time to contribute some docs to the dependency. Similarly, if your tooling doesn’t work quite right, just try to get something that works even if it involves workarounds or missing out on some function. Fixing tooling can be another time sink you can do without. Are you prone to developer’s block? If so, what are your tips for getting unblocked? I’d love to hear about them. Some interesting comments came up on Hacker News, including a link to an interesting post on test harnesses . But try to ask questions the smart way . ↩︎ I’ve found git worktree useful for referring to a branch containing a prototype. This lets you check the branch out into a separate directory and open this up alongside your development branch in your editor or IDE. ↩︎

0 views
underlap 1 months ago

Software convergence

The fact that such limits turn out to be members of the semantic domain is one of the pleasing results of denotational semantics. That kind of convergence is all very well, but it’s not what I had in mind. I was more interested in code which converges, to some kind of limit, as it is developed over time. The limit could be a specification of some kind, probably formal. But how would we measure the distance of code from the specification. How about number of tests passing? This seems to make two assumptions: Each test really does reflect part of the specification. The more distinct tests there are, the more closely the whole set of tests would reflect the specification. The second assumption, as stated, is clearly false unless the notion of “distinct tests” is firmed up. Perhaps we could define two tests to be distinct if it is possible to write a piece of code which passes one of the tests, but not the other. There’s still a gap. It’s possible to write many tests, but still not test some part of the specification. Let’s assume we can always discover untested gaps and fill them in with more tests. With this notion of a potentially growing series of tests, how would we actually go about developing convergent software? The key is deciding which tests should pass. This can be done en masse , a classic example being when there is a Compliance Test Suite (CTS) that needs to pass. In that case, the number/percentage of tests of the CTS passing is a good measure of the convergence of the code to the CTS requirements. But often, especially with an agile development process, the full set of tests is not known ahead of time. So the approach there is to spot an untested gap, write some (failing) tests to cover the gap, make those (and any previously existing) tests pass, and then look for another gap, and so on. The number of passing tests should increase monotonically, but unfortunately, there is no concept of “done”, like there is when a CTS is available. Essentially, with an agile process, there could be many possible specifications and the process of making more tests pass simply reduces the number of possible specifications remaining. I’m still mulling over the notion of software convergence. I’m interested in any ideas you may have. One of nice property of convergent software should be that releases are backward compatible. Or, I suppose, if tests are changed so that backward incompatible behaviour is introduced, that’s the time to bump the major version of the next release, and warn the users. I’m grateful to some good friends for giving me tips on LaTeX \LaTeX markup. [2] In particular, produces a “curly” epsilon: ε \varepsilon . Tom M. Apostol, “Mathematical Analysis”, 2nd ed., 1977, Addison-Wesley. ↩︎ I’m actually using KaTeX \KaTeX , but it’s very similar to LaTeX \LaTeX . ↩︎

0 views
underlap 1 months ago

Week note 2025-08-21

I’m not in the habit of writing week notes, but given that I’ve been fiddling around with various things, I thought I’d try out the format. Last week the weather in the UK was glorious, so my wife and I went on various walks and day trips and even went for a swim/float in the sea. I also replaced the broken SSD on my Linux desktop and installed arch again. But what have I been up to this week? I was getting back into running a few weeks ago and then mildly strained my Achilles. So I’ve paused running and have been going to the gym and using a cross-trainer to regain strength and protect against injury. Going to the gym is more time-consuming than I’d like, so as soon as I can, I’ll want to get back to running. I’m enjoying Gordon Corera’s “The Spy in the Archive”. I also bumped into the film “Tenet” on BBC iPlayer. Although it seems a bit violent for my taste (Kenneth Branagh plays an extremely nasty piece of work), the 12 rating encouraged me to watch more of it. The mind-bending nature of the plot is very enjoyable, although I don’t find the dialogue particularly easy to hear, so I’m wondering how many details I’m missing. I’ll need to finish it off some time when my wife is busy – it wasn’t her cup of tea. I’ve also done a bit of gardening by re-planting a bed in the front garden and using bark chippings to suppress any weeds. Gardening isn’t my favourite activity, but this task gave me a nice sense of satisfaction. I’ve been enjoying the stability of my Linux desktop after re-installing arch. The i3 window manager continues to work out well. I’m particularly pleased that, so far, suspend/resume has worked perfectly. Previously, every 10-20 suspend/resumes would result in a crash and require a hard reboot. I did one or two rolling upgrades, which were as painless as usual. I’ve been tidying up some things on my website. I replaced WriteFreely with eleventy a couple of weeks ago. But this week I’ve been tweaking the content, especially the “cornerstone” pages (/about etc.), and chipped away on some draft posts (which are taking a bit longer than I’d like). I ripped out a couple of servers from the VPS: a gist server and a link shortener. I don’t use gists much – I migrated the useful ones to the notes section of my site – and I have never used the link shortener properly, even though I enjoyed implementing the underlying algorithm in Rust. I updated the VPS to the latest release of Debian 12, which was very straightforward. I won’t need to migrate to Debian 13 until 2028, which is nice to know. I look after live streaming at my church and we are in the middle of a major refurishment which involves rewiring and re-siting the cameras and tech desks (sound, AV, and live streaming). Since I’ll be away when the system is commissioned, I’ve written up some acceptance tests and shared them with others who will be around. I’ll probably have to update my “tech notes” after the system is bedded in and help the live streaming team get to grips with any changes. My next big chunk of development work will be on Servo: writing some ipc-channel tests to mirror the way Servo uses ipc-channel. If I can reproduce some of the file descriptor exhaustion scenarios, so much the better. But I want to leave this to the last quarter of the year and enjoy some holidays before then. In September we have planned a trip to Ireland. We’ve been to Dublin before, but this time we are spending a few weeks in the south-west and south. I’m expecting it to be beautiful, but wet. Then we’ve got some sailing booked in the Norfolk Broads. After that I should be rested and up for a challenge. I expect it to be fairly taxing to unearth, and understand, the ipc-channel usecases in Servo. After that, I’ll want to extend my ipc-channel multiplexing prototype to accommodate more Servo usecases. The goal is to multiplex the ipc-channel usecases in Servo to solve the long-standing file descriptor exhaustion bug.

0 views
underlap 1 months ago

Is it worth blocking AI bots?

I don’t want AI bots scraping my website. Apart from the licensing issue if the scraped content is used inappropriately [1] , I’m very concerned about the environmental impact of AI in terms of the power and water consumption of data centres as well as electronic waste. The proliforation of AI bots is probably a temporary phenomenon since the AI bubble – hype and over-investment coupled with limited results – is likely to pop within a few years. But meanwhile, website owners like myself need to decide how to respond. In this post I outline my current approach of blocking AI bots from my website and ask whether it’s really worth it. My website has a file listing the AI-related bots I don’t want to access content. The list is maintained by the ai.robots.txt community project. This works fine for bots, run by responsible companies, [2] which respect . But what about bots which ignore ? I also run an nginx module which blocks access to the site [3] by the same list of bots. I recently tried to rebuild the module and the build failed due to a new warning in gcc v15. [4] Also, there are some vulnerabilities in the dependencies of this project, so – assuming the vulnerabilities have been fixed – I’d need to bump some versions to pick up the fixes. In short, the module is starting to need more effort to maintain. Some AI bots respect . So it seems worth the small amount of effort to keep up to date. Other bots ignore , either due an oversight (which seems unlikely) or maliciously. I recently checked my server logs and there were very few accesses by bots. Also the feasibility of blocking by user agent depends on AI bots using predictable user agents and malicious bots are starting to vary their user agents to avoid detection. So I think it’s simply not worth the effort required to maintain the nginx module. An alternative would be to configure nginx filters, again using an include from ai.robots.txt . But this is yet another thing to keep up to date and doesn’t seem worth the effort, given the low bot traffic on my site and the fragility of blocking by user agent. I really don’t want to get into blocking by IP address, or the complexity of fronting my site with a system which does this for me. I’ll stick with and hope that, until the AI bubble bursts, relatively responsible AI companies will drive their malicious counterparts out of business. This site is licensed under CC BY-NC-SA 4.0 . ↩︎ Perhaps I should say “relatively responsible”, given the claims some of these companies are making. ↩︎ Except for , to which access is always granted. ↩︎ Setting the environment variable worked around this. ↩︎

0 views
underlap 2 months ago

Formatting maths in Eleventy with KaTeX

The dependency was pretty old: v0.6.0. The new version was needed to recognise the “output” option. ↩︎ This is from a draft post. The maths is taken from wikipedia . ↩︎

0 views
underlap 2 months ago

Blogging in markdown

I recently switched my blog to eleventy and so posting now consists of editing a markdown file and regenerating the site. There are several benefits: This post discusses markdown footnotes, the YAML preamble used in eleventy markdown files, how best to configure markdown in eleventy, and how I use Code OSS to edit markdown. But first I’d like to reflect on the advantages of markup languages, of which markdown is one. WYSIWYG (What You See Is What You Get) has sometimes been described as WYSIAYG (What You See Is All You’ve Got). In other words, the content doesn’t necessarily imply the logical structure of a document. Being able to see the structure of a document makes it more readable. Also, I’ve seen Microsoft Word documents that would make your toes curl: [1] the author used arbitrary formatting to achieve what, in their opinion, looked good. But in doing so, they failed to provide a logical structure. I have used WYSIWYG editors, such as Microsoft Word and OpenOffice/LibreOffice, but I tend to spend too long fiddling with the formatting, which is a distraction from writing the content. Also, I have experienced situations where a document gets corrupted and cannot be opened. This is more likely with WYSIWYG editors which store documents in a binary format. Therefore I much prefer markup languages over WYSIWYG. The structure of a document is clearer and there’s less need to pay attention to formatting while writing. I’ve used various markup languages over the years: GML, SGML [2] (briefly), HTML, LaTeX, and markdown. I really like LaTeX, especially when mathematics is involved, but markdown has the advantage that the source is more readable. The authors of RFC 9535, of which I was one, used markdown [3] , so it’s even suitable for writing technical documents. That said, let’s look at one of the main benefits of moving my blog to eleventy. The beauty of using markdown footnotes is that they are numbered and sorted automatically. Using a meaningful name for a footnote rather than a number makes it easier to keep track of which footnote goes with which reference. Here’s an example of the syntax: With manual numbering, adding a footnote in the middle of the sequence was awkward and error prone. Also, the footnotes can be kept near to where they are referenced, rather than having to be put at the bottom of the file. I installed a footnotes plugin for , [4] to use markdown footnotes in eleventy. So much for one of the main benefits of using markdown for blog posts. On the other hand, an unfamiliar feature was forced on me by the eleventy base blog: each markdown post has to start with a preamble written in YAML. A preamble seems like a reasonable place to store metadata for a post. For example, this post’s preamble is: I’m still getting used to listing tags in the preamble. WriteFreely used to render hashtags automatically, which was more in line with the philosophy of markdown. Also, it would be more natural to use a top-level heading at the start of a post to indicate the title. The default configuration of eleventy markdown isn’t ideal. Here’s my configuration: ensures semantic line breaks. Then breaking a paragraph across multiple lines is not reflected in the rendered version. So it’s possible to put each sentence on its own line. This makes for better readability of the markdown and better diffs. [5] If you need persuading of the advantages of this, see Semantic Linefeeds , which includes the quote below, Semantic line breaks are a feature of Markdown, not a bug , and Semantic Line Breaks . Hints for Preparing Documents Most documents go through several versions (always more than you expected) before they are finally finished. Accordingly, you should do whatever possible to make the job of changing them easy. First, when you do the purely mechanical operations of typing, type so subsequent editing will be easy. Start each sentence on a new line. Make lines short, and break lines at natural places, such as after commas and semicolons, rather than randomly. Since most people change documents by rewriting phrases and adding, deleting and rearranging sentences, these precautions simplify any editing you have to do later. — Brian W. Kernighan, 1974 ensures proper quote marks are used and various constructs are replaced by symbols, e.g.: I’m using Code OSS (the open source variant of VSCode) for editing. Yeah, I know: it’s not Emacs or vi. But it does have a ton of useful features and plugins which work out of the box. In addition to the built-in markdown editing and preview support in Code OSS, I installed the following plugins: [6] Markdown Footnote - renders footnotes correctly in the preview. Markdown yaml Preamble - displays the preamble at the start of the preview. [7] For example, the preamble of this post renders in the preview as: Markdown lint - helps enforce a standard style for writing markdown. I’m pretty happy writing blog posts as plain markdown files. There are many more advantages than disadvantages. Let’s see if my opinion is the same in six months’ time. The most egregious examples have been students’ assignments, but others have come close. ↩︎ This was with Framemaker. I can’t remember whether the markup was actually SGML or XML. ↩︎ We actually used kramdown, a dialect of markdown geared towards writing IETF specifications. ↩︎ The markdown support used by eleventy base blog. ↩︎ particularly if you use ↩︎ I used arch’s code marketplace to install plugins from the VSCode marketplace. This seems legit if I restrict myself to plugins with an OSS license. After all, I could have downloaded the source of each plugin and installed it in Code OSS. ↩︎ Having the table in the preview at least means the title features somewhere. But I’d prefer the plugin to render the title as a heading, so I suggested this in an issue . ↩︎

1 views
underlap 2 months ago

Arch linux take two

After a SSD failure [1] , I have the pleasure of installing arch linux for the second time. [2] Last time was over two years ago (in other words I remember almost nothing of what was involved) and since then I’ve been enjoying frequent rolling upgrades (only a couple of which wouldn’t boot and needed repairing). While waiting for the new SSD to be delivered, I burned a USB stick with the latest arch iso in readiness. I followed the instructions to check the ISO signature using gpg: So this looks plausible, but to be on the safe side, I also checked that the sha256 sum of the ISO matched that on the arch website. My previous arch installation ran out of space in the boot partition, so I ended up fiddling with the configuration to avoid keeping a backup copy of the kernel. This time, I have double the size of SSD, so I could (at least) double the size of the boot partition. But what is a reasonable default size for the boot partition? According to the installation guide , a boot partition isn’t necessary. In fact, I only really need a root ( ) partition since my machine has a BIOS (rather than UEFI). Since there seem to be no particular downsides to using a single partition, I’ll probably go with that. Then I don’t need to choose the size of a boot partition. The partitioning guide states: If you are installing on older hardware, especially on old laptops, consider choosing MBR because its BIOS might not support GPT If you are partitioning a disk that is larger than 2 TiB (≈2.2 TB), you need to use GPT. My system BIOS was dated 2011 [3] and the new SSD has 2 TB capacity, so I decided to use BIOS/MBR layout, especially since this worked fine last time. Here are the steps I took after installing the new SSD. Boot from the USB stick containing the arch ISO. Check ethernet is connected using ping. It was already up to date. Launch and set the various options: I then chose the Install option. It complained that there was no boot partition, so I went back and added a 2 GB fat32 boot partition. Chose the install option again. The installation began by formatting and partitioning the SSD. Twelve minutes later, I took the option to reboot the system after installation completed. After Linux booted (with the slow-painting grub menu, which I’ll need to switch to text), I was presented with a graphical login for i3. After I logged in, it offered to create an i3 config for me, which I accepted. Reconfigured i3 based on the contents of my dotfiles git repository. Installed my cloud provider CLI in order to access restic/rclone backups from the previous arch installation. At this point I feel I have a usable arch installation and it’s simply a matter of setting up the tools I need and restoring data from backups. I wanted to start dropbox automatically on startup and Dropbox as a systemd service was just the ticket. The failed SSD had an endurance of 180 TBW and lasted 5 years. The new SSD has an endurance of 720 TBW, so I hope it would last longer, although 20 years (5*720/180) seems unlikely. ↩︎ I was being ironic: it was quite painful the first time around. But this time I know how great arch is, so I’ll be more patient installing it. Also, I have a backup and a git repo containing my dot files, so I won’t be starting from scratch. ↩︎ There was a BIOS update available to fix an Intel advisory about a side-channel attack. However, I couldn’t confirm that my specific hardware was compatible with the update, so it seemed too risky to apply the update. Also, browsers now mitigate the side-channel attack. In addition, creating a bootable DOS USB drive seems to involve either downloading an untrusted DOS ISO or attempting to create a bootable Windows drive (for Windows 10 or 11 which may require a license key), neither of which I relish. ↩︎

0 views