Latest Posts (20 found)

📝 2026-06-17 13:56: So you know how we have the incubator setup? Well, unbeknownst to me, my wife...

So you know how we have the incubator setup? Well, unbeknownst to me, my wife gave one of our broody hens a cluch of Guinea fowl eggs to sit on a month ago . She just sent me this... Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

The State of Fable, The Jailbreak Problem, SpaceX Acquires Cursor

The administration is very likely wrong about Fable, but that is ultimately Anthropic's responsibility.

0 views

Adding a Town Square

I recently learned about this fantastic project where visitors are able to "chat" with one another in a fun and private way. I had to try it! So now, at the bottom of every page on this site, you will see my little town square. Please take a look and have some fun with it. If you want to learn more about Town Square, you can take a look at this post from its creator, Cauê Napier. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

📝 2026-06-17 07:04: We have the incubator setup incubating a dozen eggs, including the last 2 our hen...

We have the incubator setup incubating a dozen eggs, including the last 2 our hen that was caught by the fox layed. Expect regular updates. 🐣 Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Flax debugging: making a hash of things

I was debugging an issue with a JAX/Flax NNX training loop the other day, and found a neat little trick to help debug it. Specifically, I wanted to see if the issue was with my model, my loss function, my optimiser settings, or the "plumbing" of the training loop itself -- were gradients actually coming through and being applied to the parameters? I could print out the loss and the gradients, but printing out the parameters to see if they were changing was unhelpful -- any given update might only change a small number of parameters, or might change them such a small amount that I'd not notice -- especially given that the model had 77 million of them! Let's take a look. I am building an LLM from scratch in JAX and Flax NNX, and at this stage I'm trying to get the training loop right. As a simple test, I've just implemented the "shell" of the LLM -- the token embeddings on the input side, and the final linear layer for an output head, wired directly together. My plan was to train that so that given a sequence, instead of predicting next tokens for each position, it would "predict" the sequence itself -- that is, I might train it with the input ...and the target ...rather than the normal setup for an LLM, where you feed it ...and give it targets of So, in LLM terms, I'd be training a model to project from vocab space to a learned embedding space where each token had a distinct-enough embedding for the output head to be able to reliably project back to logits in vocab space. There's a bit of background here if that was all Greek to you . Here's the core part of the code I was working with, the function, which seems to be the traditional JAX name for the JITted part of your code that does the forward pass through the model, works out the gradients, and then applies them to update the model: I'd based it on the "Basic Usage" example that's currently right there on the front page of the Flax site. Seasoned Flax veterans will probably spot the issue right away, but it wasn't obvious to me -- so it was time to dig in. The problem was that loss was not dropping -- indeed, taken to two decimal places, it was stuck at 10.82. The digits to the right of that changed for each batch, but the first four did not. Now, this model was using the GPT-2 tokeniser, and 10.82 is exactly the loss that you'd expect if the model was essentially guessing randomly -- if you convert it to perplexity by calculating e 10.82 , you get about 50,011 -- which is very close to the GPT-2 vocab size of 50,257. Perplexity is, loosely, the number of tokens that the model was trying to choose between for a typical input -- so a perplexity equal to the vocab size is what you'd expect of a random model that is getting it right about one in 50,257 times. That said, getting that loss consistently was a solid validation of my loss function! It's vanishingly unlikely that it would have been getting that specific number so consistently if I'd made a mess of that. The tiny variations I was seeing in the third and subsequent decimal places would make sense, as they could easily be due to the variations in the contents of the different batches. So was it that the gradients were somehow zero, or NaNs, or something else that couldn't be usefully applied to the model by the optimiser? I printed them out in the function (removing the decorator, as otherwise the s would only get executed in the initial JIT pass through the function to compile it -- not when it had actual data 1 ). The result was values like this: Those looked plausible enough -- pretty small, but not so tiny that I'd expect them to have no effect at all with my learning rate of 0.0014. It was time to dig into the training loop's plumbing. The obvious suspect was the update step -- was that call to actually changing the parameters at all? Flax's NNX API is a bit odd compared to the normal JAX functional way of doing things . In vanilla JAX code you would expect to do something like this to apply gradients: That is, you get the new parameters by applying a transformation to the old ones. NNX, by contrast, is more PyTorch-flavoured. It updates the parameters in-place, using a function with a side effect of mutating one of its parameters: ...rather than something more functional like this imaginary API: I could easily imagine that I'd got something wrong that would break that in-place update, as it has the feel of something that would have to be quite delicately implemented on top of a functional system like JAX. But how could I see whether the parameters were changing, when there were 77 million of them and they would be being updated (based on gradients like -2.6879393e-06 and a learning rate of 1.4e-3) in the ninth decimal place or beyond? Printing the arrays out was a non-starter! After a little thought, I realised that the solution was to use hashes. Even tiny changes in the parameters' values would change their hashes drastically. So if the parameters were not being updated, as I suspected, I'd see constant hashes. If they were being updated, even by a minuscule amount, then the hashes would change. This GitHub discussion pointed me in the right direction: if I could get the parameters as pure JAX arrays, I could do this: ...where is just . That would produce a hash that was stable for the life of this run -- the same parameters would always have the same hash, and different ones would differ, just as we want. It could vary from run to run (Python uses different hash seeds in each new interpreter), but that wouldn't matter for this kind of debugging. I wasn't sure what the structure of my Flax model's parameters was, but printing them out in the training loop told me: So, guided by that, I added these lines to the training loop: Obviously copying the arrays around and converting them like that would slow things down, but for debugging purposes, it looked solid. I kicked off the training loop, and the problem was clear: ...and so on. The hashes were not changing, so the model's parameters were not being updated, even by a tiny amount. Gotcha! The problem turned out, as I had suspected, to be related to the in-place updates that NNX does. Like I said earlier, I'd based my training loop on the "Basic Usage" example on the Flax site -- but I'd messed up one important thing. I had this: ...and they had this: You can see a number of differences -- for example, they're baking the inputs and targets into the lambda they're using for the loss function through a lexical closure, and that means that they're only passing in the model to the version of it wrapped in . But none of that matters! The real difference is actually nicely highlighted with a comment, but I'd completely managed to miss it. Right at the start, where I had , they had this: It 100% makes sense that in order to support this kind of non-functional, in-place updating of the model's parameters, you have to have a modified version of the JIT decorator. And I was just using the standard, functional pure-JAX one. Fixing that fixed the problem: The hashes were changing! And even better, if you scroll to the right you'll see that loss was slowly dropping. After 10k or so iterations, I was seeing 0.000: I had my do-nothing "LLM" working. A satisfying debugging journey -- and while I don't think I'll make this specific mistake in the future, I think that the parameter-hashing trick is actually a really useful trick for the toolbox. If you're uncertain as to whether your parameters are being updated, just looking at them probably won't help. But looking at their hashes can help you find out whether anything is changing. And I think that the pattern that I used to zoom in on it is a useful one, too. I always track loss, so it's a good starting point (indeed, seeing that it wasn't falling was what told me that something was going wrong). But checking that it has a sane -- or ideally, as in this case, a meaningful -- value is a nice sanity check that we have a working loss function and a model that isn't doing something completely pathological. Moving on from there to checking that some kind of gradients are flowing through is a solid next move (and might become increasingly interesting with deeper models where they can vanish or explode ). Then finally we can check the parameters -- in particular, are they changing? 2 Let's see how many new tricks I pick up as I work through this LLM project. I always forget that exists -- I could have used that instead, and kept the JIT.  ↩ Something's slightly broken in my brain and I keep reading that as "is our parameters changing" in George W. Bush's voice . Maybe I can stop that from happening by inflicting it on my readers instead. You're welcome.  ↩ I always forget that exists -- I could have used that instead, and kept the JIT.  ↩ Something's slightly broken in my brain and I keep reading that as "is our parameters changing" in George W. Bush's voice . Maybe I can stop that from happening by inflicting it on my readers instead. You're welcome.  ↩

0 views
Unsung Today

Clicking, fast and slow

In iPhone’s accessibility settings you can choose the allowed speed of double- and triple-taps on its side button (why is it important? we talked about it once ), and the interface does something nice – after you make a choice, it shows the expected speed in a sort of a preview: To be honest with you, I was surprised that I liked it. This feels like it’d be a perfect example of cheapness , especially given the iPhone has this delightful animation that could be reused here: But, I don’t know. Somehow, this one feels like it’d be too complicated. Maybe cheap is okay if one cannot think of a better “bespoke” interface? Cheap here also has an added benefit of reusing existing patterns, which might feel nicer in the more utilitarian surroundings of settings. But my favorite thing that elevated this was that with each visual blink there is also an accompanying haptic buzz. I think this is really clever. A haptic buzz is much “closer” to your fingers than onscreen blinking, and can help you feel the speed rather than just see it. Unfortunately, the same clever preview is not present here in the otherwise very similar AirPods menu… = 3x)" srcset="https://unsung.aresluna.org/_media/clicking-fast-and-slow/3-framed.1600w.avif" type="image/avif"> …and I also found myself wondering what would it take for it to make its way here as well: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/clicking-fast-and-slow/4.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/clicking-fast-and-slow/4.1600w.avif" type="image/avif"> #apple #ios

0 views
Unsung Today

I can’t stop watching Bret Victor’s talks

You might have seen Bret Victor’s 55-minute Inventing On Principle talk soon after he gave it in 2012. If not, you should check it out. If you did, you should check it out again and see how it makes you feel today: = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-cant-stop-watching-bret-victors-talks/yt1-play.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/i-cant-stop-watching-bret-victors-talks/yt1-play.1600w.avif" type="image/avif"> It is about interactions but in the service of something grander, which (if I’m doing my job well) you might recognize as Unsung’s core theme. Victor – a designer, researcher, and computing historian – gave a few other talks in the few years since, and I thought a little guide might be helpful: There are some wonderful repeating themes in there: I love this blend of theory and practice, inspiration and pragmatism, high- and low-level. The tools look surprisingly professional for research projects, but underlying their microinteractions is a deep philosophical stance. It all reminds me a bit of Jef Raskin and Doug Engelbart. Victor’s last talk of this era is Seeing Spaces (15 mins) from 2015, serving as a sort of introduction of him moving toward computing in physical spaces. As far as I understand, Victor has been spending time on Dynamicland since, which is definitely more physical computing, but also a lot more academic and scrappy, and as such out of range for this blog. (His website is worth checking out , especially if you’re not in the mood for talks and would like to get to know his work in a different way.) #conference talk #flow #interface design #youtube Media For Thinking The Unthinkable (40 mins) is the continuation, with slightly more academic examples. Drawing Dynamic Visualizations (35 mins) is specifically about information visualization, chiefly a demo of the “Illustrator, but programmatic” tool showed briefly in the above talk. There’s also a bit more theory. Similarly, Stop Drawing Dead Fish (53 mins) is a demonstration of a different programmatic tool to make animations. The Humane Representation of Thought (56 mins) presents more theoretical underpinnings to the Inventing On Principle talk. The Future of Programming (32 mins) is a – mostly static/​traditional – history lesson about what programming could be. We can do and expect better from computing and interactions. You have to know your history to march confidently toward the future. Ideas need an environment that nurtures them. Playful environments leads to more discoveries. Feedback doesn’t just have to happen. It has to happen immediately and comprehensively. There are no left-brained and right-brained people, but our brains have two different modalities: language (algebra) vs. spatial (geometry). A big emphasis on two-handed operation (kind of like Fontificator just yesterday ).

0 views

10Gb/s Ethernet: switching to a Broadcom SFP+ module

Back in April , I upgraded my home LAN to 10Gb/s. The in-wall cabling is CAT-6 or similar, so I had to use 10GBASE-T. Now, the router I'm using, and the switch in my study, provide 10Gb/s through SFP+ cages; that meant that they needed 10GBASE-T SFP+ modules in order to connect. That kind of module is known to run hot -- sometimes too hot to actually work. The modules in , the router, appeared to be running OK (see the linked post above for charts), but the one in , the study switch, was a worrying 93C. I tried sticking some mini-heatsinks on it , which seemed to help a bit. But the weather got warmer, and eventually the module overheated. I lost access to the Internet from the study, and checking the metrics showed me this: You can see that it's "flapping": the temperature gets up to a level where the module shuts itself down for its own protection -- about 95C, I think -- and then when it has recovered, it switches on again, the temperature rises, and the process repeats. I was able to work around the problem by switching on the air conditioning in the study. But normally I only have it on when I'm in there, and keeping aircon on 24/7 just to keep the network working felt like the wrong solution. It was time to switch to a more power-efficient SFP+ module. My original 10Gb/s post had quite a lot of discussion on Hacker News , and mentioned that there are two generations of 10GBASE-T SFP+ modules: old ones using a Marvell chip, and newer ones using one from Broadcom. on the ServeTheHome forums made the same point. The Marvell-based ones were known to run hot, and they both recommended finding Broadcom-based ones. I'd confirmed that the MikroTik S+RJ10 that I had in was indeed a Marvell one, so the solution was pretty simple: get a better one. So I went on Amazon and picked up a 10Gtek ASF-10G-T80-INT . Checking 10Gtek's own page on that module confirmed that it used the right kind of chip (although it was a little bit garbled): 10Gtek's ASF-10G-T80 is a newest version copper transceiver, its biggest feature are ultra lowpower consumption and longer transmission distance (1.6W C10Gbps 30m,2.0W 110Gbps 80m). ASF-10G-T80 is a 10GBase mult-rate Copper RJ45 SFP+ transceiver, designed in with BROADCOM BCM84891 PHY chip following IEEE 802.3an/az and SFP+ MSA, supporting up to 80-meter transmission over CAT.6a or CAT.7. A day or two later, it arrived. It came in a rather pretty little metal case: Installing it took a little while, because I found removing the existing MikroTik module tricky; Willie Howe's video on YouTube helped quite a lot in showing how to disengage the latch, but I still needed to fiddle around with it quite a bit to get it out. However, that was eventually done, and the new module went in. I plugged all of the network cables back in, switched on the switch, and (after a slightly nerve-wracking wait for it to boot up) the network was back up and running! So, were the temperatures any better? I checked my monitoring, and: Huh, nothing was being reported. That made sense, though. The way I was charting those numbers was that the switch exposed them over SNMP, and then the Telegraf daemon on my router, , read the numbers and sent them to InfluxDB ; finally, Grafana did the charting. I'd been reading the module temperatures in using the SNMP OID that I'd identified that the switch was providing them on ( if you're interested), but perhaps the new module was published on a different OID. It was time to log in to the switch and take a look. It's saying that it's an Intel module; that in itself is not all that odd -- there are frequently compatibility issues between switches and SFP+ modules, so sometimes modules are configured to "lie" about which manufacturer made them -- and I'd specifically bought the "Intel-compatible" one on Amazon, the , because I couldn't find one that pretended to be MikroTik. Research had suggested that it would work OK, and it did. But the really odd bits were these: Not only was it impersonating an Intel module -- it was saying that it was a fibre-optic one ! Perhaps if I had found the "MikroTik-compatible" option it would have been better -- though, equally, it might have just impersonated a MikroTik fibre module anyway. Anyway, it was working -- so that was OK. But there was some bad news. If the switch was able to read a temperature from the new module, then you'd expect it to appear in that output, as . So, sadly, I don't think I'll be able to monitor the temperature of the new module. How could I tell whether it had helped, then? Well, one thing would be to simply see if there are any further instances of network flapping. I actually did the replacement just over two weeks ago, and everything has been fine as far as I can tell from using it and from the other monitoring (despite another hot week last week). But another interesting metric is the CPU temperature for over the two weeks before and after the module change: You can see that there was a real drop-off late on 1 June, when I switched the modules, and it has been running about 5C cooler since. Of course, there's a lot that's different about the new module -- as well as having a different chipset and a mendacious EEPROM, it's likely to have different thermal coupling characteristics -- it might be shedding more or less of its heat to the SFP+ cage and thence to the switch's CPU. So it's not proof of anything, but in combination with the improved link stability, I'll take it as a win. So, an interesting little excursion into the world of SFP+ modules -- in particular, slightly dodgy ones :-) Let's see if this one holds up better as we go through the toasty Lisbon summer.

0 views

Armenian Shorthand System from 1888!

Read on the website: It turns out, Armenian has at least one stenography system. The one designed in 1888 by a monastic order from Venice! Althought it’s imperfect, it’s a nice historic rarity.

0 views

LightDSA: Enabling Efficient DSA Through Hardware-Aware Transparent Optimization

LightDSA: Enabling Efficient DSA Through Hardware-Aware Transparent Optimization Yuansen Wang, Teng Ma, Yuanhui Luo, Dongbiao He, Zheng Liu, and Yunpeng Chai EUROSYS'26 This paper describes performance characteristics of the Intel DSA hardware accelerator, and software techniques to maximize performance when using DSA. My takeaway is: the DSA supports a variety of convenience features, but each one is so expensive that you are better off adding software complexity to avoid these paths. The Data Streaming Accelerator ( DSA ) is a hardware accelerator in recent Intel chips. It can implement simple memory operations like , , , and CRC generation. Fig. 2 contains a high-level diagram of the DSA architecture. Source: https://dl.acm.org/doi/10.1145/3767295.3769356 Operations are written into work queues as 64-byte descriptors. A work descriptor (WD) describes a single operation, whereas a batch descriptor (BD) references many work descriptors. A batch is the fundamental unit of control (the DSA signals the CPU when a batch has completed). The DSA contains multiple engines and arbiters to spread work across the engines. Source and destination buffers accessed by the DSA do not need to be pinned into memory, the DSA can handle page faults. The DSA supports demand faulting via the page request service (PRS). This enables the DSA to send an interrupt to the OS (via the IOMMU) requesting the OS to resolve the fault. This paper reports a similar finding to a previous paper looking at the PCIe page request interface: demand faulting is convenient, but slow. The LightDSA authors recommend that software forcibly fault in pages before submitting descriptors. The DSA supports operations that require unaligned reads and writes of data from/to DRAM but the authors find that 64-byte aligned accesses are much faster. Fig. 8 has some numbers, even 32-byte alignment is expensive (compare the light green and dark green bars). Source: https://dl.acm.org/doi/10.1145/3767295.3769356 The authors recommend having software ensure that all writes performed by the DSA are 64-byte aligned. Software can do this by executing the operation for the first few bytes of each task, up until the destination buffer is 64-byte aligned. Like many HW/SW interfaces, the DSA writes both result data and metadata to memory. Result data is associated with each work descriptor, while completion metadata is associated with each batch descriptor. Metadata is read by software to learn when an operation completes. In such a scheme, it is important that software observes the batch metadata write after the result writes have completed. If the metadata write can land first, then software may try to read the result buffer before it has actually been updated. The DSA supports multiple traffic classes (TC). As with discrete PCIe accelerators, writes from the DSA associated with the same traffic class will land in host memory in order (these are posted writes ). However, writes associated with different TCs may be reordered. Here is a previous paper that describes performance problems with reordering. Section 3.8 of the DSA architecture specification describes two choices that software developers have. Either they should configure work descriptors and batch descriptors to use the same traffic class, or they should configure the DSA to enforce ordering via the (readback) flag. When that flag is set, the DSA will ensure that all result writes have landed in host memory by issuing a read request to read the most recently written result data back to the DSA, waiting for the response to come back, and then issuing metadata writes associated with the batch descriptor. Discrete PCIe devices can use the same trick to enforce ordering across traffic classes. Fig. 7 shows the performance cost of using this feature: Source: https://dl.acm.org/doi/10.1145/3767295.3769356 My takeaway is that DSA users should ensure that work and batch descriptors use the same traffic class, to avoid having to invoke this slow read-back path. Because the DSA contains multiple engines, tasks can complete in a different order than the order in which they are submitted. This is fine in itself, but the authors note that software must take care to efficiently support allocating work and batch descriptors in light of this. Time spent bookkeeping to handle out-of-order completion is overhead that adds up for small tasks. The solution proposed by this paper is for software to maintain two batch descriptor lists (free, and busy). When software needs to recycle descriptors from the busy list to the free list, it checks most (but not all) batch descriptors in the busy list to see if the hardware has completed the batch. This is in contrast to an approach which simply checks to see if the oldest-submitted batch has completed. The paper finds that it is optimal for the recycling process to ignore the 25 most recently submitted batch descriptors but check the completion status of all other outstanding batches. Figs. 12 and 13 compare the performance you can expect to see from using DSA naively versus using the techniques described in this paper (LightDSA). My takeaway is that the DSA is powerful, but only if you use it carefully. Source: https://dl.acm.org/doi/10.1145/3767295.3769356 Dangling Pointers I suspect the elevator pitch for DSA is something like: “just re-compile your existing C/C++ code and all of the memcpy/memcmp time will be optimized out”. It seems like DSA falls short of that. I wonder if the elevator pitch would be better realized if application code was written in other languages (like an explicitly pipeline parallel language). Thanks for reading Dangling Pointers! Subscribe for free to receive new posts.

0 views
Martin Fowler Yesterday

Building Reliable Agentic AI Systems

One of the most interesting projects my colleagues have done with LLMs has been building a system with Bayer to allow pharmaceutical researchers to query decades of information about studies buried in PDF reports. Sarang Sanjay Kulkarni describes its evolution from keyword-based search to an intelligent research assistant capable of answering complex questions and drafting regulatory documents.

0 views
fLaMEd fury Yesterday

Create A Static Site Using 11ty & Deploy to Neocities (2026 Refresh)

What’s going on, Internet? Way back in 2022 I wrote a guide on building a static site with 11ty and deploying it to Neocities . It’s been one of my most-read posts, but it’s also aged: Eleventy has moved to v3 with a brand new module system, the dev server changed, and my whole workflow has shifted away from GitHub toward Forgejo and Codeberg . So here’s the refresh. I haven’t hosted my own site on Neocities for years now, but it’s still home to a huge community of personal sites and homepages, especially folks in the 32-Bit Cafe , so this guide is still very much for them. This guide aims to help you create a homepage using the static site generator (SSG) 11ty , keep the code in version control, and deploy it to Neocities , first by hand, then automatically. The homepage that we are creating will take advantage of the Nunjucks templating language, allowing us to create a shared header, navigation and footer across all the pages on our homepage. We will be creating an about, links, and contact pages before diving in and creating the ability to add a blog and a list of all blog posts on the blog page! We will structure and style the page with a standard HTML5 boilerplate and some basic CSS that should allow you to add in your unique flavour that we all know you love to do. This guide assumes the following: First off, from a terminal, confirm that you have Node and NPM installed: Create a new directory and cd into it: Initiate a new project: Install 11ty: Once the 11ty installation is complete, open the project in your favourite code editor: You should now be in VSCodium with the following project structure: Open and update the scripts section to the following: We also need to tell Node that this is an ESM project. Add to . The file should look like this: The line lets us use modern / syntax in our config and JavaScript files. The script lets us run to serve our homepage with hot-reload, provided by Eleventy's built-in dev server. Every time you save a change in VSCodium, the browser reloads with your most recent changes, amazing! From the terminal (or VSCodium), create a new file at the project root: Open the file in VSCodium and add the following and save: This configuration file tells 11ty what to do. Setting the directory to tells 11ty where to look for changes, this is our working directory. When changes are detected, 11ty builds the site and outputs it to the directory which is where the static html/css/img files are served from, amazing! As we’re going to be keeping our homepage code in version control, create a file in the project root: Open the file in VSCodium and add the following and save: The .gitignore file is a text file that tells Git which files or folders to ignore in a project. In this case, our file tells git to ignore the directory and the directory where our static files are built locally. Now comes the fun part, building our homepage. 11ty supports a number of templating languages, but the two you’ll reach for most are Markdown and plain HTML. Markdown is the popular choice for content like blog posts: you just write, without tags getting in the way. HTML is handy when you need precise structure. The best part is you can drop HTML straight into a Markdown file and 11ty renders it correctly, so it’s never one or the other. For the pages that make up the site’s structure (home, about, links, contact) we’ll use HTML, because it maps neatly onto the layouts and partials we’re about to build. When we get to the blog, we’ll write the posts in Markdown, where it shines. Use whichever fits the job. Create a directory at the project root and cd into it: Create an file in the terminal or VSCodium: Open the file and add some content: Now from the terminal start 11ty: If everything has been configured right so far you should see the following: Now you can open up and check out your new 11ty homepage! It should look like this: A Basic Hello World HTML Page Amazing! But what we want to avoid is having to write out the and and tags on each and every page, and be able to include a site header, navigation and footer so we don’t have to copy and paste the changes across every page each time we update. Let’s checkout templating a layout! Create a new directory in the directory and cd into it: Create a file in the terminal or VSCodium: Open the file and add the following: We've created as a Nunjucks template file, hence the file extension. This means we can use Nunjucks' double curly braces for using frontmatter variables. In our layout template we're calling and . Now, head back to the file you created earlier, delete the contents and add some front matter and some content: If you’ve kept 11ty running and the browser running it should look like this: A Basic Hello World HTML Page Using a Template Amazing! Now lets create the additional pages for our homepage. Create the following pages in the directory with the terminal or VSCodium: Open each of them up and add in some front matter and content: about.html: links.html: contact.html: You should now be able to browse each of these pages if you kept 11ty running on the following urls: Great stuff, but that’s no use without a navigation! Let’s take a look at and create a shared , , and to bring our homepage together. In the terminal cd into and create three partial files: Open each of them up and add some content: header.njk: navigation.njk footer.njk: Once our partials are created, open again and update it to include our new elements and partials: If you’ve kept 11ty running and the browser running it should look like this: A Basic Hello World HTML Page Using a Template and Partials Amazing! Now lets add the blog. Blog posts are mostly prose, so this is where Markdown earns its keep. We’ll write the posts as files and let 11ty turn them into pages. Create a new directory in the directory and cd into it: Create the following files in the directory with the terminal or VSCodium: Awesome, Open each of them up in VSCodium and add the following: my-first-post.md : my-second-post.md : my-third-post.md We better create a blog layout so it renders! Head back to the directory to create a new layout file: Open up in VSCodium and add the following: Check that your blog posts are loading: Amazing right? But to make it a blog, we need a blog page that lists all of our blog posts. We can do this with a collection: Open again and add a key called with a value of : Now 11ty has created a collection called and all we have to do is list it. Head back to the directory and create a file: Open it and add the following: If you’ve kept 11ty running and the browser running it should look like this: A Basic Blog List Page Amazing huh? Great, so far we have a fully functional home page, but it doesn’t look quite right. We need a style sheet. You can use the one below as an example, it’s basic styling with some modern techniques, or just throw in your own! Create a new directory in , cd into it and create : Open in VSCodium and add the following: styles.css: Now we need to include the style sheet in our layout file. Open it up and add to the : _includes/base.njk: You would have noticed that the stylesheet hasn’t been applied, we have to do one more thing in , something called file passthrough copy. Open in VSCodium and add the following: Because this will come up we may as well create the directories and add in the configuration for our images, fonts and JavaScript files. Create the following directories in : Update again: Just make sure you put all your static files in the appropriate directory and you’ll be good. So finally, if you’ve kept 11ty running and the browser running it should look like this: A Nicely Styled Homepage Yours will look a little different depending on the colours and fonts you chose above. Now we have a homepage we’re happy with, let’s get it online. There are two ways to get your site onto Neocities. We’ll start with the simplest, pushing it from your terminal by hand, then automate it so a deploy happens every time you commit. Whichever method you choose, first build a fresh copy of your site: This writes the finished HTML, CSS and assets to the directory. That’s the folder we deploy. Neocities provides a command-line tool that lets you push your site straight from your terminal. It’s a Ruby gem, so you’ll need Ruby installed. The first time you run a command it’ll ask for your Neocities username and password, then store an API key locally so you don’t have to log in again. Push the contents of your directory: That’s it, your homepage is live. For a lot of people this is all you need. Build, push, done. Pushing by hand is fine, but it’s even nicer to have your site rebuild and deploy itself every time you commit a change. We can do that with Forgejo Actions , the built-in CI for Forgejo. If you self-host Forgejo this runs on your own runner; if you don’t self-host, Codeberg offers the same thing (more on that below). First, push your project to a repository on your Forgejo instance. Then grab your Neocities API key from your account settings (Manage Site Settings → API Key) and add it to your repository as a secret named (Repository → Settings → Actions → Secrets). Now create a workflow file at : A few things to note in this workflow: Commit and push the workflow file. From now on, every push to rebuilds your site and deploys it to Neocities automatically. If you don’t run your own Forgejo instance, Codeberg is a free, community-run home for your code and runs the very same Forgejo Actions. The workflow file above works as-is. Push your project to a Codeberg repo, add the secret in the repository settings, and you’re away. You may need to enable Actions for your repository first; see the Codeberg CI documentation for details. Already have a homepage you’ve been hand-coding on Neocities? You don’t have to start from scratch. Eleventy is happy to take what you’ve got and slot it into this structure. Copy each existing page into (your old becomes , and so on). Then move the parts every page repeats, the , header, nav and footer, into and the partials you built earlier. Delete that boilerplate from each page and add a little front matter at the top: Whatever’s left in the file is just that page’s own content, and the layout wraps it. Your CSS goes in , images in , and fonts in . The passthrough copy we set up earlier ships them straight to . If a page is mostly writing, paste the body into a file instead of . Any fiddly HTML, like an embed or some custom markup, can stay exactly as it is and 11ty will render the Markdown around it. Run , check looks the way you expect, then push it live with the Neocities CLI or your Forgejo Actions workflow. Same site you already had, now with layouts, partials and a build step doing the repetitive work for you. Reference: I created the original version of this guide based heavily on these existing guides, and they’re still well worth a read: Without these, I wouldn’t even know how to write down what I needed to. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website. You have a basic understanding of HTML and CSS You have a basic understanding of the command line and terminal You have Node.js installed (version 18 or newer) You're using VSCodium as your editor You have a Neocities account You have somewhere to keep your code: a Forgejo instance or a Codeberg account http://localhost:8080/blog/my-first-post/ http://localhost:8080/blog/my-second-post/ http://localhost:8080/blog/my-third-post/ picks the runner label. This is the default on Forgejo and Codeberg. Actions are referenced by their full URL. The checkout and setup-node actions come from , so we stay off GitHub for those. The deploy step uses , which is hosted on GitHub. We're only using it. Your code still lives on Forgejo or Codeberg. The option removes remote files that aren't in your new build, the same as on the CLI. Create Your First Basic 11ty Website Itsiest, Bitsiest Eleventy Tutorial

0 views
Stratechery Yesterday

Fox Buys Roku, The Problem With Fox’s Smart Strategy, Streaming That Works

The market hates Fox's acquisition of Roku, but the company is trading extraction from rights holders for leverage as a renter.

0 views
neilzone Yesterday

Speeding up static site generation with BSSG

Three months ago, I moved from hugo to BSSG for this blog (and my work blog). You can get BSSG here . I’ve been really happy with BSSG, and a couple of recent changes by Stefano have made it even better. I have a minimalist blog. A list of posts on the front page, and generally text-only posts. I like it to load fast even though it is running on a Raspberry Pi 4, along with a couple of other bits. This means that there are some features of BSSG that I do not use, including descriptions of blogposts. I use the title for that, on the basis that this should be informative in itself. It suits me, anyway. There are also some other UI elements that I do not need, such as reading time. I bodged my way around these, using CSS rules to hide the unwanted content from display. I could have changed the code to neither generate nor display them, but I didn’t really want to run, and need to maintain, my own branch. With the recent changes, Stefano added some new config options: These are set to “true” by default - to preserve the experience for people who already use BSSG and expect these things, which makes sense to me - but now I can set them to “false”, and have an even slicker, faster experience. The second brilliant change is about the way the scripts handle incremental updates. The idea being that, rather than building every post, every time, it will just build the new posts. I struggled to get this to work initially, as it was building all posts, every time. This turned out to be entirely down to me: my build script, which I use to control building and deploying both the cleartext and .onion versions of the blogs, cleared the output directory each time. I removed that, and bingo, incremental updates! This combination of things meant that building each site went from ~10 minutes (which was a bit painful) to ~1 minute (which is fine!). Happy days.

0 views
Kev Quirk Yesterday

📝 2026-06-16 08:11: Sun's out, so there's only one way to travel to the office... ☀️

Sun's out, so there's only one way to travel to the office... Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Kev Quirk Yesterday

📝 2026-06-16 07:55: Like we don't have enough animals already. This little black blob will be joining us...

Like we don't have enough animals already. This little black blob will be joining us on August. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views

Exclusive: OpenAI Losses Increased Nearly 8X in 2025, With Spending Hitting $34 Billion

Soundtrack: In Flames - Colony To further support my independent journalism, please subscribe to my premium newsletter. It’s $7 a month or $70 a year. If you’re subscribed to the free newsletter and logged in, you should see at the bottom right hand corner of your screen a little circle you can click, and you’ll be able to sign up for premium.  Today, I can exclusively report, based on audited financial documents viewed by this publication that have been independently verified by the Financial Times , that OpenAI lost around $38.5 billion in 2025, as well as other crucial details about the financial condition of the company.  Due to the seriousness of this story, I am not going to do very much editorializing, as the numbers speak for themselves. OpenAI’s financial statements tell the story of a company with incredible losses. Additional factors – including interest income and interest expense – left it with a net loss of $8.84 billion. It then marked $3.74 billion of losses as “net loss attributable to noncontrolling members capital,” leaving the net loss attributable to the company as $5.09 billion.  It’s unclear what this means, nor how OpenAI reconciled the removal of $3.74 billion in costs. I will not speculate further. Please note that 2025 was the year that OpenAI converted from a non-profit to a for-profit entity, leading to a $41.55 billion loss due to changes in fair value of convertible interests and warrant liability.  Taking into account other minor factors like interest income and interest expense, OpenAI is left with a net loss of $60.35 billion, which it lowered to $38.53 billion by removing $17.87 billion in costs via that “net loss attributable to noncontrolling members capital” and another $3.95 billion via a “net loss attributable to redeemable noncontrolling interests.”  Ultimately, the net loss attributable to OpenAI in 2025 was $38.5 billion.  At the end of the year, OpenAI had just over $50 billion in assets, with almost half of that in cash. In 2025, SoftBank paid OpenAI $867 million. Microsoft paid it $303 million.  The documents revealed how much OpenAI paid Microsoft for services. In the 2025 calendar year, OpenAI paid Microsoft $10.59 billion for “Research and development” expenses. We believe this most likely refers to the cost of training OpenAI’s models.  The documents also mention a $6.047 billion charge related to “cost of revenue,” a $527 million charge for sales and marketing, and $42 million in “general and administrative expenses.” In total, OpenAI’s expenses to Microsoft amounted to $17.2 billion.  According to the figures, OpenAI had liabilities to Microsoft of $3.64 billion at the close of the calendar year, and additional $21 million in “accrued expenses and other current liabilities.” The documents also mention a further $58 million in non-current liabilities. I intend to follow up this story in the next month with more in-depth reporting related to the documents. The documents are detailed, and I need time to fully parse them. Once I have done so, you’ll know. The financial condition of OpenAI is deeply concerning. $38.53 billion in losses are astronomical, and far higher than most believed it would be. Losses also appear to be mounting year-over-year at a dramatic rate, and I’m not sure how this company finds a way toward any kind of sustainability or profitability. As discussed, I have not editorialized much today. I believe the best thing I can do for the general public is to deliver this news as plainly as possible.  As I mentioned at the beginning, if you liked this piece, you should subscribe to my premium newsletter. It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including vast, detailed analyses of NVIDIA , Anthropic and OpenAI’s finances , and the AI bubble writ large . My Hater's Guides To the SaaSpocalypse , Private Credit and Private Equity are essential to understanding our current financial system, and my guide to how OpenAI Kills Oracle pairs nicely with my Hater's Guide To Oracle .  Revenue: $3.7 billion Cost of Revenue: $2.65 billion Research and Development: $7.81 billion Sales and Marketing: $1.11 billion General and Administrative: $907 Million Total Costs and Expenses: $12.48 billion Loss from Operations: $8.78 billion Revenue: $13.07 billion Cost of Revenue: $7.5 billion Research and Development: $19.18 billion Sales and Marketing: $5.73 billion General and Administrative: $1.57 Billion Total Costs and Expenses: $34 billion Loss from Operations: $20.92 billion

0 views
Unsung Yesterday

Fontificator

I thought about this the other day, and I thought it’d be fun to share this internal tool I made over a decade ago to aid with exploring options for Medium’s typographical redesign. It’s called Fontificator. You can play with Fontificator here (desktop browsers only), or watch the likely confusing video below: The motivation for building Fontificator came from two observations: With Fontificator, I was aiming at this Doug Engelbart-esque notion of one hand on the keyboard + one hand on the mouse, and the UI where it was only necessary to point to an element, and the keys under your other hand would start working immediately – no clicking needed: This way, we could move really, really fast. To accommodate that, Fontificator always tried to keep the current item under the cursor by counter-adjusting scroll position as needed. On top of it all, a few more shortcuts: You can also edit any text if you are so inclined, and also drag in any font file from your computer onto a paragraph – then that font becomes part of the F/G stack. (Bernino Sans and Freight Text were the starting fonts before the redesign.) On the left, you can also see a naïve mobile preview – there was also more sophisticated on-smartphone preview, but I removed it from this restored version. Fontificator was literally made for an audience of 2–3 designers (and perhaps 1–2 stakeholders in read-only mode), and it was surprising to me how quickly one could master this strange tool, have fun with it, and feel the entire typography on the page becoming much more malleable. We also put up a more “traditional” list of contenders on the wall… = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/fontificator/2.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/fontificator/2.1600w.avif" type="image/avif"> …but it was in Fontificator where we learned the most. I love internal UIs because they allow you to go very wild and very tactical. If you have one you’d be willing to share (maybe it, too, is on the other side of the statute of limitations?), or one you already wrote about or spotted someone else doing so, please let me know! #internal ui #typography font previews on type foundry sites were generally too limited to get a real sense of how a certain typeface feels, and it was best to see a font in situ, often an extremely tiny nuance – like adding some letter spacing, or messing with line height – was what separated something that was promising from something that seemed very far from working. F and G to change the font, – and + for font size, ← and → for letter spacing, ↑ and ↓ for line height, < and > for opacity (for all the above you can hold Shift for bigger moves), and, there are a few more shortcuts you can see at the top. ⇥ and ⇧⇥ move very quickly between different types of stories so you can preview that, Space compares to the original/​current version, 1–9 allow you to switch to different “slots” so you can have various presets ready to compare, Esc hides the toolbar for maximum immersion,

0 views
Lalit Maganti Yesterday

TIL: Iroh: peer-to-peer networking for app developers

I came across Iroh ( via , via ) today as it hit 1.0 and found it a really interesting solution to a problem I knew existed but had not thought a lot about. Judging from the comment sections, it seems pretty clear that lots of people are confused as to exactly what Iroh is. I don’t think their launch post does their product justice at all, and their tagline is “IP addresses break, dial keys instead” which sounds cool, but if you think about it for just a second, you’ll end up with lots of questions. The biggest one is: “so how is this different from a mesh VPN like Tailscale, ZeroTier, Netbird, etc.?” It’s only after reading a lot of developers’ comments on the threads that I feel I understand: Iroh is aimed at  application  developers who want to communicate P2P between machines running their app, while mesh networks are aimed at  network admins  who want to connect devices they own/manage together.

0 views

Lean, not backpressure

Lucas Costa has written a good article on how to build systems that can handle code-generating robots. Unfortunately, when calling it backpressure , he used the wrong metaphor. Backpressure is about signaling to upstream processes that they are running too fast and need to slow down. The suggestions presented by Costa are mostly about signaling to the upstream process that it needs to do things differently , rather than just slow down. This has more to do with ensuring sufficient quality is sent downstream, rather than quantity . This irked me. As I was reading, I was searching for the right analogy. I kept coming back to lean manufacturing . The more famous half of the lean philosophy is waste reduction. The other half is about managing the unstable input of people. That’s what we’re interested in here. (Continue reading the full article on the web.)

0 views