Latest Posts (20 found)

Notes on "Harness Engineering"

I find it useful and revealing to perform very close readings of engineering blog posts from frontier labs. They seem like meaningful artifacts that, despite their novelty, are barely discussed at any level except the surface one. I try to keep an open but keen mind when reading these posts, both trying to find things that don't make much sense when you think about them for more than a couple seconds and what things are clearly in the internal zeitgeist for these companies but haven't quite filtered themselves out into mainland. And so I read Harness Engineering with this spirit in mind. Some notes: This was a fairly negative list of notes, and I want to end with something positive: I do generally agree with the thrust of the thesis. Ryan writes: This is the kind of architecture you usually postpone until you have hundreds of engineers. With coding agents, it's an early prerequisite: the constraints are what allows speed without decay or architectural drift. I think this is absolutely the right mindset. Build for developer productivity as if you have one more order of magnitude of engineers than you actually do. It's disingenuous to make this kind of judgment without knowing more about the use case and purpose of the application itself, but the quantitative metrics divulged are astounding. The product discussed in this post has been around for five months. It contains over one million lines of code and is not yet ready for public consumption but has a hundred or so users. If you had told me those statistics in any other context, I would be terrified of what was happening within that poor Git repository — which is to say nothing of a very complicated stack relative to an application of that size. Why do you need all this observability for literally one hundred internal users? Again, there might be a very reasonable answer that we are not privy to. Most of the techniques discussed in this essay — like using as an index file rather than a monolith — have been fully integrated into the meta at this point. But there's one interesting bit about using the repository as the main source of truth, and in particular building a lot of tooling around things like downloading Slack discussions or other bits of exogenous data so they can be stored at a repository level. My initial reaction was one of revulsion. Again, that poor, poor Git repository. But in a world where you're optimizing for throughput, getting to eliminate network calls and MCP makes a lot of sense — though I can't help but feel like storing things as flat files as opposed to throwing it in a SQLite database or something a little bit more ergonomic would make more sense. 1 See insourcing-your-data-warehouse . The essay hints at, but does not outright discuss, failure modes. They talk about the rough harness for continuous improvement and paying down of technical debt, as well as how to reproduce and fix bugs, but comparatively little about the circumstances by which those bugs and poor patterns are introduced in the first place. Again, I get it. It's an intellectual exercise, and I'm certainly not one to suggest that human-written code is immune from bugs and poor abstractions. But this does feel a little bit like Synecdoche, New York — an intellectual meta-exercise that demands just as much attention and care and fostering as the real thing. At which point one must ask themselves: why bother?

1 views

Carl Cox On Waiheke

What’s going on, Internet? Ferry ride over, no kids. Bus to Onetangi. Alibi for a late lunch. Picanha steak with seasonal vegetables. Unfortunately, they had some type of beer shortage, so their usual selection was limited to four tap beers. I enjoyed the Ruru Hazy, even though it was a hazy. Hopefully, they have the full range available next time I’m there. The gig was a minute walk up the road at the Wild Estate . I was wondering how they would do the setup, and once I saw the fences and tents set up on the front lawns, it made sense. We got through check-in sweet as. The drinks were supplied by Pals. We grabbed a drink, my wife a Purple Pals and a Frankie’s Cola for myself. We took a short walk around the venue to get a lay of the land and found a table to sit down at. There was one person there enjoying a pizza. We said hello and sat down. Shortly after, a couple approached and asked if the seats were free. Of course, come sit down. Let’s chat. Want another drink? Sure, let’s go. Friends were made. Another couple, two friends, sat down in the remaining seats. Hi, how are you? More friends. Time to dance. We met up on the dance floor. A group of new friends dancing amongst the crowd to Nichole Moudaber before Carl Cox came on. Both sets were amazing and just what I wanted to hear on a Saturday afternoon. It’s pretty cool that Carl can play something like Awakenings Festival to hundreds of thousands of people, and then a month later play a small venue on Waiheke to a crowd of a thousand. The gig started at 3pm and went until 9pm. Perfect timing for us. We decided to skip staying to the end of Carl’s set and grabbed the 8:11pm bus back to the ferry terminal. I think we made the right call, as the next boat back to the city was at 9:30pm. Sure, we had to wait at the ferry terminal, but we were at the front of the line and got a seat right away as the boat turned up. There were hundreds of people left waiting at the terminal for the next boat. We managed to get home and into bed by 11pm. Perfect timing for a good enough sleep before kids’ activities in the morning. ← Previous 1 / 4 Next → Close ← Previous 2 / 4 Next → Close ← Previous 3 / 4 Next → Close ← Previous 4 / 4 Next → I miss nights out like these. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views

some thoughts on online verification

I've been thinking about writing a post on the Discord age verification thing, but the entire situation is milked to death by content creators right now. Everyone feels the need to throw their conspiracy theories and misinformation into every comment section as well, so it just feels like a lot of noise and panic right now. I'll leave it at a retrospective write-up when the dust has settled and not add to the confusion.  What I feel like touching on instead is the history of age or name verification online. I've seen many people behave as if this is a new issue or an escalation, and while I understand the concerns, I feel like we shouldn't lose sight of the bigger picture. That's not meant to sugarcoat what's happening or make it seem more harmless, but point out that this has been going on for longer and is part of a bigger pattern. Thinking back on my time online, of course I also had to verify age to purchase games on PlayStation and Steam. But even nowadays, as I have no YouTube account, I get a pop-up that YouTube classifies me as a minor after a few videos. This didn't just start in 2025 when they started using AI to judge users' age; I remember the outrage when YouTube enabled age verification in the first place and asked adult accounts to submit an ID to prove their age. But did anything change? No. People did not leave the platform en masse. I also remember the start of Facebook's real name policy . This de-anonymized people or locked them out of their account unless they provided ID, and targeted ethnic groups a lot, as well as any people whose name on their documents doesn't match the name they go by. It's especially funny to read the justification of " authentic identity is important to the Facebook experience, and our goal is that every account on Facebook should represent a real person " when they are at the forefront of AI user profiles and chatbots right now. Even before and during all that, we have watched as sex workers, NSFW artists and queer people in general have had their accounts demonetized, removed, and payment providers discriminating against them and their platforms due to the general stigma and ideas of "protecting kids". But not many were willing to stand up against that because it surely wouldn't extend to the "respectable people", and only got rid of the people they didn't want to see. My point is: These things are older than the recent UK, Australia and select few US states legal mandates of age verification. Of course, just 'consuming content' in an age-restricted way is different than having direct communication hampered by age restriction and surveilled. Being aware that you are watched can lead to self-censorship. I am reminded of the German " Volkszählungsurteil ", which said (translated by me): “ Anyone who is uncertain whether deviant behavior is being recorded at any time and permanently stored, used, or passed on as information will try not to attract attention through such behavior. […] This would not only impair the individual’s opportunities for personal development, but also the common good, because self-determination is an elementary functional condition of a free democratic community that is based on the capacity of its citizens to act and to participate. From this it follows: Under modern conditions of data processing, the free development of personality presupposes the protection of the individual against unlimited collection, storage, use, and disclosure of personal data. This protection is therefore encompassed by the fundamental right in Article 2(1) in conjunction with Article 1(1) of the Basic Law. To that extent, the fundamental right guarantees the individual the authority, in principle, to determine for themselves the disclosure and use of their personal data. ” Fear of constant monitoring leads to self-censorship and conformity, which harms both individual freedom and democratic participation. But how have we dealt with the knowledge that this is happening? Denial, ignorance, forgetting, defeatism, making memes about our FBI agent, pretending security by obscurity works, focusing on how it makes apps nicer to use, and pretending we have nothing to hide. I saw a YouTuber I like say that Discord surveilling every message for sensitive content or to guess your age is like sending all your messages to the FBI. That left me a little speechless. Unfortunately, it's like many haven't learned anything from the Snowden era. US intelligence is already allowed to almost freely collect data on you 1 , and even as a non-US citizen, see FISA 702 bulk surveillance. Stuff like that is exactly why Safe Harbor and Privacy Shield failed, and why the current upholding of the EU-US Privacy Framework is a farce. This is the issue with no encryption. This is exactly why your privacy-conscious friends were leading you towards options that could be encrypted (and why governments everywhere wage a war on encryption). If you send something via unencrypted means, technically speaking, you must treat it as consent for it to be collected, compiled and evaluated, which sucks. It shouldn't be that way, but it is. Even I struggle with that! This is extremely uncomfortable, especially when most of us were only educated on this years into treating our data on services as private and safe, or when we were children who didn't know how to properly judge the consequences of our actions online and were surrounded by others who did the same thing. This is also a boiling frog situation. You point out for years that the amount of data these giants collect on you is not okay. You advise people to go look into Google or Twitter settings and see what they are grouped as for targeted advertising, to show them exactly what data is collected as an eye-opener, and to turn stuff off. You advise people what services they could switch to. Instead, many people doubled down on it because the recommendations of the algorithm and ads are so good, having a home assistant like Alexa is so sci-fi and convenient, and a Ring camera and a pet camera is the pinnacle of home-safety. The more private service is ugly or doesn't auto-detect your music or whatever else weird reason people can think of. Only now, with a US government becoming increasingly dangerous, do people seem to rethink it all - deleting some social media accounts, switching away from Google, getting rid of their Ring cameras and the like. The problem is: If you make decisions like that based on your current government, you aren't ready for the next one. If you allow intense data harvesting under a benevolent government, that dataset already exists for when fascists take power. You can point all you want towards countries where being gay or trans is illegal or where women cannot leave the house on their own and act as if this won't affect you; you and them are not so different and very little actually protects you from that. The safest option isn't to hope that the next institution to have access to intense amounts of data every couple years will not misuse it, but that they don't hold this level and amount of data to begin with. The same goes for companies: Even if you trust them now, differences in laws, leadership and profitability can change the circumstances. As a user, you're unlikely to be able to control them, you can only control yourself and your means to an extent. Have you also noticed that 2025 seems to have been the year with the most "Wrapped"s so far? It felt like every app and service had a Wrapped ready for you - even period tracking software! Of course they are very fun to share and get to know your friends better and measure up against them, but they absolutely normalize being comfortable with this sort of surveillance. The mechanisms and data on which services like YouTube and Discord attempt to guess your age for verification are the same ones they use for advertising, the feed algorithm, the Wrapped and the auto-generated playlists you enjoy. So dare to look behind the fun facade and know what these things truly are. " delulu yearning girl dinner friday evening " is another way to present 20-25 years old ", location and interests. Reply via email Published 15 Feb, 2026 Every argument denying this is "they can't do that, that's illegal!" levels of convincing. There are so many intelligence laws, so much careful wording, and also so much internals we do not (yet) know about. It took whistleblowers to show some of it, and recent ICE news shows the tip of the iceberg with what law enforcement and intelligence is willing to do to ensure more surveillance - Palantir, Flock etc. ↩ Every argument denying this is "they can't do that, that's illegal!" levels of convincing. There are so many intelligence laws, so much careful wording, and also so much internals we do not (yet) know about. It took whistleblowers to show some of it, and recent ICE news shows the tip of the iceberg with what law enforcement and intelligence is willing to do to ensure more surveillance - Palantir, Flock etc. ↩

0 views

Running `deezer/spleeter`

Here are up-to-date installation instructions for running Deezer's Spleeter on `Ubuntu 24.04`. Minimum requirements are around 16 GB of RAM. (During the processing, it uses around 11 GB at the peak.) I ran this on a temporary Hetzner server because my Apple Silicon system, after lots of fiddling with version, ran into AVX issues. Install Conda. ``` conda create -n spleeter_env python=3.8 -y ``` ``` conda activate spleeter_env ``` ``` conda install -c conda-forge ffmpeg libsndfile numpy=1.19 -y ``` ``` pip install spleeter ``` ``` spleeter separate -o audio_output input.mp3 ``` If your a...

0 views
(think) Today

How to Vim: Many Ways to Paste

Most Vim users know and – paste after and before the cursor. Simple enough. But did you know that Vim actually has around a dozen paste commands, each with subtly different behavior? I certainly didn’t when I started using Vim, and I was surprised when I discovered the full picture. Let’s take a tour of all the ways to paste in Vim, starting with Normal mode and then moving to Insert mode. One important thing to understand first – it’s all about the register type . Vim registers don’t just store text, they also track how that text was yanked or deleted. There are three register types (see ): This is something that trips up many Vim newcomers – the same command can behave quite differently depending on the register type! With that in mind, here’s the complete family of paste commands in Normal mode: The “Direction” column above reflects both cases – for characterwise text it’s “after/before the cursor”, for linewise text it’s “below/above the current line”. How to pick the right paste command? Here are a few things to keep in mind: All Normal mode paste commands accept a count (e.g., pastes three times) and a register prefix (e.g., pastes from register ). In Insert mode things get interesting. All paste commands start with , but the follow-up keystrokes determine how the text gets inserted: Let me unpack this a bit: Note: Plain can be a minor security concern when pasting from the system clipboard ( or registers), since control characters in the clipboard will be interpreted. When in doubt, use instead. And that’s a wrap! Admittedly, even I didn’t know some of those ways to paste before doing the research for this article. I’ve been using Vim quite a bit in the past year and I’m still amazed how many ways to paste are there! If you want to learn more, check out , , , and in Vim’s built-in help. There’s always more to discover! That’s all I have for you today. Keep hacking! You can also use and with a register, e.g. to paste from register with adjusted indentation.  ↩ Characterwise (e.g., ): inserts text to the right of the cursor, to the left. Linewise (e.g., , ): inserts text on a new line below, on a new line above. The cursor position within the line doesn’t matter. Blockwise (e.g., selection): text is inserted as a rectangular block starting at the cursor column. The difference between / and / is all about where your cursor ends up. With the cursor lands on the last character of the pasted text, while with it moves just past the pasted text. This makes handy when you want to paste something and continue editing right after it. and are incredibly useful when pasting code – they automatically adjust the indentation of the pasted text to match the current line. No more pasting followed by to fix indentation! 1 and are the most niche – they only matter when pasting blockwise selections, where they avoid adding trailing whitespace to pad shorter lines. is the most common one – you press and then a register name (e.g., , , for the system clipboard). The text is inserted as if you typed it, which means and auto-indentation apply. This can be surprising if your pasted code gets reformatted unexpectedly. inserts the text literally – special characters like backspace won’t be interpreted. However, and auto-indent still apply. is the “raw paste” – no interpretation, no formatting, no auto-indent. What you see in the register is what you get. This is the one I’d recommend for pasting code in Insert mode. is like , but it adjusts the indentation to match the current context. Think of it as the Insert mode equivalent of . You can also use and with a register, e.g. to paste from register with adjusted indentation.  ↩

0 views

Two different tricks for fast LLM inference

Anthropic and OpenAI both recently announced “fast mode”: a way to interact with their best coding model at significantly higher speeds. These two versions of fast mode are very different. Anthropic’s offers up to 2.5x tokens per second (so around 170, up from Opus 4.6’s 65). OpenAI’s offers more than 1000 tokens per second (up from GPT-5.3-Codex’s 65 tokens per second, so 15x). So OpenAI’s fast mode is six times faster than Anthropic’s 1 . However, Anthropic’s big advantage is that they’re serving their actual model. When you use their fast mode, you get real Opus 4.6, while when you use OpenAI’s fast mode you get GPT-5.3-Codex-Spark, not the real GPT-5.3-Codex. Spark is indeed much faster, but is a notably less capable model: good enough for many tasks, but it gets confused and messes up tool calls in ways that vanilla GPT-5.3-Codex would never do. Why the differences? The AI labs aren’t advertising the details of how their fast modes work, but I’m pretty confident it’s something like this: Anthropic’s fast mode is backed by low-batch-size inference, while OpenAI’s fast mode is backed by special monster Cerebras chips . Let me unpack that a bit. The tradeoff at the heart of AI inference economics is batching , because the main bottleneck is memory . GPUs are very fast, but moving data onto a GPU is not. Every inference operation requires copying all the tokens of the user’s prompt 2 onto the GPU before inference can start. Batching multiple users up thus increases overall throughput at the cost of making users wait for the batch to be full. A good analogy is a bus system. If you had zero batching for passengers - if, whenever someone got on a bus, the bus departed immediately - commutes would be much faster for the people who managed to get on a bus . But obviously overall throughput would be much lower, because people would be waiting at the bus stop for hours until they managed to actually get on one. Anthropic’s fast mode offering is basically a bus pass that guarantees that the bus immediately leaves as soon as you get on. It’s six times the cost, because you’re effectively paying for all the other people who could have got on the bus with you, but it’s way faster 3 because you spend zero time waiting for the bus to leave. Obviously I can’t be fully certain this is right. Maybe they have access to some new ultra-fast compute that they’re running this on, or they’re doing some algorithmic trick nobody else has thought of. But I’m pretty sure this is it. Brand new compute or algorithmic tricks would likely require changes to the model (see below for OpenAI’s system), and “six times more expensive for 2.5x faster” is right in the ballpark for the kind of improvement you’d expect when switching to a low-batch-size regime. OpenAI’s fast mode does not work anything like this. You can tell that simply because they’re introducing a new, worse model for it. There would be absolutely no reason to do that if they were simply tweaking batch sizes. Also, they told us in the announcement blog post exactly what’s backing their fast mode: Cerebras. OpenAI announced their Cerebras partnership a month ago in January. What’s Cerebras? They build “ultra low-latency compute”. What this means in practice is that they build giant chips . A H100 chip (fairly close to the frontier of inference chips) is just over a square inch in size. A Cerebras chip is 70 square inches. You can see from pictures that the Cerebras chip has a grid-and-holes pattern all over it. That’s because silicon wafers this big are supposed to be broken into dozens of chips. Instead, Cerebras etches a giant chip over the entire thing. The larger the chip, the more internal memory it can have. The idea is to have a chip with SRAM large enough to fit the entire model , so inference can happen entirely in-memory. Typically GPU SRAM is measured in the tens of megabytes . That means that a lot of inference time is spent streaming portions of the model weights from outside of SRAM into the GPU compute 4 . If you could stream all of that from the (much faster) SRAM, inference would a big speedup: fifteen times faster, as it turns out! So how much internal memory does the latest Cerebras chip have? 44GB . This puts OpenAI in kind of an awkward position. 44GB is enough to fit a small model (~20B params at fp16, ~40B params at int8 quantization), but clearly not enough to fit GPT-5.3-Codex. That’s why they’re offering a brand new model, and why the Spark model has a bit of “small model smell” to it: it’s a smaller distil of the much larger GPT-5.3-Codex model 5 . It’s interesting that the two major labs have two very different approaches to building fast AI inference. If I had to guess at a conspiracy theory, it would go something like this: Obviously OpenAI’s achievement here is more technically impressive. Getting a model running on Cerebras chips is not trivial, because they’re so weird. Training a 20B or 40B param distil of GPT-5.3-Codex that is still kind-of-good-enough is not trivial. But I commend Anthropic for finding a sneaky way to get ahead of the announcement that will be largely opaque to non-technical people. It reminds me of OpenAI’s mid-2025 sneaky introduction of the Responses API to help them conceal their reasoning tokens . Seeing the two major labs put out this feature might make you think that fast AI inference is the new major goal they’re chasing. I don’t think it is. If my theory above is right, Anthropic don’t care that much about fast inference, they just didn’t want to appear behind OpenAI. And OpenAI are mainly just exploring the capabilities of their new Cerebras partnership. It’s still largely an open question what kind of models can fit on these giant chips, how useful those models will be, and if the economics will make any sense. I personally don’t find “fast, less-capable inference” particularly useful. I’ve been playing around with it in Codex and I don’t like it. The usefulness of AI agents is dominated by how few mistakes they make , not by their raw speed. Buying 6x the speed at the cost of 20% more mistakes is a bad bargain, because most of the user’s time is spent handling mistakes instead of waiting for the model 6 . However, it’s certainly possible that fast, less-capable inference becomes a core lower-level primitive in AI systems. Claude Code already uses Haiku for some operations. Maybe OpenAI will end up using Spark in a similar way. This isn’t even factoring in latency. Anthropic explicitly warns that time to first token might still be slow (or even slower), while OpenAI thinks the Spark latency is fast enough to warrant switching to a persistent websocket (i.e. they think the 50-200ms round trip time for the handshake is a significant chunk of time to first token). Either in the form of the KV-cache for previous tokens, or as some big tensor of intermediate activations if inference is being pipelined through multiple GPUs. I write a lot more about this in Why DeepSeek is cheap at scale but expensive to run locally , since it explains why DeepSeek can be offered at such cheap prices (massive batches allow an economy of scale on giant expensive GPUs, but individual consumers can’t access that at all). Is it a contradiction that low-batch-size means low throughput, but this fast pass system gives users much greater throughput? No. The overall throughput of the GPU is much lower when some users are using “fast mode”, but those user’s throughput is much higher. Remember, GPUs are fast, but copying data onto them is not. Each “copy these weights to GPU” step is a meaningful part of the overall inference time. Or a smaller distil of whatever more powerful base model GPT-5.3-Codex was itself distilled from. I don’t know how AI labs do it exactly, and they keep it very secret. More on that here . On this note, it’s interesting to point out that Cursor’s hype dropped away basically at the same time they released their own “much faster, a little less-capable” agent model. Of course, much of this is due to Claude Code sucking up all the oxygen in the room, but having a very fast model certainly didn’t help . OpenAI partner with Cerebras in mid-January, obviously to work on putting an OpenAI model on a fast Cerebras chip Anthropic have no similar play available, but they know OpenAI will announce some kind of blazing-fast inference in February, and they want to have something in the news cycle to compete with that Anthropic thus hustles to put together the kind of fast inference they can provide: simply lowering the batch size on their existing inference stack Anthropic (probably) waits until a few days before OpenAI are done with their much more complex Cerebras implementation to announce it, so it looks like OpenAI copied them This isn’t even factoring in latency. Anthropic explicitly warns that time to first token might still be slow (or even slower), while OpenAI thinks the Spark latency is fast enough to warrant switching to a persistent websocket (i.e. they think the 50-200ms round trip time for the handshake is a significant chunk of time to first token). ↩ Either in the form of the KV-cache for previous tokens, or as some big tensor of intermediate activations if inference is being pipelined through multiple GPUs. I write a lot more about this in Why DeepSeek is cheap at scale but expensive to run locally , since it explains why DeepSeek can be offered at such cheap prices (massive batches allow an economy of scale on giant expensive GPUs, but individual consumers can’t access that at all). ↩ Is it a contradiction that low-batch-size means low throughput, but this fast pass system gives users much greater throughput? No. The overall throughput of the GPU is much lower when some users are using “fast mode”, but those user’s throughput is much higher. ↩ Remember, GPUs are fast, but copying data onto them is not. Each “copy these weights to GPU” step is a meaningful part of the overall inference time. ↩ Or a smaller distil of whatever more powerful base model GPT-5.3-Codex was itself distilled from. I don’t know how AI labs do it exactly, and they keep it very secret. More on that here . ↩ On this note, it’s interesting to point out that Cursor’s hype dropped away basically at the same time they released their own “much faster, a little less-capable” agent model. Of course, much of this is due to Claude Code sucking up all the oxygen in the room, but having a very fast model certainly didn’t help . ↩

1 views

pay or okay - is it really?

While browsing news websites, you may have seen a pop-up like this: That's one form of how " Pay or Okay " can look like. The model was first introduced by newspapers in Austria and Germany in 2018, but in 2023, Meta adopted it for Instagram and Facebook. What this means is: Either you agree to the tracking and get to use the website for free, or you disagree to tracking and have to pay . Maybe this doesn't sound so bad to you; after all, if they lose out on tracking that generates money, they should be compensated, right? Unfortunately, it's not that easy! You see: Not every Pay or Okay system is set up the same way, or even the way it sounds at first. I wouldn't fault you for thinking that paying should mean there's no tracking at all, or only the most essential tracking and no ads, but that's not true. With many websites like The Guardian above, you pay just to opt out of the ads being personalized . You'll still see ads, you'll still have cookies and "similar technologies" (other tracking) being employed against you. Despite paying monthly, your data is still harvested and your reading behavior tracked. To me, this is a sort of " double-dipping ", as it still results in some data selling on top of my monthly payment. Some research shows that publishers on average earn €0.24 per user and month from personalized tracking and €3.24 per user and month from the paid option 1 . If I'm going to pay for this and there's increased revenue, I want there to be the minimum amount of tracking, not just less. I don't want you to take my money and still somehow monetize my data! There are regional differences in pricing too, with the most extreme in France: If you read French online news sites, you'd pay ~800% of the average total digital advertising revenue per user if you wanted to refuse tracking. That means that what you pay and what your data is worth is not equal, they are just milking you on top of it. Pay or Okay models can, depending on the implementation, lead to a double payment too. You might be paying to be tracked less, and then also need to pay to access paywalled content separately. This tends to happen in setups where it's combined with a freemium model, in which some content is freely accessible while some is paywalled. Even in setups where the paid mode to reduce tracking is just their normal subscription (usually called "hard paywall", or "metered paywall" if you have limited free samples), it means the popup is simply advertisement for their subscription and has little to do with choice. The sad reality is that instead of empowering users to make a choice, this is once again engaging in dark patterns . Not only is one of the options often automatically pre-selected, higher, or emphasized with colors, but it's obviously easier to just click to agree and be done with it instead of setting up payment first. Research papers about this show that this model leads consent rates of 99% to 99.9% 2 , even though only 0.16% - 7% of people actually want to be tracked or see personalized advertising online 3 . This is hardly reconcilable with Article 7(3) GDPR, in which withdrawal or rejection should be as easy as giving consent. That means: Not only does this put a price on the human right of informational self-determination , but it also makes it a hassle to enforce and stick to as a user. Another issue is that it's pricing people out of actually getting to make a decision freely . If you struggle financially (or are just a teen with no or little income), it's not worth it to spend money each month just for less tracking - you have bigger problems! If you cannot afford it, you're either forced to agree to the tracking or exit the site. Even if you pay the fee for one news site, you'd surely not pay it for the handful of others you visit. In Germany, paying the reject fee on 29 of the top 100 websites that used Pay or Okay (including news, weather, ‘social’ media networks and others) amounts to an overall cost of over € 1.528,87 per year according to noyb.eu . That's more than the German yearly spending for clothes. There's also no geographical pricing adjustment, so if you are in an economically weaker country wanting to read German or French news, you'd still have to pay those high prices. So far, I haven't seen a single site that allows you to pay a rejection fee per article with their Pay or Okay pop-up; it was all or nothing, in a recurring subscription. That's unfortunate, because a user shouldn't have to enter a subscription model to avoid tracking while viewing one article of a site they might not visit again. This, together with paywalls, is adding to the issue of people increasingly getting their news from third parties that are freely available, but may skew it to their advantage. Of course independent, investigative journalism needs to be compensated and kept alive . But digital advertising, according to estimates by the European Media Industry Outlook , only accounts for about 10% of the revenue of the press, with targeted advertising being only about 5%. For comparison: Their Figure 50 graphic shows print circulation still makes up roughly 50% of revenue! Given that on average only 5% of press revenue comes from advertising, implementing Pay or Okay likely only increases the income by very little. This is not enough to save the press , so we should not be misled by economic interests to deny that this has a significant negative impact on our decision to be tracked or not. This doesn't sound like a legitimate (economic) interest that overrides the users' interests according to Article 6(1)(f) GDPR. Tracking isn't even that useful for news sites: The World Association of News Publishers says that >50 % of global programmatic ('personalized') advertisement spending instead goes to Alibaba, Alphabet, Amazon, ByteDance and Meta. In comparison, news publishers are still taking more directly sold advertisements . That makes sense: The big platforms already work with algorithms and hyper-personalizing the user experience, while news publishers come from a long past of offering people a fixed, non-personalized ad space in the newspaper. Even if they wanted to use more fitting advertising, there is still the option of contextualized advertising , which are only linked to a specific medium or content without needing to use the users' personal data. Of course you could say " Who the hell cares? Just install an ad-blocker and other privacy-focused browser extensions! " and you'd not be wrong. Allegedly, due to increased blocking or rejection of tracking and cookies, only about 30% of internet users are even exposed to targeting 4 . I have doubts about this number, because many people do not engage via browsers, but within apps that don't allow interference. But if we believe it, that means even when we have an artificially inflated 99% consent rate due to Pay or Okay pop-ups, most of those don't actually transfer into ad revenue. Still, there's always an arms race between tracking/advertising and blocking, and we should enable a free choice even for people who aren't knowledgeable enough about this stuff and are still getting tracked without their consent, or forced to. Caring about privacy in this aspect requires people to know and that is a lot! Just imagine telling all of that to your grandparents. Ask the average person what cookie banners are about; many will not be able to tell you. They are like Terms of Service, Privacy Policies, or EULA's to people. They just know if they click yes, they'll get to where they wanna go faster. There's no informed choice there because many people are not sat down and educated about it, and Pay or Okay pop-ups work the same. I prefer to work on shitty implementations and legal loopholes rather than put the responsibility on the user to know about the latest issues or technical solutions. Unfortunately, it seems like we are moving on with this. Despite the European Data Protection Board stating in its 08/2024 opinion that large online platforms relying on a binary choice between consenting or paying a fee is generally not legal, no consequences have followed. Data Protection Authorities, like the ones in Germany, have stayed silent on the matter. In the Digital Omnibus to overhaul parts of the GDPR and other laws around digital rights, they write: " Considering the importance of advertising revenue for independent journalism as an indispensable pillar of a democratic society, media service providers as defined in Regulation (EU) 2024/1083 (European Media Freedom Act) should not be obliged to respect such signals . " "Such signals" meaning automated signals of refusing tracking/cookies. This unfortunately shows that in the future, if this goes through, " Pay or Okay " is seen as acceptable because choice does not matter for news media, even if it was previously (aside from a CJEU judgment in 2023) contentious or denied for large platforms . If it is allowed for one, it should technically be allowed for others, because the GDPR doesn't differentiate between different groups of controllers for these things. That means a future in which we still continue to fight back against ad-tech, and not just paywalls for content, but paywalls to our right to choose as well. Reply via email Published 14 Feb, 2026 Study by Müller-Tribbensee et. al. ↩ This is also mentioned in an interview with Dirk Freitag, CEO of Contentpass, a service that offers Pay or Okay services to publications. ↩ See for example the attitude towards tracking on Facebook . ↩ Read here and here about targeting issues, for example. ↩ how tracking and advertising works the negative aspects of advertising (why would you possibly not want it? Not just annoying placement, but possible psychological effects) the fact that many of these sites have 100+ (sometimes even 1000+) partners they share the data with what data is tracked how it can be misused, leaked, etc. that ad-blockers and other software exists that you can use a browser version instead of the app Study by Müller-Tribbensee et. al. ↩ This is also mentioned in an interview with Dirk Freitag, CEO of Contentpass, a service that offers Pay or Okay services to publications. ↩ See for example the attitude towards tracking on Facebook . ↩ Read here and here about targeting issues, for example. ↩

0 views
Kev Quirk Yesterday

Updates to My Commenting System

I've been making some updates to my personal commenting system . Before they looked like this: It was just a simple table that contained every comment, both original and replies, in one table. As more people commented, it got a little unwieldy and confusing, so I've changed the layout to that it's all nested comment threads now: I've been using it for around 6 months and have received well over a hundred comments, plus moving it from Jekyll to Pure Blog was extremely simple - just some CSS changes, actually. At this point it's battle tested and working great. However, there's still some rough edges in the code, and security could definitely be improved. So over the next few weeks I'll be doing that, at which point I'll probably release it to the public so you too can have comments on your blog, if you want them.

0 views
Chris Coyier Yesterday

The Good & Not Good

I’ve spent more time with religious people in the last year than perhaps I have in my whole life. It’s got me thinking about religion with more curiosity than I ever have. So I’m having what are probably middle-school level thoughts. I’ve forever identified as agnostic, likely because that’s how most of my family rolled when growing up. Aside from what anyone truly believes,  most people  end up doing the religion their family does. I’m no exception. I want to be a good person. I like good people. I’m interested in what drives people to be good and vice versa. Here’s an oversimplification of all humans that rolls through my brain: There are people in all quadrants. There are cases that make obvious sense: Who’s evaluating these people as being good or bad, and their individual actions, as good or bad ? Me, I do. I’m the judge. I wonder — are there cases that are nearly the opposite? I’m interested in what helps any individual person be good and provides some kind of framework for evaluating their actions. Maybe I can learn from them. Religious or otherwise, equally. I’d like to think I can. I’m not above reading some scripture to help understand the world and myself if it can help me be better. But I struggle. I’ve talked to three men in the past year who have had an encounter with a powerful religious figure. They came to them, as it were, in a time of need, and spoke to them clearly and directly and told them what to do. Did they, though? My agnostic brain is full of doubt. Like… you talked to a ghost? OK. Or did their brain just invent that (brains are wild!) because they needed it and the culture they grew up in supports and rewards stories like this? But I can’t help but worry that my own lack of faith prevents me from these powerful guiding moments. After all, I look up to all three of these men in certain ways and find them to be good men. Maybe I can change my brain to get in on this. I’m just as interested, or more, in the fuel and motivations behind not-good people. I don’t need help understanding doing bad, I don’t think. If I take candy from a baby, then I have candy! Plus, that baby was different to me, and I don’t understand and thus fear it. I can think of two recent personal instances with very religious people hiding behind a religious shield. They did bad. Not horrifically bad, but you know, they had a choice and made the bad one. I can’t perfectly know their mind, but based on their words and actions, it feels like religion pre-excused the choice. Of course I’m doing something bad, I’m born bad, and I actively feed bad about being bad. Religion isn’t a battery of good for them; it’s trapping them into a counterproductive way of thinking. Perhaps being directly and truly accountable for your own actions can be a way out of that trap? I think I’ll just continue to be interested in people and try to pick the best path I can. I’m not sure I’m ready to let religion be a guide to me. But I’m very comfortable with the thought that there is an incredible amount of unknown in ourselves and the universe, and that our actions matter. The contradictions in religion and action will continue to sit uncomfortably for me. I’ve been thinking about this for a year, but high five to Derek Sivers recent post Religion is action, not belief for the motivation to get my own words out. One man believed God was on his side. He often lost his temper, hurt people, and did more harm than good. But he believed that what matters is what’s in his heart, since God will forgive his actions and see his good intentions. Another man was full of doubt but followed the rules of his religion. He stopped to pray five times a day, and donated to charity. He was calm and kind to everyone, no matter how he felt. He was never sure about his beliefs, but kept that to himself, since what mattered were his actions. What is the point of beliefs if they don’t shape your actions?

0 views
(think) Yesterday

Neocaml 0.1: Ready for Action

neocaml 0.1 is finally out! Almost a year after I announced the project , I’m happy to report that it has matured to the point where I feel comfortable calling it ready for action. Even better - recently landed in MELPA , which means installing it is now as easy as: That’s quite the journey from “a fun experimental project” to a proper Emacs package! You might be wondering what’s wrong with the existing options. The short answer - nothing is wrong per se, but offers a different set of trade-offs: Of course, is the youngest of the bunch and it doesn’t yet match Tuareg’s feature completeness. But for many OCaml workflows it’s already more than sufficient, especially when combined with LSP support. I’ve started the project mostly because I thought that the existing Emacs tooling for OCaml was somewhat behind the times - e.g. both and have features that are no longer needed in the era of . Let me now walk you through the highlights of version 0.1. The current feature-set is relatively modest, but all the essential functionality one would expect from an Emacs major mode is there. leverages TreeSitter for syntax highlighting, which is both more accurate and more performant than the traditional regex-based approaches used by and . The font-locking supports 4 customizable intensity levels (controlled via , default 3), so you can pick the amount of color that suits your taste. Both (source) and (interface) files get their own major modes with dedicated highlighting rules. Indentation has always been tricky for OCaml modes, and I won’t pretend it’s perfect yet, but ’s TreeSitter-based indentation engine is already quite usable. It also supports cycle-indent functionality, so hitting repeatedly will cycle through plausible indentation levels - a nice quality-of-life feature when the indentation rules can’t fully determine the “right” indent. If you prefer, you can still delegate indentation to external tools like or even Tuareg’s indentation functions. Still, I think most people will be quite satisfied with the built-in indentation logic. provides proper structural navigation commands ( , , ) powered by TreeSitter, plus integration definitions in a buffer has never been easier. The older modes provide very similar functionality as well, of course, but the use of TreeSitter in makes such commands more reliable and robust. No OCaml mode would be complete without REPL (toplevel) integration. provides all the essentials: The default REPL is , but you can easily switch to via . I’m still on the fence on whether I want to invest time into making the REPL-integration more powerful or keep it as simple as possible. Right now it’s definitely not a big priority for me, but I want to match what the other older OCaml modes offered in that regard. works great with Eglot and , automatically setting the appropriate language IDs for both and files. Pair with ocaml-eglot and you get a pretty solid OCaml development experience. The creation of LSP really simplified the lives of a major mode authors like me, as now many of the features that were historically major mode specific are provided by LSP clients out-of-the-box. That’s also another reason why you probably want to leaner major mode like . But, wait, there’s more! There’s still plenty of work to do: If you’re following me, you probably know that I’m passionate about both Emacs and OCaml. I hope that will be my way to contribute to the awesome OCaml community. I’m not sure how quickly things will move, but I’m committed to making the best OCaml editing experience on Emacs. Time will tell how far I’ll get! If you’re an OCaml programmer using Emacs, I’d love for you to take for a spin. Install it from MELPA, kick the tires, and let me know what you think. Bug reports, feature requests, and pull requests are all most welcome on GitHub ! That’s all from me, folks! Keep hacking! is ancient and barely maintained. It lacks many features that modern Emacs users expect and it probably should have been deprecated a long time ago. is very powerful, but also very complex. It carries a lot of legacy code and its regex-based font-locking and custom indentation engine show their age. It’s a beast - in both the good and the bad sense of the word. aims to be a modern, lean alternative that fully embraces TreeSitter. The codebase is small, well-documented, and easy to hack on. If you’re running Emacs 29+ (and especially Emacs 30), TreeSitter is the future and is built entirely around it. - Start or switch to the OCaml REPL - Send the current definition - Send the selected region - Send the entire buffer - Send a phrase (code until ) to quickly switch between and files Prettify-symbols support for common OCaml operators Automatic installation of the required TreeSitter grammars via Compatibility with Merlin for those who prefer it over LSP Support for additional OCaml file types (e.g. ) Improvements to structured navigation using newer Emacs TreeSitter APIs Improvements to the test suite Addressing feedback from real-world OCaml users Actually writing some fun OCaml code with

0 views
Stone Tools Yesterday

dBASE on the Kaypro II

The world that might have been has been discussed at length. In one possible world, Gary Kildall's CP/M operating system was chosen over MS-DOS to drive IBM's then-new "Personal Computer." As such, Bill Gates's hegemony over the trajectory of computing history never happened. Kildall wasn't constantly debunking the myth of an airplane joyride which denied him Microsoft-levels of industry dominance. Summarily, he'd likely be alive and innovating the industry to this day. Kildall's story is pitched as a "butterfly flaps its wings" inflection point that changed computing history. The truth is, of course, there were many points along our timeline which led to Kildall's fade and untimely death. Rather, I'd like to champion what Kildall did . Kildall did co-host Computer Chronicles with Stewart Chiefet for seven years. Kildall did create the first CD-ROM encyclopedia. Kildall did develop (and coin the term for) what we know today as the BIOS. Kildall did create CP/M, the first wide-spread, mass-market, portable operating system for microcomputers, possible because of said BIOS. CP/M did dominate the business landscape until the DOS era, with 20,000+ software titles in its library. Kildall did sell his company, Digital Research Inc., to Novell for US $120M. Kildall did good . Systems built to run Kildall's CP/M were prevalent, all built around the same 8-bit limits: an 8080 or Z80 processor and up to 64KB RAM. The Osborne 1, a 25lb (11kg) "portable" which sold for $1795 ($6300 in 2026), was the talk of the West Coast Computer Faire in 1981. The price was sweet, considering it came bundled with MSRP $1500 in software, including Wordstar and Supercalc . Andy Kay's company, Non-Linear Systems, debuted the Kaypro II (the "I" only existed in prototype form) the following year at $1595, $200 less (and four pounds heavier) than the Osborne. Though slower than an Osborne, it arguably made it easier to do actual work, with a significantly larger screen and beefier floppy disk capacity. Within the major operating system of its day, on popular hardware of its day, ran the utterly dominant relational database software of its day. PC Magazine , February 1984, said, "Independent industry watchers estimate that dBASE II enjoys 70 percent of the market for microcomputer database managers." Similar to past subjects HyperCard and Scala Multimedia , Wayne Ratcliff's dBASE II was an industry unto itself, not just for data-management, but for programmability, a legacy which lives on today as xBase. Strangely enough, dBASE also decided to attach "II" to its first release; a marketing maneuver to make the product appear more advanced and stable at launch. I'm sure the popularity of the Apple II had nothing to do with anyone's coincidentally similar roman numeral naming scheme whatsoever. Written in assembly, dBASE II squeezed maximum performance out of minimal hardware specs. This is my first time using both CP/M and dBASE. Let's see what made this such a power couple. I'm putting on my tan suit and wide brown tie for this one. As the owner of COMPUTRON/X, a software retail shop, I'm in Serious Businessman Mode™. I need to get inventory under control, snake the employee toilet, do profit projections, and polish a mind-boggling amount of glass and chrome. For now, I'll start with inventory and pop in this laserdisc to begin my dBASE journey. While the video is technically for 16-bit dBASE III , our host, Gentry Lee of Jet Propulsion Laboratory, assures us that 8-bit dBASE II users can do everything we see demonstrated, with a few interface differences. This is Gail Fisher, a smarty pants who thinks she's better than me. Tony Lima, in his book dBASE II for Beginners , concurs with the assessment of dBASE II and III 's differences being mostly superficial. Lima's book is pretty good, but I'm also going through Mastering dBASE II The Easy Way , by Paul W. Heiser, the official Kaypro dBASE II Manual, and dBase II for the First Time User by Alan Freedman. That last one is nicely organized by common tasks a dBASE user would want to do, like "Changing Your Data" and "Modifying Your Record Structure." I find I return to Freedman's book often. As I understand my time with CP/M, making custom bootable diskettes was the common practice. dBASE II is no different, and outright encourages this, lest we risk losing US$2000 (in 2026 dollars) in software. Being of its time and place in computing history, dBASE uses the expected UI. You know it, you love it, it's "a blinking cursor," here called "the dot prompt." While in-program is available, going through the video, books, and manual is a must. dBASE pitches the dot prompt as a simple, English language interface to the program. for example sets the default save drive to the B: drive. You could never intuit that by what it says, nor guess that it even needs to be done, but when you know how it works, it's simple to remember. It's English only in the sense that English-like words are strung together in English-like order. That said, I kind of like it? creates a new database, prompting first for a database name, then dropping me into a text entry prompt to start defining fields. This is a nice opportunity for me to feign anger at The Fishers, the family from the training video. Fancy-pants dBASE III has a more user-friendly entry mode, which requires no memorization of field input parameters. Prompts and on-screen help walk Gail through the process. In dBASE II , a field is defined by a raw, comma-delimited string. Field definitions must be entered in the order indicated on-screen. is the data type for the field, as string, number, or boolean. This is set by a one-letter code which will never be revealed at any time, even when it complains that I've used an invalid code. Remind me to dog-ear that page of the manual. For my store, I'm scouring for games released for CP/M. Poking through Moby Games digs up roughly 30 or so commercial releases, including two within the past five years . Thanks, PunyInform ! My fields are defined thusly, called up for review by the simple command. The most frustrating part about examining database software is that it doesn't do anything useful until I've entered a bunch of data. At this stage in my learning, this is strictly a manual process. Speaking frankly, this part blows, but it also blows for Gail Fisher, so my schadenfreude itch is scratched. dBASE does its best to minimize the amount of keyboard shenanigans during this process, and in truth data entry isn't stressful. I can pop through records fairly quickly, if the raw data is before me. The prompt starts at the first field and (not !) moves to the next. If entry to a field uses the entire field length (as defined by me when setting up the fields earlier), the cursor automatically jumps to the next field with a PC-speaker beep. I guess dBASE is trying to "help," but when touch typing I'm looking at my data source, not the screen. I don't know when I'm about to hit the end of a field, so I'm never prepared when it switches input fields and makes that ugly beep. More jarring is that if the final field of a record is completely filled, the cursor "helpfully" jumps to the beginning of a new record instantly, with no opportunity to read or correct the data I just input. It's never not annoying. Gail doesn't have these issues with dBASE III and her daughter just made dinner for her. Well, I can microwave a burrito as well as anyone so I'm not jealous . I'm not. In defining the fields, I have already made two mistakes. First, I wanted to enter the critic score as a decimal value so I could get the average. Number fields, like all fields, have a "width" (the maximum number of characters/bytes to allocate to the field), but also a "decimal places" value and as I type these very words I see now my mistake. Rubber ducking works . I tricked myself into thinking "width" was for the integer part, and "decimal places" was appended to that. I see now that, like character fields, I need to think of the entire maximum possible number as being the "width." Suppose in a value we expect to record . There are 2 decimal places, and a decimal point, and a leading 0, and potentially a sign, as or . So that means the "width" should be 5, with 2 "decimal places" (of those 5). Though I'm cosplaying as a store owner, I'm apparently cosplaying as a store owner that sucks! I didn't once considered pricing! Gah, Gail is so much better at business than I am! Time to get "sorta good." Toward that end, I have my to-do list after a first pass through data entry. Modifying dBASE "structures" (the field/type definitions) can be risky business. If there is no data yet, feel free to change whatever you want. If there is pre-existing data, watch out. will at least do the common decency of warning you about the pile you're about to step into. Modifying a database structure is essentially verboten, rather we must juggle files to effect a structure change. dBASE let's us have two active files, called "work areas," open simultaneously: a and a . Modifications to these are read from or written to disk in the moment; 64K can't handle much live data. It's not quite "virtual memory" but it makes the best of a tight situation. When wanting to change data in existing records, the command sounds like a good choice, but actually winds up being more useful. will focus in on specified fields for immediate editing across all records. It's simple to through fields making changes. I could to edit everything at once, but I'm finding it safer while learning to make small incremental changes or risk losing a large body of work. Make a targeted change, save, make another change, save, etc. 0:00 / 0:03 1× I laughed every time Gentry Lee showed up, like he's living with The Fishers as an invisible house gremlin. They never acknowledge his presence, but later he eats their salad! Being a novice at dBASE is a little dangerous, and MAME has its own pitfalls. I have been conditioned over time to when I want to "back out" of a process. This shuts down MAME instantly. When it happens, I swear The Fishers are mocking me, just on the edge of my peripheral vision, while Gentry Lee helps himself to my tuna casserole. dBASE is a relational database. Well, let's be less generous and call it "relational-ish." The relational model of data was defined by Edgar F. Codd in 1969 where "relation is used here in its accepted mathematical sense." It's all set theory stuff; way over my head. Skimming past the nerd junk, in that paper he defines our go-to relationship of interest: the join. As a relational database, dBASE keeps its data arranged VisiCalc style, in rows and columns. So long as two databases have a field in common, which is defined, named, and used identically in both , the two can be "joined" into a third, new database. I've created a mini database of developer phone numbers so I can call and yell at them for bugs and subsequent lost sales. I haven't yet built up the grin-and-bear-it temperament Gail possesses toward Amanda Covington. Heads will roll! You hear me, Lebling? Blank?! 64K (less CP/M and dBASE resources) isn't enough to do an in-memory join. Rather, joining creates and writes a completely new database to disk which is the union of two databases. The implication being you must have space on disk to hold both original databases as well as the newly joined database, and also the new database cannot exceed dBASE 's 65,535 record limit after joining. In the above , means and means , so we can precisely specify fields and their work area of origin. This is more useful for doing calculations at time, like to join only records where deletes specific records, if we know the record number, like . Commands in dBASE stack, so a query can define the target for a command, as one would hope and expect in 2026. Comparisons and sub-strings can be used as well. So, rather than deleting "Infocom, Inc." we could: The command looks for the left-hand string as a case-sensitive sub-string in the right-hand string. We can be a little flexible in how data may have been input, getting around case sensitivity through booleans. Yes, we have booleans! Wait, why am I deleting any Infocom games? I love those! What was I thinking?! Once everything is marked for deletion, that's all it is: marked for deletion. It's still in the database, and on disk, until we do real-deal, non-reversible, don't-forget-undo-doesn't-exist-in-1982, destruction with . Until now, I've been using the command as a kind of ad-hoc search mechanism. It goes through every record, in sequence, finding record matches. Records have positions in the database file, and dBASE is silently keeping track of a "record pointer" at all times. This represents "the current record" and commands without a query will be applied to the currently pointed record. Typing in a number at the dot prompt moves the pointer to that record. That moves me to record #3 and display its contents. When I don't know which record has what I want, will move the pointer to the first match it finds. At this point I could that record, or to see a list of records from the located record onward. Depending on the order of the records, that may or may not be useful. Right now, the order is just "the order I typed them into the system." We need to teach dBASE different orders of interest to a stripmall retail store. While the modern reaction would be to use the command, dBASE's Sort can only create entirely new database files on disk, sorted by the desired criteria. Sort a couple of times on a large data set and soon you'll find yourself hoarding the last of new-old 5 1/4" floppy disk stock from OfficeMax, or being very careful about deleting intermediary sort results. SQL brainiacs have a solution to our problem, which dBASE can also do. An "index" is appropriate for fast lookups on our columnar data. We can index on one or more fields, remapping records to the sort order of our heart's desire. Only one index can be used at a time, but a single index can be defined against multiple fields. It's easier to show you. When I set the index to "devs" and , that sets the record pointer to the first record which matches my find. I happen to know I have seven Infocom games, so I can for fields of interest. Both indexes group Infocom games together as a logical block, but within that block Publisher order is different. Don't get confused, the actual order of files in the database is betrayed by the record number. Notice they are neither contiguous nor necessarily sequential. would rearrange them into strict numerical record order. An Index only relates to the current state of our data, so if any edits occur we need to rebuild those indexes. Please, contain your excitement. Munging data is great, but I want to understand my data. Let's suppose I need the average rating of the games I sell. I'll first need a count of all games whose rating is not zero (i.e. games that actually have a rating), then I'll need a summation of those ratings. Divide those and I'll have the average. does what it says. only works on numeric fields, and also does what it says. With those, I basically have what I need. Like deletion, we can use queries as parameters for these commands. dBASE has basic math functions, and calculated values can be stored in its 64 "memory variables." Like a programming language, named variables can be referenced by name in further calculations. Many functions let us append a clause which shoves a query result into a memory variable, though array results cannot be memorized this way. shoves arbitrary values into memory, like or . As you can see in the screenshot above, the rating of CP/M games is (of 100). Higher than I expected, to be perfectly honest. As proprietor of a hot (power of positive thinking!) software retail store, I'd like to know how much profit I'll make if I sold everything I have in stock. I need to calculate, per-record, the following but this requires stepping through records and keeping a running tally. I sure hope the next section explains how to do that! Flipping through the 1,000 pages of Kaypro Software Directory 1984 , we can see the system, and CP/M by extension, was not lacking for software. Interestingly, quite a lot was written in and for dBASE II, bespoke database solutions which sold for substantially more than dBASE itself. Shakespeare wrote, "The first thing we do, let's kill all the lawyers." Judging from these prices, the first thing we should do is shake them down for their lunch money. In the HyperCard article I noted how an entire sub-industry sprung up in its wake, empowering users who would never consider themselves programmers to pick up the development reigns. dBASE paved the way for HyperCard in that regard. As Jean-Pierre Martel noted , "Because its programming language was so easy to learn, millions of people were dBASE programmers without knowing it... dBASE brought programming power to the masses." dBASE programs are written as procedural routines called Commands, or .CMD files. dBASE helpfully includes a built-in (stripped down) text editor for writing these, though any text editor will work. Once written, a .CMD file like can be invoked by . As Martel said, I seem to have become a dBASE programmer without really trying. Everything I've learned so far hasn't just been dot prompt commands, it has all been valid dBASE code. A command at the dot prompt is really just a one-line program. Cool beans! Some extra syntax for the purpose of development include: With these tools, designing menus which add a veneer of approachability to a dBASE database are trivial to create. Commands are interpreted, not compiled (that would come later), so how were these solutions sold to lawyers without bundling a full copy of dBASE with every Command file? For a while dBASE II was simply a requirement to use after-market dBASE solutions. The 1983 release of dBASE Runtime changed that, letting a user run a file, but not edit it. A Command file bundled with Runtime was essentially transformed into a standalone application. Knowing this, we're now ready to charge 2026 US$10,000 per seat for case management and tracking systems for attorneys. Hey, look at that, this section did help me with my profit calculation troubles. I can write a Command file and bask in the glow of COMPUTRON/X's shining, profitable future. During the 8 -> 16-bit era bridge, new hardware often went underutilized as developers came to grips with what the new tools could do. Famously, Visicalc 's first foray onto 16-bit systems didn't leverage any of the expanded RAM on the IBM-PC and intentionally kept all known bugs from the 8-bit Apple II version. The word "stop gap" comes to mind. Corporate America couldn't just wait around for good software to arrive. CP/M compatibility add-ons were a relatively inexpensive way to gain instant access to thousands of battle-tested business software titles. Even a lowly Coleco ADAM could, theoretically, run WordStar and Infocom games, the thought of which kept me warm at night as I suffered through an inferior Dragon's Lair adaptation. They promised a laserdisc attachment! For US$600 in 1982 ($2,000 in 2026) your new-fangled 16-bit IBM-PC could relive the good old days of 8-bit CP/M-80. Plug in XEDEX's "Baby Blue" ISA card with its Z80B CPU and 64K of RAM and the world is your slowly decaying oyster. That RAM is also accessible in 16-bit DOS, serving dual-purpose as a memory expansion for only $40 more than IBM's own bare bones 64K board. PC Magazine' s February 1982 review seemed open to the idea of the card, but was skeptical it had long-term value. XEDEX suggested the card could someday be used as a secondary processor, offloading tasks from the primary CPU to the Z80, but never followed through on that threat, as far as I could find. Own anApple II with an 8-bit 6502 CPU but still have 8-bit Z80 envy? Microsoft offered a Z80 daughter-card with 64K RAM for US$399 in 1981 ($1,413 in 2026). It doesn't provide the 80-column display you need to really make use of CP/M software, but is compatible with such add-ons. It was Bill Gates's relationship with Gary Kildall as a major buyer of CP/M for this very card that started the whole ball rolling with IBM, Gates's purchase of QDOS, and the rise of Microsoft. A 16K expansion option could combine with the Apple II's built-in 48K memory, to get about 64K for CP/M usage. BYTE Magazine 's November 1981 review raved, "Because of the flexibility it offers Apple users, I consider the Softcard an excellent buy." Good to know! How does one add a Z80 processor to a system with no expansion slots? Shove a Z80 computer into a cartridge and call it a day, apparently. This interesting, but limited, footnote in CP/M history does what it says, even if it doesn't do it well. Compute!'s Gazette wrote, "The 64 does not make a great CP/M computer. To get around memory limitations, CP/M resorts to intensive disk access. At the speed of the 1541, this makes programs run quite slowly," Even worse for CP/M users is that the slow 1541 can't read CP/M disks. Even if it could, you're stuck in 40-column mode. How were users expected to get CP/M software loaded? We'll circle back to that a little later. At any rate, Commodore offered customers an alternative solution. Where it's older brother had to make do with a cartridge add-on, the C128 takes a different approach. To maintain backward compatible with the C64 it includes a 6510 compatible processor, the 8502. It also wants to be CP/M compatible, so it needs a Z80 processor. What to do, what to do? Maybe they could put both processors into the unit? Is that allowed? Could they do that? They could, so they did. CP/M came bundled with the system, which has a native 80-column display in CP/M mode. It is ready to go with the newer, re-programmable 1571 floppy drive. Unfortunately, its slow bus speed forces the Z80 to run at only 2MHz, slower even than a Kaypro II. Compute!'s Gazette said in their April 1985 issue, "CP/M may make the Commodore 128 a bargain buy for small businesses. The price of the Commodore 128 with the 1571 disk drive is competitive with the IBM PCjr." I predict rough times ahead for the PCjr if that's true! Atari peripherals have adorable industrial design, but were quite expensive thanks to a strange system design decision. The 8-bit system's nonstandard serial bus necessitated specialized data encoding/decoding hardware inside each peripheral, driving up unit costs. For example, the Atari 910 5 1/4" floppy drive cost $500 in 1983 (almost $2,000 in 2026) thanks to that special hardware, yet only stored a paltry 90K per disk. SWP straightened out the Atari peripheral scene with the ATR8000. Shenanigans with special controller hardware are eliminated, opening up a world of cheaper, standardized floppy drives of all sizes and capacities. It also accepts Centronics parallel and RS-232C serial devices, making tons of printers, modems, and more compatible with the Atari. The device also includes a 16K print buffer and the ability to attach up to four floppy drives without additional controller board purchases. A base ATR8000 can replace a whole stack of expensive Atari-branded add-ons, while being more flexible and performant. The saying goes, "Cheaper, better, faster. Pick any two." The ATR8000 is that rare device which delivered all three. Now, upgrade that box with its CP/M compatibility option, adding a Z80 and 64K, and you've basically bought a second computer. When plugged into the Atari, the Atari functions as a remote terminal into the unit, using whatever 40/80-column display adapter you have connected. It could also apparently function standalone, accessible through any terminal, no Atari needed. That isn't even its final form. The Co-Power-88 is a 128K or 256K PC-compatible add-on to the Z80 CP/M board. When booted into the Z80, that extra RAM can be used as a RAM disk to make CP/M fly. When booted into the 8088, it's a full-on PC running DOS or CP/M-86. Tricked out, this eight pound box would set you back US$1000 in 1984 ($3,000 in 2026), but it should be obvious why this is a coveted piece of kit for the Atari faithful to this day. For UK£399 in 1985 (£1288 in 2026; US$1750) Acorn offered a Z80 with dedicated 64K of RAM. According to the manual, the Z80 handles the CP/M software, while the 6502 in the base unit handles floppies and printers, freeing up CP/M RAM in the process. Plugged into the side of the BBC Micro, the manual suggests desk space clearance of 5 ft wide and 2 1/2 feet deep. My god. Acorn User June 1984 declared, "To sum up, Acorn has put together an excellent and versatile system that has something for everyone." I'd like to note that glowing review was almost exclusively thanks to the bundled CP/M productivity software suite. Their evaluation didn't seem to try loading off-the-shelf software, which caused me to narrow my eyes, and stroke my chin in cynical suspicion. Flip through the manual to find out about obtaining additional software, and it gets decidedly vague. "You’ll find a large and growing selection available for your Z80 personal computer, including a special series of products that will work in parallel with the software in your Z80 pack." Like the C128, the Coleco ADAM was a Z80 native machine so CP/M can work without much fuss, though the box does proclaim "Made especially for ADAM!" Since we don't have to add hardware (well, we need a floppy; the ADAM only shipped with a high-speed cassette drive), we can jump into the ecosystem for about US$65 in 1985 ($200 in 2026). Like other CP/M solutions, the ADAM really needed an 80-column adapter, something Coleco promised but never delivered. Like Dragon's Lair on laserdisc! As it stands, CP/M scrolls horizontally to display all 80 columns. This version adds ADAM-style UI for its quaint(?) roman numeral function keys. OK, CP/M is running! Now what? To be honest, I've been toying with you this whole time, dangling the catnip of CP/M compatibility. It's time to come clean and admit the dark side of these add-on solutions. There ain't no software! Even when the CPU and CP/M version were technically compatible, floppy disc format was the sticking point for getting software to run any given machine. For example, the catalog for Kaypro software in 1984 is 896 pages long. That is all CP/M software and all theoretically compatible with a BBC Micro running CP/M. However, within that catalog, everything shipped expressly on Kaypro compatible floppy discs. Do you think a Coleco ADAM floppy drive can read Kaypro discs? Would you be even the tiniest bit shocked to learn it cannot? Kaypro enthusiast magazine PRO illustrates the issue facing consumers back then. Let's check in on the Morrow Designs (founded by Computer Chronicles sometimes co-host George Morrow!) CP/M system owners. How do they fare? OK then, what about that Baby Blue from earlier? The Microsoft Softcard must surely have figured something out. The Apple II was, according to Practical Computing , "the most widespread CP/M system" of its day. Almost every product faced the same challenge . On any given CP/M-80 software disk, the byte code is compatible with your Z8o, if your floppy drive can read the diskette. You couldn't just buy a random CP/M disk, throw it into a random CP/M system, and expect it to work, which would have been a crushing blow to young me hoping to play Planetfall on the ADAM. So what could be done? There were a few options, none of them particularly simple or straightforward, especially to those who weren't technically-minded. Some places offered transfer services. XEDEX, the makers of Baby Blue, would do it for $100 per disk . I saw another listing for a similar service (different machine) at $10 per disk. Others sold the software pre-transferred, as noted on a Coleco ADAM service flyer. A few software solutions existed, including Baby Blue's own Convert program, which shipped with their card and "supports bidirectional file transfer between PC-DOS and popular CP/M disk formats." They also had the Baby Blue Conversion Software which used emulation to "turn CP/M-80 programs into PC-DOS programs for fast, efficient execution on Baby Blue II." Xeno-Copy, by Vertex Systems, could copy from over 40+ disk formats onto PC-DOS for US$99.50 ($313 in 2026); their Plus version promised cross-format read/write capabilities. Notably, Apple, Commodore, Apricot, and other big names are missing from their compatibility list. The Kermit protocol , once installed onto a CP/M system disk, could handle cross-platform serial transfers, assuming you had the additional hardware necessary. "CP/M machines use many different floppy disk formats, which means that one machine often cannot read disks from another CP/M machine, and Kermit is used as part of a process to transfer applications and data between CP/M machines and other machines with different operating systems." The Catch-22 of it all is that you have to get Kermit onto your CP/M disk in the first place. Hand-coding a bare-bones Kermit protocol (CP/M ships with an assembler) for the purposes of getting "real" Kermit onto your system so you could then transfer the actual software you wanted in the first place, was a trick published in the Kermit-80 documentation . Of course, this all assumes you know someone with the proper CP/M setup to help; basically, you're going to need to make friends. Talk to your computer dealer, or better yet, get involved in a local CP/M User's Group. It takes a village to move Wordstar onto a C64. I really enjoyed my time learning dBASE II and am heartened by the consistency of its commands and the clean interaction between them. When I realized that I had accidentally learned how to program dBASE , that was a great feeling. What I expected to be a steep learning curve wasn't "steep" per se, but rather just intimidating. That simple, blinking cursor, can feel quite daunting at the first step, but each new command I learned followed a consistent pattern. Soon enough, simple tools became force multipliers for later tools. The more I used it, the more I liked it. dBASE II is uninviting, but good. On top of that, getting data out into the real world is simple, as you'll see below in "Sharpening the Stone." I'm not locked in. So what keeps me from being super enthusiastic about the experience? It is CP/M-80 which gives me pause. The 64K memory restriction, disk format shenanigans, and floppy disk juggling honestly push me away from that world except strictly for historical investigations. Speaking frankly, I don't care for it. CP/M-86 running dBASE III+ could probably win me over, though I would probably try DR-DOS instead. Memory constraints would be essentially erased, DOSBox-X is drag-and-drop trivial to move files in and out of the system, and dBASE III+ is more powerful while also being more user-friendly. Combine that with Clipper , which can compile dBASE applications into standalone .exe files, and there's powerful utility to be had . By the way, did you know dBASE is still alive ? Maybe. Kinda? Hard to say. The latest version is dBASE 2019 (not a typo!), but the site is unmaintained and my appeal to their LinkedIn for a demo has gone unanswered. Its owner, dBase LTD, sells dBASE Classic which is dBASE V for DOS running in DOSBox, a confession they know they lost the plot, I'd humbly suggest. An ignominious end to a venerable classic. Ways to improve the experience, notable deficiencies, workarounds, and notes about incorporating the software into modern workflows (if possible). When working with CP/M disk images, get to know cpmtools . This is a set of command line utilities for creating, viewing, and modifying CP/M disk images. The tools mostly align with Unix commands, prefixed with Those are the commands I wound up using with regularity. If your system of choice is a "weirdo system" you may be restricted in your disk image/formatting choices; these instructions may be of limited or no help. knows about Kaypro II disk layouts via diskdefs. This Github fork makes it easy to browse supported types. Here's what I did. Now that you can pull data out of CP/M, here's how to make use of it. Kaypro II emulation running in MAME. Default setup includes Dual floppies Z80 CPU at 2.4MHz dBase II v2.4 See "Sharpening the Stone" at the end of this post for how to get this going. Personally, I found this to be a tricky process to learn. Change the of the rating field and add in that data. Add pricing fields and related data. Add more games. and allow decision branching does iterations and will grab a character or string from the user prints text to screen at a specific character position and give control over system memory will run an assembly routine at a known memory location For this article I specifically picked a period-authentic combo of Kaypro II + CP/M 2.2 + dBASE II 2.4. You don't have to suffer my pain! CP/M-86 and dBASE III+ running in a more feature-rich emulator would be a better choice for digging into non-trivial projects. I'm cold on MAME for computer emulation, except in the sense that in this case it was the fastest option for spinning up my chosen tools. It works, and that's all I can say that I enjoyed. That's not nothing! I find I prefer the robust settings offered in products like WinUAE, Virtual ADAM, VICE , and others. Emulators with in-built disk tools are a luxury I have become addicted to. MAME's interface is an inelegant way to manage hardware configurations and disk swapping. MAME has no printer emulation, which I like to use for a more holistic retro computing experience. Getting a working, trouble-free copy of dBASE II onto a Kaypro II compatible disk image was a non-trivial task. It's easier now that I know the situation, but it took some cajoling. I had to create new, blank disks, and copy CP/M and dBASE over from other disk images. Look below under "Getting Your Data into the Real World" to learn about and how it fits into the process. Be careful of modern keyboard conventions, especially wanting to hit to cancel commands. In MAME this will hard quit the emulator with no warning! Exported data exhibited strange artifacts: The big one: it didn't export any "logical" (boolean) field values from my database. It just left that field blank on all records. Field names are not exported. Garbage data found after the last record; records imported fine. On Linux and Windows (via WSL) install thusly : view the contents of a CP/M disk image. Use the flag to tell it the format of the disk, like for the Kaypro II. : format a disk image with a CP/M file system : copy files to/from other disk or to the host operating system : remove files from a CP/M disk image : for making new, blank disk image files (still needs to be formatted) : makes a blank disk image to single-sided, double-density specification : formats that blank image for the Kaypro II : copies "DBASE.COM" from the current directory of the host operating system into the Kaypro II disk image. : displays the contents of the disk : copies "FILE.TXT" from the disk image into the current directory of the host operating system (i.e. ) dBASE has built-in exporting functionality, so long as you use the extension when saving ( in dBASE lingo). That creates a bog-standard ASCII text file, each record on its own line, comma-delimited (and ONLY comma-delimited). It is not Y2K compatible, if you're hoping to record today's date in a field. I tackled this a bit in the Superbase post . It is probably possible to hack up a Command file to work around this issue, since dates are just strings in dBASE . dBASE II doesn't offer the relational robustness of SQL. Many missing, useful tools could be built in the xBase programming language. It would be significant work in some cases; maybe not worth it or consider if you can do without those. Your needs may exceed what CP/M-80 hardware can support; its 8-bit nature is a limiting factor in and of itself. If you have big plans , consider dBASE III+ on DOS to stretch your legs. (I read dBASE IV sucks) The user interface helps at times, and is opaque at other times. This can be part of the fun in using these older systems, mastering esoterica for esoterica's sake, but may be a bridge too far for serious work of real value. Of course, when discussing older machines we are almost always excluding non-English speakers thanks to the limitations of ASCII. The world just wasn't as well-connected at the time.

0 views
Anton Sten Yesterday

Build something silly

Matt Shumer's [Something Big is Happening](https://shumer.dev/something-big-is-happening) has been making the rounds this week. If you haven't read it, it's a 5,000-word letter to non-tech friends and family about what AI is doing to the world right now. Some of it is hyperbolic. Some of it feels like the "learn to code" advice of 2020 — confident about a future that hasn't happened yet. But the core message is right: if you're not experimenting with these tools, you're falling behind. Where I think Shumer's post is most useful is in its advice to non-technical people. Not the doomsday stuff. The practical stuff. Stop treating AI like a search engine. Push it into your actual work. See what happens. I'd take it one step further. Don't just use AI. Build something with it. ## I can't code Let me be clear about something. I'm not a developer. I did some coding early in my career, but that was at the end of the 20th century. We're talking Geocities-era HTML. For the past 25 years, every time I needed something built, I hired someone. That changed last year. I [rebuilt my entire website](https://www.antonsten.com/articles/designers-prompt/) using Cursor and Claude. No developer. Just me, prompting my way through it. It's not rocket science, but it's not nothing either — it's a real site with a blog, newsletter integration, RSS feed, the whole thing. That experience opened a door I didn't expect. ## From $11/month to free I'd been a [Harvest](https://www.getharvest.com/) customer for about ten years. It handled time tracking and invoicing. It was fine. But when I returned to consulting recently, I asked myself a question I'd never considered before: do I actually need this? I tried [Midday.ai](https://midday.ai), which has some clever features. But paying $29/month for someone who sends one or two invoices a month didn't make financial sense. So I did what I think anyone in this position should at least consider doing — I started building my own tool. It wasn't that complicated, mainly because I knew exactly what I needed. Import my clients and invoices from Harvest. Create new invoices connected to a client. Generate PDFs I could send. That's it. No features I'd never use. No settings I had to ignore. Just exactly what I needed. And it was done in less than two days. ## Then something clicked For the first couple of weeks, my tool worked like any other invoicing app. Click to create a client. Click to create a project. Fill out details. Click Save. It followed the same patterns I'd been trained on by a decade of SaaS products. Then it hit me — I was building software that lived by old rules. Rules designed for generic tools that serve thousands of users. But this tool serves exactly one user. Me. So I changed it. Now, instead of manually entering client details, I upload a signed contract and let AI parse it — mapping it to an existing client or creating a new one, extracting the scope, payment terms, duration, everything. It creates my own vault of documents. I added an AI chat where I can ask things like "draft an invoice for unbilled time on Project X" or "what's the total amount invoiced to Client Y this year?" or "what does my availability look like in April?" None of this is rocket science. But it's mine. It does exactly what I need and nothing else. ## This isn't just about me Wall Street has noticed this shift too. A few weeks ago, SaaS stocks lost $285 billion in value after Anthropic released new AI tools. Traders are calling it the "SaaSpocalypse." The fear is simple: if people can build their own tools, why would they keep paying for generic ones? That's probably overblown for enterprise software. Nobody's replacing Salesforce with a weekend project. But for individuals and small businesses? The math is changing fast. My friend Elan Miller recently launched a [competitive brand audit tool](https://audit.off-menu.com/) — a way for anyone to analyze how their brand voice compares to competitors. He's a brand strategist, not a developer. A year ago, building something like that would have meant hiring an agency. Now it's something one person can ship. This is where Shumer is right and where it gets exciting. Not the "your job is going to disappear" part. The part where regular people — designers, consultants, strategists, teachers, whoever — can build tools that are perfectly shaped for their specific needs. ## The mindset shift matters more than the tool Here's what I think is actually important about all of this. It's not the invoicing tool. It's not my website. It's the shift in thinking. For decades, the default response to any problem was "what software should I subscribe to?" We browsed Product Hunt. We compared pricing pages. We squeezed our workflows into someone else's idea of how things should work. What if the default question became "could I build something myself?" Not always. Not for everything. But as a first instinct instead of a last resort. That mental shift — from consumer to builder — is what I think people should be practicing right now. And the only way to develop it is to start small. Build something silly. Build a tool that tracks your dog's meals. Build a dashboard for your book club. Build an invoicing tool because you're tired of paying $11/month for features you don't use. The point isn't the tool. The point is the muscle. Once you've built one thing, you start seeing opportunities everywhere. You stop asking "is there an app for that?" and start asking "what if I just made it?" That's the real takeaway from this moment. Not that AI is going to eat the world. But that for the first time, building software isn't reserved for people who know how to code. And the people who figure that out early — not by reading about it, but by doing it — will have a significant advantage. Yuval Noah Harari was asked a few years ago what the most important skill for the coming decades would be. His answer wasn't coding. It wasn't AI literacy. It was adaptability. > "The most important skills for surviving and flourishing in the 21st century are not specific skills. Instead, the really important skill is how to master new skills again and again throughout your life." Building something silly is how you practice that. Not because the tool matters, but because the act of building rewires how you think. You stop being a passive consumer of software and start being someone who shapes their own tools. That's adaptability in action. So build something. It doesn't have to be good. It doesn't have to be useful to anyone but you. Just build it.

0 views
Justin Duke Yesterday

The death of software, the A24 of software

Steven Sinofsky recently published Death of Software. Nah. , arguing via historical case studies that AI will not kill software any more than previous technological shifts killed their respective incumbents. I agree with the headline thesis. But I think his media analogy deserves a sharper look, because it actually complicates his optimism in ways worth taking seriously. He writes that there is "vastly more media today than there was 25 years ago," pointing to streaming as evidence that disruption creates abundance rather than destruction. This is telling, because I agree with both sides of the glass: The shift to streaming has not killed media. But it has, to put it mildly, made the aggregate quality of the product worse, and in doing so shifted the value generated away from creative labor and towards platforms and capital. Warner Bros. is, to hear some people say it, the last great conventional studio producing consistently risky and high-quality work that advances the medium forward; Netflix, Apple, et al do put out some extremely great stuff, but the vast majority of their budget goes to things like Red Notice — films designed with their audiences' revealed preferences (i.e., browsing their phone while the film is on) in mind. And yet! The greatest studio of the past decade was also a studio founded in, essentially, the past decade — A24, in 2012. I think it's uncontroversial to say that no other studio has had a higher batting average, and they've done it the right way: very pro-auteur, very fiscally disciplined, focusing more on an overall portfolio brand and strong relationships than the need for Yet Another Tentpole Franchise. A24 didn't succeed despite the streaming era — they succeeded because of it. The explosion of mediocre content created a vacuum for taste, for curation, for a brand that stood for something. When everything is abundant and most of it is forgettable, the scarce thing is discernment . The interesting question isn't "will there be more software?" — it's who captures the value, and what excellence looks like in a world of abundance. (Kicker: A24 just took a round of additional funding from Thrive Capital last year. The market, it seems, agrees.) There will be more software, not less, in the future. The quality of that software — as defined by the heuristics of yesteryear — will be lower.

1 views

The evolution of OpenAI's mission statement

As a USA 501(c)(3) the OpenAI non-profit has to file a tax return each year with the IRS. One of the required fields on that tax return is to "Briefly describe the organization’s mission or most significant activities" - this has actual legal weight to it as the IRS can use it to evaluate if the organization is sticking to its mission and deserves to maintain its non-profit tax-exempt status. You can browse OpenAI's tax filings by year on ProPublica's excellent Nonprofit Explorer . I went through and extracted that mission statement for 2016 through 2024, then had Claude Code help me fake the commit dates to turn it into a git repository and share that as a Gist - which means that Gist's revisions page shows every edit they've made since they started filing their taxes! It's really interesting seeing what they've changed over time. The original 2016 mission reads as follows (and yes, the apostrophe in "OpenAIs" is missing in the original ): OpenAIs goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return. We think that artificial intelligence technology will help shape the 21st century, and we want to help the world build safe AI technology and ensure that AI's benefits are as widely and evenly distributed as possible. Were trying to build AI as part of a larger community, and we want to openly share our plans and capabilities along the way. In 2018 they dropped the part about "trying to build AI as part of a larger community, and we want to openly share our plans and capabilities along the way." In 2020 they dropped the words "as a whole" from "benefit humanity as a whole". They're still "unconstrained by a need to generate financial return" though. Some interesting changes in 2021. They're still unconstrained by a need to generate financial return, but here we have the first reference to "general-purpose artificial intelligence" (replacing "digital intelligence"). They're more confident too: it's not "most likely to benefit humanity", it's just "benefits humanity". They previously wanted to "help the world build safe AI technology", but now they're going to do that themselves: "the companys goal is to develop and responsibly deploy safe AI technology". 2022 only changed one significant word: they added "safely" to "build ... (AI) that safely benefits humanity". They're still unconstrained by those financial returns! No changes in 2023... but then in 2024 they deleted almost the entire thing, reducing it to simply: OpenAIs mission is to ensure that artificial general intelligence benefits all of humanity. They've expanded "humanity" to "all of humanity", but there's no mention of safety any more and I guess they can finally start focusing on that need to generate financial returns! Update : I found loosely equivalent but much less interesting documents from Anthropic . You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
fLaMEd fury Yesterday

The Guestbook Is Back

What’s going on, Internet? Guestbooks are one of my favourite relics of the old web. My old guestbook stopped working after the database behind it shut down, and I’ve been meaning to bring it back ever since. Well, it’s finally here. The new guestbook is powered by webweav.ing , built by yequari and available to 32-Bit Cafe members. It provides web components that handle the form and comments, making it easy to drop into any site. If you’re a member, I’d recommend checking it out. Go ahead and sign the guestbook . Say what’s up. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views

Is the Mac having a BMW’s Neue Klasse moment?

In the last couple of months, we have seen plenty of rants , reports, analysis , and other exposés about the state of Apple software, whether it is about their bad icon design , bad icon implementation , neglect , more neglect , and plain worrisome trends . The most damning thing of all? All of these complaints are valid at the same time, and, coming from Mac enthusiasts and connoisseurs, they carry a lot of weight. This collective reaction is strong because Apple is not a brand usually associated with poor quality, odd design choices, or a lack of attention to detail. It is particularly notable on the Mac, arguably the most prominent Apple software product when it comes to enthusiasm about the brand and what they stand for. Today, some of the Apple observers and critics are almost in shock of how fast things went bad. There were warning signs before, but the core foundations of what makes the Mac a great computing platform didn’t seem threatened. The problems seemed limited to a few bugs and side apps that were quickly filed under mishaps , and the growing popularity of non-native apps that ignore Mac conventions . Now, even MacOS itself is plagued with symptoms of the “unrefined” disease. Is MacOS becoming another Windows? A couple of years ago, circa 2021, I was using a Windows computer for work. It was fine. Not great, not bad, it was just OK. Most of the tools I have to use at work live in the browser, and I managed to find peace with the few apps I was using, most of them Electron-based, like Obsidian. When I eventually got an M1 MacBook Air as a replacement, it was a breath of fresh air. Not because I’m a Mac user since 2006, but because the Mac is not fine or just OK: it’s great. Mac apps, the “real” Mac apps, are indeed very good. They feel part of the system, whereas on Windows it’s hard to distinguish between a web-wrapped app and a native app. They all feel the same. Ty Bolt said it best writing about Panic’s Nova (emphasis mine): Nova is one of the best pieces of software I’ve ever used. It’s refined and polished and there’s no equivalent on Linux and Windows. It has its own personality, but also feels like an extension of the operating system. Which is a hallmark of a great Mac app. Folks in the community call them Mac-assed Mac apps. These apps are what make MacOS really great. The best apps I have used are all Mac apps. For me, this quote is what the Mac is all about. But with all the current issues documented on MacOS Tahoe, it is not as easy to look down on Windows as it once was. For users like me, who appreciate a certain level of precision and craftsmanship in software and love Apple because of that — especially the Mac — this trend is worrisome. We know that Apple is not going away, but the Apple we love seems distracted. We worry that the Mac won’t ever feel like the Mac we love today again. We worry that our habits, our taste, and our commitments to a platform will become pointless and dépassés . We worry because there is not a proper alternative to the Mac environment. Users with a different set of tastes, values, and habits, users who may use a Mac for their best-in-class chips, but not for its software, won't understand. Some users who already use and love Linux or Windows (and easily switch between the two), for their set of tastes, values, and habits, won't understand. Users who use a Mac just to live inside a Chrome/Electron landscape of apps won't understand. This period of neglect may be over soon. It may go on for another few years. It may also be all downhill from here. We just don't know. We have to wait, we have to hope, and we have to continue pointing out what feels off about the platform we love. The most cynical will point to the obvious, saying that Mac enthusiasts are not where the money is these days for Apple. This would explain a lot, and it's very tempting to think that way. But I thought of something that may sound like wishful thinking: What if Apple is having its own BMW-Neue-Klasse moment? For BMW, Neue Klasse is the name of their brand reset, their upcoming generation of cars, from the design language to the production platform to the actual vehicle models. It was announced a few years ago, in the midst of the transition to the electric-first era. For BMW, this meant reaffirming the brand, getting back to its roots , and embracing what makes BMW a well-loved and praised car manufacturer. This kind of transition takes a lot of time, effort, and money. Between the announcement and today, brand enthusiasts and critics have perceived a regression in quality and finish , and have felt that the brand has lost touch with its premium foundations and with what makes them love it in the first place. Optimists and apologists will explain this by saying that BMW has put all their best talents and resources towards the Neue Klasse. They will tell you that the current line of models and its related perceived-quality issues are temporary while they reallocated some of their best teams , a necessary low to set things anew, with the upcoming generation of vehicles. As far as I can understand, the reasoning is that BMW knew it had enough brand capital to absorb a few awkward design cycles and perceived drops in interior quality. They surfed on their existing reputation while spending a lot of resources on a platform reset, hoping for a smooth transition. It may hurt them a little , but they considered it a small price to pay to be able to embrace this new era confidently, and regain what was lost. I want to imagine that the same thing is happening at Apple. What if the last couple of years were a transition for Apple? Unlike BMW, Apple would not share their own Neue Klasse vision: they would just unveil it when it’s ready and keep it a secret until then. Meanwhile, their best engineers, designers, and product people are reassigned and working hard on a new generation of MacOS, something that is a big step forward. Maybe Apple thinks that, for the current lineup, helped by the greatest hardware the Mac ever had, the limited resources and ongoing problems are an acceptable compromise, for now. * 1 Mark Gurman would probably have shared the scoop if that were what was really happening, but I’ll keep hoping this “Mac reset” is actually happening and good (and not a failed renaissance). After all, the Neue Klasse era could end up being a disaster, and the worrying signs we’re seeing are actually just the beginning of the end. For Apple, if we are indeed witnessing the first signs of a company that has lost its touch, if we are already at a point of no return when it comes to MacOS quality, the potential downfall won’t be nearly as consequential as it could be for BMW. Apple could lose money for decades and still be one of the richest companies in the world. Without the Mac (just 6% of revenue ), Apple would post similar financial reports for years to come. * 2 For the Mac enthusiasts like myself, there are only three upcoming scenarios in my mind right now. One, the Mac we love returns, either in its current form or as a “new class” of Mac (MacOS XX?) and all of this will just be a bad memory. Two, the Mac keeps on getting worse and worse to the point of driving long-time users away, and it ends up getting replaced with yet another version of iOS on MacBooks. Three, all operating systems end up being background tasks in the A.I. era anyway , and Apple knows this and doesn’t bother anymore. This is maybe what happened back in the butterfly keyboard era: Apple were working on the Apple-silicon Macs, and focused most of their resources towards that, hence the Mac computers of that era being underserved. I am clearly speculating, but you get my point. ^ I wonder for what part of these 6% the Mac enthusiasts are responsible for. Maybe 5%? 10%? I’m pretty sure most of the Mac revenue comes from users who won’t pay attention to all of this. ^ This is maybe what happened back in the butterfly keyboard era: Apple were working on the Apple-silicon Macs, and focused most of their resources towards that, hence the Mac computers of that era being underserved. I am clearly speculating, but you get my point. ^ I wonder for what part of these 6% the Mac enthusiasts are responsible for. Maybe 5%? 10%? I’m pretty sure most of the Mac revenue comes from users who won’t pay attention to all of this. ^

1 views

Premium: The AI Data Center Financial Crisis

Since the beginning of 2023, big tech has spent over $814 billion in capital expenditures, with a large portion of that going towards meeting the demands of AI companies like OpenAI and Anthropic.  Big tech has spent big on GPUs, power infrastructure, and data center construction,  using a variety of financing methods to do so, including (but not limited to) leasing. And the way they’re going about structuring these finance deals is growing increasingly bizarre.  I’m not merely talking about Meta’s curious arrangement for its facility in Louisiana , though that certainly raised some eyebrows. Last year, Morgan Stanley published a report that claimed hyperscalers were increasingly relying on finance leases to obtain the “powered shell” of a data center, rather than the more common method of operating leases.  The key difference here is that finance leases, unlike operating leases, are effectively long-term loans where the borrower is expected to retain ownership of the asset (whether that be a GPU or a building) at the end of the contract. Traditionally, these types of arrangements have been used to finance the bits of a data center that have a comparatively limited useful life — like computer hardware, which grows obsolete with time. The spending to date is, as I’ve written about again and again , an astronomical amount of spending considering the lack of meaningful revenue from generative AI.  Even after a year straight of manufacturing consent for Claude Code as the be-all-end-all of software development resulted in putrid results for Anthropic — $4.5 billion of revenue and $5.2 billion of losses before interest, taxes, depreciation and amortization according to The Information — with ( per WIRED ) Claude Code only accounting for around $1.1 billion in annualized revenue in December, or around $92 million in monthly revenue. This was in a year where Anthropic raised a total of $16.5 billion (with $13 billion of that coming in September 2025), and it’s already working on raising another $25 billion . This might be because it promised to buy $21 billion of Google TPUs from Broadcom , or because Anthropic expects AI model training costs to cost over $100 billion in the next 3 years . And it just raised another $30 billion — albeit with the caveat that some of said $30 billion came from previously-announced funding agreements with Nvidia and Microsoft, though how much remains a mystery. According to Anthropic’s new funding announcement, Claude Code’s run rate has grown to “over $2.5 billion” as of February 12 2026 — or around $208 million. Based on literally every bit of reporting about Anthropic, costs have likely spiked along with revenue, which hit $14 billion annualized ($1.16 billion in a month) as of that date.  I have my doubts, but let’s put them aside for now. Anthropic is also in the midst of one of the most aggressive and dishonest public relations campaigns in history. While its Chief Commercial Officer Paul Smith told CNBC that it was “focused on growing revenue” rather than “spending money,” it’s currently making massive promises — tens of billions on Google Cloud , “ $50 billion in American AI infrastructure ,” and $30 billion on Azure . And despite Smith saying that Anthropic was less interested in “flashy headlines,” Chief Executive Dario Amodei has said, in the last three weeks , that “ almost unimaginable power is potentially imminent ,” that AI could replace all software engineers in the next 6-12 months , that AI may (it’s always fucking may ) cause “ unusually painful disruption to jobs ,” and wrote a 19,000 word essay — I guess AI is coming for my job after all! — where he repeated his noxious line that “we will likely get a century of scientific and economic progress compressed in a decade.” Yet arguably the most dishonest part is this word “training.” When you read “training,” you’re meant to think “oh, it’s training for something, this is an R&D cost,” when “training LLMs” is as consistent a cost as inference (the creation of the output) or any other kind of maintenance.  While most people know about pretraining — the shoving of large amounts of data into a model (this is a simplification I realize) — in reality a lot of the current spate of models use post-training , which covers everything from small tweaks to model behavior to full-blown reinforcement learning where experts reward or punish particular responses to prompts. To be clear, all of this is well-known and documented, but the nomenclature of “training” suggests that it might stop one day, versus the truth: training costs are increasing dramatically, and “training” covers anything from training new models to bug fixes on existing ones. And, more fundamentally, it’s an ongoing cost — something that’s an essential and unavoidable cost of doing business.  Training is, for an AI lab like OpenAI and Anthropic, as common (and necessary) a cost as those associated with creating outputs (inference), yet it’s kept entirely out of gross margins : This is inherently deceptive. While one would argue that R&D is not considered in gross margins, training isn’t gross margins — yet gross margins generally include the raw materials necessary to build something, and training is absolutely part of the raw costs of running an AI model. Direct labor and parts are considered part of the calculation of gross margin, and spending on training — both the data and the process of training itself — are absolutely meaningful, and to leave them out is an act of deception.  Anthropic’s 2025 gross margins were 40% — or 38% if you include free users of Claude — on inference costs of $2.7 (or $2.79) billion, with training costs of around $4.1 billion . What happens if you add training costs into the equation?  Let’s work it out! Training is not an up front cost , and considering it one only serves to help Anthropic cover for its wretched business model. Anthropic (like OpenAI) can never stop training, ever, and to pretend otherwise is misleading. This is not the cost just to “train new models” but to maintain current ones, build new products around them, and many other things that are direct, impossible-to-avoid components of COGS. They’re manufacturing costs, plain and simple. Anthropic projects to spend $100 billion on training in the next three years, which suggests it will spend — proportional to its current costs — around $32 billion on inference in the same period, on top of $21 billion of TPU purchases, on top of $30 billion on Azure (I assume in that period?), on top of “tens of billions” on Google Cloud. When you actually add these numbers together (assuming “tens of billions” is $15 billion), that’s $200 billion.  Anthropic ( per The Information’s reporting ) tells investors it will make $18 billion in revenue in 2026 and $55 billion in 2027 — year-over-year increases of 400% and 305% respectively, and is already raising $25 billion after having just closed a $30bn deal. How does Anthropic pay its bills? Why does outlet after outlet print these fantastical numbers without doing the maths of “how does Anthropic actually get all this money?” Because even with their ridiculous revenue projections, this company is still burning cash, and when you start to actually do the maths around anything in the AI industry, things become genuinely worrying.  You see, every single generative AI company is unprofitable, and appears to be getting less profitable over time. Both The Information and Wall Street Journal reported the same bizarre statement in November — that Anthropic would “turn a profit more quickly than OpenAI,” with The Information saying Anthropic would be cash flow positive in 2027 and the Journal putting the date at 2028, only for The Information to report in January that 2028 was the more-realistic figure.  If you’re wondering how, the answer is “Anthropic will magically become cash flow positive in 2028”: This is also the exact same logic as OpenAI, which will, per The Information in September , also, somehow, magically turn cashflow positive in 2030: Oracle, which has a 5-year-long, $300 billion compute deal with OpenAI that it lacks the capacity to serve and that OpenAI lacks the cash to pay for, also appears to have the same magical plan to become cash flow positive in 2029 : Somehow, Oracle’s case is the most legit, in that theoretically at that time it would be done, I assume, paying the $38 billion it’s raising for Stargate Shackelford and Wisconsin, but said assumption also hinges on the idea that OpenAI finds $300 billion somehow . it also relies upon Oracle raising more debt than it currently has — which, even before the AI hype cycle swept over the company, was a lot.  As I discussed a few weeks ago in the Hater’s Guide To Oracle , a megawatt of data center IT load generally costs  (per Jerome Darling of TD Cowen) around $12-14m  in construction (likely more due to skilled labor shortages, supply constraints and rising equipment prices) and $30m a megawatt in GPUs and associated hardware. In plain terms, Oracle (and its associated partners) need around $189 billion to build the 4.5GW of Stargate capacity to make the revenue from the OpenAI deal, meaning that it needs around another $100 billion once it raises $50 billion in combined debt, bonds, and printing new shares by the end of 2026. I will admit I feel a little crazy writing this all out, because it’s somehow a fringe belief to do the very basic maths and say “hey, Oracle doesn’t have the capacity and OpenAI doesn’t have the money.” In fact, nobody seems to want to really talk about the cost of AI, because it’s much easier to say “I’m not a numbers person” or “they’ll work it out.” This is why in today’s newsletter I am going to lay out the stark reality of the AI bubble, and debut a model I’ve created to measure the actual, real costs of an AI data center. While my methodology is complex, my conclusions are simple: running AI data centers is, even when you remove the debt required to stand up these data centers, a mediocre business that is vulnerable to basically any change in circumstances.  Based on hours of discussions with data center professionals, analysts and economists, I have calculated that in most cases, the average AI data center has gross margins of somewhere between 30% and 40% — margins that decay rapidly for every day, week, or month that you take putting a data center into operation. This is why Oracle has negative 100% margins on NVIDIA’s GB200 chips — because the burdensome up-front cost of building AI data centers (as GPUs, servers, and other associated) leaves you billions of dollars in the hole before you even start serving compute, after which you’re left to contend with taxes, depreciation, financing, and the cost of actually powering the hardware.  Yet things sour further when you face the actual financial realities of these deals — and the debt associated with them.  Based on my current model of the 1GW Stargate Abilene data center, Oracle likely plans to make around $11 billion in revenue a year from the 1.2GW (or around 880MW of critical IT). While that sounds good, when you add things like depreciation, electricity, colocation costs of $1 billion a year from Crusoe, opex, and the myriad of other costs, its margins sit at a stinkerific 27.2% — and that’s assuming OpenAI actually pays, on time, in a reliable way. Things only get worse when you factor in the cost of debt. While Oracle has funded Abilene using a mixture of bonds and existing cashflow, it very clearly has yet to receive the majority of the $25 billion+ in GPUs and associated hardware (with only 96,000 GPUs “ delivered ”), meaning that it likely bought them out of its $18 billion bond sale from last September .  If we assume that maths, this means that Oracle is paying a little less than $963 million a year ( per the terms of the bond sale ) whether or not a single GPU is even turned on, leaving us with a net margin of 22.19%... and this is assuming OpenAI pays every single bill, every single time, and there are absolutely no delays. These delays are also very, very expensive. Based on my model, if we assume that 100MW of critical IT load is operational (roughly two buildings and 100,000 GB200s) but has yet to start generating revenue, Oracle is burning, without depreciation ( EDITOR’S NOTE: sorry! This previously said depreciation was a cash expense and was included in this number (even though it wasn’t! ) , but it's correct in the model! ), around $4.69 million a day in cash . I have also confirmed with sources in Abilene that there is no chance that Stargate Abilene is fully operational in 2026. In simpler terms: I will admit I’m quite disappointed that the media at large has mostly ignored this story. Limp, cautious “are we in an AI bubble?” conversations are insufficient to deal with the potential for collapse we’re facing.  Today, I’m going to dig into the reality of the costs of AI, and explain in gruesome detail exactly how easily these data centers can rapidly approach insolvency in the event that their tenants fail to pay.  The chain of pain is real: Today I’m going to explain how easily it breaks. If Anthropic’s gross margin was 38% in 2025, that means its COGS (cost of goods sold) was $2.79 billion. If we add training, this brings COGS to $6.89 billion, leaving us with -$2.39 billion after $4.5 billion in revenue. This results in a negative 53% gross margin. AI startups are all unprofitable, and do not appear to have a path to sustainability.  AI data centers are being built in anticipation of demand that doesn’t exist, and will only exist if AI startups — which are all unprofitable — can afford to pay them. Oracle, which has committed to building 4.5GW of data centers, is burning cash every day that OpenAI takes to set up its GPUs, and when it starts making money, it does so from a starting position of billions and billions of dollars in debt. Margins are low throughout the entire stack of AI data center operators — from landlords like Applied Digital to compute providers like CoreWeave — thanks to the billions in debt necessary to fund both construction and IT hardware to make them run, putting both parties in a hole that can only be filled with revenues that come from either hyperscalers or AI startups.  In a very real sense, the AI compute industry is dependent on AI “working out,” because if it doesn’t, every single one of these data centers will become a burning hole in the ground.

0 views
Stratechery 2 days ago

2026.07: Aggregators and AI

Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Microsoft and Software Survival . Individualization at Scale. Spotify had a fantastic result in its quarterly earnings, but I thought the earnings call commentary — it was former CEO and founder Daniel Ek’s last — was worth a deeper examination into the reality of modern network companies . Spotify is a music streaming service that everyone is familiar with, but the actual experience of every Spotify user is unique. That’s a feature of every monolithic network-effects company, and explains why such companies are poised to be big winners from AI. — Ben Thompson CapEx Explosions and Distinctions.  The most amazing thing I read this week was a note from a Sharp Tech emailer who observed that the combined project capital expenditures of Amazon, Google and Meta in 2026 — more than $700 billion — net out to nearly two-thirds of the annual budget for the U.S. Department of Defense. Where is that money going, and how scared should investors be? The answer there varies. On Monday, Ben explained why Google’s plans make perfect sense  given the nature of the business and their results over the past few years. Tuesday’s Update told a different story about Amazon : their spending is defensible, but shareholders who are anxious have a few good reasons to be.  — Andrew Sharp The Interviewer Becomes the Interviewee.  As Ben’s friend and colleague, I loved the inversion we got from this week’s Stratechery Interview , cross-published from Stripe Preisdent John Collison’s Cheeky Pint podcast . This was Collison interviewing Ben, talking from everything from the difference between pre- and post-smartphone Japan, Meta’s allergy to advertising evangelism, why Ben doesn’t cover TikTok as much, and Stratechery’s business model (which was in part enabled by Stripe). The 90-minute conversation was a delight and is also available to watch on YouTube , if you’d like to see what Stripe’s on-premise pub looks like and get pretty jealous.   — AS Google Earnings, Google Cloud Crushes, Search Advertising and LLMs — Google announced a massive increase in CapEx that blew away expectations; the companies earnings results explain why the increase is justified. Amazon Earnings, CapEx Concerns, Commodity AI — Amazon’s massive CapEx increase makes me much more nervous than Google’s, but it is understandable. Spotify Earnings, Individualized Networks, AI and Aggregation — Spotify’s nature as a content network means that AI is a sustaining technology, particularly because they have the right business model in place. An Interview with Ben Thompson by John Collison on the Cheeky Pint Podcast — An interview with me by John Collison on the Cheeky Pint podcast about AI, ads, and the history of Stratechery. Takaichi, Tanking and Legalization Lessons — On landslide elections in Japan, fixing a mess in the NBA, and a defining political challenge for the next generation in the United States.  Ferrari Luce Apple Losing Control The Great Golden Age of Antibiotics The Epochal Ultra-Supercritical Steam Turbine Pending Taiwan Arms Sales; Jimmy Lai Sentenced; Takaichi Secures a Supermajority; AI Models as Propaganda Vectors Radical Transparency Roundup, The Tanking Race Gets Disgusting, Giannis Takes a Share in Kalshi Spotify Spreads Its Wings, CapEx Explosions and Distinctions, Q&A on Viral AI Tweets, Anthropic, Giannis

1 views
neilzone 2 days ago

Moving away from Nextcloud

I have used Nextcloud for a long time. In fact, I have used Nextcloud from before it was Nextcloud - before the fork of Owncloud. And while I have not used many of its features - sync, calendar, and contacts - I’ve been a very happy user for a long time. Until a year or so ago, at least. I’ve had a worry, at the back of my mind, for a while, that Nextcloud is trying to do too much. A collaborative document editor. An email client. A voice/video conferencing tool, and so on. I’m sure that, in some contexts, this is amazing, and convenient. For me, as someone who typically prefers a piece of software to do one thing well, it left me a bit uneasy. But that was not, in itself, enough of a reason for me to switch. A year or so ago, I had problem after problem keeping files in sync. I routinely got error messages about the database (or files; I don’t quite remember) being locked. And, for me, that was the mainstay of Nextcloud, and indeed the reason why I started to use it in the first place. I tried all sorts of things, including setting up redis, and trying other memcache options, even though I am the only regular user. I could not get it to sync reliably. And I really did try, using the voluminous logs to try to determine what was going wrong. But I failed. And so I started considering other options. Did I actually need Nextcloud at all? I’ve moved to Syncthing for syncing, and so far, that has been working fine. It is fast, and appears to be reliable. I should probably write about it at some point. Using Nextcloud to sync photos from my phone was not too bad, but from Sandra’s iPhone, it did not work well. I have switched to Immich for photo sync / gallery, and I’ve been very happy with it. For contacts and calendar sync - DAV - I am using Radicale . The main annoyance is that Sandra cannot invite me (or anyone) to appointments using the iOS or macOS calendar. For me, I’ve just given Sandra write access to my calendar, so that she can add events directly, but it is far from ideal. I’ve tried using Radicale’s server-side email functionality, and that is not suitable for my needs, as it sends out far too many email. But, for now, Radicale is tolerable, even if I might try to find another option at some point. And that just leaves the directories which I share via Nextcloud and mount in my file browser. Stuff that I don’t need on my computer, but still want to access. For that, I’m going back to samba. It works. And so, once I’ve finalised this and tested it and given it some time to bed in, I will turn off the Nextcloud server.

0 views
Ankur Sethi 2 days ago

I used a local LLM to analyze my journal entries

In 2025, I wrote 162 journal entries totaling 193,761 words. In December, as the year came to a close and I found myself in a reflective mood, I wondered if I could use an LLM to comb through these entries and extract useful insights. I’d had good luck extracting structured data from web pages using Claude, so I knew this was a task LLMs were good at. But there was a problem: I write about sensitive topics in my journal entries, and I don’t want to share them with the big LLM providers. Most of them have at least a thirty-day data retention policy, even if you call their models using their APIs, and that makes me uncomfortable. Worse, all of them have safety and abuse detection systems that get triggered if you talk about certain mental health issues. This can lead to account bans or human review of your conversations. I didn’t want my account to get banned, and the very idea of a stranger across the world reading my journal mortifies me. So I decided to use a local LLM running on my MacBook for this experiment. Writing the code was surprisingly easy. It took me a few evenings of work—and a lot of yelling at Claude Code—to build a pipeline of Python scripts that would extract structured JSON from my journal entries. I then turned that data into boring-but-serviceable visualizations. This was a fun side-project, but the data I extracted didn’t quite lead me to any new insights. That’s why I consider this a failed experiment. The output of my pipeline only confirmed what I already knew about my year. Besides, I didn’t have the hardware to run the larger models, so some of the more interesting analyses I wanted to run were plagued with hallucinations. Despite how it turned out, I’m writing about this experiment because I want to try it again in December 2026. I’m hoping I won’t repeat my mistakes again. Selfishly, I’m also hoping that somebody who knows how to use LLMs for data extraction tasks will find this article and suggest improvements to my workflow. I’ve pushed my data extraction and visualization scripts to GitHub. It’s mostly LLM-generated slop, but it works. The most interesting and useful parts are probably the prompts . Now let’s look at some graphs. I ran 12 different analyses on my journal, but I’m only including the output from 6 of them here. Most of the others produced nonsensical results or were difficult to visualize. For privacy, I’m not using any real names in these graphs. Here’s how I divided time between my hobbies through the year: Here are my most mentioned hobbies: This one is media I engaged with. There isn’t a lot of data for this one: How many mental health issues I complained about each day across the year: How many physical health issues I complained about each day across the year: The big events of 2025: The communities I spent most of my time with: Top mentioned people throughout the year: I ran all these analyses on my MacBook Pro with an M4 Pro and 48GB RAM. This hardware can just barely manage to run some of the more useful open-weights models, as long as I don’t run anything else. For running the models, I used Apple’s package . Picking a model took me longer than putting together the data extraction scripts. People on /r/LocalLlama had a lot of strong opinions, but there was no clear “best” model when I ran this experiment. I just had to try out a bunch of them and evaluate their outputs myself. If I had more time and faster hardware, I might have looked into building a small-scale LLM eval for this task. But for this scenario, I picked a few popular models, ran them on a subset of my journal entries, and picked one based on vibes. This project finally gave me an excuse to learn all the technical terms around LLMs. What’s quantization ? What does the number of parameters do? What does it mean when a model has , , , or in its name? What is a reasoning model ? What’s MoE ? What are active parameters? This was fun, even if my knowledge will be obsolete in six months. In the beginning, I ran all my scripts with Qwen 2.5 Instruct 32b at 8-bit quantization as the model. This fit in my RAM with just enough room left over for a browser, text editor, and terminal. But Qwen 2.5 didn’t produce the best output and hallucinated quite a bit, so I ran my final analyses using Llama-3.3 70B Instruct at 3bit quantization. This could just about fit in my RAM if I quit every other app and increased the amount of GPU RAM a process was allowed to use . While quickly iterating on my Python code, I used a tiny model: Qwen 3 4b Instruct quantized to 4bits. A major reason this experiment didn’t yield useful insights was that I didn’t know what questions to ask the LLM. I couldn’t do a qualitative analysis of my writing—the kind of analysis a therapist might be able to do—because I’m not a trained psychologist. Even if I could figure out the right prompts, I wouldn’t want to do this kind of work with an LLM. The potential for harm is too great, and the cost of mistakes is too high. With a few exceptions, I limited myself to extracting quantitative data only. From each journal entry, I extracted the following information: None of the models was as accurate as I had hoped at extracting this data. In many cases, I noticed hallucinations and examples from my system prompt leaking into the output, which I had to clean up afterwards. Qwen 2.5 was particularly susceptible to this. Some of the analyses (e.g. list of new people I met) produced nonsensical results, but that wasn’t really the fault of the models. They were all operating on a single journal entry at a time, so they had no sense of the larger context of my life. I couldn’t run all my journal entries through the LLM at once. I didn’t have that kind of RAM and the models didn’t have that kind of context window. I had to run the analysis one journal entry at a time. Even then, my computer choked on some of the larger entries, and I had to write my scripts in a way that I could run partial analyses or continue failed analyses. Trying to extract all the information listed above in one pass produced low-quality output. I had to split my analysis into multiple prompts and run them one at a time. Surprisingly, none of the models I tried had an issue with the instruction . Even the really tiny models had no problems following the instruction. Some of them occasionally threw in a Markdown fenced code block, but it was easy enough to strip using a regex. My prompts were divided into two parts: The task-specific prompts included detailed instructions and examples that made the structure of the JSON output clear. Every model followed the JSON schema mentioned in the prompt, and I rarely ever ran into JSON parsing issues. But the one issue I never managed to fix was the examples from the prompts leaking into the extracted output. Every model insisted that I had “dinner with Sarah” several times last year, even though I don’t know anybody by that name. This name came from an example that formed part of one of my prompts. I just had to make sure the examples I used stood out—e.g., using names of people I didn’t know at all or movies I hadn’t watched—so I could filter them out using plain old Python code afterwards. Here’s what my prompt looked like: To this prompt, I appended task-specific prompts. Here’s the prompt for extracting health issues mentioned in an entry: You can find all the prompts in the GitHub repository . The collected output from all the entries looked something like this: Since my model could only look at one journal entry at a time, it would sometimes refer to the same health issue, gratitude item, location, or travel destination using different synonyms. For example, “exhaustion” and “fatigue” should refer to the same health issue, but they would appear in the output as two different issues. My first attempt at de-duplicating these synonyms was to keep a running tally of unique terms discovered during each analysis and append them to the end of the prompt for each subsequent entry. Something like this: But this quickly led to some really strange hallucinations. I still don’t understand why. This list of terms wasn’t even that long, maybe 15-20 unique terms for each analysis. My second attempt at solving this was a separate normalization pass for each analysis. After an analysis finished running, I extracted a unique list of terms from its output file and collected them into a prompt. Then asked the LLM to produce a mapping to de-duplicate the terms. This is what the prompt looked like: There were better ways to do this than using an LLM. But you know what happens when all you have is a hammer? Yep, exactly. The normalization step was inefficient, but it did its job. This was the last piece of the puzzle. With all the extraction scripts and their normalization passes working correctly, I left my MacBook running the pipeline of scripts all day. I’ve never seen an M-series MacBook get this hot. I was worried that I’d damage my hardware somehow, but it all worked out fine. There was nothing special about this step. I just decided on a list of visualizations for the data I’d extracted, then asked Claude to write some code to generate them for me. Tweak, rinse, repeat until done. I’m underwhelmed by the results of this experiment. I didn’t quite learn anything new or interesting from the output, at least nothing I didn’t already know. This was only partly because of LLM limitations. I believe I didn’t quite know what questions to ask in the first place. What was I hoping to discover? What kinds of patterns was I looking for? What was the goal of the experiment besides producing pretty graphs? I went into the project with a cool new piece of tech to try out, but skipped the important up-front human-powered thinking work required to extract good insights from data. I neglected to sit down and design a set of initial questions I wanted to answer and assumptions I wanted to test before writing the code. Just goes to show that no amount of generative AI magic will produce good results unless you can define what success looks like. Maybe this year I’ll learn more about data analysis and visualization and run this experiment again in December to see if I can go any further. I did learn one thing from all of this: if you have access to state-of-the-art language models and know the right set of questions to ask, you can process your unstructured data to find needles in some truly massive haystacks. This allows you analyze datasets that would take human reviewers months to comb through. A great example is how the NYT monitors hundreds of podcasts every day using LLMs. For now, I’m putting a pin in this experiment. Let’s try again in December. List of things I was grateful for, if any List of hobbies or side-projects mentioned List of locations mentioned List of media mentioned (including books, movies, games, or music) A boolean answer to whether it was a good or bad day for my mental health List of mental health issues mentioned, if any A boolean answer to whether it was a good or bad day for my physical health List of physical health issues mentioned, if any List of things I was proud of, if any List of social activities mentioned Travel destinations mentioned, if any List of friends, family members, or acquaintances mentioned List of new people I met that day, if any A “core” prompt that was common across analyses Task-specific prompts for each analysis

0 views