Latest Posts (20 found)

Is Jonathan Shelley A False Teacher?

Thanks to the following interaction, it seems to me Jonathan Shelly is a false teacher ![bannedpastor.jpg](/files/d46b244676374587) For example, _1 Timothy 3_, "This is a true saying, if a man desire the office of a bishop, he desireth a good work. A bishop then must be blameless, the husband of one wife, vigilant, sober, of good behaviour, given to hospitality, apt to teach; Not given to wine, no striker, not greedy of filthy lucre; but patient, not a brawler, not covetous;" Even if I was wrong on my "false teacher" claim, calling another pastor "lame" seems not to be an example of "good b...

0 views

Fragments: April 14

I attended the first Pragmatic Summit early this year, and while there host Gergely Orosz interviewed Kent Beck and myself on stage . The video runs for about half-an-hour. I always enjoy nattering with Kent like this, and Gergely pushed into some worthwhile topics. Given the timing, AI dominated the conversation - we compared it to earlier technology shifts, the experience of agile methods, the role of TDD, the danger of unhealthy performance metrics, and how to thrive in an AI-native industry. ❄                ❄                ❄                ❄                ❄ Perl is a language I used a little, but never loved. However the definitive book on it, by its designer Larry Wall, contains a wonderful gem. The three virtues of a programmer: hubris, impatience - and above all - laziness . Bryan Cantrill also loves this virtue : Of these virtues, I have always found laziness to be the most profound: packed within its tongue-in-cheek self-deprecation is a commentary on not just the need for abstraction, but the aesthetics of it. Laziness drives us to make the system as simple as possible (but no simpler!) — to develop the powerful abstractions that then allow us to do much more, much more easily. Of course, the implicit wink here is that it takes a lot of work to be lazy Understanding how to think about a problem domain by building abstractions (models) is my favorite part of programming. I love it because I think it’s what gives me a deeper understanding of a problem domain, and because once I find a good set of abstractions, I get a buzz from the way they make difficulties melt away, allowing me to achieve much more functionality with less lines of code. Cantrill worries that AI is so good at writing code, we risk losing that virtue, something that’s reinforced by brogrammers bragging about how they produce thirty-seven thousand lines of code a day. The problem is that LLMs inherently lack the virtue of laziness. Work costs nothing to an LLM. LLMs do not feel a need to optimize for their own (or anyone’s) future time, and will happily dump more and more onto a layercake of garbage. Left unchecked, LLMs will make systems larger, not better — appealing to perverse vanity metrics, perhaps, but at the cost of everything that matters. As such, LLMs highlight how essential our human laziness is: our finite time forces us to develop crisp abstractions in part because we don’t want to waste our (human!) time on the consequences of clunky ones. The best engineering is always borne of constraints, and the constraint of our time places limits on the cognitive load of the system that we’re willing to accept. This is what drives us to make the system simpler, despite its essential complexity. This reflection particularly struck me this Sunday evening. I’d spent a bit of time making a modification of how my music playlist generator worked. I needed a new capability, spent some time adding it, got frustrated at how long it was taking, and wondered about maybe throwing a coding agent at it. More thought led to realizing that I was doing it in a more complicated way than it needed to be. I was including a facility that I didn’t need, and by applying yagni , I could make the whole thing much easier, doing the task in just a couple of dozen lines of code. If I had used an LLM for this, it may well have done the task much more quickly, but would it have made a similar over-complication? If so would I just shrug and say LGTM? Would that complication cause me (or the LLM) problems in the future? ❄                ❄                ❄                ❄                ❄ Jessica Kerr (Jessitron) has a simple example of applying the principle of Test-Driven Development to prompting agents . She wants all updates to include updating the documentation. Instructions – We can change AGENTS.md to instruct our coding agent to look for documentation files and update them. Verification – We can add a reviewer agent to check each PR for missed documentation updates. This is two changes, so I can break this work into two parts. Which of these should we do first? Of course my initial comment about TDD answers that question ❄                ❄                ❄                ❄                ❄ Mark Little prodded an old memory of mine as he wondered about to work with AIs that are over-confident of their knowledge and thus prone to make up answers to questions, or to act when they should be more hesitant. He draws inspiration from an old, low-budget, but classic SciFi movie: Dark Star . I saw that movie once in my 20s (ie a long time ago), but I still remember the crisis scene where a crew member has to use philosophical argument to prevent a sentient bomb from detonating . Doolittle: You have no absolute proof that Sergeant Pinback ordered you to detonate. Bomb #20: I recall distinctly the detonation order. My memory is good on matters like these. Doolittle: Of course you remember it, but all you remember is merely a series of sensory impulses which you now realize have no real, definite connection with outside reality. Bomb #20: True. But since this is so, I have no real proof that you’re telling me all this. Doolittle: That’s all beside the point. I mean, the concept is valid no matter where it originates. Bomb #20: Hmmmm…. Doolittle: So, if you detonate… Bomb #20: In nine seconds…. Doolittle: …you could be doing so on the basis of false data. Bomb #20: I have no proof it was false data. Doolittle: You have no proof it was correct data! Bomb #20: I must think on this further. Doolittle has to expand the bomb’s consciousness, teaching it to doubt its sensors. As Little puts it: That’s a useful metaphor for where we are with AI today. Most AI systems are optimised for decisiveness. Given an input, produce an output. Given ambiguity, resolve it probabilistically. Given uncertainty, infer. This works well in bounded domains, but it breaks down in open systems where the cost of a wrong decision is asymmetric or irreversible. In those cases, the correct behaviour is often deferral, or even deliberate inaction. But inaction is not a natural outcome of most AI architectures. It has to be designed in. In my more human interactions, I’ve always valued doubt, and distrust people who operate under undue certainty. Doubt doesn’t necessarily lead to indecisiveness, but it does suggest that we include the risk of inaccurate information or faulty reasoning into decisions with profound consequences. If we want AI systems that can operate safely without constant human oversight, we need to teach them not just how to decide, but when not to. In a world of increasing autonomy, restraint isn’t a limitation, it’s a capability. And in many cases, it may be the most important one we build.

0 views
iDiallo Today

Back button hijacking is going away

When websites are blatantly hostile, users close them to never come back. Have you ever downloaded an app, realized it was deceptive, and deleted it immediately? It's a common occurrence for me. But there is truly hostile software that we still end up using daily. We don't just delete those apps because the hostility is far more subtle. It's like the boiling frog, the heat turns up so slowly that the frog enjoys a nice warm bath before it's fully cooked. With clever hostile software, they introduce one frustrating feature at a time. Every time I find myself on LinkedIn, it's not out of pleasure. Maybe it's an email about an enticing job. Maybe it's an article someone shared with me. Either way, before I click the link, I have no intention of scrolling through the feed. Yet I end up on it anyway, not because I want to, but because I've been tricked. You see, LinkedIn employs a trick called back button hijacking. You click a LinkedIn URL that a friend shared, read the article, and when you're done, you click the back button expecting to return to whatever app you were on before. But instead of going back, you're still on LinkedIn. Except now, you are on the homepage, where your feed loads with enticing posts that lure you into scrolling. How did that happen? How did you end up on the homepage when you only clicked on a single link? That's back button hijacking. Here's how it works. When you click the original LinkedIn link, you land on a page and read the article. In the background, LinkedIn secretly gets to work. Using the JavaScript method, it swaps the page's URL to the homepage. The method doesn't add an entry to the browser's history. Then LinkedIn manually pushes the original URL you landed on into the history stack. This all happens so fast that the user never notices any change in the URL or the page. As far as the browser is concerned, you opened the LinkedIn homepage and then clicked on a post to read it. So when you click the back button, you're taken back to the homepage, the feed loads, and you're presented with the most engaging post to keep you on the platform. If you spent a few minutes reading the article, you probably won't even remember how you got to the site. So when you click back and see the feed, you won't question it. You'll assume nothing deceptive happened. While LinkedIn only pushes you one level down in the history state, more aggressive websites can break the back button entirely. They push a new history state every time you try to go back, effectively trapping you on their site. In those cases, your only option is to close the tab. I've also seen developers unintentionally break the back button, often when implementing a search feature. On a search box where each keystroke returns a result, an inexperienced developer might push a new history state on every keystroke, intending to let users navigate back to previous search terms. Unfortunately, this creates an excessive number of history entries. If you typed a long search query, you'd have to click the back button for every character (including spaces) just to get back to the previous page. The correct approach is to only push the history state when the user submits or leaves the search box ( ). As of yesterday, Google announced a new spam policy to address this issue. Their reasoning: People report feeling manipulated and eventually less willing to visit unfamiliar sites. As we've stated before, inserting deceptive or manipulative pages into a user's browser history has always been against our Google Search Essentials. Any website using these tactics will be demoted in search results: Pages that are engaging in back button hijacking may be subject to manual spam actions or automated demotions, which can impact the site's performance in Google Search results. To give site owners time to make any needed changes, we're publishing this policy two months in advance of enforcement on June 15, 2026. I'm not sure how much search rankings affect LinkedIn specifically, but in the grand scheme of things, this is a welcome change. I hope this practice is abolished entirely.

0 views

Mythos and its impact on security

I’m sure by now you’ve all read the news about Anthropic’s new “Mythos” model and its apparently “dangerous” capabilities in finding security vulnerabilities. I’m sure everyone reading this also has opinions about that. Well, here are a few of mine. Firstly, it’s tempting to dismiss the announcement as pure marketing hype. Anthropic are rumoured to be approaching IPO , so obviously a lot of hype is expected, and we’ve seen the “dangerous” card played before with GPT-2 . Throughout the history of computer security, both new tools and security researchers themselves have often been branded as dangerous or irresponsible. If you have sympathy for this viewpoint, then I hate to break it to you: Mythos is not inserting vulnerabilities into software, they were there all along. (Vibe-coding notwithstanding). That’s not to say that Mythos doesn’t represent a potentially interesting breakthrough. (Although apparently many existing small models are able to reproduce its findings, at least in part). And that’s not to say that releasing Mythos would not have some risk: potentially quite a large risk in some cases, and its ability to synthesise actual exploits is concerning. All security tools that find vulnerabilities come with a risk, but they also come with an upside: letting defenders find vulnerabilities too (ideally, first). Anthropic quote costs of around $10,000–20,000 for each vulnerability they found. You can quibble around those costs and I’m sure they’ll come down over time, but at the moment I think it’s fair to say that this won’t be run over every single software project out there. If it’s going to be used by bad actors, then it’ll probably still be somewhat targeted at high-impact systems. I’m sure we’ll see some new zero-day exploits of edge devices and probably an uptick in ransomware attacks, but it’s not like edge devices don’t regularly get exploited anyway . (Spoiler: many security products are shockingly poorly designed and implemented). But on the plus side, I can see Mythos and similar models being an excellent add-on to your annual pentest engagement. At those costs, you’re not going to run it on every build pipeline, and there’s probably going to be a certain amount of expertise required to get the most from it in a limited budget. As with all new tools, eventually the findings will plateau. There’s only so many times you can run the same tool over the same source code and come up with new findings. That’s not to say that there won’t still be vulnerabilities (there almost certainly will), but just that the tool will not be able to find them. As a former AI researcher myself (before modern ML exploded), I find this aspect of the Mythos write-up quite interesting. Most security tools suffer from problems with false positives, and LLMs are of course famous for that: they are “bullshit machines” . Putting it in slightly less pejorative terms, I would call them abduction machines: they generate plausible hypotheses to explain some set of observations. (Training an LLM is induction, but what they do at runtime is closer to abduction). In the case of a chatbot, the “observations” are the token context window, and the hypotheses are the plausible next token completions. In the case of vulnerability hunting, the observations are the source code and a prompt asking to look for a vulnerability and the hypotheses are the generated potential vulnerabilities. Despite knowing how this works, it is still kind of magic to me that the latter emerges from the former (plausible vulnerabilities from merely predicting the most likely next tokens given the context). Broadly speaking, the better the model the more likely those hypotheses are to be accurate. But they are still wrong an awful lot of the time, and false positives are the death of productivity. We’ve all seen reports of open source projects being overwhelmed by “slop” AI-generated vulnerability reports . But recently, that seems to have changed and a larger quantity of high-quality reports are being submitted to many high-profile projects. What changed? I think the clue is front and centre in Anthropic’s write-up: use of Address Sanitizer (ASan) as an oracle to weed-out false positives. I think this is a crucial dividing line that separates successful from unsuccessful uses of AI. This is why “agentic” (grr) AI is relatively successful at software development. The models aren’t inherently much better at writing code than any other task, but there already exists a large body of automated “bullshit correctors”: type checkers, linters, automated test suites, etc. (Many of which use techniques from earlier waves of “symbolic” AI research, just saying…) These oracles provide a clear signal about whether a hypothesis generated by the LLM is bullshit or not. (I would hypothesise that LLMs are likely to produce better code in languages with more sophisticated type systems). Hence why we see quite a lot of progress and marketing for AI systems in such use-cases, despite those markets being relatively small compared to the AI company’s massive valuations and funding. I’m guessing investors are not going to be pleased to have stumped up billions for a slice of a dev tools company? But software development does seem somewhat unique in this regard. Getting back to vulnerability hunting and oracles. This is the same situation that fuzzers face: a fuzzer is generally only really good at finding vulnerabilities when there is a good oracle to decide if you’ve found one or not (PDF link). Like Mythos, fuzzers are very good at finding crashes and (via oracles like ASan) memory safety issues, but they are not going to find subtle violations of user expectations. Mythos is clearly more than just a fuzzer though, it’s also looking at the source code and doing somewhat sophisticated “analysis” of potential weaknesses. But I think the problem of needing an oracle will remain. Without an oracle, I’m sure that Mythos would still find genuine vulnerabilities, but they will be overwhelmed by slop false positives, which will drown out the signal in the noise. For me then, I think this is the most interesting open question for LLM-based vulnerability finders: which classes of bugs can we write (or train) good oracles for? I think potentially quite a lot, but definitely not everything. I think humans will still have an edge in finding complex bugs for a long time to come. Obviously Anthropic’s take is that you should use AI tools to find all the bugs first. You could dismiss that as an obvious attempt to cash-in, and it even has shades of a protection racket . But I do think that Mythos and models like it are probably worth using, as an add-on to a human penetration test or similar. But really Mythos is just the latest tool revealing the continuing poor state of software security. Such tools continue to find a frightening array of vulnerabilities, because there are a terrifying quantity of them out there, and we keep adding more. The situation is not really going to be improved by throwing more AI at the problem. If anything, the rise of vibe-coding is likely to be increasing this trend. As I’ve covered here before, even apparent experts write total garbage security services when assisted by an LLM. If you want secure software, you need to slow down and think carefully and deeply, not rush ever faster to churn out more and more junk software. In the short term, we can just keep doing the things we know how to do: thinking about security earlier in the design process, incorporating basic security tools and testing into build pipelines, ensuring you can patch CVEs quickly (but not too quickly ), etc etc. Longer-term, we know how to solve some classes of vulnerabilities altogether. For example, we know that memory-safe programming languages eliminate whole swathes of potential issues, including many of the sort that Mythos is good at discovering. We’ve known this for decades but still write lots of software in unsafe languages. Numerous reports and government proclamations are slowly shifting that, but we still have a very long way to go. Capability-based security would solve many other classes of vulnerabilities, whether in CPUs , Operating Systems, or in supply chains . These are not easy fixes, and would require a massive investment over many years. Profit-driven companies are not going to pursue them without regulatory pressure, and are largely going in the opposite direction. Such fundamental changes would not solve everything: there will still be vulnerabilities, but fewer and of less severity if we do it right. Strong security mechanisms can provide a multiplicative reduction in risk . But if we really do want more secure software, and a foundation to our digital society that we can actually trust, then I don’t see an alternative. Finding and fixing individual vulnerabilities will never deliver that, however good the tools get.

0 views

Happy Easter 2026

What’s going on, Internet? I’m writing this from Martinborough in the pockets of time I have between activities with the family. I’ve found myself reading more eBooks rather than audiobooks this year. Turns out my brain still works without someone narrating to it . I need to take note of where I hear about interesting podcasts from, because by the time I get done listening to them, like MF DOOM: Long Island to Leeds , I totally forget! Maybe this would be a great use of the notebook I carry around but never write in? Over at the Cafe , yequari got an instance of Linkding setup and made available to all community members via our SSO service. I know a bunch of people in this corner of the web use it, so I got myself some focus time and moved my Bookmarks to Linkding . It’s improved my bookmarking workflow and hoping that makes me more inclined to actually saving them as I read them and I get more regular sharing them again, which lead me to trying out a Link Dump in their own post rather than hiding them away at the bottom of these posts each month. And finally, the musical highlight this month was seeing Bic Runga At Te Paepae Theatre , Dinner and a concert. Going into new music blind, a fantastic evening out with my wife. We don’t get too many times together like this these days between everything else going on in life. I haven’t picked up an audiobook again yet, but I’ve finished the Vega Jane stories with the last two books, Vega Jane and the Rebels’ Revolt and Vega Jane and the End of Time by David Baldacci. I really enjoyed this series, and like I mentioned previously I don’t even know how the first one wound up on my Kobo! Picked up English Teacher after a trailer came up on YouTube and after the first episode I was right into it. A group of 30-something teachers at an American high school navigating their own friendship dramas while dealing with hilariously difficult students. Two seasons were available so I blasted through all of them. Crackhead is a new local dramady and I got to attend a special screening of the first two episodes with the cast and crew. Created by Holly Shervey and based on her own early experiences, the show takes place within a rehab facility somewhere in New Zealand. Watching it episode by episode on Three Now as they come out. Continued watching Shrinking S3 as well, it feels like it’s wrapping up, does this have another season in it? I’m a sucker for dumb teen movies and Summer of 69 fit the description. A young twitch streamer gets to the point in life where they want to get it on with their childhood crush. They recruit a stripper and then it all goes down. Regretting You was an interesting one, turns out to be about a mother and daughter dealing with the fallout of a fatal accident that reveals a betrayal in the family. Such a wild plot, my wife only said wtf when I described it to her. I had Roofman playing in the background while playing Warcraft. It was okay, didn’t really care for it, I don’t think I really enjoy Channing Tatum. Four new records added to the collection. Frou Frou and Lisa Loeb from the Interscope Vinyl Collective (IVC). I wasn’t familiar with either record, they’re fine and have made me aware to exercise the cancel subscription button for releases that I’m not totally keen on. They’re gorgeous releases, I love them and have listened to them but yeah, could be spending that money on records I actually want . The other two were both Bic Runga records. Her latest release (first in 15 years), Red Sunset (signed) and then her original from 1997, Drive which I picked up at her show and got it signed in person 😃 What else have I been listening to? Been on a hard house kick recently so I dug up an old favourite from the early 2000s Wellington clubbing scene - Dynotuned by DJ Shakka, and went back through the Hard House Euphoria albums. After listening to the MF DOOM podcast I dived into the albums of his I have in my collection. Operation: Doomsday and Madvillainy. On the local music front I got really into The Panthers from Diggy Dupé, choicevaughan, and P. Smith (aka Troy Kingi). “The Villain” was on repeat, such a tune. Another new find was RNZŌ SZN from RNZŌ. Released January 2025, this one’s going to be in regular rotation for a while. Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website.

0 views
Rik Huijzer Yesterday

Trump and Ivanka

Trump and Ivanka through the years ![trump-ivanka/white-dress.jpg](/files/c61731c028823bcd) ![trump-ivanka/gettyimages-74713659.webp](/files/43b87b67b4f0e010) ![trump-ivanka/yellow-dress.jpg](/files/390097223887ec59) ![trump-ivanka/awkward-hand.jpg](/files/0c5f6fdd6a10be76) ![trump-ivanka/weird-breast-hold.jpg](/files/c49f1886d4bc0a64) ![trump-ivanka/ivanka-trump-eric-donald-440nw-9912536a.jpg](/files/c3c46b04f956094f) ![trump-ivanka/vf_ivanka_trump_6234.webp](/files/1acd15c895c20d38) ![trump-ivanka/gettyimag...

0 views
Rik Huijzer Yesterday

The Jewish Messiah

Some interesting links by creekbendz on Reddit. In 2022, various prominent rabbis explained that "the Messiah is just about to reveal himself", according to _israel today_. According to _Israel News_, it would have taken Trump "a miracle to win" in 2016. Next, "King David also had to overcome great opposition in order to establish his kingship,” Rabbi Berger told Breaking Israel News. “Trump connected to that. This cycle was completed when the president recognized Jerusalem as the true capital of the Jewish People, just as King David did." In Gematria (Hebrew numerology), _Donald Trump_...

0 views
iDiallo Yesterday

You paid for it, you should be comfortable in it

A friend of mine bought a Tesla Roadster back in the early 2010s. At the time, spotting a Tesla on the road was a rare event. Maybe even occasion enough to stop and take a picture. I never got the chance to photograph one, let alone drive one, until I met this new friend recently. This was my chance to experience the car firsthand. We walked to the parking structure to see it. As soon as he opened the door, something looked... off. On the outside, it was a pristine, six-figure roadster. But the inside looked completely custom. Not "custom" in the sense of a professional shop install, but more like the driver himself grabbed a hammer and chisel and made it his own. First, the driver's seat had been altered. It was much lower than usual and didn't match the passenger seat. My friend stands 6'7", and the Roadster is a tiny car. He physically couldn't fit, so he modified the seat rails to lower it. But that fix created a new problem: the door armrest now dug into his hip. So, he took a file to the interior panel, shaved it down, and 3D printed a smaller, ergonomic armrest. He even 3D printed a cup holder for the passenger side so his coffee was within reach. To me, the idea of taking a Dremel or a file to a $100,000+ car was unimaginable. You must be crazy to do it. He caught the look on my face and shrugged. "Hey, it's my car. I paid for it. I intend to be comfortable in it." I never thought of it like this. That sentiment stuck with me. Recently when I read an article by Kent Walters about filing the corners of his MacBook , those same feelings resurfaced. My work MacBook has edges so sharp that I've often felt like I was slicing my wrist on the chassis. I treated this as a design flaw I had to endure. But not Kent. He treated it as an obstacle to be removed. He literally filed down the corners of his laptop to ensure the machine he uses every day was comfortable. I may not have the guts to file my work issued MacBook, but I'm no stranger to customization... in software. I modify my tools constantly. I spend days tweaking my IDE, remapping keyboard shortcuts, and writing custom scripts until the software is unrecognizable to anyone else on my team. I don't think twice about rewriting a config file to make the tool fit my brain. When I was a kid, I always had a screw driver around, fixing a device that wasn't really broken. On the home computer, I modified everything. I once deleted all files to improve performance. It didn't work, but it led to a fruitful career. But somehow, when it comes to expensive hardware now, I freeze. I treat the physical object as a museum piece to be preserved. I bought a docking station to banish the laptop to a shelf, using an external mouse and keyboard to avoid touching the sharp chassis. I built a complex workaround to accommodate the tool, rather than performing the simple, brutal act of modifying the tool to accommodate me. We treat our physical tools as if they are on loan from the manufacturer. You'll see a musician buying a vintage guitar but refuses to adjust the action, terrified of ruining the "collector's value." Meanwhile, the working guitarist has sanded down the neck and covered it in stickers because it feels better in their hand. The software engineer accepts the default keybindings to avoid "bad habits," while the power user creates a layout that doubles their speed. If you own a tool, whether it's a car, a computer, or a line of code, you own the right to change it. The manufacturer designed it for the "average" user, but you are a specific human with specific needs. Remember grandma's couch in the living room? It had that plastic cover on it. It was so uncomfortable, but no one dared to remove it. The plastic was to preserve the sofa. No one got to enjoy it, instead everyone accommodated the couch only to preserve its value. A value that one ever benefits from. Don't let the perceived value of an object stop you from making it truly yours. A tool with battle scars is a tool that is loved.

0 views
Josh Comeau Yesterday

Squash and Stretch

Have you ever heard of Disney’s 12 Basic Principles of Animation? In this tutorial, we’ll explore how we can use the very first principle to create SVG micro-interactions that feel way more natural and believable. It’s one of those small things that has a big impact.

0 views
Stratechery Yesterday

Mythos, Muse, and the Opportunity Cost of Compute

Listen to this post : In January 2025, Doug O’Laughlin at Fabricated Knowledge declared that o1 and reasoning models marked the end of Aggregation Theory: I believe that there is no practical limit to the improvements of models other than economics, and I think that will be the real constraint in the future. It is reasonable that if we spent infinite dollars on a model, it would be improved. The problem is whether infinite dollars would make sense for a business. That is going to be the key question for 2025. How do the economics of AI make this work? One of the core assumptions about the internet has just been broken. Marginal costs now exist again, meaning that most hyperscalers will become increasingly capital-intensive. The era of Aggregation Theory is behind us, and AI is again making technology expensive. This relation of increased cost from increased consumption is anti-internet era thinking. And this will be the big problem that will be reckoned with this year. Hyperscaler’s business models are mainly underpinned by the marginal cost being zero. So, as long as you set up the infrastructure and fill an internet-scale product with users, you can make money. This era will soon be over, and the future will be much weirder and more compute-intensive. Looking back on the 2010s, we will probably consider them a naive time in the long arc of technology. One of our fundamental assumptions about this period is unraveling. This will be the single most significant change in the technology landscape going forward. Aggregation Theory was, if I may say so myself, the single best way to understand the 2010s, particularly consumer tech. It explained the dynamics undergirding Google and Facebook’s dominance, as well as the App Store and Amazon’s e-commerce business; it was also a useful ( albeit incomplete ) framework to understand an entire host of consumer services like Uber, Airbnb, and Netflix. It’s worth pointing out, however, that some of the critical insights undergirding Aggregation Theory are much older, and are embedded in the fundamental nature of tech itself. They are, as O’Laughlin notes, rooted in the concept of zero marginal costs. Marginal costs are how much it costs to make one more unit of a good. Consider a widget-making factory: Land and machines are clearly fixed costs; you have to have both to get started, and you are paying for both whether or not you make one more widget. Raw material, on the other hand, is clearly a marginal cost: if you make one more widget, you need one more widget’s worth of raw material. When it comes to physical goods, electricity and humans are also marginal costs: you need more or fewer of them depending on whether you make more or fewer widgets. Where marginal costs matter is that they provide a price floor. Companies will operate unprofitably because profit and loss is an accounting concept that incorporates depreciation, i.e. your fixed costs. For example, imagine that a company spent $1,000 on a factory to make widgets that have a marginal cost of $10: as long as the price of widgets is >$10 the company will make them even if they don’t earn enough money to cover their depreciation costs (i.e. they operate at a loss) because at least they are still making a marginal profit on each widget (what the company may not do is invest in any more fixed costs, and, eventually, will probably go bankrupt from interest on the debt that likely financed those fixed costs). I explain all of this precisely because it’s almost completely immaterial to tech. First, there generally are no raw material costs, because the outputs are digital. Second, because there are no raw material costs, and because the fixed costs are so large, electricity and humans are generally treated as fixed costs, not marginal costs: of course you will run your servers all of the time and at full capacity, because every scrap of additional revenue you can generate is worth it. AI very much fits in this paradigm: the output is digital, and while AI chips use a lot of electricity, the cost is a fraction of the cost of the chips themselves, which is to say that no one with AI chips is making marginal cost calculations in terms of utilizing them. They’re going to be used! Rather, the decision that matters is what they will be used for. Consider Microsoft: last quarter the company missed the Street’s Azure growth expectations not because there wasn’t demand, but because the company decided to use its capacity for its own products. CFO Amy Hood said on the company’s earnings call : I think it’s probably better to think about the Azure guidance that we give as an allocated capacity guide about what we can deliver in Azure revenue. Because as we spend the capital and put GPUs specifically, it applies to CPUs, the GPUs more specifically, we’re really making long-term decisions. And the first thing we’re doing is solving for the increased usage in sales and the accelerating pace of M365 Copilot as well as GitHub Copilot, our first-party apps. Then we make sure we’re investing in the long-term nature of R&D and product innovation. And much of the acceleration that I think you’ve seen from us and products over the past a bit is coming because we are allocating GPUs and capacity to many of the talented AI people we’ve been hiring over the past years. Then, when you end up, is that, you end up with the remainder going towards serving the Azure capacity that continues to grow in terms of demand. And a way to think about it, because I think, I get asked this question sometimes, is if I had taken the GPUs that just came online in Q1 and Q2 in terms of GPUs and allocated them all to Azure, the KPI would have been over 40. And I think the most important thing to realize is that this is about investing in all the layers of the stack that benefit customers. And I think that’s hopefully helpful in terms of thinking about capital growth, it shows in every piece, it shows in revenue growth across the business and shows as OpEx growth as we invest in our people. The cost that Microsoft is contending with here is not marginal cost, but rather opportunity cost: compute spent in one area cannot be used in another area; in the case of these earnings, Microsoft was admitting that they could have made their Azure number if they wanted to, but chose to prioritize their own workloads because, as CEO Satya Nadella noted later in the call, those have higher gross margin profiles and higher lifetime value. It’s opportunity costs, not marginal costs, that are the challenge facing hyperscalers. How much compute should go to customers, and which ones? How much should be reserved for internal workloads? Microsoft needs to balance Azure — both for its enterprise customers and OpenAI — and its software business; Amazon needs to balance its e-commerce business, AWS, and its strategic investments in both Anthropic and OpenAI. Google has to balance GCP, its own strategic investment in Anthropic, and its consumer businesses. Last week Anthropic released announced Mythos, its most advanced model. And, in somewhat typical Anthropic fashion, it did so by focusing on its dangers; from the introductory post for Project Glasswing , the company’s initiative for leveraging Mythos to address security: We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity. Claude Mythos Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout—for economies, public safety, and national security—could be severe. Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes. In an Update last week I analogized Anthropic’s “disaster-porn-as-marketing-tool” approach to The Boy Who Cried Wolf ; what’s important about that analogy is not just that the boy raised false alarms, but also that, in the end, the wolf did come. To that end, I wrote two weeks ago about the myriad of security issues that underpin all software, and my optimism that AI would solve these issues in the long run, even if it made things much worse in the short run. In other words, it’s actually not important whether or not Mythos represents a major security threat: if this model doesn’t, a future model will; to that end, I do support leveraging Mythos to proactively find and fix bugs before bad actors can find and exploit them. At the same time, it’s also worth noting that there are other reasons for Anthropic to not make Mythos widely available, limiting access to a finite number of companies with a high capacity and willingness to pay. The first are those opportunity costs: Anthropic is already short on compute serving its current models; X was overrun with complaints and debates this weekend about Anthropic allegedly dumbing down Claude over the last month or so . Making Mythos more widely available — particularly to subscription plans that don’t pay per usage — would make the situation much worse. In other words, Anthropic isn’t facing a marginal cost problem, but an opportunity cost problem: where to allocate its compute. Of course this could become a margin problem: I suspect that Anthropic is going to overcome its conservatism in terms of compute by acquiring more compute from hyperscalers and neoclouds, and paying dearly for the privilege. The key to handling those costs will be to charge more for Claude going forward; that, by extension, means maintaining pricing power, which leads to a second benefit of not releasing Mythos broadly. Anthropic certainly faces competition from OpenAI; for both frontier labs, however, the real competition in the long run are open source models. Right now those primarily come from China, and a key ingredient in fast-following frontier models is distillation; from Anthropic’s blog : We have identified industrial-scale campaigns by three AI laboratories—DeepSeek, Moonshot, and MiniMax—to illicitly extract Claude’s capabilities to improve their own models. These labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, in violation of our terms of service and regional access restrictions. These labs used a technique called “distillation,” which involves training a less capable model on the outputs of a stronger one. Distillation is a widely used and legitimate training method. For example, frontier AI labs routinely distill their own models to create smaller, cheaper versions for their customers. But distillation can also be used for illicit purposes: competitors can use it to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost, that it would take to develop them independently. I absolutely believe this is a real problem, and wrote as much when DeepSeek R1 was released last year . I also think it’s in the interest of everyone other than the frontier labs to pretend that it isn’t; open source models are not subject to the frontier labs’ markup or compute constraints, which is exactly why it benefits most companies to have them available, whether or not they are distilled. Of course that doesn’t mean they are free to run: you still need to provide the compute. Notice, however, how that makes stopping distillation even more of a priority for the frontier labs: first, they want to protect their margins. Second, however, their biggest cost is opportunity cost: the customers they can’t serve because they don’t have enough compute. To the extent they can make compute less useful for their potential customers — by stopping open source models from distilling their models — is the extent to which they can acquire that compute for themselves at more favorable rates. Mythos wasn’t the only new model announced last week: Meta released the first fruit of their new frontier lab as well. From the company’s blog post : Today, we’re excited to introduce Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts. To support further scaling, we are making strategic investments across the entire stack — from research and model training to infrastructure, including the Hyperion data center… Muse Spark offers competitive performance in multimodal perception, reasoning, health, and agentic tasks. We continue to invest in areas with current performance gaps, such as long-horizon agentic systems and coding workflows. Muse Spark isn’t state of the art, but it’s in the game, and overall a positive first impression from Meta Superintelligence Labs. What is most notable to me, however, is the extent to which the last nine months of AI have made clear that CEO Mark Zuckerberg made the right call to embark on that “ground-up overhaul of [Meta’s] AI efforts”. The trigger for O’Laughlin’s post that I opened this Article with was reasoning, where models using more tokens led to better answers; since then agents have exponentially increased token demand , as they can use LLMs continuously without a human in the loop. This is a huge driver in sky-rocketing demand for Claude, as well as OpenAI’s Codex. Moreover, this use case is so potentially profitable that not only is Anthropic’s revenue sky-rocketing, but OpenAI is pivoting its focus to enterprise. Indeed, you can make the argument that one of OpenAI’s biggest challenges is the fact it has such a popular consumer product in ChatGPT. I, with my Aggregation Theory lens, have long maintained that that userbase was a big advantage for OpenAI, but that assumed that the company could effectively monetize it, which is why I have argued so vociferously for an advertising model . OpenAI has big projections for exactly that, but until that materializes, that big consumer base is a big opportunity cost in terms of OpenAI’s focus and compute. The company has, to its credit and in the face of widespread skepticism, made significant investments in more compute, but the temptation to allocate more and more compute to agentic use cases that enterprises will pay for, even at the expense of the consumer business, will be very large. This puts Meta in a unique position relative to everyone else in the industry: unlike any of the hyperscalers or the frontier labs, Meta does not have an enterprise or cloud business to worry about. That means that serving the consumer market comes with no opportunity costs. Of course those opportunity costs would be much smaller anyways, given that Meta already has an at-scale advertising business to monetize usage. In other words, Meta may actually face less competition in winning the consumer space than it might have seemed a few months ago, simply because that is their primary focus — and because they have their own model, which means they don’t need to worry about not having access to the frontier labs (much of this analysis applies to Google, of course). This, by the same token, is why Meta should open source Muse, just like they did Llama. The entities that will be the most hurt by widespread availability of a frontier model are other frontier labs, who will see their pricing power reduced and face increased competition for compute. This will make it even harder for them to bear the opportunity cost of pursuing the consumer market, leaving it for Meta. So is “the era of Aggregation Theory…behind us”? On one hand, the insight that the way to create and maintain value will come from owning the customer is almost certainly going to continue to be the case. On the consumer side owning customers leads to advertising which provides the revenue to provide services to customers. On the enterprise side — which, I would note, has never been an arena where Aggregation Theory was meant to be applied — I think it’s likely that both Anthropic and OpenAI continue to move up the stack and deliver features that compete with software providers directly (an approach that is also in line with not making leading edge models publicly available). On the other hand, O’Laughlin’s observation that we are and will continue to be compute constrained is an important one: companies will not be able to assume they can serve everyone, because serving one set of customers imposes the opportunity cost of not serving another. This won’t, at least in theory, last forever: at some point AI will be “good enough” for enough use cases that there will be enough compute capacity to take advantage of the fact that there really aren’t meaningful marginal costs entailed in serving AI; that theoretical future, however, feels further away than ever. OpenAI is betting that this compute constraint — and the deals they have made to overcome it — will matter more than Anthropic’s current momentum with end users. From Bloomberg : OpenAI told investors this week that its early push to dramatically increase computing resources gives it a key advantage over Anthropic PBC at a moment when its longtime rival is gaining ground and mulling a potential public offering. The ChatGPT maker said it has outpaced Anthropic by “rapidly and consistently” adding computing capacity to support wider adoption of its software, according to a note the company sent to some of its investors after Anthropic announced a more powerful AI model called Mythos. The ambitious infrastructure build-out, criticized by some as too costly, has enabled OpenAI to better keep pace with rising demand for AI products, the memo states. I’m less certain that this will be dispositive. When it comes to AI, distribution and transaction costs are still free — the two preconditions for Aggregators — which means that the winners should be those with the most compelling products. Those products will win the most users, providing the money necessary to source the compute to serve them; consider Anthropic’s deal to secure a meaningful portion of TPU supply , which, given the capacity constraints at TSMC, is ultimately an example of taking supply from Google. I suspect that Anthropic can take more, including already built hyperscaler and neocloud capacity. Yes, that compute will be more expensive, but if demand is high enough the necessary cash flow will be there. In other words, my bet is that owning demand will ultimately trump owning supply, suggesting that the underlying principles of Aggregation Theory lives on. To put it another way, I think that OpenAI will need to win with better products, not just more compute; then again, if more compute is the key to better products, then does supply matter most? Regardless, they’ll certainly be focused on delivering both to the enterprise customers who are driving Anthropic’s astonishing growth. The real cost may be the consumer market they currently dominate, given that Meta has nothing to lose and everything to gain. You need land for the factory You need machines for the factory You need electricity to operate the machines You need humans to operate the machines You need the raw material for the widgets

0 views
André Arko Yesterday

Software developers have become their own joke

Creating software is complicated. It’s hard to figure out exactly what you need to build without a lot of trial and error. It almost always requires both exploring possible options and refining something until it works really well. But those things aren’t the same! Your research prototype is not a good product that people will happily pay for. Back in the olden days, when software literally came from BigCo R&D departments, we managed to invent Unix, and the mouse, and GUIs, and Ethernet, and TCP/IP, and a ton of other stuff we all use constantly today. Those research divisions didn’t ship viable consumer products, though. Doug Englebart demoed a mouse-driven GUI in 1968, but you couldn’t buy a home computer with a mouse-driven GUI until 1979, and they didn’t become commercially popular until the Macintosh in 1984. Even years or decades of research wasn’t enough, and years (or decades!) of development work also needed to be done before the results was ready for people to use. Early literature about creating software, written by Fred Brooks and his peers, seems to contain the internalized view that both R and D are required. That’s not surprising, since R&D departments created most software back then, but we seem to have lost track of that connection. Even though our jobs are descended from those R&D labs of yore, we somehow lost the industry job of “software researcher”, and only “software developer” remains. Instead, research happens in academia, where an argument and some pseudocode is all you need to publish a paper. In that world, development is effectively non-existent. (I admit the division isn’t perfectly clear-cut. Sometimes academics will start companies around their research that create a product, or more likely get acquired to add a feature to a product. And sometimes Linus Torvalds will just build a new operating system, without doing any academic research on it, and it will get so popular everyone uses it. The point is that industry and academia have each publicly claimed one half of R&D while disowning the other.) The broader separation of research and development into academia and industry is really unfortunate, because good software needs both research and development as inputs. If you don’t do any research, you can’t identify which parts will be hard (or impossible) until after it’s too late. You also won’t have a good idea of what parts are important until after you’ve put in most or all of the work to create the parts that don’t matter. If you don’t do development, you won’t ever have something robust enough that other people can use it successfully. Meanwhile, on the other side, it feels like developers work hard to convince themselves there are no research aspects involved in their jobs. We call anything research-ish by another name, like “design”, “user experience”, “prototyping”, “de-risking”, “a spike”, and a lot of other funny euphemisms that avoid referring to the work as research. It seems like we’re trying to convince ourselves that we don’t do Research any more, because we are just Developers. This cultural lack of clarity around research in software development spaces really hit hard for me this week, as I read yet another treatise on working with LLM-driven agents for development. The two most popular takes that I have seen are “these tools are a fundamental shift in the nature of software development” and “these tools change nothing about building software at all”. Then the two sides start screaming at each other about how the other side is delusional and time will prove them completely wrong, and I lose interest. If we instead start from the premise that all software work requires research (where the problem space must be explored) and development (where solutions must be implemented and refined), there’s something hiding in the sometimes messy overlap between those two ideas that I’m not seeing come up in any discussions. No one can take the output of software research and treat it like it’s the output of software development. Not Bell Labs, not Xerox PARC, not Microsoft middle managers, and not “solo founders managing a team of AI agents” today. Unfortunately, seeing a prototype and becoming convinced it’s complete is not a new problem. It’s been the bane of software development possibly since the very beginning, when (apocryphally) a manager would review a mockup and conclude the project was now complete and could be shipped to customers immediately. Today, instead of telling that story as a joke, software developers have have somehow turned themselves into the boss from the joke, shouting that it’s time to ship the research prototype because it “looks finished”. How did we do this to ourselves? It seems like, back when we always had to do all the work ourselves, it was harder for software developers to be confused this way. If a developer knows they skipped every validation and edge case, it’s much easier to realize it’s not finished. If an LLM agent says “here’s a comprehensive implementation”, without mentioning all the validations and edge cases it skipped, many (and possibly most) developers will not notice the parts that are missing. This phenomenon is bad for a lot of reasons, including one reason you have probably already thought of: we’re going to get a lot more software claiming to be “comprehensive” and “fully implemented” when it’s really a partially finished prototype that’s full of holes. In a world full of research prototypes being pitched as completed development work, life is about to get worse for everyone who uses software. The docs are even more wildly wrong than they were before, customer support is telling you that your problem is solved by a feature that doesn’t exist, and company leadership is so excited they are planning to fire as many humans as possible so they can have more of it. I don’t want worse software! The software we already have is mostly terrible. Not only much worse software, but also much more of it, is pretty much my worst case scenario. What I actually want is better software, even if that means less of it. Unfortunately, instead of making better software, software developers have decided to become the butt of their own joke, shipping software that doesn’t work, with a footnote that says they know it doesn’t work but they are still shipping it. I don’t see any way to stop it, but I hate it anyway.

0 views

One Developer, Two Dozen Agents, Zero Alignment

Why we need collaborative AI engineering

0 views

A little tool to visualise MoE expert routing

I've been curious for a while about what's actually happening inside Mixture of Experts models when they generate tokens. Nearly every frontier model these days (Qwen 3.5, DeepSeek, Kimi, and almost certainly Opus and GPT-5.x) is a MoE - but it's hard to get an intuition for what "expert routing" actually looks like in practice. So I built a small tool to visualise it: moe-viz.martinalderson.com You can pick between a few different prompts, watch the generation animate out, and see exactly which experts fire at each layer for each token. The top panel shows routing as the token is generated, the bottom panel builds up a cumulative heatmap across the whole generation. I built this by modifying the llama.cpp codebase to output more profiling data, with Claude Code's help. So it may have serious mistakes, but it was a really fun weekend project. The thing that really surprised me: for any given (albeit short) prompt, ~25% of experts never activate at all. But it's always a different 25% - run a different prompt and a different set of experts goes dormant. That's a much more interesting result than I expected. Interestingly Gemma 26BA4 runs really well with the "CPU MoE" feature - 4b params is not a lot to run on a fairly fast CPU and having KV cache on GPU really helps. I think there's a lot of performance improvements that could be done with MoE inference locally as well - eg caching certain experts on GPU vs CPU. If you're interested in learning more about LLM inference internals I'd certainly recommend pointing your favourite coding agent at the llama.cpp codebase and getting it to explain the various parts - it really helped me learn a lot.

0 views
iDiallo 2 days ago

Your AWS Certificate Makes You an AWS Salesman

I must have been the last developer still confused by the AWS interface. I knew how to access DynamoDB, that was the only tool I needed for my daily work. But everything else was a mystery. How do I access web hosting? If I needed a small server to host a static website, what service would I use? Searching for "web hosting" inside the AWS console yielded nothing. After digging through the web, I found the answer: an Elastic Cloud Compute instance, better known as EC2. I learned that I could use it under the "Free Tier." Amazon offers free tiers for many services, but figuring out the actual cost beyond that introductory period requires elaborate calculation tools. In fact, I’ve often seen independent developers build tools specifically to help people decipher AWS pricing If you want to use AWS effectively, it seems the only path is to get certified. Companies send employees to conferences and courses to learn the platform. I took some of those courses and they taught me how to navigate the interface and build very specific things. But that skill isn't transferrable. In the course, I wasn't exactly learning a new engineering skill. Instead, I was learning Amazon. Amazon has created a complex suite of tools that has become the industry standard. Hidden within its moat of confusion, we are trained to believe it is the only option. Its complexity justifies the high cost, and the Free Tier lures in new users who settle into the idea that this is just "the way" to do web development. When you are presented with a simple interface like DigitalOcean or Linode and a much cheaper price tag, you tend to think that something is missing. Surely, a cheaper, simpler service must lack half the features, right? The reality is, you don't need half the stuff AWS offers. Where other companies create tutorials to help you build, Amazon offers certificates. It is a powerful signal for enterprise legitimacy, but for most developers, it is overkill. This isn't to say AWS is "bad," but it obscures the reality of running a web service. It is much easier than it seems. There are hundreds of alternatives for hosting. You can run your services reliably on a VPS without ever breaking the bank. Most web programming is free , or at the very least, affordable.

0 views

From Painfully Explicit to Implicit in Lean

Note: AI was used to edit this post. As a proof-of-human thought and input, I am also publishing the original draft which was written fully before asking AI to edit the post with me. This post is aimed at Lean language beginners that are interested in writing proofs in Lean, but still feel lost when reading Lean code. A very simplified mental model of Lean is that at the core there are two systems:

0 views
Kev Quirk 2 days ago

Adding a Book Editor to My Pure Blog Site

Regular readers will know that I've been on quite the CMS journey over the years. WordPress, Grav, Jekyll, Kirby, my own little Hyde thing, and now Pure Blog . I won't bore you with the full history again, but the short version is: I kept chasing just the right amount of power and simplicity, and I think Pure Blog might actually be it. But there was one nagging thing. I have a books page that's powered by a YAML data file, which creates a running list of everything I've read with ratings, summaries, and the occasional opinion. It worked great, but editing it meant cracking open a YAML file in my editor and being very careful not to mess up the indentation. Not ideal. So I decided to build a proper admin UI for it. And in doing so, I've confirmed that Pure Blog is exactly what I wanted it to be - flexible and hackable. I added a new Books tab to the admin content page, and a dedicated editor page. It's got all the fields I need - title, author, genre, dates, a star rating dropdown, and a Goodreads URL. I also added CodeMirror editors for the summary and opinion fields, so I have all the markdown goodness they offer in the post and page editors. The key thing is that none of this touched the Pure Blog core. Not a single line. My new book list in Pure Blog A book being edited Pure Blog has a few mechanisms that make this kind of thing surprisingly clean: is auto-loaded after core, so any custom functions I define there are available everywhere — including in admin pages. I put my function here, which takes the books data and writes it back to the data file, then clears the cache — exactly like saving a normal post does. Again, zero core changes. is the escape hatch for when I do need to override a core file. I added both (where I added the Books tab) and (the new editor) to the ignore list , so future Pure Blog updates won't mess with them. It's a simple text file, one path per line. Patch what you need, ignore it, and move on. is where it gets a bit SSG-ish. The books page is powered by — a PHP file that loads the YAML, sorts it by read date, and renders the whole page. It's essentially a template, not unlike a Liquid or Nunjucks layout in Jekyll or Eleventy. Same idea for the books RSS feed . Using a YAML data file for books made more sense to me, rather than markdown files like a post or a page, as it's all metadata really. There's no real "content" for these entries. Put those three things together and you've got something pretty nifty. A customisable admin UI, safe core patching, and template-driven data pages — all without a plugin system or any framework magic. Bloody. Brilliant. I spent years chasing the perfect CMS, and a big part of what I was looking for was this . The ability to build exactly what I need without having to fight the platform, or fork it, or bolt on a load of plugins. With Kirby, I could do this kind of thing, but the learning curve was steep and the blueprint system took me ages to get my head around. With Jekyll/Hyde, I had the SSG flexibility, but no web-based CMS I could login to and create content - I needed my laptop. Pure Blog sits in a really nice middle ground — it's got a proper admin interface out of the box, but it gets out of the way when you want to extend it. I'm chuffed with how the book editor turned out. It's a small thing, but it's exactly what I wanted, and the fact that it all lives outside of core means I can update Pure Blog without worrying about losing any of it. Now, if you'll excuse me, I have some books to log. 📚 Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Kev Quirk 2 days ago

How I Discover New Blogs

Finding a new blog to read is one of my favourite things to do online. It genuinely brings me joy. Right now I have 230 sites that I follow in my RSS reader, Miniflux . If I ever want to spend some time reading, I'll usually open Miniflux over my Mastodon client, Moshidon. There's no likes, boosts, hashtags etc. just interesting people sharing interesting opinions. It's lovely. So how do I discover these blogs? There's many ways to do it, but here's some that I've found most successful, ranked from most useful, to least. When someone I already enjoy reading links to a post from another blogger, either just to share their posts, or to add their own commentary to the conversation. This (to me at least) is the most useful way to discover new blogs to read. It's the entire premise of the Indieweb, so if you own a blog, please make sure you're linking to other blogs in your posts. 🙃 There are a number of great small/indie web aggregators out there, and there seems to be new ones popping up all the time. Here's a list of some of my favourites: I tend to use these as a kind of extended RSS reader. So if I'm up to date on my RSS feeds, I'll use these as a way to continue hunting for new people to follow. Truth is, I actually spend more time on these sites than I do on the fediverse. Speaking of which... There's lots of cool people on the fediverse , and many of them have blogs. Even those who don't blog will regularly share links to posts they've enjoyed. I also nose at hashtags of the topics that interest me, rather than just the timeline of people I follow. So remember to add hashtags to your posts - they're a great way to aid discovery. 👍🏻 This last bucket is just everything else ; where I naturally find my way to a blog while surfing the net. I've discovered some great blogs this way, but it's becoming harder and harder to find indie blogs this way, as discoverability on the web has been overtaken by AI summaries and SEO. 😏 It's still possible though. There's plenty of interesting people out there, creating great posts for us all to enjoy. The indie web is thriving, and if you're not taking advantage of it, you're missing out! Why not take a look at a couple of the sites I've listed above and see what you discover? It's a tonne of fun. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment . Bear Blog Discover Blogosphere Kagi Small Web

0 views
Weakty 2 days ago

26/W15 - Project Hail Mary, Errands, Pens

The highlight of this week were a few social plans. I worked out with a friend and got lunch with another. I treated myself to a new Pilot Kakuno, a fine drawing implement, and of which my opinion has greatly raised since I last had one three or so years ago (and subsequently dropped it nib-first on the floor and ruined it). I also finished the audiobook of Project Hail Mary , which was excellent. I don’t usually watch movies, much less go to the theatre, but the idea of going to a movie on my own and seeing the adaptation is appealing.

0 views
Playtank 2 days ago

Analogue Prototyping

There is a lot to say about prototyping . Chris Hecker talked about advanced prototyping at GDC 2006, and provided a hierarchy of priorities that goes like this: Analogue prototyping comes in right away at Step 1: Don’t . By not launching straight into your game engine, you can save giant heaps of time between hypothesis and implementation. You can also figure out what kinds of references will be relevant before you reach Step 4: Gather References . There’s another side to analogue prototyping as well. In the book Challenges for Game Designers , Brenda Romero says: “A painter gets better by making lots of paintings; sculptors hone their craft by making sculptures; and game designers improve their skills by designing lots of games. […] Unfortunately, designing a complete video game (and implementing it, and then seeing all the things you did right and wrong) can take years, and we’d all like to improve at a faster rate than that.” Brenda Romero Using cards, dice, and paper leads to some of the fastest prototyping possible. It can be just ten minutes between idea and test, fitting really well into those two days of Step 2: Just Do It . Of course, it can also take weeks and require countless iterations, but that’s part of the game designer’s job after all. This post focuses on what to gain from analogue prototypes of digital games, and the practical process involved. It’s also unusually full of real work, since this is something I’ve done quite a bit for my personal projects and is therefore not under NDA. If you’re curious about something or need to tell me I’m wrong, don’t hesitate to comment or e-mail me at [email protected] . Why you should care about analogue prototyping when all you want to do is the next amazing digital game may seem like a mystery. A detour that leads to having your fingers glued together and a bunch of leftover paper clippings you can’t use for anything. In Chris Hecker’s talk, the first suggestion is that you should cheat before you put too much time into anything else. Since you will be cutting and gluing and sleeving, and some of that work takes time, this counts double with analogue prototypes. The easiest way to cheat is to use proxies. If you have a collection of boardgames, this is easy. You can also go out and buy some used games cheap or ask friends if they have some lying around that they don’t use. Perhaps that worn copy of Monopoly that almost caused a family breakup can finally get some table time again, in a different form. Aesthetics matter. If you want to take shortcuts with how a game feels to play, getting something that looks the part can be a shortcut. Go to your local Dollar Store or second hand shop and pick up some plastic toys or a game with miniatures that are similar to what you are after. They can merely be there to act as center pieces for your prototype. The easiest and most efficient reference board that exists is a standard chessboard. Square grid with a manageable size. You can also use a Go board, with the extra benefit that the Go beads also make for excellent proxy components. Beyond those two, you can really use any other board game board too. Just make sure to remember where you got it from if you want to play those games in the future. Or you can even pick up games with missing parts at yard sales, usually super cheap, and scavenge proxy parts from those. For some types of games, finding a good real-world map, perhaps even a tourist map or subway map, can be an excellent shortcut. Not just for wargames, but for anything with a spatial component. The guide map from a theme park or museum works, too. Packs of 52 standard playing cards are fantastic proxies. You can use suits, ladders, make face cards have a different meaning, and much more. Countless prototypes have used these excellent decks to handle anything from combat resolution to hidden information. It’s also possible to go even further, and make your own game use regular playing gards and the known poker combos as a feature. Balatro comes to mind. Many families have a Yatzy set lying around, providing you with a small handful of six-sided standard dice. You can do a lot with just this simple straightforward randomisation element. But don’t limit yourself to just six-sided dice, if you don’t have to. Get yourself a set of Dungeons & Dragons polyhedrals and you’ll have four-, eight-, ten-, twelve- and twenty-sided dice rounding out your randomisation armory. Just want to make an honorable mention of this fantasy wargame, because of its diversity. You can build all manner of strange scenery from just a core HeroScape set and use it effectively to represent almost anything. The same goes for Lego. The main issue with these kinds of proxies is that they can take a lot of space. Particularly HeroScape , since it has a predefined scale. With Lego, you just need to figure out a scale and stick to it. If there’s a game the people you will play with are especially familiar with, you can skip over having to design one of your systems by substituting a mechanic from a game you already know. Say, if you know that you will want to have statistics in your game, you can copy the traditional lineup of six abilities from Dungeons & Dragons , as well as their scale, to get started. Even if you know that you will want a different lineup later, this means you can test elements that are more unique to your game faster. An effective way to minimise cut-and-paste time is to print your cards very small. Preferably so all of them fit on a single piece of paper. They will be a bit trickier to shuffle this way, but that’s rarely an issue in testing. This way, you need less paper and you can cut everything faster. Going from eight cards to a sheet to 32 is a pretty big difference. Just avoid miniaturizing to the point that you need a magnifying glass. There’s no need to get fancy with real cardstock. Here are some things you can use. I usually just keep any interesting sheets from deliveries I receive. Say, the sturdy sheet of paper used in a plastic sleeve to make sure a comic book doesn’t bend in the mail. Perfect for gluing counters. There are three things you need to consider for paper: size, weight, and texture. For size, since I’m in Europe, I use the standardized A-sizes. A0 is a giant piece of paper, A1 is half as big, A2 half as big again, and so on. The standard office paper format is A4, roughly equivalent to U.S. Letter. This can easily be folded into A5 pamphlets. I also keep A3 papers around (twice the size of A4), but those I use to draw on. Not for printing. I don’t have a big enough home to fit a floor printer. The next thing is paper weight, measured in grams per square meter (GSM). Most home printers can’t handle heavier paper than 120-200 GSM. I always keep standard paper (80 GSM) around, and some heavier papers too. If I print counters or cards I sometimes use the sturdier stock. For reference, Magic cards are printed on 300 GSM black core paper stock. The black core is so you can’t see through the card and is taken directly from the gambling circuit. Lastly, the paper’s texture. If you want to work a little on the presentation, it can be nice to find paper canvas, or other sturdier variants. I’ve found that glossy photo paper is almost entirely useless in my own printer, however, always smearing or distorting the print. So when I buy any higher-GSM paper I try to find paper with coarser texture. There are many different kinds of cardboard, and you should try to keep as many around as possible. Some can be good for gluing boards or counters onto, while others can help make your prototype sturdier. This isn’t as important as paper, but gets used frequently enough that it felt worth mentioning. There will be a lot of rambling about cards later, and how to use them. For now, I only refer to loose cards you can use to prop up your thin paper printouts. These are not strictly necessary, but make shuffling easier. I don’t play much Magic: The Gathering anymore, but I still have lots and lots of leftover Magic cards, so those are the ones that get used as backing in most of my prototypes. You can cheaply buy colored wooden cubes as well as glass and plastic beads in bulk. It’s not always obvious what you may need, so keeping some different types around can be helpful. More specific pieces, like coins or pawns, can also be useful but unless these components provide unique affordances the kinds of components you have access to is rarely important. It’s usually enough to be able to move them around and separate them into groups. Storage is another thing that needs solving. If you mostly print paper and iterate on rules, a binder can be quite helpful. Especially paired with plastic sleeves so you can group iterations of your rules together and store them easily. If you also need to transport your prototypes, the kinds of storage boxes you find in office supply stores will have you sorted. You can push your analogue prototyping really far and build a whole workshop. A 3D printer for making scenery and miniatures, a laser cutter for custom MDF components, and a big floor-sized professional printer that takes over a whole room. If you have the space and the resources for that, you do you, but let’s focus on the smallest possible toolbox for making analogue prototypes. If you want to buy a printer, you just need to be aware that all of them have the same problems of losing connections and failing to print still to this day. Those same problems that have plagued printers since forever. I use a laser color printer with duplex (double-sided) printing support and the ability to print slightly heavier paper, up to 220 GSM. This has been more than enough for my needs. Specifically the duplex feature helps a lot if you want to print rulebooks. Having a good store of pencils and pens, including alcohol- and water-based markers, is more than enough. You can go deeper into the pen rabbit hole by looking at Niklas Wistedt’s spectacular tutorial on how to draw dungeon maps : it’ll have you covered in the pen and pencil department. Some tools you keep around to hold piles of paper or cards together. Paper clips are extra handy, because they can also be used as improvised sliders pointing at health numbers or other variables. Rubber bands are handy for keeping decks of cards together inside a box and for transportation. Almost every paper-based activity without decent scissors on hand will be a futile effort. Just beware that cutting things out by hand takes more time than you think. If you have a game with many cards, you may have to put on a couple of episodes of your favorite show as you cut them out. If you need more precision than scissors can provide, the next rung on the cutting lader is to get a proper cutting mat, a steel ruler, and a set of good sharp knives. These can be craft scalpels, metal handles with interchangeable blades (Americans insist on calling these “x-acto knives”), or carpet knives. Once you have rules and test documents printed, you’ll quickly disappear under a veritable ocean of paper. Though smaller sheafs can be pinned together with a paper clip, staplers are even better. A standard small office stapler is enough. But if you want to staple booklets and not just sheafs, it can be worth it to get a long-reach stapler capable of punching 20 sheets or more. Attaching paper to other paper can be done in more ways than with clips or staples. Sometimes you want to use glue or adhesive tape. Keeping a standard gluestick and a can of spray glue around is perfect. Regular tape and double-sided tape is also great for many things, even if the main use for tape can just be to make larger scale maps out of individual pieces of paper. As mentioned previously, it can take some time to cut out all the cards you want to print. You can cut this time down to a fraction, metaphorically and physically, by getting a paper guilloutine. These can usually take a few sheets at a time and will give you clean cuts along identified lines. Yelling “vive la France” when you drop the blade is optional. Lastly, a more decadent piece of machinery that isn’t strictly needed is a paper laminator. These will heat up a plastic pocket and melt the edges together to provide the paper with a plastic surface. It makes the paper much sturdier and has the added benefit of allowing you to use dry erase markers to make notes and adjustments right on the sheet itself. There is a lot of software out there that can be used to make cards, boards, illustrations, and whatever else you may need. The following is merely a list of what I personally use. Since you will often want to test things at different sizes, vector graphics are generally more useful for board game prototyping than pixel graphics. This is by no means a hard rule, but resolution of pixel images tends to limit how large you can scale them, while vector graphics have no such limitations. My go-to for vector graphics is Illustrator, but there are free alternatives like Affinity available as well. My other go-to piece of software for analogue shenanigans is InDesign, another Adobe program that can also be replaced by Affinity . I’m just personally too stuck in the Adobe ecosystem, after decades of regular use, that it’s too late for me to switch. You can’t teach an old dogs new tricks, as the saying goes. Indesign is great for multiple reasons. Not least of all its ability to use comma-separated value (CSV) files to populate unique pages or cards with data. A feature called DataMerge. Speaking of spreadsheets, all system designers have a lovely relationship to their tool of choice. This can be Microsoft Excel , OpenOffice Calc , or Google Spreadsheets, but the many convenient features of spreadsheets are a huge part of our bread and butter. I don’t even want to know how many sheets I create in an average year. Very broadly speaking, when making an analogue prototype, I will make use of spreadsheets for these reasons: The fantastic Tabletop Simulator is not just a great place to play tabletop games, it’s also a great place to test your own games. Renown board game designer Cole Wehrle has recorded some workshops for people interested in this specific adventure, and let’s just say that once you have this up and running it will make it a lot easier to test your game. Especially if the members of your team doesn’t all live in the same city. Its biggest strength is how quickly you can update new versions for anyone with a module already installed. If you share your module through Steam Workshop, it’s even easier. For most analogue prototypes, this isn’t doable, simply because of NDAs and rights issues. So much stuff ! Let’s put it all together. The way I’ve talked about this, there are really six steps to the process of making an analogue prototype: This is more important than you may think. An analogue prototype can easily become a design detour. Because of this, your goal needs to formulate why you are making this analogue prototype. “Test if it’s fun with infinitely respawning enemies” could be a goal. “See what works best: party or individual character” could be another one. But it can also be a lot narrower, for example designed to test the gold economy in your game. Perhaps even to balance it. The point is that you need a goal, and you need to stick to it and cut everything out that doesn’t serve that goal. If you need to test how travelling works on the map, you probably don’t need a full-fledged combat system, for example. Facts are the smallest units of decision in your game’s design . Stuff that every decision maker on your team has agreed on and that can therefore safely inform your analogue prototype. This can be super broad, like “the player plays a hamster,” or it can be more specific, like “the player character always has exactly one weapon.” You need these facts to keep your prototype grounded, but you don’t necessarily need to refer to them all at once. Pick the ones that are most important to your goal. With a goal and some facts, you need to figure out what systems you will use. Try to narrow it down more than you may think. Don’t make a “combat system,” but rather one “attack system” and another “defense system.” The reason for this is that what you are after is the resource exchanges that come from this, and the dynamics of the interactions. The attack system may take player choices as input and dish out damage as output, while the defense system may accept armor and damage input and send health loss as output. You can refer to the examples of building blocks in this post for inspiration. This is where we come to the biggest strength of analogue prototyping: real humans provide a lot more nuance and depth than any prototype can do on its own. Analogue or digital. One player can take on the role of referee or game master, similar to how it would work in a tabletop role-playing game . In many wargames of the past, this was called an umpire. Someone who would know all the rules and act as a channel between the players and the systems. If you have built a particularly complicated analogue prototype, a good way to test it can be to act as a referee and then simply ask players what they want to do instead of teaching them the details of the rules. Players can play each other’s opponents, representing different factions, interest groups, or feature sets via their analogue mechanics. If you built an analogue prototype of StarCraft , you’d probably do it this way, with three players taking on one faction each. One player can play the enemies, while another plays the economy system, or the spawning system. The goal here is to put one player in charge of the decisions made within the related system. If someone wants to trade their stock for a new space ship, and this isn’t covered by the rules, the economy system player can decide on the exchange rate and the spawning system player can say that this spawns a patrol of rival ships. Just take ample notes, so you don’t forget the nuances that come out of this process. There are many different ways to use the components you collected previously. Some of them may not be intuitive at all. The humble die: perhaps the most useful component in your toolbox. Just look at the following list and be amazed: People have been using playing cards for leisure activities since at least medieval times. Just as for dice, you’ll see why right here, and perhaps these things will fit your needs better than dice: Humans are spatial beeings that think in three dimensions. Even such a simple thing as a square grid where you put miniatures will create relationships of behind, in front of, far away from, close to, etc. All analogue prototypes don’t need this, but if you do need it, here are some alternatives to explore: With the fast iterations of analogue prototypes, you can usually just change a word or an image somewhere and print a new page. This means you may have many copies of the same page after a while. To prepare for this situation, make sure to have a system for versioning. It doesn’t have to be too involved, especially if you’re the only designer working on this prototype, but you need to do something. I usually just iterate a number in the corner of each page. The 3 becomes a 4. I may also write the date, if that seems necessary. I may also add a colored dot (usually red) to pages that have been deprecated, since just the number itself won’t say much and you may end up referring to the wrong pages if you don’t have an indicator like this. Step 1: Don’t : Steal it, fake it, or rehash stuff you have already made before you start a new prototype. Step 2: Just Do It : If it takes less than two days, just do it. As the saying goes, it’s easier to ask for forgiveness than for permission. Step 3: Fail Early : When something feels like a dud even at an early stage, you can assume that it is in fact a dud. There’s nothing wrong about abandoning a prototype. In fact, learning to kill things early is a skill. Step 4: Gather References : Prototypes can only really help with small problems. Big problems, you must break apart and figure out. Collect references. White papers, mockup screenshots, music, asset store packs, and so on. Anything that helps you understand the problem space. The same psychology applies . Rewards, risk-taking, information overload. Many of our intrinsic and extrinsic motivators are triggered the same by boardgames as by digital games. The distance is not nearly as far as we may tell ourselves. Players can represent complex systems . A player has all the complexity of a living breathing human, making odd decisions and concocting strange plans. This lets you use players as representations of systems, from enemy behaviors to narrative. Analogue games are “pure” systems . If you can’t make sense of your mechanic in its naked form, you can probably not expect your players to make sense of it either. Similar affordances . Generating random numbers with dice, shuffling cards, moving things around a limited space; analogue gaming is always extremely close to digital gaming, even to the point that we use similar verbs and parlance. Holism . Probably the best part of the analogue format is that you can actually represent everything in your game in one way or another. It doesn’t have to be a big complex system, as long as you provide something to act as that system’s output. Listing all the actions, components, elements, etc., that are relevant. Just getting things into a list can show you if something is realistic or not. Cross-matrices for fleshing out a game’s state-space. If I know the features I want, and the terrains that exist, a cross-matrix can explore what those mean: a feature-terrain matrix. Notes on playtests. How many players played, what happened, who won and why, etc. Calculators of various kinds, incorporating more spreadsheet scripting. Can be used to check probabilities, damage variation, feature dominance, etc. Session logging. If I want to be more detailed, I can log each action from a whole session and see if there are things that can be added or removed. Set a Goal Identify Facts Systemify the Facts Consider the roles of Players Tie it together with Components Types of dice : you can use any number of sides, and make use of the corresponding probabilities. Dividing a result by the number of sides gives you the probability of that result. So, 1/6 = 0.1666 means there’s a ~17% chance to roll any single side on a six-sided die. Use the dice that best represents the percentage chances you have in mind. Singles : rolling a single die and reading the result. Pretty straightforward. Sums : rolling two or more dice and adding the result together. Pools : rolling a handful of dice and checking for specific individual results or adding them together. Buckets : rolling a lot of dice and checking for specific results. The only reason buckets of dice are separated from dice pools here is because they have a different “feel” to them; they are functionally identical. Add/Subtract : add or subtract one die from the result of another, or use mathematical modifiers to add or subtract from another result. X- or X+ : require specific results per die. In these cases X- would mean “X or lower,” and X+ would mean “X or higher.” Patterns : like Yatzy, or what the first The Witcher called “Dice Poker:” you want doubles, triples, full houses, etc. Reroll : allowing rerolls of some or all of the dice you just rolled. Makes the rolling take longer but also provides increased chances of reaching the right result. Some games allow rerolling in realtime and then use other time elements to restrict play. So you can frantically keep trying to get that 6, but if an hourglass runs out first you lose. Spin : spinning the die to the specific side you want. Trigger : if you roll a specific result, something special happens. It could be the natural 20 that causes a critical hit in Dungeons & Dragons , or it can be that a roll of 10 means you roll another ten-sided die and add it to your result. Hide : you roll or you set your result under a cupped hand or physical cup, hiding the result until everyone reveals at the same time or the game rules require it. Statistics : common sense may say that you can’t possibly roll a fifth one after the first four, but in reality you can. Dice are truly random. Shuffle : shuffling cards is a great way to randomise outcomes. This can be done in many different ways, as well, where you shuffle a “bomb” into half of the pile and then shuffle the other half to place on top, for example. There are many ways to mix up how to shuffle a deck of cards. Uniqueness : each card can only be drawn once, which means that you can make each card in a deck unique and you can affect the mathematics of probability by adding multiple copies of the same card. Just like the board game Maria uses standard playing cards but in different numbers. Front and back : the face and back of the cards can have different print on them, or the back can just inform you what kind of card it is so you can shuffle them together in setup. Of course, the fact that you can hide the faces for other players is also what makes bluffing in poker interesting. Turn, sideways : what Magic calls “tapping” and other games may call exhausting or something else. Some cards can be turned sideways (in landscape mode instead of portrait mode) by default. Turn, over : flipping a card to its other side can serve to show you new information or to hide its face from everyone around the table. It can represent a card being exhausted, or injured, or other state changes like a person transforming into a werewolf. Over/under : cards can be placed physically over or under other cards, to show various kinds of relationships. An item equipped by a character, or a condition suffered by an army, for example. Card grids : cards can be placed in a grid to generate a board, or to act as a sheet selection for a character. One card could be your character class, another could be a choice of quest, etc. It’s a neat way to test combinations. Hide cards : if you want to get really physical, you can hide cards on your person, under boards, and so on. This was one way you could play Killer , by hiding notes your opponents would find. Card text : if you print your own cards, you can have any text you want on them. Reminders, rules exceptions, etc. Deck composition : how you put decks together will affect how the game plays, and predesigning decks for different tests can be very effective. Perhaps you remove all the goblins in one playtest and have only goblins in another. Deck building : decks can also be constructed through play, similarly to how Slay the Spire works. A style of mechanic where you can start small and then grow in complexity throughout a session. Stats : cards can be in different states. On the table, in your hand, available from an open tableau, shuffled into a deck, discarded to a discard pile, and even removed from the game due to in-game effects. Semantics : something that Magic: The Gathering ‘s designer, Richard Garfield, was particularly good at was to figure out interesting names for the things you were doing. You don’t just play a card, you’re casting a spell. It’s not a discard pile, it’s your graveyard. These kinds of semantics can be strong nods back to the digital game you are making, or they can serve a more thematic purpose. Statistics : with every card you draw, the deck shrinks, increasing the chances of drawing the specific card you may want. You are guaranteed to draw every card if you go through a whole deck, which is one of the biggest strengths of decks of cards. Node or point maps : picture a corkboard with pins and red thread, or just simple circular nodes with lines between them. You can draw this easily on a large sheet of paper and just write simple names next to each circle to provide context. Sector maps : one step above the node or point map is the sector map, where regions share proximity. Grand strategy games have maps like this, where provinces share borders. Another example are more abstract role-playing games, where a house’s interior is maybe divided into two sectors and the whole exterior area around it is another sector. It’s excellent for broad-stroke maps. Square grids : if you want a grid, the square grid is probably the most intuitive. But it also has some mathematical problems: diagonals reach twice as far as cardinals. This means you need to either not allow diagonals or allow them and account for the problems that will emerge. Hexagon grids : these are more accurate and classic wargame fare, but they will also often force you to adapt your art to the grid in ways that are not as intuitive as with a square grid. Freeform : finally, you can just take any satellite image or nice drawn map, perhaps an overhead screenshot from a level you’ve made, and use it as a map in a freeform capacity. This may force you to use a tape measure or other way to measure distances, but if the distances are not important that matters a lot less. For example if your game shares sensibilities with Marvel’s Midnight Suns .

0 views
マリウス 2 days ago

KTT x 80Retros GAME 1989 Orange

I picked up the KTT x 80Retros GAME 1989 Orange switches a while ago at Funkeys , a physical brick-and-mortar mechanical keyboard store in Yongsan-gu, Seoul , and it’s my first linear switch. Given its surprisingly cheap price I really didn’t expect much from it to be honest. KTT is a name people normally associate with budget options, like Peaches , Sea Salts , and Strawberries . It’s the kind of switches that show up in beginner build guides and they are generally good stuff, but not really the kind of thing that made me stop and think about what I was typing on. However, the GAME 1989 Orange changed that perception for me, and it did it in a way I genuinely didn’t see coming. But before we get into the switch itself, we need to talk about the vibe , because the vibe is half the story here. 80Retros is a relatively young brand out of China that debuted on ZFrontier around December 2023 with an interest check for their GAME 1989 cherry-profile PBT keycap set inspired by the original Game Boy . They describe themselves as lovers of all things vintage and retro, and unlike a lot of brands that slap “retro” on things as a marketing afterthought, they actually seem to mean it. What’s remarkable is how fast they’ve moved since then. Within a few years, they went from a single keycap IC to pushing out nearly a dozen different switches across two separate manufacturers ( KTT and HMX ), along with matching keycap sets in multiple colorways. The G.O.A.T. of switch reviews himself, ThereminGoat , covered this in detail in his HMX Volume 0-T review , and the GAME timeline is pretty interesting: The original HMX -manufactured GAME 1989 switches came first, followed by what he calls the “Film Trio” (the KD200 , FJ400 , and GAME 1989 Classic ), all packaged in these absolutely gorgeous film canister-inspired containers that look like oversized Kodak rolls. The film canister thing started as a nod to the KD200 and FJ400 being camera-brand-inspired, but the community loved the packaging so much that 80Retros seemingly just kept using it for everything. Even for switches that have nothing to do with photography. The KTT -manufactured GAME 1989 Orange and Red are the newer entries in this expanding catalogue, released as part of an “Expanded Film Series” in early 2025 alongside a Silent White variant and an HMX XMAS switch. So we’re looking at a brand that is absolutely not slowing down. On paper, PC top and PA66 bottom is a pretty classic material combo. KTT has used variations of this pairing for years. What makes this switch interesting is the KT2 stem made out of their proprietary UPE blend. UPE ( ultra-high molecular weight polyethylene ) is a material that’s been showing up more and more in the switch world, but it’s one of those things where the specific manufacturer’s blend matters enormously. Keygeek ’s U4 , for example, sounds glassy and solid. KTT ’s KT2 is more dry, a bit foamy, and (this is the part I didn’t expect) it brings an audible character that I can only describe as “marble-y” . It’s not soft, but it’s not hard either. It sits in this interesting middle ground. At 4mm travel with a pole bottom-out the switch is technically a long-pole linear, but the full travel distance means it doesn’t feel like one in the snappy, sharp way that most long-poles do. The pole bottom-out is there, but it’s mellowed out by the travel length and the stem material. More on that later. Stock smoothness is good, and I mean genuinely good. Probably not HMX -tier buttery, and probably not the absolute smoothest thing I’ve tried in the recent years, but there’s a quality to the travel that feels deliberate and controlled. The factory lube is present but light. A thin coating on the bottom housing railings, some on the stem legs and leaf, and the springs seem lightly done too. There is a texture to the keystroke and some people might call it scratch, but I’m not sure that would be fair, though it’s not entirely wrong either. UPE blends can be unpredictable when paired with other housing materials. Sometimes you get something silky, sometimes you get audible friction. The KT2 blend with this PC/PA66 housing produces a slight tactile grain in the travel that I genuinely enjoy. It’s subtle enough that you won’t notice it during normal typing speed, but if you slow-press a single key at ear level, it’s there. Spring-wise, 40g actuation bottoming out at around 50g is on the lighter side, especially for me and my usual Frankenswitches . I wouldn’t call it featherweight, but if you tend to bottom out hard, you’ll definitely hit the end of the stroke with minimal effort. The springs are clean, without noticeable ping in my set. The factory lube on the springs seems to do its job. One thing to note is that there’s reportedly about a 3g variance between individual switches. I couldn’t verify that precisely, but I did notice the occasional key that felt marginally different. Not a dealbreaker for me, but if you’re the kind of person who weighs every spring in a batch, keep it in mind. As for wobble, it is present. There’s some slight vertical (north-south) wobble and maybe a touch of east-west if you go looking for it. This seems to be a known trade-off with KTT ’s newer molds. Their older switches like the Hyacinths seemingly had incredibly tight tolerances, but those molds are from a different era. KTT has been retooling to accommodate new materials like their KT2 and KT3 blends, and the fit isn’t quite as snug as the old stuff. As for films, they probably do help to tighten up the housings and I’ve read that filming the switches apparently also compresses the sound profile slightly. Personally, the wobble doesn’t bother me too much. The sound profile is where the GAME 1989 Orange gets genuinely interesting, because the sound profile is busy , and I mean that in a good way. The bottom-out is lower-pitched than you’d typically expect from a PC -topped switch. The PA66 bottom housing and the KT2 stem material seemingly pull the tone down into a territory that’s thocky without being mushy. There’s a definite pop to the keystroke, and the bottom-out has weight to it. The top-out (the return stroke) is a touch brighter, creating this slight tonal contrast between the downstroke and upstroke that gives the switch a lot of auditory dimension. There’s a lot happening acoustically at any given keystroke and none of it sounds muddied or confused. The “marble-y” quality I mentioned earlier really comes through in the sound. It’s not a wet, lubed sound, but a relatively dry and more textured one, with a character that feels… natural, in lack of better words. The slight scratch in the travel actually adds to the sound profile rather than detracting from it. The initial contact, the pole hitting bottom, the spring compression, the return remains distinct of each other and layered. Volume-wise, it’s moderate. Definitely not silent, but also not exactly loud. Slightly quieter than your average long-pole, which makes sense given the full 4mm travel and the way the KT2 material absorbs some of the impact energy. I haven’t yet tested it on any of my aluminium builds , but at least on the few keyboards Funkeys had these switches on, as well as on my Kunai , I find that the sound profile works beautifully. Having that said, these switches are definitely less ideal for quiet/public environments, like open space offices and cafes. The switches come factory lubed and they work just fine stock. I’d personally resist the urge to lube them further unless you specifically want to kill the audible scratch, which I think is part of the charm. If you do lube, know that you’re trading character for smoothness, and these are already reasonably smooth to begin with. They accept films, and filming them does seem to tighten the sound slightly with less resonance in the housing, a more compressed signature. Depending on your build and plate material, that might be exactly what you want or exactly what you don’t. Try a few with and without before committing. As for the packaging, if you buy the 35-switch sets, they come in those aforementioned film canister containers. It’s genuinely lovely and a nice touch that makes the whole experience feel considered. Not something I’d pay extra for, but it’s a detail that matters for the overall product identity. One thing to note is that the canisters open very easily. I wouldn’t walk around holding them upside down unless I’d want to play find 35 switches hidden underneath the furniture . The KTT x 80Retros GAME 1989 Orange surprised me. It’s a switch that trades the ultra-polished, frictionless perfection for something with a dry, textured, slightly scratchy keystroke that somehow comes together into a sound profile that’s warm, full, and more complex than it has any right to be at this price point. It’s not perfect. The wobble is there, and the housing tolerances aren’t as tight as the best in the business. It doesn’t feel like every other linear on the market, at least not like the ones I had the chance to try over the past years. It has character, which, in a hobby that’s increasingly crowded with technically excellent but personality-free switches, has its charm. If you want the smoothest linear available, look elsewhere. If you want something that sounds interesting, feels engaging, and comes wrapped an homage to a long gone era give the 1989 Orange a shot. I’m genuinely glad I did. Disclaimer: I’m not a switch scientist. I don’t own a force curve rig, I can’t tell you the exact durometer of the KT2 blend, and my ears are probably not calibrated to the standards of someone like ThereminGoat . This review is based on my personal experience typing on these switches across a few different boards and ultimately actively using them on my primary keyboard . Your mileage may vary based on your plate material, case, keycaps, and other factors. Take everything here as one person’s experience and use it as a starting point for your own.

0 views