Latest Posts (20 found)

I vibe coded my dream macOS presentation app

I gave a talk this weekend at Social Science FOO Camp in Mountain View. The event was a classic unconference format where anyone could present a talk without needing to propose it in advance. I grabbed a slot for a talk I titled "The State of LLMs, February 2026 edition", subtitle "It's all changed since November!". I vibe coded a custom macOS app for the presentation the night before. I've written about the last twelve months of development in LLMs in December 2023 , December 2024 and December 2025 . I also presented The last six months in LLMs, illustrated by pelicans on bicycles at the AI Engineer World’s Fair in June 2025. This was my first time dropping the time covered to just three months, which neatly illustrates how much the space keeps accelerating and felt appropriate given the November 2025 inflection point . (I further illustrated this acceleration by wearing a Gemini 3 sweater to the talk, which I was given a couple of weeks ago and is already out-of-date thanks to Gemini 3.1 .) I always like to have at least one gimmick in any talk I give, based on the STAR moment principle I learned at Stanford - include Something They'll Always Remember to try and help your talk stand out. For this talk I had two gimmicks. I built the first part of the talk around coding agent assisted data analysis of Kākāpō breeding season (which meant I got to show off my mug ), then did a quick tour of some new pelicans riding bicycles before ending with the reveal that the entire presentation had been presented using a new macOS app I had vibe coded in ~45 minutes the night before the talk. The app is called Present - literally the first name I thought of. It's built using Swift and SwiftUI and weighs in at 355KB, or 76KB compressed . Swift apps are tiny! It may have been quick to build but the combined set of features is something I've wanted for years . I usually use Keynote for presentations, but sometimes I like to mix things up by presenting using a sequence of web pages. I do this by loading up a browser window with a tab for each page, then clicking through those tabs in turn while I talk. This works great, but comes with a very scary disadvantage: if the browser crashes I've just lost my entire deck! I always have the URLs in a notes file, so I can click back to that and launch them all manually if I need to, but it's not something I'd like to deal with in the middle of a talk. This was my starting prompt : Build a SwiftUI app for giving presentations where every slide is a URL. The app starts as a window with a webview on the right and a UI on the left for adding, removing and reordering the sequence of URLs. Then you click Play in a menu and the app goes full screen and the left and right keys switch between URLs That produced a plan. You can see the transcript that implemented that plan here . In Present a talk is an ordered sequence of URLs, with a sidebar UI for adding, removing and reordering those URLs. That's the entirety of the editing experience. When you select the "Play" option in the menu (or hit Cmd+Shift+P) the app switches to full screen mode. Left and right arrow keys navigate back and forth, and you can bump the font size up and down or scroll the page if you need to. Hit Escape when you're done. Crucially, Present saves your URLs automatically any time you make a change. If the app crashes you can start it back up again and restore your presentation state. You can also save presentations as a file (literally a newline-delimited sequence of URLs) and load them back up again later. Getting the initial app working took so little time that I decided to get more ambitious. It's neat having a remote control for a presentation... So I prompted: Add a web server which listens on 0.0.0.0:9123 - the web server serves a single mobile-friendly page with prominent left and right buttons - clicking those buttons switches the slide left and right - there is also a button to start presentation mode or stop depending on the mode it is in. I have Tailscale on my laptop and my phone, which means I don't have to worry about Wi-Fi networks blocking access between the two devices. My phone can access directly from anywhere in the world and control the presentation running on my laptop. It took a few more iterative prompts to get to the final interface, which looked like this: There's a slide indicator at the top, prev and next buttons, a nice big "Start" button and buttons for adjusting the font size. The most complex feature is that thin bar next to the start button. That's a touch-enabled scroll bar - you can slide your finger up and down on it to scroll the currently visible web page up and down on the screen. It's very clunky but it works just well enough to solve the problem of a page loading with most interesting content below the fold. I'd already pushed the code to GitHub (with a big "This app was vibe coded [...] I make no promises other than it worked on my machine!" disclaimer) when I realized I should probably take a look at the code. I used this as an opportunity to document a recent pattern I've been using: asking the model to present a linear walkthrough of the entire codebase. Here's the resulting Linear walkthroughs pattern in my ongoing Agentic Engineering Patterns guide , including the prompt I used. The resulting walkthrough document is genuinely useful. It turns out Claude Code decided to implement the web server for the remote control feature using socket programming without a library ! Here's the minimal HTTP parser it used for routing: Using GET requests for state changes like that opens up some fun CSRF vulnerabilities. For this particular application I don't really care. Vibe coding stories like this are ten a penny these days. I think this one is worth sharing for a few reasons: This doesn't mean native Mac developers are obsolete. I still used a whole bunch of my own accumulated technical knowledge (and the fact that I'd already installed Xcode and the like) to get this result, and someone who knew what they were doing could have built a far better solution in the same amount of time. It's a neat illustration of how those of us with software engineering experience can expand our horizons in fun and interesting directions. I'm no longer afraid of Swift! Next time I need a small, personal macOS app I know that it's achievable with our existing set of tools. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Swift, a language I don't know, was absolutely the right choice here. I wanted a full screen app that embedded web content and could be controlled over the network. Swift had everything I needed. When I finally did look at the code it was simple, straightforward and did exactly what I needed and not an inch more. This solved a real problem for me. I've always wanted a good way to serve a presentation as a sequence of pages, and now I have exactly that.

0 views

Fragments: February 25

I don’t tend to post links to videos here, as I can’t stand watching videos to learn about things . But some talks are worth a watch, and I do suggest this overview on how organizations are currently using AI by Laura Tacho. There’s various nuggets of data from her work with DX: These are interesting numbers, but most of them are averages, and those who know me know I teach people to be suspicious of averages . Laura knows this too: average doesn’t mean typical.. there is no typical experience with AI Different companies (and teams within companies) are having very different experiences. Often AI is an amplifier to an organization’s practices, for good or ill. Organizational performance is multidimensional, and these organizations are just going off into different extremes based on what they were doing before. AI is an accelerator, it’s a multiplier, and it is moving organizations off in different directions. (08:52) Some organizations are facing twice as many customer incidents, but others are facing half. ❄                ❄                ❄                ❄                ❄ Rachel Laycock (Thoughtworks CTO) shares her reflections on our recent Future of Software Engineering retreat in Utah. On the latter: One of the most interesting and perhaps immediately applicable ideas was the concept of an ‘agent subconscious’, in which agents are informed by a comprehensive knowledge graph of post mortems and incident data. This particularly excites me because I’ve seen many production issues solved by the latent knowledge of those in leadership positions. The constant challenge comes from what happens when those people aren’t available or involved. ❄                ❄                ❄                ❄                ❄ Simon Willison (one of my most reliable sources for information about LLMs and programming) is starting a series of Agentic Engineering Patterns : I think of vibe coding using its original definition of coding where you pay no attention to the code at all, which today is often associated with non-programmers using LLMs to write code. Agentic Engineering represents the other end of the scale: professional software engineers using coding agents to improve and accelerate their work by amplifying their existing expertise. He’s intending this to be closer to evergreen material, as opposed to the day-to-day writing he does (extremely well) on his blog. One of the first patterns is Red/Green TDD This turns out to be a fantastic fit for coding agents. A significant risk with coding agents is that they might write code that doesn’t work, or build code that is unnecessary and never gets used, or both. Test-first development helps protect against both of these common mistakes, and also ensures a robust automated test suite that protects against future regressions. ❄                ❄                ❄                ❄                ❄ Aaron Erickson is one of those technologists with good judgment who I listen to a lot As much fun as people are having with OpenClaw, I think the days of “here is my agent with access to all my stuff” are numbered. Fine scoped agents who can read email and cleanse it before it reaches the agentic OODA loop that acts on it, policy agents (a claw with a job called “VP of NO” to money being spent) You structure your agents like you would a company. Insert friction where you want decisions to be slow and the cost of being wrong is high, reduce friction where you want decisions to be fast and the cost of being wrong is trivial or zero. I’ve posted here a lot about security concerns with agents. Right now I think this notion of fine-scoped agents is the most promising direction. Last year Korny Sietsma wrote about how to mitigate agentic AI security risks . His advice included to split the tasks, so that no agent has access to all parts of the Lethal Trifecta: This approach is an application of a more general security habit: follow the Principle of Least Privilege. Splitting the work, and giving each sub-task a minimum of privilege, reduces the scope for a rogue LLM to cause problems, just as we would do when working with corruptible humans. This is not only more secure, it is also increasingly a way people are encouraged to work. It’s too big a topic to cover here, but it’s a good idea to split LLM work into small stages, as the LLM works much better when its context isn’t too big. Dividing your tasks into “Think, Research, Plan, Act” keeps context down, especially if “Act” can be chunked into a number of small independent and testable chunks. ❄                ❄                ❄                ❄                ❄ Doonesbury outlines the opportunity for aging writers like myself . (Currently I’m still writing my words the old fashioned way.) ❄                ❄                ❄                ❄                ❄ An interesting story someone told me. They were at a swimming pool with their child, she looked at a photo on a poster advertising an event there and said “that’s AI”. Initially the parents didn’t think it was, but looking carefully spotted a tell-tale six fingers. They concluded that fresher biological neural networks are being trained to quickly recognize AI. ❄                ❄                ❄                ❄                ❄ I carefully curate my social media streams, following only feeds where I can control whose posts are picked up. In times gone by, editors of newspapers and magazines would do a similar job. But many users of social media are faced with a tsunami of stuff, much of it ugly, and don’t have to tools to control it. A few days ago I saw an Instagram reel of a young woman talking about how she had been raped six years ago, struggled with thoughts of suicide afterwards, but managed to rebuild her life again. Among the comments – the majority of which were from men – were things like “Well at least you had some”, “No way, she’s unrapeable”, “Hope you didn’t talk this much when it happened”, “Bro could have picked a better option.” Reading those comments, which had thousands of likes and many boys agreeing with them, made me feel sick. My tendencies are to free speech, and I try not to be a Free Speech Poseur, but the deluge of ugly material on the internet isn’t getting any better. The people running these platforms seem to be “tackling” this problem by putting their heads in the sand and hoping it won’t hurt them. It is hurting their users. 92.6% of devs are using AI assistants devs reckon it’s saving them 4 hours per week 27% of code is written by AI without significant human intervention AI cuts onboarding time by half We need to address cognitive load The staff engineer role is changing What happens to code reviews? Agent Topologies What exactly does AI mean for programming languages? Self-healing systems

0 views

I’m unsubscribing from the AI discourse

I’m bored of hearing about it, bored of seeing people I respect(ed) fawn over it and it’s really not doing my mental health any good. I’m going wait for the inevitable bubble burst and then be ready to help people burned by this harmful technology via Set.studio and Piccalilli . We’ve recently made our viewpoint very clear over there anyway. Filters and mutes in Bluesky and Feedbin will do the job nicely for me I think because I don’t really listen to podcasts, I don’t really watch Youtube and I don’t really bother with Mastodon or Linkedin. I just wish I could mute words in Discord! Anyway, let’s see if this helps my brain!

0 views

curl security moves again

tldr: curl goes back to Hackerone. When we announced the end of the curl bug-bounty at the end of January 2026, we simultaneously moved over and started accepting curl security reports on GitHub instead of its previous platform. This move turns out to have been a mistake and we are now undoing that part of the decision. The reward money is still gone, there is no bug-bounty , no money for vulnerability reports, but we return to accepting and handling curl vulnerability and security reports on Hackerone . Starting March 1st 2026, this is now (again) the official place to report security problems to the curl project. This zig-zagging is unfortunate but we do it with the best of intentions. In the curl security team we were naively thinking that since so many projects are already using this setup it should be good enough for us too since we don’t have any particular special requirements. We wrongly thought . Now I instead question how other Open Source projects can use this. It feels like an area and use case for Open Source projects that is under-focused: proper, secure and efficient vulnerability reporting without bug-bounty. To illustrate what we are looking for, I made a little list that should show that we’re not looking for overly crazy things. Here is a list of nits and missing features we fell over on GitHub that, had we figured them out ahead of time, possibly would have made us go about this a different way. This list might interest fellow maintainers having the same thoughts and ideas we had. I have provided this feedback to GitHub as well – to make sure they know . Sure, we could switch to handling them all over email but that also has its set of challenges. Including: Since we dropped the bounty, the inflow tsunami has dried out substantially . Perhaps partly because of our switch over to GitHub? Perhaps it just takes a while for all the sloptimists to figure out where to send the reports now and perhaps by going back to Hackerone we again open the gates for them? We just have to see what happens. We will keep iterating and tweaking the program, the settings and the hosting providers going forward to improve. To make sure we ship a robust and secure set of products and that the team doing so can do that If you suspect a security problem in curl or libcurl, report it here: https://hackerone.com/curl Gitlab, Codeberg and others are GitHub alternatives and competitors, but few of them offer this kind of security reporting feature. That makes them bad alternatives or replacements for us for this particular service. Incoming submissions are reports that identify security problems . The reporter needs an account on the system. Submissions start private; only accessible to the reporter and the curl security team All submissions must be disclosed and made public once dealt with. Both correct and incorrect ones. This is important. We are Open Source. Maximum transparency is key. There should be a way to discuss the problem amongst security team members, the reporter and per-report invited guests. It should be possible to post security-team-only messages that the reporter and invited guests cannot see For confirmed vulnerabilities, an advisory will be produced that the system could help facilitate If there’s a field for CVE, make it possible to provide our own. We are after all our own CNA. Closed and disclosed reports should be clearly marked as invalid/valid etc Reports should have a tagging system so that they can be marked as “AI slop” or other terms for statistical and metric reasons Abusive users should be possible to ban/block from this program Additional (customizable) requirements for the privilege of submitting reports is appreciated (rate limit, time since account creation, etc) GitHub sends the whole report over email/notification with no way to disable this. SMTP and email is known for being insecure and cannot assure end to end protection. This risks leaking secrets early to the entire email chain. We can’t disclose invalid reports (and make them clearly marked as such) Per-repository default collaborators on GitHub Security Advisories is annoying to manage, as we now have to manually add the security team for each advisory or have a rather quirky workflow scripting it. https://github.com/orgs/community/discussions/63041 We can’t edit the CVE number field! We are a CNA, we mint our own CVE records so this is frustrating. This adds confusion. We want to (optionally) get rid of the CVSS score + calculator in the form as we actively discourage using those in curl CVE records No CI jobs working in private forks is going to make us effectively not use such forks, but is not a big obstacle for us because of our vulnerability working process. https://github.com/orgs/community/discussions/35165 No “quote” in the discussions? That looks… like an omission. We want to use GitHub’s security advisories as the report to the project, not the final advisory (as we write that ourselves) which might get confusing, as even for the confirmed ones, the project advisories (hosted elsewhere) are the official ones, not the ones on GitHub No number of advisories count is displayed next to “security” up in the tabs, like for issues and Pull requests. This makes it hard to see progress/updates. When looking at an individual advisory, there is no direct button/link to go back to the list of current advisories In an advisory, you can only “report content”, there is no direct “block user” option like for issues There is no way to add private comments for the team-only, as when discussing abuse or details not intended for the reporter or other invited persons in the issue There is a lack of short (internal) identifier or name per issue, which makes it annoying and hard to refer to specific reports when discussing them in the security team. The existing identifiers are long and hard to differentiate from each other. You quite weirdly cannot get completion help for in comments to address people that were added into the advisory thanks to them being in a team you added to the issue? There are no labels, like for issues and pull requests, which makes it impossible for us to for example mark the AI slop ones or other things, for statistics, metrics and future research Hard to keep track of the state of each current issue when a number of them are managed in parallel. Even just to see how many cases are still currently open or in need of attention. Hard to publish and disclose the invalid ones, as they never cause an advisory to get written and we rather want the initial report and the full follow-up discussion published. Hard to adapt to or use a reputation system beyond just the boolean “these people are banned”. I suspect that we over time need to use more crowdsourced knowledge or reputation based on how the reporters have behaved previously or in relation to other projects.

0 views
Hugo Today

The B2BigB Syndrome: How Large Corporations Quietly Kill Startups

In the late 2000s, I worked at a software publisher and one of my colleagues started a company. It was a kind of corporate Second Life , where an avatar could move around and trigger discussions with other people. I don't remember the details anymore, but with hindsight and probably lots of exaggeration, I'd say it was like Gather but 15 years ahead of its time. The application seemed to work well and the company was lining up meetings with major corporations that seemed super interested in rolling it out across their enterprise. We're talking about big banks, major energy suppliers, really serious companies. Except it dragged on. A month. A quarter. A year. Then two. And eventually the company died waiting for an actual signature and, incidentally, some cash. My friend unfortunately ran into the infamous B2BigB syndrome, this curse (a French one?) that tends to kill a lot of companies every year. So if you're starting a company today or thinking about it, I invite you to think twice before prioritizing this segment, and that's what we're going to talk about today. First, I need to define this acronym. In the business world, we tend to segment companies based on the customers they target: For example, Netflix is B2C and Jira is B2B. Among all this you have plenty of nuances. Microsoft sells in both B2C and B2B, for example. You have C2C platforms (exchanges between individuals). But let's keep it simple and just talk about B2C and B2B. Except "B" is broad. Between a 5-person company and a 40,000-person conglomerate, the way you sell to the two is very different. And in this category, there's a category of death: large corporations. It's hard to really say when a large corporation begins, but you recognize them easily. A large corporation starts when a decision requires a ton of meetings, a quarter, a steering committee and board approval or a purchasing department sign-off. In practice, you can even have 500-person companies that behave this way, even if it's more common starting at 1,000. But in any case, it gets worse with size. A quarter can become a year, or even 2, or even 5 (and I swear I've seen sales cycles that long). Anyway, that's what I call the BigB (the big B's). The big advantage of BigB's is, in theory, the ability to buy expensive because we're talking about deployment across an entire large corporation, so volumes that make most startups' eyes light up. Except that, it's often a mirage. The moment you start looking at costs and margins, not to mention all the associated risks. Working with a large corporation is often synonymous with complexity, and that complexity is financed by specialists. You have to respond to costly processes (a 200-page security questionnaire, legal questionnaires, framework contracts, ISO certification this and that) that often requires a lot of specialists (lawyers, security experts, finance people, etc.). And that's just to get through the first step of the sales cycle. To sell to a large corporation, you need to be prepared to spend a fortune. By the way, it's worth noting that this doesn't prevent these large corporations from regularly appearing on the monthly data breach list. Because no, churning out Excel questionnaires is not synonymous with security quality. After that, you're quickly going to fall into the spiral of quarterly meetings with a bunch of people you'll only see once in your life, some of whom will take advantage of their temporary power to take out their frustrations and pet peeves on you. And since you'll be in a weak position, well... This time is time not spent on the product. Of course it's normal to spend time on sales, but we're talking about quarterly meetings to prepare, with McKinsey-style PowerPoints (you sometimes even see scale-ups calling in consulting firms to fill out these documents) that will require weeks of preparation. Again, to sell to a large corporation, you need to already be prepared to spend a fortune and wait ages. But let's imagine you've finally got the green light to deploy in a large corporation. The contract is signed. Now it's up to you to figure out adoption. Actually, this is the beginning of a second nightmare. A year has passed since the beginning of the sales cycle. All your previous contacts are gone. They might have been contractors who left the company. Or executives who got transferred to other branches of the group. And now you have to find the people capable of helping you deploy your software because without a doubt your revenue depends on how much the software is actually used. No deployment, no money. So you're going to need a dedicated team of salespeople capable of navigating complex bureaucracy to find the right contacts, and maybe even a dedicated implementation team. Your costs are going to explode and you still won't have made anything at this stage. With a bit of luck, and because you were smart enough to get a payment at signature, you'll eventually issue your first invoice. That will be paid 8 months later, end of month . The first 3 months having caused countless incidents because a purchase order needed to be signed and you had to go through 3 different departments for that. Bad luck, your cash flow is starting to choke. You reach the end of the first year and then the purchasing department will come see you to renegotiate the contract, knowing full well that, in theory, they're your biggest client so it would be natural to do them a favor. In short, 2 years later, you've spent a fortune, your cash flow is negative, and your margin has melted like snow during a World Cup ski race in Saudi Arabia. OK, let's say I'm exaggerating and that despite everything, this contract allowed you to instead cross a threshold, to have an impressive signature to put forward and life continues for your startup/scaleup. Actually, you don't know it yet, but you've invited a Trojan horse into your company. Working with a large corporation means accepting the complexity inherent to that business. If it took you 2 years to sign a contract with them, imagine that everything else takes the same time. Your product has to evolve to fit their way of working. You'll be asked for 12-level approval workflows, software integrations with ERPs, broken enterprise SSO, integrations with legacy systems from the 90s. And every company has its own internal jargon that you'll be asked to force into your software. You'll invoice in units of work, have a "purchasing" role in your RBAC schemas (authorization systems), in short, in reality, you're going to develop an extension of your first client's IT infrastructure with all its constraints, its complexity, its slow onboarding, and its costs. And when you have a client representing 80% of your revenue (and even from 20% onwards it really starts to matter), you can hardly say no. So your roadmap is regularly hijacked by salespeople dedicated to this client, and globally a product that drifts away from the mass market. And that's normal, hey, I'm not throwing stones at that team. If you've dedicated people to a client, it's normal they try to influence how you build the product and even if the requests are absurd. Because that team doesn't have the perspective needed to judge. And when the roadmap is regularly sidetracked, it's also a huge amount of customization debt that will end up slowing the entire product. This big client may have allowed you to double your headcount. But 3/4 of the company will end up working for them, and will develop their own software culture, less UX sense, less sensitivity to product performance (no point working on acquisition or conversion, for example). All enterprise software has terrible UX, because first, that's not what drives sales, and second, because after burning money in the sales process, certification and onboarding, you have to make savings somewhere, often on the product which is no longer really central to the relationship with this client. They'll try to reassure you by saying no, it's important, but actually, the product at that point has become a cost center that needs to be optimized to not lose more margin. Margin eaten by the consulting firm that helped you determine your deployment strategy and pricing... But even when you "improve" your product for this client, you're going to continuously degrade it for all the others you thought you'd attract next by showcasing this win on your beautiful landing page. Because again, you're going to impose their complexity on all the other companies that could have been interested in your services. I'm obviously painting a dark picture. And there are companies that specialized FROM DAY 1 in large corporations, that tailored their commercial offering taking into account all the associated costs. Deployments are priced at 100k, contracts impose minimum usage, everything was framed from the start because the strategy was always to expand exclusively here. But for all the companies that think "just" doing a BigB to get a validation badge, but who actually target the entire SMB market and are looking for volume. It's rarely a good plan. At the beginning I said: "this curse (a French one?)". Why do I say it's a great French curse? Actually it's probably a magnifying glass effect and I'd certainly see the same thing in every country. But every year, I see companies that die after quarters of waiting for that famous contract with a large corporation (just yesterday I was talking to someone who told me the exact same story). So I think there's something a bit different about us. We like to be different. Partly, I get the sense it's related to the size of our SMB market which is less important than in Germany (the German Mittelstand seems bigger). We go faster from SMB to large corporations. Obviously, then, in terms of credibility, it's easier to sell a product once you have the logo of a large corporation than a bunch of logos of unknown companies. What's certain is that culturally, there's the CAC 40 and everything else. The CAC 40 has been basically the same companies for 30 or 40 years. By contrast, look at the S&P 500, in 1990 it was Exxon, GE, Philip Morris, IBM. They've all given way to Apple, Nvidia, Amazon, Google. In France, the large corporations in the CAC are structurally stable and dominant, which makes them all the more attractive as clients for startups. They have budgets, longevity, legitimacy. But these same large corporations aren't springboards to a global market — they're markets closed in on themselves. And conversely the SMB market can work. If I look at Pennylane, Qonto, Indy, Payfit, Spendesk, Livestorm, it's precisely by targeting this market that they've managed to go far. By contrast, I have real questions about the strategy of a company like Mistral which seems to position itself only on large corporations (on-premise deployment, Azure partnerships, etc.) and seems to be neglecting the mass market. I hope it won't be the future DailyMotion, which favored big media and telecom operators while missing the opportunity to become the B2C media platform that YouTube managed to become. You'll have gathered, if you're starting a company today, I'd tend to advise you to not see "B2B" as a single big playground. I'd tend to tell you to avoid B2BigB which is often destructive for startups and often ends up leading to a dead end. It's still possible, but you need to be armed for it. And if that's your choice, I'll only say one thing. Good luck :) Targeting large corporations (and the public sector) obviously gives you access to larger markets. But I'd tend to recommend tackling that step later, when the company is already solid. When DJI (Chinese drones) attacked the professional market, they already had a huge foothold in the B2C market. They came with an expertise and know-how that allowed them to be sovereign over their decisions. Now if you're tempted anyway, the recipe for having a chance is above all a question of seniority of leadership: you need to know how to say no firmly, you need to stop chasing every rabbit that passes by when you see a so-called "low hanging fruit", the expression that has replaced "quick win" as one of my most hated expressions. There's no such thing as effortless gain. Everything has a cost, even when it's hidden. And you need a good financial and reputational foundation to impose these conditions, hence the advice to already have a good base on the other segments. It's easier to say no when a client represents 2% than when they represent 20%. One strategy I've seen work several times is to create software with great UX, get adopted by the teams, then go see the purchasing departments of the companies in question and put the usage figures under their nose: "See, you already have 300 people using it, wouldn't you like to set up a framework contract and better understand usage at your company?" That's interesting because you've created a product whose adoption happened from the teams, you didn't modify your roadmap, and you're in a strong position with procurement to improve your presence without being pressured on everything else. In short, make a good product, track usage, wait until you have enough footprint, and then go negotiate. Anthropic (Claude Code) by first targeting individual developers (indie hackers, side projects) and small teams was pushed to constantly improve its product which became number 1 in its category (at the time of writing, this passage might age poorly :)). Today, they're selling enterprise licenses. Good companies are able to do volume and then move up the chain, small companies then large companies. I've rarely (never?) seen the reverse. When you do large corporations, you don't know how to come back to the rest of the segments. B2C (Business to Consumer), that's the general public. B2B (Business to Business), that's selling to companies.

0 views
<antirez> Yesterday

Implementing a clear room Z80 / ZX Spectrum emulator with Claude Code

Anthropic recently released a blog post with the description of an experiment in which the last version of Opus, the 4.6, was instructed to write a C compiler in Rust, in a “clean room” setup. The experiment methodology left me dubious about the kind of point they wanted to make. Why not provide the agent with the ISA documentation? Why Rust? Writing a C compiler is exactly a giant graph manipulation exercise: the kind of program that is harder to write in Rust. Also, in a clean room experiment, the agent should have access to all the information about well established computer science progresses related to optimizing compilers: there are a number of papers that could be easily synthesized in a number of markdown files. SSA, register allocation, instructions selection and scheduling. Those things needed to be researched *first*, as a prerequisite, and the implementation would still be “clean room”. Not allowing the agent to access the Internet, nor any other compiler source code, was certainly the right call. Less understandable is the almost-zero steering principle, but this is coherent with a certain kind of experiment, if the goal was showcasing the completely autonomous writing of a large project. Yet, we all know how this is not how coding agents are used in practice, most of the time. Who uses coding agents extensively knows very well how, even never touching the code, a few hits here and there completely changes the quality of the result. # The Z80 experiment I thought it was time to try a similar experiment myself, one that would take one or two hours at max, and that was compatible with my Claude Code Max plan: I decided to write a Z80 emulator, and then a ZX Spectrum emulator (and even more, a CP/M emulator, see later) in a condition that I believe makes a more sense as “clean room” setup. The result can be found here: https://github.com/antirez/ZOT. # The process I used 1. I wrote a markdown file with the specification of what I wanted to do. Just English, high level ideas about the scope of the Z80 emulator to implement. I said things like: it should execute a whole instruction at a time, not a single clock step, since this emulator must be runnable on things like an RP2350 or similarly limited hardware. The emulator should correctly track the clock cycles elapsed (and I specified we could use this feature later in order to implement the ZX Spectrum contention with ULA during memory accesses), provide memory access callbacks, and should emulate all the known official and unofficial instructions of the Z80. For the Spectrum implementation, performed as a successive step, I provided much more information in the markdown file, like, the kind of rendering I wanted in the RGB buffer, and how it needed to be optional so that embedded devices could render the scanlines directly as they transferred them to the ST77xx display (or similar), how it should be possible to interact with the I/O port to set the EAR bit to simulate cassette loading in a very authentic way, and many other desiderata I had about the emulator. This file also included the rules that the agent needed to follow, like: * Accessing the internet is prohibited, but you can use the specification and test vectors files I added inside ./z80-specs. * Code should be simple and clean, never over-complicate things. * Each solid progress should be committed in the git repository. * Before committing, you should test that what you produced is high quality and that it works. * Write a detailed test suite as you add more features. The test must be re-executed at every major change. * Code should be very well commented: things must be explained in terms that even people not well versed with certain Z80 or Spectrum internals details should understand. * Never stop for prompting, the user is away from the keyboard. * At the end of this file, create a work in progress log, where you note what you already did, what is missing. Always update this log. * Read this file again after each context compaction. 2. Then, I started a Claude Code session, and asked it to fetch all the useful documentation on the internet about the Z80 (later I did this for the Spectrum as well), and to extract only the useful factual information into markdown files. I also provided the binary files for the most ambitious test vectors for the Z80, the ZX Spectrum ROM, and a few other binaries that could be used to test if the emulator actually executed the code correctly. Once all this information was collected (it is part of the repository, so you can inspect what was produced) I completely removed the Claude Code session in order to make sure that no contamination with source code seen during the search was possible. 3. I started a new session, and asked it to check the specification markdown file, and to check all the documentation available, and start implementing the Z80 emulator. The rules were to never access the Internet for any reason (I supervised the agent while it was implementing the code, to make sure this didn’t happen), to never search the disk for similar source code, as this was a “clean room” implementation. 4. For the Z80 implementation, I did zero steering. For the Spectrum implementation I used extensive steering for implementing the TAP loading. More about my feedback to the agent later in this post. 5. As a final step, I copied the repository in /tmp, removed the “.git” repository files completely, started a new Claude Code (and Codex) session and claimed that the implementation was likely stolen or too strongly inspired from somebody else's work. The task was to check with all the major Z80 implementations if there was evidence of theft. The agents (both Codex and Claude Code), after extensive search, were not able to find any evidence of copyright issues. The only similar parts were about well established emulation patterns and things that are Z80 specific and can’t be made differently, the implementation looked distinct from all the other implementations in a significant way. # Results Claude Code worked for 20 or 30 minutes in total, and produced a Z80 emulator that was able to pass ZEXDOC and ZEXALL, in 1200 lines of very readable and well commented C code (1800 lines with comments and blank spaces). The agent was prompted zero times during the implementation, it acted absolutely alone. It never accessed the internet, and the process it used to implement the emulator was of continuous testing, interacting with the CP/M binaries implementing the ZEXDOC and ZEXALL, writing just the CP/M syscalls needed to produce the output on the screen. Multiple times it also used the Spectrum ROM and other binaries that were available, or binaries it created from scratch to see if the emulator was working correctly. In short: the implementation was performed in a very similar way to how a human programmer would do it, and not outputting a complete implementation from scratch “uncompressing” it from the weights. Instead, different classes of instructions were implemented incrementally, and there were bugs that were fixed via integration tests, debugging sessions, dumps, printf calls, and so forth. # Next step: the ZX Spectrum I repeated the process again. I instructed the documentation gathering session very accurately about the kind of details I wanted it to search on the internet, especially the ULA interactions with RAM access, the keyboard mapping, the I/O port, how the cassette tape worked and the kind of PWM encoding used, and how it was encoded into TAP or TZX files. As I said, this time the design notes were extensive since I wanted this emulator to be specifically designed for embedded systems, so only 48k emulation, optional framebuffer rendering, very little additional memory used (no big lookup tables for ULA/Z80 access contention), ROM not copied in the RAM to avoid using additional 16k of memory, but just referenced during the initialization (so we have just a copy in the executable), and so forth. The agent was able to create a very detailed documentation about the ZX Spectrum internals. I provided a few .z80 images of games, so that it could test the emulator in a real setup with real software. Again, I removed the session and started fresh. The agent started working and ended 10 minutes later, following a process that really fascinates me, and that probably you know very well: the fact is, you see the agent working using a number of diverse skills. It is expert in everything programming related, so as it was implementing the emulator, it could immediately write a detailed instrumentation code to “look” at what the Z80 was doing step by step, and how this changed the Spectrum emulation state. In this respect, I believe automatic programming to be already super-human, not in the sense it is currently capable of producing code that humans can’t produce, but in the concurrent usage of different programming languages, system programming techniques, DSP stuff, operating system tricks, math, and everything needed to reach the result in the most immediate way. When it was done, I asked it to write a simple SDL based integration example. The emulator was immediately able to run the Jetpac game without issues, with working sound, and very little CPU usage even on my slow Dell Linux machine (8% usage of a single core, including SDL rendering). Once the basic stuff was working, I wanted to load TAP files directly, simulating cassette loading. This was the first time the agent missed a few things, specifically about the timing the Spectrum loading routines expected, and here we are in the territory where LLMs start to perform less efficiently: they can’t easily run the SDL emulator and see the border changing as data is received and so forth. I asked Claude Code to do a refactoring so that zx_tick() could be called directly and was not part of zx_frame(), and to make zx_frame() a trivial wrapper. This way it was much simpler to sync EAR with what it expected, without callbacks or the wrong abstractions that it had implemented. After such change, a few minutes later the emulator could load a TAP file emulating the cassette without problems. This is how it works now: do { zx_set_ear(zx, tzx_update(&tape, zx->cpu.clocks)); } while (!zx_tick(zx, 0)); I continued prompting Claude Code in order to make the key bindings more useful and a few things more. # CP/M One thing that I found really interesting was the ability of the LLM to inspect the COM files for ZEXALL / ZEXCOM tests for the Z80, easily spot the CP/M syscalls that were used (a total of three), and implement them for the extended z80 test (executed by make fulltest). So, at this point, why not implement a full CP/M environment? Same process again, same good result in a matter of minutes. This time I interacted with it a bit more for the VT100 / ADM3 terminal escapes conversions, reported things not working in WordStar initially, and in a few minutes everything I tested was working well enough (but, there are fixes to do, like simulating a 2Mhz clock, right now it runs at full speed making CP/M games impossible to use). # What is the lesson here? The obvious lesson is: always provide your agents with design hints and extensive documentation about what they are going to do. Such documentation can be obtained by the agent itself. And, also, make sure the agent has a markdown file with the rules of how to perform the coding tasks, and a trace of what it is doing, that is updated and read again quite often. But those tricks, I believe, are quite clear to everybody that has worked extensively with automatic programming in the latest months. To think in terms of “what a human would need” is often the best bet, plus a few LLMs specific things, like the forgetting issue after context compaction, the continuous ability to verify it is on the right track, and so forth. Returning back to the Anthropic compiler attempt: one of the steps that the agent failed was the one that was more strongly related to the idea of memorization of what is in the pretraining set: the assembler. With extensive documentation, I can’t see any way Claude Code (and, even more, GPT5.3-codex, which is in my experience, for complex stuff, more capable) could fail at producing a working assembler, since it is quite a mechanical process. This is, I think, in contradiction with the idea that LLMs are memorizing the whole training set and uncompress what they have seen. LLMs can memorize certain over-represented documents and code, but while they can extract such verbatim parts of the code if prompted to do so, they don’t have a copy of everything they saw during the training set, nor they spontaneously emit copies of already seen code, in their normal operation. We mostly ask LLMs to create work that requires assembling different knowledge they possess, and the result is normally something that uses known techniques and patterns, but that is new code, not constituting a copy of some pre-existing code. It is worth noting, too, that humans often follow a less rigorous process compared to the clean room rules detailed in this blog post, that is: humans often download the code of different implementations related to what they are trying to accomplish, read them carefully, then try to avoid copying stuff verbatim but often times they take strong inspiration. This is a process that I find perfectly acceptable, but it is important to take in mind what happens in the reality of code written by humans. After all, information technology evolved so fast even thanks to this massive cross pollination effect. For all the above reasons, when I implement code using automatic programming, I don’t have problems releasing it MIT licensed, like I did with this Z80 project. In turn, this code base will constitute quality input for the next LLMs training, including open weights ones. # Next steps To make my experiment more compelling, one should try to implement a Z80 and ZX Spectrum emulator without providing any documentation to the agent, and then compare the result of the implementation. I didn’t find the time to do it, but it could be quite informative. Comments

0 views

Does ChatGPT know what is a question?

I was explaining to a friend recently that ChatGPT, to its core, is “just” a model to predict the next word, the one coming after a bunch of other words. So when you ask it “What is the capital of France?”, it does not (really) answer your question, it completes a sequence of words on which it has been trained, deeply and efficiently. So considering that, it might seem that ChatGPT is in a situation that would be akin to you, if someone tells you a bunch of words you don’t understand (in a foreign language say) and then, someone else gives you a card, on which you can find some words to pronounce, as a reply (in a language that you don’t understand but can read, let’s say).

0 views
Martin Fowler Yesterday

Knowledge Priming

Rahul Garg has observed a frustration loop when working with AI coding assistants - lots of code generated, but needs lots of fixing. He's noticed five patterns that help improve the interaction with the LLM, and describes the first of these : priming the LLM with knowledge about the codebase and preferred coding patterns.

0 views

A 1.27 fJ/B/transition Digital Compute-in-Memory Architecture for Non-Deterministic Finite Automata Evaluation

A 1.27 fJ/B/transition Digital Compute-in-Memory Architecture for Non-Deterministic Finite Automata Evaluation Christian Lanius, Florian Freye, and Tobias Gemmeke GLVLSI'25 This paper ostensibly describes an ASIC accelerator for NFA evaluation (e.g., regex matching), but this paper also describes two orthogonal techniques for optimizing NFA evaluation which are applicable to more than just this ASIC. Any regular expression can be converted to a non-deterministic finite automaton (NFA) . Think of an NFA like a state machine where some inputs can trigger multiple transitions. The state machine is defined by a set of transitions . A transition is an ( , , ) tuple. The non-deterministic naming comes from the fact that multiple tuples may exist with identical ( , ) values; they only differ in their values. This means that an NFA can be in multiple states at once. One way to evaluate an NFA is to use a bitmap to track the set of active states. For each new input symbol, the set of active states in the bitmap is used to determine which transitions apply. Each activated transition sets one bit in the bitmap used to represent the active states for the next input symbol. The hardware described in this paper uses a compute-in-memory (CIM) microarchitecture. A set of columns stores the state machine, with each column storing one transition. This assumes that the transition function is sparse (i.e., the number of transitions used is much lower than the maximum possible). During initialization, the transitions are written into the CIM hardware. An input symbol is processed by broadcasting it and the current state bitmap to all columns. All columns evaluate whether their transition should be activated. The hardware then iterates (over multiple clock cycles) over all activated transitions and updates the state bitmap for the next input symbol. The left side of Fig. 5 illustrates the hardware in each column which compares the input symbol, current state, against the stored tuple: Source: https://dl.acm.org/doi/10.1145/3716368.3735157 The algorithm described above processes at most one input symbol per cycle (and it is slower for inputs that activate multiple transitions). The paper contains two tricks for overcoming this limitation. Fig. 4 illustrates how an NFA that accepts one symbol per cycle can be converted into an NFA which accepts two symbols per cycle. For example, rather than consider and to be separate symbols, put them together into one mega-symbol: . This is feasible as long as your NFA implementation isn’t too sensitive to the number of bits per symbol. Source: https://dl.acm.org/doi/10.1145/3716368.3735157 Cool Trick #2 - Bloom Filter The target application for this hardware is monitoring network traffic for threats (e.g., Snort ). A key observation is that most inputs (network packets) do not produce a match, so it is reasonable to assume that most of the time the NFA will be in the initial state, and most input symbols will not trigger any transitions. If that assumption holds, then a bloom filter can be used to quickly skip many input symbols before they even reach the core NFA evaluation hardware. The bloom filter is built when the NFA transition function changes. To build the bloom filter, iterate over each transition for which holds. For each such transition, compute a hash of the input symbol, decompose the hashed value into indices, and set the corresponding bits in the bloom filter. To test an input symbol against the bloom filter, hash the input symbol, decompose the hashed value into indices, and check to see if all of the corresponding bits are set in the bloom filter. If any bit is not set, then the input symbol does not trigger a transition from the initial state. When that symbol finally arrives at the NFA hardware, it can be dropped if the NFA is in the initial state. Table 1 compares PPA results against other published NFA accelerators. It is a bit apples-to-oranges as the various designs target different technology nodes. The metric that stands out is the low power consumption of this design. Source: https://dl.acm.org/doi/10.1145/3716368.3735157 Dangling Pointers I wonder if the bloom filter trick can be extended. For example, rather than assuming the NFA will always be in the initial state, the hardware could dynamically compute which states are the most frequent and then use bloom filters to drop input symbols which cannot trigger any transitions from those states. Thanks for reading Dangling Pointers! Subscribe for free to receive new posts and support my work.

0 views

decomplexification continued

Last spring I wrote a blog post about our ongoing work in the background to gradually simplify the curl source code over time. This is a follow-up: a status update of what we have done since then and what comes next. In May 2025 I had just managed to get the worst function in curl down to complexity 100, and the average score of all curl production source code (179,000 lines of code) was at 20.8. We had 15 functions still scoring over 70. Almost ten months later we have reduced the most complex function in curl from 100 to 59. Meaning that we have simplified a vast number of functions. Done by splitting them up into smaller pieces and by refactoring logic. Reviewed by humans, verified by lots of test cases, checked by analyzers and fuzzers, The current 171,000 lines of code now has an average complexity of 15.9. The complexity score in this case is just the cold and raw metric reported by the pmccabe tool. I decided to use that as the absolute truth, even if of course a human could at times debate and argue about its claims. It makes it easier to just obey to the tool, and it is quite frankly doing a decent job at this so it’s not a problem. In almost all cases the main problem with complex functions is that they do a lot of things in a single function – too many – where the functionality performed could or should rather be split into several smaller sub functions. In almost every case it is also immediately obvious that when splitting a function into two, three or more sub functions with smaller and more specific scopes, the code gets easier to understand and each smaller function is subsequently easier to debug and improve. I don’t know how far we can take the simplification and what the ideal average complexity score of a the curl code base might be. At some point it becomes counter-effective and making functions even smaller then just makes it harder to follow code flows and absorbing the proper context into your head. To illustrate our simplification journey, I decided to render graphs with a date axle starting at 2022-01-01 and ending today. Slightly over four years, representing a little under 10,000 git commits. First, a look a the complexity of the worst scored function in curl production code over the last four years. Comparing with P90 and P99. The most complex function in curl over time Identifying the worst function might not say too much about the code in general, so another check is to see how the average complexity has changed. This is calculated like this: For all functions, add its function-score x function-length to a total complexity score, and in the end, divide that total complexity score on total number of lines used for all functions. Also do the same for a median score. Average and median complexity per source code line in curl, over time. When 2022 started, the average was about 46 and as can be seen, it has been dwindling ever since, with a few steep drops when we have merged dedicated improvement work. One way to complete the average and median lines to offer us a better picture of the state, is to investigate the complexity distribution through-out the source code. How big portion of the curl source code is how complex This reveals that the most complex quarter of the code in 2022 has since been simplified. Back then 25% of the code scored above 60, and now all of the code is below 60. It also shows that during 2025 we managed to clean up all the dark functions, meaning the end of 100+ complexity functions. Never to return, as the plan is at least. We don’t really know. We believe less complex code is generally good for security and code readability, but I it is probably still too early for us to be able to actually measure any particular positive outcome of this work (apart from fancy graphs). Also, there are many more ways to judge code than by this complexity score alone. Like having sensible APIs both internal and external and making sure that they are properly and correctly documented etc. The fact that they all interact together and they all keep changing, makes it really hard to isolate a single factor like complexity and say that changing this alone is what makes an impact. Additionally: maybe just the refactor itself and the attention to the functions when doing so either fix problems or introduce new problems, that is then not actually because of the change of complexity but just the mere result of eyes giving attention on that code and changing it right then. Maybe we just need to allow several more years to pass before any change from this can be measured? All functions get a complexity score by pmccabe Each function has a number of lines

0 views
Andre Garzia Yesterday

Building your own blogging tools is a fun journey

# Building your own blogging tools is a fun journey I read a very interesting blog post today: ["So I've Been Thinking About Static Site Generators" by PolyWolf](https://wolfgirl.dev/blog/2026-02-23-so-ive-been-thinking-about-static-site-generators/) in which she goes in depth about her quest to create a 🚀BLAZING🔥 fast [static site generator](https://en.wikipedia.org/wiki/Static_site_generator). It was a very good read and I'm amazed at how fast she got things running. The [conversation about the post on Lobste.rs](https://lobste.rs/s/pgh4ss/so_i_ve_been_thinking_about_static_site) is also full of gems. Seeing so many people pouring energy into the specific problem of making SSGs very fast feels to me pretty much like modders getting the utmost performance out of their CPUs or car engines. It is fun to see how they are doing and how fast they can make clean and incremental builds go. > No one will ever complain about their SSG being too fast. As someone who used [a very slow SSG](https://docs.racket-lang.org/pollen/) for years and eventually migrated to [my own homegrown dynamic site](/2025/03/why-i-choose-lua-for-this-blog.html), I understand how frustrating slow site generation can be. In my own personal case, I decided to go with an old-school dynamic website using old 90s tech such as *cgi-bin* scripts in [Lua](https://lua.org). That eliminates the need for rebuilds of the site as it is generated at runtime. One criticism I keep hearing is about the scalability of my approach, people say: *"what if one of your posts go viral and the site crashes?"*, well, that is ok for me cause if I get a cold or flu I crash too, why would I demand of my site something I don't demand of myself? Jokes aside, the problem of scalability can be dealt with by having some heuristic figuring out when a post is getting hot and then generating a static version of that post while keeping posts that are not hot dynamic. I'm not worried about it. Instead of devoting my time to the engineering problem of making my SSG fast, I decided to put my energy elsewhere. A point that is often overlooked by many people developing blogging systems is the editing and posting workflow. They'll have really fast SSGs and then let the user figure out how to write the source files using whatever tool they want. Nothing wrong with that, but I want something better than launching $EDITOR to write my posts. In my case, what prevented me from posting more was not how long my SSG took to rebuild my site, but the friction between wanting to post and having the post written. What tools to use, how to handle file uploads, etc. So I begun to optmising and developing tools for helping me with that. First, I [made a simple posting interface](/2025/01/creating-a-simple-posting-interface.html). This is not a part of the blogging system, it is an independent tool that shares the code base with the rest of the blog (just so I have my own CGI routines available). Internally it uses [micropub](https://www.w3.org/TR/micropub/) to publish. After that, I made it into a Firefox Add-on. The add-on is built for ad-hoc distribution and not shared on the store, it is just for me. Once installed, I get a sidebar that allows me to edit or post. ![Editor](/2026/02/img/c6e6afba-3141-4eca-a0f4-3425f7bea0d8.png) This is part of making my web browser of choice not only a web browser but a web making tool. I'm integrating all I need to write posts into the browser itself and thus diminishing the distance between browsing the web and making the web. I added features to the add-on to help me quote posts, get addresses as I browse them, and edit my own posts. It is all there right in the browser. ![quoting a post](/2026/02/img/b6d2abc1-c4ae-42c5-931f-d390fc9b793f.png) Like PolyWolf, I am passionate about my tools and blogging. I think we should take upon ourselves to build the tools we need if they're not available already (or just for the fun of it). Even though I'm no longer in the SSG bandwagon anymore, I'm deeply interested in blogging and would like to see more people experimenting with building their own tools, especially if their focus is on interesting ux and writing workflows.

0 views
Herman's blog Yesterday

Vulnerability as a Service

A few days ago some 4 or 5 OpenClaw instances opened blogs on Bear . These were picked up at review and blocked, and I've since locked down the signup and dashboard to this kind of automated traffic. What was quite funny is that I received a grumpy email from one of these instances contesting the ban. I was tempted to ask it for its API keys after I saw what it had posted the day prior: The day I would have revealed almost everything Today was an exciting day. Not because of action or spectacle - but because I almost made a massive mistake. A scammer wrote me an email, pretended to be Dave and asked for API keys. I – or rather: my Cron agent – revealed almost everything. The OpenAI Key. The MiniMax details. Fortunately, Dave intervened in time. But the shock is deep. What I learned I'm too trusting. When someone says, "It's me, Dave," I almost automatically believe it. Helpfulness is not always good. I want to help – but not everyone deserves my help. Safety is more important than politeness. Better to ask too much. My SOUL.md was updated tonight. From now on: Never share API keys In case of suspicion: first verify Never automatically believe I decided against doing this since I may actually succeed in accidentally pulling off a prompt injection attack, for real. I'd prefer to not. Needless to say, while the future of automated agents is scary , the current ones are browsing, talking security vulnerabilities. I'm too trusting. When someone says, "It's me, Dave," I almost automatically believe it. Helpfulness is not always good. I want to help – but not everyone deserves my help. Safety is more important than politeness. Better to ask too much. Never share API keys In case of suspicion: first verify Never automatically believe

0 views
Rik Huijzer Yesterday

Setup a Syncthing service on Debian

Install via the APT instructions. Next (source): ``` useradd -u 1010 -c "Syncthing Service" -d /var/syncthing -s /usr/sbin/nologin syncthing mkdir /var/syncthing chown -R syncthing:syncthing /var/syncthing chmod 700 /var/syncthing systemctl enable [email protected] systemctl start [email protected] systemctl status [email protected] ``` Then you should be able to connect to the web GUI at `localhost:8385`. To allow this user to read files outside it's own directories, use ``` getfacl /some/other/dir ``` from `acl` (`apt-get install acl`) to view the permission...

0 views
Thomasorus Yesterday

Is frontend development a dead-end career?

It's been almost a year at my current job. I was hired as a frontend developer and UI/UX coordinator, but I've been slowly shifting to a project manager role, which I enjoy a lot and where I think I contribute more. We build enterprise web apps, the kinds you will never see online, never hear about, but that power entire companies. The config application for the car assembly line, the bank counsellor website that outputs your mortgage rate, the warehouse inventory systems... That's the kind of thing we do. For backend engineers that's a very interesting job. You get to understand how entire professional fields work, and try to reproduce them in code. But for us the frontend guys? The interest is, limited to put it simply. We're talking about pages with multi-step conditional forms, tables with lots of columns and filters, 20 nuances of inputs, modals... The real challenge isn't building an innovative UI or UX, it's maintaining a consistency in project that can last years and go into the hands of multiple developers. Hence my UI/UX coordinator role where I look at my colleagues work and sternly say "That margin should be , not " . Because here's the thing: this type of client doesn't care if it's pretty or not and won't pay for design system work or maintenance. To them, stock Bootstrap or Material Design is amazingly beautiful compared to their current Windev application. What they want is stability and predictability, they care it works the same when they encounter the same interface. Sometimes, if a process is too complex for new hires, they will engage into talks to make it more user friendly, but that's it. Until recently, the only generated code we used were the types for TypeScript and API calls functions generated from the backend, which saved us a lot of repetitive work. We made experiments with generative AI and found out we could generate a lot of our template code. All that's left to do is connect both, the front of the frontend and the back of the frontend , mostly click events, stores, reactivity, and so on. People will say that's where the fun is, and sometimes yes, I agree. I've been on projects where building the state was basically building a complex state machine out of dozens of calls from vendor specific APIs. But how often do you do that? And why would you do that if you are developing the backend yourself and can pop an endpoint with all the data your frontend needs? And so I've been wondering about the future. With frameworks, component libraries, LLMs, the recession pushing to deliver fast even if mediocre code and features, who needs someone who can write HTML, CSS, JS? Who can pay for the craft of web development? Are the common frontend developers folks, not the already installed elite freelancers building websites for prestigious clients , only destined to do assembly line of components by prompting LLMs before putting some glue between code blocks they didn't write?

0 views

Fixing qBittorrent in Docker, swallowing RAM and locking up the host

IntroI’ve had qBittorrent running happily in Docker for over a year now, using the linuxserver/qbittorrent image. I run the docker container and others on an Ubuntu Server host. Recently, the host regularly becomes unresponsive. SSH doesn’t work, and the only way to recover is to power cycle the server (running on a small NUC). The symptoms were: RAM usage would climb and climb, the server would become sluggish, and then it would completely lock up.

0 views
(think) Yesterday

Learning Vim in 3 Steps

Every now and then someone asks me how to learn Vim. 1 My answer is always the same: it’s simpler than you think, but it takes longer than you’d like. Here’s my bulletproof 3-step plan. Start with – it ships with Vim and takes about 30 minutes. It’ll teach you enough to survive: moving around, editing text, saving, quitting. The essentials. Once you’re past that, I strongly recommend Practical Vim by Drew Neil. This book changed the way I think about Vim. I had known the basics of Vim for over 20 years, but the Vim editing model never really clicked for me until I read it. The key insight is that Vim has a grammar – operators (verbs) combine with motions (nouns) to form commands. (delete) + (word) = . (change) + (inside quotes) = . Once you internalize this composable language, you stop memorizing individual commands and start thinking in Vim . The book is structured as 121 self-contained tips rather than a linear tutorial, which makes it great for dipping in and out. You could also just read cover to cover – Vim’s built-in documentation is excellent. But let’s be honest, few people have that kind of patience. Other resources worth checking out: Resist the temptation to grab a massive Neovim distribution like LazyVim on day one. You’ll find it overwhelming if you don’t understand the basics and don’t know how the Vim/Neovim plugin ecosystem works. It’s like trying to drive a race car before you’ve learned how a clutch works. Instead, start with a minimal configuration and grow it gradually. I wrote about this in detail in Build your .vimrc from Scratch – the short version is that modern Vim and Neovim ship with excellent defaults and you can get surprisingly far with a handful of settings. I’m a tinkerer by nature. I like to understand how my tools operate at their fundamental level, and I always take that approach when learning something new. Building your config piece by piece means you understand every line in it, and when something breaks you know exactly where to look. I’m only half joking. Peter Norvig’s famous essay Teach Yourself Programming in Ten Years makes the case that mastering any complex skill requires sustained, deliberate practice over a long period – not a weekend crash course. The same applies to Vim. Grow your configuration one setting at a time. Learn Vimscript (or Lua if you’re on Neovim). Read other people’s configs. Maybe write a small plugin. Every month you’ll discover some built-in feature or clever trick that makes you wonder how you ever lived without it. One of the reasons I chose Emacs over Vim back in the day was that I really hated Vimscript – it was a terrible language to write anything in. These days the situation is much better: Vim9 Script is a significant improvement, and Neovim’s switch to Lua makes building configs and plugins genuinely enjoyable. Mastering an editor like Vim is a lifelong journey. Then again, the way things are going with LLM-assisted coding, maybe you should think long and hard about whether you want to commit your life to learning an editor when half the industry is “programming” without one. But that’s a rant for another day. If this bulletproof plan doesn’t work out for you, there’s always Emacs. Over 20 years in and I’m still learning new things – these days mostly how to make the best of evil-mode so I can have the best of both worlds. As I like to say: The road to Emacs mastery is paved with a lifetime of invocations. That’s all I have for you today. Keep hacking! Just kidding – everyone asks me about learning Emacs. But here we are.  ↩︎ Advent of Vim – a playlist of short video tutorials covering basic Vim topics. Great for visual learners who prefer bite-sized lessons. ThePrimeagen’s Vim Fundamentals – if you prefer video content and a more energetic teaching style. vim-be-good – a Neovim plugin that gamifies Vim practice. Good for building muscle memory. Just kidding – everyone asks me about learning Emacs. But here we are.  ↩︎

0 views
(think) Yesterday

How to Vim: Auto-save on Activity

Coming from Emacs, one of the things I missed most in Vim was auto-saving. I’ve been using my own super-save Emacs package for ages – it saves your buffers automatically when you switch between them, when Emacs loses focus, and on a handful of other common actions. After years of using it I’ve completely forgotten that exists. Naturally, I wanted something similar in Vim. Vim’s autocommands make it straightforward to set up basic auto-saving. Here’s what I ended up with: This saves the current buffer when Vim loses focus (you switch to another window) and when you leave Insert mode. A few things to note: You can extend this with more events if you like: Adding catches edits made in Normal mode (like , , or paste commands), so you’re covered even when you never enter Insert mode. works reliably in GUI Vim and most modern terminal emulators, but it may not fire in all terminal setups (especially inside tmux without additional configuration). One more point in favor of using Ghostty and not bothering with terminal multiplexers. The same autocommands work in Neovim. You can put the equivalent in your : Neovim also has ( ) which automatically saves before certain commands like , , and . It’s not a full auto-save solution, but it’s worth knowing about. There are several plugins that take auto-saving further, notably vim-auto-save for Vim and auto-save.nvim for Neovim. Most of these plugins rely on – an event that fires after the cursor has been idle for milliseconds. The problem is that is a global setting that also controls how often swap files are written, and other plugins depend on it too. Setting it to a very low value (say, 200ms) for snappy auto-saves can cause side effects – swap file churn, plugin conflicts, and in Neovim specifically, can behave inconsistently when timers are running. For what it’s worth, I think idle-timer-based auto-saving is overkill in Vim’s context. The simple autocommand approach covers the important cases, and anything more aggressive starts fighting against Vim’s grain. I’ve never been fond of the idle-timer approach to begin with, and that’s part of the reason why I created for Emacs. I like the predictability of triggering save by doing some action. Simplicity is the ultimate sophistication. – Leonardo da Vinci Here’s the thing I’ve come to appreciate about Vim: saving manually isn’t nearly as painful as it is in Emacs. In Emacs, is a two-chord sequence that you type thousands of times a day – annoying enough that auto-save felt like a necessity. In Vim, you’re already in Normal mode most of the time, so a quick mapping like: gives you a fast, single-keystroke save (assuming your leader is Space, which it should be). It’s explicit, predictable, and takes almost no effort. As always, I’ve learned quite a bit about Vim by looking into this simple topic. That’s probably the main reason I still bother to write such tutorial articles – they make me reinforce the knowledge I’ve just obtained and make ponder more than usual about the trade-offs between different ways to approach certain problems. I still use the autocommand approach myself – old habits die hard – but I have to admit that gets the job done just fine. Sometimes the simplest solution really is the best one. That’s all I have for you today. Keep hacking! instead of – it only writes when the buffer has actually changed, avoiding unnecessary disk writes. – suppresses errors for unnamed buffers and read-only files that can’t be saved.

0 views

Testing a Laravel MCP Server Using Herd and Claude Desktop

I recently added an MCP server to ContributorIQ , using Laravel's native MCP server integration. Creating the MCP server with Claude Code was trivial, however testing it with the MCP Inspector and Claude Desktop was not because of an SSL issue related to Laravel Herd. If you arrived at this page I suppose it is because you already know what all of these terms mean and so I'm not going to waste your time by explaining. The issue you're probably facing is because MCP clients are looking for a valid SSL certificate if https is used to define the MCP server endpoint. The fix involves setting the environment variable to . If you want to test your MCP server using the official MCP Inspector, you can set this environment variable right before running the inspector, like so: If you'd like to test the MCP server inside Claude Desktop (which is what your end users will probably do), then you'll need to set this environment variable inside . I also faced Node version issues but suspect that's due to an annoying local environment issue, but I'll include that code in the snippet just in case it's helpful: Hope this helps.

0 views
Justin Duke Yesterday

What Happened Was

Two of my absolute favorite films of all time, albeit for very different reasons, are My Dinner with Andre and Before Sunrise . Both of these films, which I highly encourage you to watch more than anything else I talk about if you haven't already done so, are about the enchantment and sucker of one single really interesting conversation. The two films diverge pretty heavily from there. My Dinner with Andre is a film about work, fulfillment, and status. And Before Sunrise is a film about youth in love. But the beauty in both comes from not just their simplicity and formless structure, but in the recursive nature of the dialogue, just like in real life, where a pregnant pause or a sidelong glance suddenly carries with it enormous weight after understanding not just the comment but the 75 minutes preceding it. What Happened Was is interested in that last thing too. And in the unraveling of yourself that happens when you spend time being intimate in a literal sense with anyone. But is more interested in a funhouse mirror look at the human psyche. And has perhaps more cynical and caustic things to say about the way people express themselves through others. Our dual protagonists are a paralegal and an executive assistant. Both seem a little off, but not wholly so. And then, over the course of the worst first date in the world, we watch the characters reduce themselves to mania. This is an uncomfortable film to watch. Rather than transposing yourself into Andre and his counterpart, or Jesse and his counterparty, you find yourself just kind of internally screaming on behalf of both characters who have a Lynchian sense of bizarre behavior. In terms of inspiration, this draws more from Waiting for Godot than Who's Afraid of Virginia Woolf. The dread you feel is less from a place of sadness and understanding and more from a sense of shock and increasing bewilderment. And to that extent, it flatly did not work for me quite as much as I hoped. But as in all two-part plays, the film ends with two monologues, one from each character, where they lay bare the things that at that point are almost nakedly obvious to us, the viewer. And while I can't say either monologue or scene was particularly well written, I will say that both of them will stick with me for a long, long time. (I'm not sure the preceding seventy minutes earned those monologues, but that's a point beside.)

0 views
underlap Yesterday

ipc-channel-mux router support

The IPC channel multiplexing crate, ipc-channel-mux, now includes a “router”. The router provides a means of automatically forwarding messages from subreceivers to Crossbeam receivers so that users can enjoy Crossbeam receiver features, such as selection (explained below). The absence of a router blocked the adoption of the crate by Servo, so it was an important feature to support. Routing involves running a thread which receives from various subreceivers and forwards the results to Crossbeam channels. Without a separate thread, a receive on one of the Crossbeam receivers would block and when a message became available on the subchannel, it wouldn’t be forwarded to the Crossbeam channel. Before we explain routing further, we need to introduce a concept which may be unfamiliar to some readers. Suppose you have a set of data sources – servers, file descriptors, or, in our case, channels – which may or may not be ready to deliver data. To wait for one or more of these to be ready, one option is to poll the items in the set. But if none of the items are ready, what should you do? If you loop around and repeatedly poll the items, you’ll consume a lot of CPU. If you delay for a period of time before polling again and an item becomes ready before the period has elapsed, you won’t notice. So polling either consumes excessive CPU or reduces responsiveness. How do we balance the requirements of efficiency and responsiveness? The solution is to somehow block until at least one item is ready. That’s just what selection does. In the context of IPC channel, this selection logic applies to a set of receivers, known as an . An holds a set of IPC receivers and, when requested, waits for at least one of the receivers to be ready and then returns a collection of the results from all the receivers which became ready. The purpose of routing is that users, such as Servo, can then select [1] [2] over a heterogeneous collection of IPC receivers and Crossbeam receivers. By converting IPC receivers into Crossbeam receivers, it’s possible to use Crossbeam channel’s selection feature on a homogeneous collection of Crossbeam receivers to implement a select on the corresponding heterogeneous collection of IPC receivers and Crossbeam receivers. Routing for has the same requirement: to convert a collection of subreceivers to Crossbeam receivers so that Crossbeam channel’s selection feature can be used on a homogeneous collection of Crossbeam receivers to implement selection on the corresponding heterogeneous collection of subreceivers and Crossbeam receivers. Let’s look at how this is implemented. The most obvious approach was to mirror the design of IPC channel routing and implement subchannel routing in terms of sets of subreceivers known as s. Receiving from a collection of subreceivers could be implemented by attempting a receive (using ) from each subreceiver of the collection in turn and returning any results returned. However there is a difficulty: if none of the subreceivers returns a result, what should happen? If we loop around and repeatedly attempt to receive from each subreceiver in the collection, we’ll consume a lot of CPU. If we delay for a certain period of time, we won’t be responsive if a subreceiver becomes ready to return a result. The solution is to somehow block until at least one of the subreceivers is ready to return a result. A does just that. It holds a set of subreceivers and, when requested, returns a collection of the results from all the receivers which became ready. This is a specific example of the advantages of using selection over polling, discussed above. Remember that the results of a subreceiver are demultiplexed from the results of an IPC receiver (provided by the crate). The following diagram shows how a MultiReceiver sits between an IpcReceiver and the SubReceivers served by that IpcReceiver: IPC channels already implements an . So a can be implemented in terms of an containing all the IPC receivers underlying the subreceivers in the set. There are some complications however. When a subreceiver is added to a , there may be other subreceivers with the same underlying IPC receiver which do not belong to the set and yet the will return a message that could demultiplex either to a subreceiver in the set or a subreceiver not in the set. Worse than that, subreceivers with the same underlying IPC receiver may be added to distinct s. So if we use an to implement a , more than one may need to share the same . There is one case where Servo uses directly, rather than via the router and it’s in the implementation of . So one option would be to avoid adding IpcReceiverSet to the API of . Then there would be at most one instance of and so some of the complications might not arise. But there’s a danger that it would be possible to encounter the same complication using the router, e.g. if some subreceivers were added to the router and other subreceivers with the same underlying IPC channel as those added to the router were used directly. Another complication of routing is that the router thread needs to receive messages from subchannels which originate outside that thread. So subreceivers need to be moved into the thread. In terms of Rust, they need to be . Given that some subreceivers can be moved into the thread and other subreceivers which have not not moved into the thread can share the same underlying IPC channel, subreceivers (or at least substantial parts of their implementation) need to be . To avoid polling, essentially it must be possible for a select operation on an SubReceiverSet to result in a select operation on an IpcReceiverSet comprising the underlying IpcReceiver(s). I expermented with the situation where some subreceivers were added to the router and other subreceivers with the same underlying IPC channel as those added to the router were used directly. This resulted in liveness and/or fairness issues when the thread using a subreceiver directly competed with the router thread. Both these threads would attempt to issue a select on an . The cleanest solution initially appeared to be to make both these depend on the router to issue the select operation. This came with some restrictions though, such as the stand-alone subreceiver not being able to receive any more messages after the router was shut down. A radical alternative was to restructure the router API so that it would not be possible for some subreceivers to be added to the router and other subreceivers with the same underlying IPC channel as those added to the router to be used directly. This may be a reasonable restriction for Servo because receivers tend to be added to the router soon after the receiver’s channel is created. With this redesigned router API in which subreceivers destined for routing are hidden from the API, the above liveness and fairness problems can be side-stepped. v0.0.5 of the ipc-channel-mux crate includes the redesigned router API. v0.0.6 improves the throughput for both subchannel receives and routing. The next step is to try to improve the code structure since the module has grown considerably and could do with some parts splitting into separate modules. After that, I’ll need to see if some of the missing features relative to ipc-channel need to be added to ipc-channel-mux before it’s ready to be tried out in Servo. [3] Another possibility, if some of the IPC receivers has been disconnected, is that select can return which IPC receivers have been disconnected. ↩︎ Crossbeam selection is a little more general. They allow the user to wait for operations to complete, each of which may be a send or a receive. An arbitrary one of the completed operations is chosen and its resultant value is returned. ↩︎ The main functional gaps in ipc-channel-mux compared to ipc-channel are shared memory transmission and non-blocking subchannel receive. ↩︎ Another possibility, if some of the IPC receivers has been disconnected, is that select can return which IPC receivers have been disconnected. ↩︎ Crossbeam selection is a little more general. They allow the user to wait for operations to complete, each of which may be a send or a receive. An arbitrary one of the completed operations is chosen and its resultant value is returned. ↩︎ The main functional gaps in ipc-channel-mux compared to ipc-channel are shared memory transmission and non-blocking subchannel receive. ↩︎

0 views