Posts in Sql (20 found)
Aran Wilkinson 2 days ago

How I lost a database and learned to actually use AI

I ran AI-generated SQL without reading it properly and lost a database. The experience changed how I work with AI tools, replacing freeform chat sessions with a structured process built around PRDs, small tasks, and frequent commits.

0 views
Simon Willison 1 weeks ago

Vibe coding and agentic engineering are getting closer than I'd like

I recently talked with Joseph Ruscio about AI coding tools for Heavybit's High Leverage podcast: Ep. #9, The AI Coding Paradigm Shift with Simon Willison . Here are some of my highlights, including my disturbing realization that vibe coding and agentic engineering have started to converge in my own work. One thing I really enjoy about podcasts is that they sometimes push me to think out loud in a way that exposes an idea I've not previously been able to put into words. A few weeks after vibe coding was first coined I published Not all AI-assisted programming is vibe coding (but vibe coding rocks) , where I firmly staked out my belief that "vibe coding" is a very different beast from responsible use of AI to write code, which I've since started to call agentic engineering . When Joseph brought up the distinction between the two I had a sudden realization that they're not nearly as distinct for me as they used to be: Weirdly though, those things have started to blur for me already, which is quite upsetting. I thought we had a very clear delineation where vibe coding is the thing where you're not looking at the code at all. You might not even know how to program. You might be a non-programmer who asks for a thing, and gets a thing, and if the thing works, then great! And if it doesn't, you tell it that it doesn't work and cross your fingers. But at no point are you really caring about the code quality or any of those additional constraints. And my take on vibe coding was that it's fantastic, provided you understand when it can be used and when it can't. A personal tool for you, where if there's a bug it hurts only you, go ahead! If you're building software for other people, vibe coding is grossly irresponsible because it's other people's information. Other people get hurt by your stupid bugs. You need to have a higher level than that. This contrasts with agentic engineering where you are a professional software engineer. You understand security and maintainability and operations and performance and so forth. You're using these tools to the highest of your own ability. I'm finding the scope of challenges I can take on has gone up by a significant amount because I've got the support of these tools. But I'm still leaning on my 25 years of experience as a software engineer. The goal is to build high quality production systems: if you're building lower quality stuff faster, I think that's bad. I want to build higher quality stuff faster. I want everything I'm building to be better in every way than it was before. The problem is that as the coding agents get more reliable, I'm not reviewing every line of code that they write anymore, even for my production level stuff. I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it's just going to do it right. It's not going to mess that up. You have it add automated tests, you have it add documentation, you know it's going to be good. But I'm not reviewing that code. And now I've got that feeling of guilt: if I haven't reviewed the code, is it really responsible for me to use this in production? The thing that really helps me is thinking back to when I've worked at larger organizations where I've been an engineering manager. Other teams are building software that my team depends on. If another team hands over something and says, "hey, this is the image resize service, here's how to use it to resize your images"... I'm not going to go and read every line of code that they wrote. I'm going to look at their documentation and I'm going to use it to resize some images. And then I'm going to start shipping my own features. And if I start running into problems where the image resizer thing appears to have bugs or the performance isn't good, that's when I might dig into their Git repositories and see what's going on. But for the most part I treat that as a semi-black box that I don't look at until I need to. I'm starting to treat the agents in the same way. And it still feels uncomfortable, because human beings are accountable for what they do. A team can build a reputation. I can say "I trust that team over there. They built good software in the past. They're not going to build something rubbish because that affects their professional reputations." Claude Code does not have a professional reputation! It can't take accountability for what it's done. But it's been proving itself anyway - time and time again it's churning out straightforward things and doing them right in the style that I like. There's an element of the normalization of deviance here - every time a model turns out to have written the right code without me monitoring it closely there's a risk that I'll trust it at the wrong moment in the future and get burned. It used to be if you found a GitHub repository with a hundred commits and a good readme and automated tests and stuff, you could be pretty sure that the person writing that had put a lot of care and attention into that project. And now I can knock out a git repository with a hundred commits and a beautiful readme and comprehensive tests of every line of code in half an hour! It looks identical to those projects that have had a great deal of care and attention. Maybe it is as good as them. I don't know. I can't tell from looking at it. Even for my own projects, I can't tell. So I realized what I value more than the quality of the tests and documentation is that I want somebody to have used the thing. If you've got a vibe coded thing which you have used every day for the past two weeks, that's much more valuable to me than something that you've just spat out and hardly even exercised. If you can go from producing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn't. It's not just the downstream stuff, it's the upstream stuff as well. I saw a great talk by Jenny Wen , who's the design leader at Anthropic, where she said we have all of these design processes that are based around the idea that you need to get the design right - because if you hand it off to the engineers and they spend three months building the wrong thing, that's catastrophic. There's this whole very extensive design process that you put in place because that design results in expensive work. But if it doesn't take three months to build, maybe the design process can be a whole lot riskier because cost, if you get something wrong, has been reduced so much. When I look at my conversations with the agents, it's very clear to me that this is moon language for the vast majority of human beings. There are a whole bunch of reasons I'm not scared that my career as a software engineer is over now that computers can write their own code, partly because these things are amplifiers of existing experience. If you know what you're doing, you can run so much faster with them. [...] I'm constantly reminded as I work with these tools how hard the thing that we do is. Producing software is a ferociously difficult thing to do. And you could give me all of the AI tools in the world and what we're trying to achieve here is still really difficult. [...] Matthew Yglesias, who's a political commentator, yesterday tweeted , "Five months in, I think I've decided that I don't want to vibecode — I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money." And that feels about right to me. I can plumb my house if I watch enough YouTube videos on plumbing. I would rather hire a plumber. On the threat to SaaS providers of companies rolling their own solutions instead: I just realized it's the thing I said earlier about how I only want to use your side project if you've used it for a few weeks. The enterprise version of that is I don't want a CRM unless at least two other giant enterprises have successfully used that CRM for six months. [...] You want solutions that are proven to work before you take a risk on them. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
iDiallo 2 weeks ago

Have You Seen the New Excel?

Stop coding. Stop hiring. Stop building. While the tech world obsesses over large language models and neural networks, I discovered the real disruptor that has been hiding in plain sight. Mine was originally installed on my desktop in 1992. And now, it's about to change everything in the world. We are talking about Microsoft Excel, of course. If you haven't looked at a spreadsheet lately, you are missing the most significant leap in enterprise capability since the invention of the corporation itself. We are entering an era of No-Code where the code was never needed in the first place. My own job as a software engineer is not safe, and I'm looking forward to the future. Developers from every walk of life are afraid, and for good reasons. You hear the complaints constantly: "How can I ensure the code works? I can't possibly review a PR with a thousand files. It's unmaintainable." This is a crisis of confidence in the software engineering sector. This specific anxiety has never existed in the Excel ecosystem. Code is called code for a reason, it is meant for the machine to read, not people. In Excel, we don't worry about "reviewing pull requests." We worry about results. The spreadsheet handles the logic and you handle the business outcome. It abstracts away the complexity so you don't have to pretend to understand it. And let's talk about the intimidation factor. Have you ever opened a modern codebase? It's a labyrinth of directories, dependencies, and config files. Where do you even start? It's paralyzing. How do you get started with Excel? You double-click an icon. It opens. It is a file. It is a grid. You type. It works. The barrier to entry is non-existent, yet the ceiling is infinite. If you are getting paid a high salary, and are watching how efficient excel is, you will be terrified. Companies are realizing they don't need distinct software solutions for distinct problems. They just need a grid. We are seeing enterprises replace entire departments with a single file. That is not an exaggeration. The HR department? Replaced by an org chart linked to a payroll calculator. The supply chain team? Replaced by a real-time inventory tracker. The marketing department? Replaced by a pie chart and a mailing list. Why pay for Salesforce? A well-formatted sheet with conditional formatting is a Customer Relationship Manager (CRM). Who even knows how to write SQL? SQL is legacy. A workbook with 1 million rows is a database. Jira is redundant when you have Gantt charts generated from cell dependencies. On top of it all, it has AI. It comes equipped with Microsoft Copilot for 365 apps, not to be confused with Windows Copilot, Microsoft Copilot, Copilot for Teams, Copilot+, Copilot Chat, or Copilot Web. This is the Copilot. It sits inside your grid, ready to extrapolate trends from column D and write your VLOOKUPs for you. While other AI startups are fighting for funding rounds, this integration is already live, embedded directly into the tool that runs the global economy. You aren't hearing much about Venture Capital funding or Series A rounds when it comes to Excel. Why? Because it is already profitable. It doesn't need a roadmap to profitability because it is the roadmap. While other platforms burn cash to acquire users, Excel is the default operating system of business. It requires no adoption curve. It requires no evangelists. It requires only that you open it and have a Microsoft 365 apps subscription. Total Vertical Integration Excel is versatile. It is a text editor; you can write your novel in cell B2. It is a design tool; pixel-perfect layouts can be achieved by merging cells and removing gridlines. It is an IDE; you can write and execute VBA code directly within the environment. It handles the visual and the logical simultaneously. You can present a quarterly report to the board while the underlying formulas are calculating the ROI of the lunch break. It creates a seamless workflow where the input and the output exist in the same plane. Privacy, Scalability, and The Cloud For the enterprise client, Excel offers the ultimate flexibility. Are you concerned about data sovereignty? Run your entire global operation locally on a ThinkPad from 2012. The file sits on your hard drive, unbreachable by the cloud. Do you need to scale? Push it to the cloud. Collaborate in real-time. Ten thousand employees can edit the same cell, creating a hive mind of productivity that traditional management structures cannot compete with. Oh, if you want to add support for crypto, just add a new worksheet. Batteries are included. The Future is a Cell The economy is shifting. We are moving away from specialized labor and toward generalized grid management. If your job involves inputting data, processing data, or presenting data, Excel has already automated you. It doesn't sleep, it doesn't ask for a raise, and it doesn't make calculation errors unless you tell it to. Best of all, it doesn't hallucinate. The grid is absolute, it is infinite and the grid is the future. Learn Excel now, or get left behind. That’s what AI Hype sounds like to my ears. Yes, it’s a great tool. But I don’t think we are all gonna die and lose our jobs. The same way we didn’t die and or lose our jobs to Excel. None of these things are jokes about Excel by the way, you can run entire companies from it. I'm tempted to just start hyping it everyday until everyone gets annoyed.

0 views
Robin Moffatt 2 weeks ago

Materialized Tables in Apache Flink

Flink added support for what it calls Materialized Tables in 1.20 , released in 2024. You can read about the design and motivations in FLIP-435 . In a nutshell, Materialized Tables provide a way to include the SQL to populate and refresh a table as part of its definition.

0 views
alikhil 3 weeks ago

How to Quickly Prepare for Software Engineering Interviews

A few months ago, I found myself needing to prepare for a series of job interviews within a very limited timeframe. It was a stressful experience, but it ultimately worked out well. I decided to share my notes and reflections in case they’re helpful to others in a similar situation. This is especially relevant if you’re not actively job hunting and suddenly receive an interview invitation, leaving you with limited time to prepare but a strong desire to maximize your chances of success. Disclaimer : The tips described in this post may be more useful for senior engineers with hands-on experience and engineering intuition. The internet is full of articles listing all possible HR interview questions. I recommend spending a bit of time on them just to understand what to expect and not be surprised. However, in my humble opinion, there are two main points to focus on during HR interview preparation. First, you need a short story that tells your experience briefly. Avoid listing every bullet point from your CV. Instead, focus on highlighting your key achievements. Also, your story must be aligned with the position you are applying to. Yes, you might need to adjust your story for different jobs at different companies. Second, it’s important to have a clear motivation. Why do you want to change your job, and why this company/role? What kind of job are you looking for? If you have some experience doing System Design interviews or have never done it, start by learning the Delivery framework . Understand each section. Watch at least one video on how it’s done. The more, the better. These videos from Hello Interview channel are really good, though. If you are applying to a FAANG company, you may search for leaked system design questions from that company and spend some time preparing for them. But there is no guarantee that you will get the same topic, thus I would not recommend spending all your time here. If you can, do a mock interview. Ask a friend or find someone to practice with. If you can’t, then try to walk through alone, but talk through everything out loud. During the interview, treat the interviewer as a colleague, ask questions, ensure you understand the problem, and that you have not missed any important requirements before building the design of the system. Don’t rush. This part is really tricky. If the company tends to use LeetCode-style interviews, there is no shortcut here. You need to solve hundreds of them to really feel confident. You may need to refresh your memory on algorithms you feel less confident about (for example, I always forget about corner cases for binary search). Again, if it’s a big / well-known company, you can try to search for leaked coding interview questions. S.T.A.R (situation task action result) & C.A.R.L (context action result learning) There are dozens of questions you could be asked in behavioral interviews. And you’re expected to structure your answers using the STAR framework. This means you need to tell a story by defining a context, your actions, and results. You could go and just prepare a STAR format answer to all such questions, but it will take a lot of time, and it’s suboptimal. This, combined with the fact that the same stories can be used for different questions, makes the situation easier for you. You can prepare 7–10 stories that will cover most of the questions. During preparation, you can write them as text, but don’t read them during the interview. It tends to sound unnatural. When telling your story using the STAR method, make sure your final sentence clearly highlights a positive outcome. Adjust your tone to emphasize this closing part so it stands out. The STAR framework is a standard. But also check CARL in some questions, it would be good to tell what you have learned from that story. Here are some materials that helped me to prepare for a behavioral interview: Some companies have such an interview stage. It’s quite unpopular but still exists. You’re asked to present a project or problem you worked on. You explain the context, problem, solution, results, and your role in this story. It’s like showing the result of your work to colleagues from different departments/teams. This stage is very open-ended. You are not given specific instructions, and there is not much information on the internet with recommendations on how to prepare and conduct such interviews. When I found out I would have this interview, I was initially shocked and unsure how to prepare, as I didn’t know what to expect. It wasn’t until I realized that in reality, it’s you, the interviewee, who rules this interview . You choose the project, decide what to include and omit, control the level of detail, and you are coming up with the story you know, with all the answers for all possible questions, because it’s your story. So, make the most of this stage. Prepare your story, make a few slides / notes / architecture sketches. Don’t dig into details too much. Leave a space for the questions. And even if there is no dedicated interview, you may be asked to tell in detail about a certain problem/project you were working on. So, be prepared. Have your story! When answering open-ended questions, aim to tell stories where the scale of the problem matches the level of the role you’re applying for . For example, if you are asked, “Tell me about a challenging/interesting problem/task you were working on recently.” Optimizing an SQL query by adding an index may be fine for junior roles, but it won’t carry enough weight for senior positions. Interviewers would expect to hear something bigger, challenging, higher stakes, and often involving cross-team collaboration, such as migrating a large system to Kubernetes. Question back . You should ask questions to learn more about the company, their culture, the hiring manager’s management style, and what they like or dislike about their work. Prepare a list of questions before the interview. Start preparing in advance . Even if you’re not planning to change jobs anytime soon, you can begin investing in your future by: Hello Interview - Behavioral Interview Discussion with Ex-Meta Hiring Committee Member - must watch Behavioral interview, although I would recommend watching it even before the HR interview, because it gives a bunch of helpful tips about self-presentation https://thebehavioral.substack.com/ - Strategies, tips, and resources to prepare for your next behavioral interview from a FAANG+ insider. solving one LeetCode problem a day keeping track of tasks/projects you’ve completed, along with your achievements (many companies require this anyway for performance reviews) – this would be a foundation for your stories in behavioral and project walkthrough interviews. keeping your CV and LinkedIn up to date.

0 views
Armin Ronacher 1 months ago

Absurd In Production

About five months ago I wrote about Absurd , a durable execution system we built for our own use at Earendil, sitting entirely on top of Postgres and Postgres alone. The pitch was simple: you don’t need a separate service , a compiler plugin , or an entire runtime to get durable workflows. You need a SQL file and a thin SDK. Since then we’ve been running it in production, and I figured it’s worth sharing what the experience has been like. The short version: the design held up, the system has been a pleasure to work with, and other people seem to agree. Absurd is a durable execution system that lives entirely inside Postgres. The core is a single SQL file ( absurd.sql ) that defines stored procedures for task management, checkpoint storage, event handling, and claim-based scheduling. On top of that sit thin SDKs (currently TypeScript , Python and an experimental Go one) that make the system ergonomic in your language of choice. The model is straightforward: you register tasks, decompose them into steps, and each step acts as a checkpoint. If anything fails, the task retries from the last completed step. Tasks can sleep, wait for external events, and suspend for days or weeks. All state lives in Postgres. If you want the full introduction, the original blog post covers the fundamentals. What follows here is what we’ve learned since. The project got multiple releases over the last five months. Most of the changes are things you’d expect from a system that people actually started depending on: hardened claim handling, watchdogs that terminate broken workers, deadlock prevention, proper lease management, event race conditions, and all the edge cases that only show up when you’re running real workloads. A few things worth calling out specifically. Decomposed steps. The original design only had , where you pass in a function and get back its checkpointed result. That works well for many cases but not all. Sometimes you need to know whether a step already ran before deciding what to do next. So we added / , which give you a handle you can inspect before committing the result. This turned out to be very useful for modeling intentional failures and conditional logic. This in particular is necessary when working with “before call” and “after call” type hook APIs. Task results. You can now spawn a task, go do other things, and later come back to fetch or await its result. This sounds obvious in hindsight, but the original system was purely fire-and-forget. Having proper result inspection made it possible to use Absurd for things like spawning child tasks from within a parent workflow and waiting for them to finish. This is particularly useful for debugging with agents too. absurdctl . We built this out as a proper CLI tool. You can initialize schemas, run migrations, create queues, spawn tasks, emit events, retry failures from the command line. It’s installable via or as a standalone binary. This has been invaluable for debugging production issues. When something is stuck, being able to just and see exactly where it stopped is a very different experience from digging through logs. Habitat . A small Go application that serves up a web dashboard for monitoring tasks, runs, checkpoints, and events. It connects directly to Postgres and gives you a live view of what’s happening. It’s simple, but it’s the kind of thing that makes the system more enjoyable for humans. Agent integration. Since Absurd was originally built for agent workloads, we added a bundled skill that coding agents can discover and use to debug workflow state via . There’s also a documented pattern for making pi agent turns durable by logging each message as a checkpoint. The thing I’m most pleased about is that the core design didn’t need to change all that much. The fundamental model of tasks, steps, checkpoints, events, and suspending is still exactly what it was initially. We added features around it, but nothing forced us to rethink the basic abstractions. Putting the complexity in SQL and keeping the SDKs thin turned out to be a genuinely good call. The TypeScript SDK is about 1,400 lines. The Python SDK is about 1,900 but most of this comes from the complexity of supporting colored functions. Compare that to Temporal’s Python SDK at around 170,000 lines. It means the SDKs are easy to understand, easy to debug, and easy to port. When something goes wrong, you can read the entire SDK in an afternoon and understand what it does. The checkpoint-based replay model also aged well. Unlike systems that require deterministic replay of your entire workflow function, Absurd just loads the cached step results and skips over completed work. That means your code doesn’t need to be deterministic outside of steps. You can call or in between steps and things still work, because only the step boundaries matter. In practice, this makes it much easier to reason about what’s safe and what isn’t. Pull-based scheduling was the right choice too. Workers pull tasks from Postgres as they have capacity. There’s no coordinator, no push mechanism, no HTTP callbacks. That makes it trivially self-hostable and means you don’t have to think about load management at the infrastructure level. I had some discussions with folks about whether the right abstraction should have been a durable promise . It’s a very appealing idea, but it turns out to be much more complex to implement in practice. It’s however in theory also more powerful. I did make some attempts to see what absurd would look like if it was based on durable promises but so far did not get anywhere with it. It’s however an experiment that I think would be fun to try! The primary use case is still agent workflows. An agent is essentially a loop that calls an LLM, processes tool results, and repeats until it decides it’s done. Each iteration becomes a step, and each step’s result is checkpointed. If the process dies on iteration 7, it restarts and replays iterations 1 through 6 from the store, then continues from 7. But we’ve found it useful for a lot of other things too. All our crons just dispatch distributed workflows with a pre-generated deduplication key from the invocation. We can have two cron processes running and they will only trigger one absurd task invocation. We also use it for background processing that needs to survive deploys. Basically anything where you’d otherwise build your own retry-and-resume logic on top of a queue. Absurd is deliberately minimal, but there are things I’d like to see. There’s no built-in scheduler. If you want cron-like behavior, you run your own scheduler loop and use idempotency keys to deduplicate. That works, and we have a documented pattern for it , but it would be nice to have something more integrated. There’s no push model. Everything is pull. If you need an HTTP endpoint to receive webhooks and wake up tasks, you build that yourself. I think that’s the right default as push systems are harder to operate and easier to overwhelm but there are cases where it would be convenient. In particular there are quite a few agentic systems where it would be super nice to have webhooks natively integrated (wake on incoming POST request). I definitely don’t want to have this in the core, but that sounds like the kind of problem that could be a nice adjacent library that builds on top of absurd. The biggest omission is that it does not support partitioning yet. That’s unfortunate because it makes cleaning up data more expensive than it has to be. In theory supporting partitions would be pretty simple. You could have weekly partitions and then detach and delete them when they expire. The only thing that really stands in the way of that is that Postgres does not have a convenient way of actually doing that. The hard part is not partitioning itself, it’s partition lifecycle management under real workloads. If a worker inserts a row whose lands in a month without a partition, the insert fails and the workflow crashes. So you need a separate maintenance loop that always creates future partitions far enough ahead for sleeps/retries, and does that for every queue. On the delete side, the safe approach is , but getting that to run from doesn’t work because it cannot be run within a transaction, but runs everything in one. I don’t think it’s an unsolvable problem, but it’s one I have not found a good solution for and I would love to get input on . This brings me a bit to a meta point on the whole thing which is what the point of Open Source libraries in the age of agentic engineering is. Durable Execution is now something that plenty of startups sell you. On the other hand it’s also something that an agent would build you and people might not even look for solutions any more. It’s kind of … weird? I don’t think a durable execution library can support a company, I really don’t. On the other hand I think it’s just complex enough of a problem that it could be a good Open Source project void of commercial interests. You do need a bit of an ecosystem around it, particularly for UI and good DX for debugging, and that’s hard to get from a throwaway implementation. I don’t think we have squared this yet, but it’s already much better to use than a few months ago. If you’re using Absurd, thinking about it, or building adjacent ideas, I’d love your feedback. Bug reports, rough edges, design critiques, and contributions are all very welcome—this project has gotten better every time someone poked at it from a different angle.

0 views
Evan Schwartz 1 months ago

Scour - March Update

Hi friends, In March, Scour scoured 813,588 posts from 24,029 feeds (7,131 were newly added) and 488 new users signed up. Welcome! Here's what's new in the product: Scour now does a better job of ensuring that your feed draws from a mix of sources and that no single interest or group of interests dominates. I had made a number of changes along these lines in the past, but they were fiddly and the diversification mechanism wasn't working that well. Under the hood, Scour now does a first pass to score how similar articles are to your interests and then has a separate step for selecting posts for your feed while keeping it diverse on a number of different dimensions. Content from websites and groups of interests you tend to like and/or click on more are now given slightly more room in your feed. Conversely, websites and groups of interests you tend to dislike or not click on will be given a bit less space. For Scour, I'm always trying to think of how to show you more content you'll find interesting -- without trapping you in a small filter bubble (you can read about my ranking philosophy in the docs). After a number of iterations, I landed on a design that I'm happy with. I hope this strikes a good balance between making sure you see articles from your favorite sources, while still leaving room for the serendipity of finding a great new source that you didn't know existed. After you click an article, Scour now explicitly asks you for your reaction. These reactions help tune your feed slightly , and they help me improve the ranking algorithm over time. Before, the reaction buttons were below every post but that made them a bit hard to hit intentionally and easy to touch accidentally. If you want to react to an article without reading it first, you can also find them in the More Options ( ) menu. Thanks to Shane Sveller for pointing out that the reaction buttons were too small on mobile! Scour now supports exact keyword matching, in addition to using vector embeddings for semantic similarity. Articles that are similar to one of your interests but don't use the exact words or phrases from your interest definition will be ranked lower. Right now this applies to interests marked as "Specific" or "Normal" (this is also automatically determined when interests are created). This should cut down on the number of articles you see that are mis-categorized or clearly off-topic. Thanks to Alex Miller and an anonymous user for prompting this, and thanks to Alex, JackJackson, mhsid, snuggles, and anders_no for all the Off-Topic reports! Sometimes, I see an article on Hacker News or elsewhere and wonder why didn't this show up in my Scour feed. You can now paste links into the Why didn't I see this? page, and it will give you a bit of an explanation. You can also report that so I can look into it more and continue to improve the ranking algorithm over time. Here were some of my favorite posts that I found on Scour in March: Happy Scouring! P.S. If you use a coding agent like Claude Code, I also wrote up A Rave Review of Superpowers , a plugin that makes me much more productive. For anyone building products, this is a good reminder to make sure you're trying out and experiencing the bad parts of your product: Bored of eating your own dogfood? Try smelling your own farts! . This was a brief, interesting history and technical overview of document formats, from to and and why Markdown "won": Markdown Ate The World . A reminder that any user-generated input, including repo branch names, can be malicious: OpenAI Codex: How a Branch Name Stole GitHub Tokens . This is a very detailed and informative visual essay explaining how quantization (compression) for large language models works: Quantization from the ground up . I'm not currently using Turso (the Rust rewrite of SQLite), but I think what they're doing is interesting. Including this experimental version that speaks the Postgres SQL dialect: pgmicro . And because I like making -- and eating -- sour sourdough: How To Make Sourdough Bread More (Or Less) Sour .

0 views
Robin Moffatt 1 months ago

Look Ma, I made a JAR! (Building a connector for Kafka Connect without knowing Java)

As a non-Java coder, for the last ten years I’ve stumbled my way through the JVM-centric world of "big data" (as it was called then), relying on my wits with SQL and config files to just about muddle through. One of the things that drew me to Kafka Connect was that I could build integrations between Kafka and other systems without needing to write Java, and the same again for ksqlDB and Flink SQL—now stream processing was available to mere RDBMS mortals and not just the Java adonises. One thing defeated me though; if a connector didn’t exist for Kafka Connect, then I was stuck. I’d resort to cobbled-together pipelines leaning heavily on kafkacat kcat, such as I did in this blog post . I built some cool analytics on top of maritime AIS data about ships' locations, but the foundations were shaky at best: No failure logic, no schema handling, no bueno. What I really needed was a connector for Kafka Connect. However for that, you need Java. I don’t write Java. But Claude can write Java.

0 views
Stratechery 1 months ago

An Interview with Nvidia CEO Jensen Huang About Accelerated Computing

Listen to this post: Good morning, This week’s Stratechery Interview is running early this week, as I had the chance to speak in person with Nvidia CEO Jensen Huang at the conclusion of his GTC 2026 keynote , which took place yesterday in San Jose . I have spoken to Huang four times previously, in May 2025 , March 2023 , September 2022 , and March 2022 . In this interview we talk about a keynote that came across like a bit of a history lesson, and what that says about a company that still feels small even as it’s the most valuable in the world, as well as what has changed in AI over the last year. Then we discuss a number of announcements that might feel like a change in approach (although Huang disagrees), including Nvidia’s burgeoning CPU business and the Groq acquisition. Finally we discuss scarcity in the AI stack and how that affects Nvidia, the China question, and Huang’s frustration with doomers and their influence in Washington. As a reminder, all Stratechery content, including interviews, is available as a podcast; click the link at the top of this email to add Stratechery to your podcast player. On to the Interview: This interview is lightly edited for clarity. Jensen Huang, welcome back to Stratechery. JH: It’s great to be with you. You literally just walked off the stage, went a little long, I think, but you spent a lot of this keynote , which I quite enjoyed, explaining what Nvidia is, starting with the history of the programmable shader, the launch of CUDA 20 years ago. We don’t need to spend too much time recounting this, you did a good job, and Stratechery readers are certainly familiar — sorry, this is a bit of a lead up here — Stratechery readers are familiar , and I remember this exactly, someone asked me to explain how is it that Nvidia can announce so many things at a single GTC, this is like six, seven years ago, maybe even longer than that, and I explained the whole thing with CUDA and all the libraries is it’s just sort of doing the same thing again and again , but for specific industries. That’s the story you told today, and it’s kind of a back-to-the-future moment after the last few GTC keynotes have kind of just been pretty AI-centered, CES was pretty AI-centered . Why did you feel the need tell that story now? To recast CUDA and why is it important? JH: Well, because we’re going into a whole lot of new new industries and because AI is going to use tools, and when AI uses tools, those are tools that we created for humans. AI is going to use Excel, AI is going to use Photoshop, AI is going to use logic synthesis tools, Synopsis tools, and Cadence tools. Those tools have to be super-accelerated, they’re going to use databases they have to be super-accelerated because AI’s are fast. And so I think in this era, we need to get all of the world’s software now as fast as possible accelerated, and then put them in front of AI so that AI could agentically use them. So is that a bit where we’ve already done this for a bunch of sectors and now we’re going to do it for a bunch more? JH: Yeah, a whole bunch more. For example, data processing. Well, that was sort of a surprise. I didn’t expect you to be opening with an IBM partnership . JH: Yeah, right, that kind of puts it in perspective. I mean, they really started it all. You wrote last week that AI is a five-layer cake : power, chips, infrastructure, models, and applications. Is there a concern that in the last four or five years, that you are worried about being squeezed into the chips box, so it’s important to both remind people and also yourselves about you being this vertically integrated company — not just in terms of building systems, but into the entire software stack, you’re not just a chip company. JH: I guess my mind doesn’t start with, “What I’m not”, it starts with, “What do we need to be?”. And back then, we realized that accelerated computing was a full stack problem, you have to understand the application to accelerate it. We realized that we had to understand the application, we had to have the developer ecosystem, we needed to have excellent expertise in algorithm development, because the old algorithms that were developed for CPUs don’t work well for GPUs, so we had to rewrite, refactor algorithms so that they could be accelerated by our GPUs. If we do that, though, you get 50 times speed up, 100 times speed up, 10 times speed up, and so it’s totally worth it. I think since the very beginning, we realized, “Ok, what do we want to do, and what does it take to achieve that?”. Now, today we’re building AI factories, we’re building AI infrastructure all over the world. That’s a lot more than building chips, and building chips is obviously important, it’s the foundation of it. Right, that’s like one full stack of doing the networking and doing the storage, and now you’re into CPUs. JH: Now you’ve got to put it all together into these giant systems — a gigawatt factory is probably $50, $60 billion. Out of that $50, $60 billion, probably about, call it $15, $17 or so, is infrastructure: land, power, and shell. The rest of it is compute and networking and storage and things like that, and so that level of investment, unless you’re helping customers achieve the level of confidence that they’re going to succeed in building it, you just have no hope, nobody’s going to risk $50 billion. So I think that that’s the big idea, that we need to help customers not just build chips, but build systems and then after we build systems, not just build systems, but build AI factories. AI Factories has a lot of software inside, it’s not just our software, it’s a ton of software for cooling management and electricals and things like that, and redundancies and a lot of it is over-designed, it’s over-designed because nobody talked to each other. When you have a lot of people who don’t talk to each other, integrate systems, you have to, by definition, over-design your part of it. But if we’re working together as one team, we’ll make sure that we can push the limits and get more throughput out of the power that we have or save money for whatever throughput you want to have. Just to go back to that software bit, you mentioned Excel wasn’t designed to be used by AI. You have things like Claude has this new functionality to use Excel , so when you talk about that, you want to invest in these libraries, is that to enable models like that to do better? Or is that something for Microsoft or for enterprises — you want to use this, you don’t want to be beholden to this sort of other player in the world? JH: Well, SQL’s a good example. SQL’s used by people, and we bang on the SQL systems like anybody else, and it is the ground truth of businesses. Well, it’s not just gonna be people banging on our SQL database now, it’s gonna be a whole bunch of agents banging on it. Right, they’re gonna do it way faster. JH: They’re gonna need to do it way faster. And so the first thing we have to do is accelerate SQL, that’s kind of the simple logic of it. That makes sense. In terms of models, you noted that language models are only one category. “Some of the most transformative work is happening in protein AI, chemical AI, physical simulation, robots, and autonomous systems”, and this is from the piece you wrote last week. You’ve previously made this point while noting in other keynotes, “Everything is a token”, I think, is a phrase that you’ve used before. Do you see transformers as being the key to everything, or do we need new fundamental breakthroughs to enable these applications? JH: We need all kinds of new models. For example, transformers, its ability to do attention scales quadratically, and so how do you have quite long memory? How can you have a conversation that lasts a very long time and not have the KV cache essentially become, over time, garbage? Or have entire racks of solid-state drives that are holding KV cache . JH: And of course, let’s say that you were able to record all of our conversation, when you go back and reference some conversation, which part of the reference is most important? There needs to be some new architecture that thinks about attention properly and be able to process that very quickly. We came up with a hybrid architecture of a transformer with an SSM, and that is what enables Nemotron 3 to be super intelligent and super efficient at the same time, that’s an example. Another example is coming up with models that are geometry aware, meaning a lot of things in life, in nature, are symmetrical. And so when you’re generating these models, you don’t want it to generate what is just statistically plausible, it has to also be physically based, and so it has to come out symmetrical. And so cuEquivariance , for example, allows you to do things like that. So we have all these different technologies that are designed — or, for example, when we’re generating tokens in words, it comes out in chunks at a time, little bits, tokens at a time, when you’re generating motion, you need it to be continuous. And so there’s discrete information that you generate and understand, and there’s continuous information that you want to generate and understand. Transformers is not ideal for both. Right, that makes sense. One more quote from the piece, you write, “In the past year, AI crossed an important threshold. Models became good enough to be useful at scale. Reasoning improved. Hallucinations dropped. Grounding improved dramatically. For the first time, applications built on AI began generating real economic value”. What specifically was that change? Because I think about the timing, I feel like this upcoming year is definitely about agents, I just wrote about it today — but for last year, was that the reasoning? Was that the big breakthrough? JH: Generative, of course, was a big breakthrough, but it hallucinated a lot and so we had to ground it, and the way to ground it is reasoning, reflection, retrieval, search, so we helped it ground. Without reasoning, you couldn’t do any of that, and so reasoning allowed us to ground the generative AI. And once you ground it, then you could use that system to reason through problems and decompose it, and decompose it into things that you could actually do something about, and so the next generation was tool use. Turns out it probably tells you something that search was a service that nobody paid for, and the reason for that is getting information is very important and very useful but it’s not something you pay for. The bar to reach to get somebody to pay you for something has to be higher than just information. “Where’s a good restaurant?” — information is just, I don’t think is worthy enough to get paid for. Some people pay for it, I pay for it. We now know that we’ve now crossed that threshold. Not only is it able to converse with us and generate information for us, it can now, of course, do things for us. Coding is just a perfect example for that. If you think about it for a second, you realize this, coding is not really the same modality as language, you have to teach it empty spaces and indentations and symbols, it’s almost like a new modality and you can’t generate code just one token at a time, you have to reflect on the chunk of code. That chunk of code has to be factored properly and has to be optimal and has to obviously compile, it has to be grounded not on probable truth, it has to be grounded on execution. Right, does it run or not? JH: It has to run or not. And so I think the code, learning that modality was a big deal. Once you’re able to now do — we pay engineers several hundred thousand dollars a year to code, and so now they have a coding assistant. They could think about architecture. Instead of describe programs in code, which is very laborious, they can now describe software in specification, which is much more abstract and allows them to be much more productive. And so they describe specification, architecture, they’re able to use their time to solve and innovate, and so our software engineers 100% use coding agents now. Many of them haven’t generated a line of code in a while, but they’re super productive and super busy. Do you think there is a temptation to over-extrapolate from coding, though, precisely because it’s verifiable? You have this agent idea where they can go — it’s not just that they will generate code, then they can actually verify it, see if it works, if it doesn’t, they can go back and do it again, and this can happen all without humans because there’s a clear, “Does it work or not?”. JH: Well, because you can reflect, you could have, let’s say, design a house. Designing a house or designing a kitchen used to be the work of architects, designers, but now you could have carpenters do that. So now you elevated the capability of a carpenter, now you use an agent for that carpenter to go design a house, design a kitchen, come up with some interesting styles. The agent doesn’t have some tool to execute. However, you could give an example. You say, “these are the styles I’m looking for, I want it to be aesthetic like that”. Because the agent is able to reflect, is able to compare its quality of code, its quality of result against some reference, it could say, “You know what, it didn’t turn out as well as I hope, I’m going to go back at it again”, and so it iterates. It doesn’t have to be fully executable, in fact, the more probabilistic, the more aesthetic, the more subjective, if you will, AI actually does better. Right, well that’s why you almost have two extremes. You have generating images where there’s no right answer and then you have coding where there is a right answer and AI seems to do good on those sides and the question is how much will it collapse into the middle there. JH: We’re fairly certain it could do architecture now, we’re fairly certain it could design kitchens and living rooms. Well, to this point, one of the big things with agents coming online is, you’ve talked a lot about accelerated computing, I think you’ve trash talked as it were, maybe the CPUs to the day, they’re all gonna be removed, like everything’s gonna be accelerated. Suddenly CPUs are hot again. It turns out they’re pretty useful and important to the extent you are selling CPUs now, how’s it feel to be a CPU salesman ? JH: There’s no question that Moore’s law is over. Accelerated computing is not parallel computing. Go back in time — 30 years ago, there were probably 10, 20, 30 parallel computing companies, only one survived, Nvidia and the reason why is because we had the good wisdom of recognizing the goal wasn’t to get rid of the CPU, the goal was to accelerate the application. So what I just falsely accused you of was actually true for everybody else. JH: We were never against CPUs, we don’t want to violate Amdahl’s Law . Accelerated computing, in fact, inside our systems, we choose the best CPUs, we buy the most expensive CPUs, and the reason for that is because that CPU, if not the best and not the most performant, holds back millions of dollars of chips. When it comes to branch prediction , you worried about wasting CPU time, now you’re worried about wasting GPU time. JH: That’s right, you just never can have GPUs be squandered, GPU time be idle. And so we always use the best CPUs to the point where we went and built Grace so that we could have the highest performance single-threaded CPU and move data around a lot faster. And so accelerated computing was never against CPUs, my basis is still true that Amdahl’s Law is over, the idea that you would use general purpose computing and just keep adding transistors, that is so dead, and so I think fundamentally we’re not against CPUs. However, these agents are now able to do tool use, and the tools that they want to use are tools created for humans and they’re basically two types. There’s the stuff that we run in data centers and most of it is SQL, most of it is database related, and the other type is personal computers. We’re now going to have AIs that are able to learn unstructured tool use, the first type of tool use is structured. CLIs are tool use, APIs, they’re all structured tool use, the commands are very explicit, the arguments are explicit, the way you talk to that application is very specific. However, there’s a whole bunch of applications that were never designed to have CLIs and APIs and those tools need AIs to learn multi-modality, unstructured, and it has to go and be able to go surf a website and it has to be able to recognize buttons and pull down menus and just kind of work its way through it like we do. That tool use are going to want to use PCs and we have both sides, we have incredibly great data processing systems, and as you know, Nvidia’s PCs are the most performant in the world. So what makes an agent-focused CPU different from other CPUs? So you’re going to have a rack of just Vera CPUs. JH: Oh, really good, excellent. So the way that CPUs were designed in the last decade, they were all designed for hyperscale cloud and the way that hyperscale cloud monetizes CPUs is by the CPU core. So you want to design CPUs that have as many cores as possible that are rentable, the performance of it is kind of secondary. You’re dealing with web latency by and large. JH: That’s exactly right, exactly. And so the number of CPU instances is what you’re optimizing for. That’s why you see these CPUs with a couple of hundred, 300, 400 cores coming. Well, they’re not performant and for tool use, where you have this GPU waiting for the tool use— And you’re going over NVLink. JH: That’s right, you want the fastest single-threaded computer you can possibly get. So is it just the speed? Or does the CPU itself need to be increasingly parallel so it doesn’t have misses and things like that? Or so it’s like just all the way down the pipeline is very different? JH: Yeah, the most important thing is single-threaded performance and the I/O has to be really great. Because it’s now in the data center, the number of single-threaded instances running is going to be quite high and therefore, it’s going to bang on the I/O system, it’s going to bang on the memory controller really hard. Vera’s bandwidth-per-CPU core, bandwidth-per-CPU, is three times higher than any CPU that’s ever been designed, and so it’s designed so that it has lots and lots of I/O bandwidth and lots and lots of memory bandwidth, so that it never throttles the CPU. If the CPU gets throttled, then we’re holding back a whole bunch of GPUs. Is this Vera rack, is it still, you talked about it being very tightly linked to the GPU rack, but is it still disaggregated so that the GPUs can be serving multiple different Vera cores? Whereas you have a Vera core on a board with- Okay, got it, that makes sense. How does your Intel partnership and the NVLink thing fit into this, if at all? JH: Excellent. Some of the world is happy with Arm, some of the world still needs, particularly, you know, enterprise computing, a whole bunch of stacks that people don’t want to move and so x86 is really important to that. Has the resiliency of x86 code been surprising to you? JH: No. Nvidia’s PC is still x86, all of our workstations are x86. I did want to congratulate you, as you talked about in the keynote today, you are the token king . So in your article, you also talked about that energy is the first principle of AI infrastructure and the constraint on how much intelligence the system can produce. If that’s the case, if it’s the amount of tokens you can produce and you’re constrained by how much energy is in the data center, why do companies even try to compete with the token king? JH: It’s going to be hard because it’s not reasonable to build a chip and somehow achieve results that are fairly dramatic. Even in the case of Groq , Groq couldn’t deliver the results unless we paired it with Vera Rubin . Well tell me about this, my next question was about Groq. JH: So if you look at the entire envelope of inference, on the one hand, you want to deliver as much throughput as possible, on the other hand, you want to deliver as many smart tokens as possible — the smarter the token, the higher price you could charge. These two balance, this tension of maximizing throughput on the one hand, maximizing intelligence on the other hand, is really, really tough to work out. I do have to say, last year you had a slide talking about this Pareto Curve , and you talked about, I think it was when you introduced Dynamo, how your GPUs could cover the whole thing, and so you didn’t have to think about it, just buy an Nvidia GPU, and Dynamo will do both. But now you’re here saying, “Well, it doesn’t quite cover the whole thing”. JH: We cover the whole thing still better than any system that can do it. Where we could extend that Pareto is particularly on the extremely high token rates and extremely low latency, but it also reduces the throughput. However, because of coding agents, because they’re now AI agents that are producing really, really great economics, and because the agents are being attached to humans that are actually making extremely, I mean, they’re extremely valuable. Right, they’re even more expensive than GPUs. JH: And so I want to give my software engineers the highest token rate service, and so if Anthropic has a tier of Anthropic Claude Code that increases coding rate by a factor of 10, I would pay for it, I would absolutely pay for it. So you’re building this product for yourself? JH: I think most great products are kind of because you see a pain point and you feel the pain point and you know that that’s where the market’s going to go. We would love for our coding agents to run 10 times faster, but in order to do that, it’s just very, very difficult to do that in a high throughput system and so we decided to add the Groq low latency system to it and then we basically co-run, co-process. Right. And is this just separating decode and prefill ? JH: We’re going to do even the high processing, high FLOPS part of decode, attention part of decode. So you’re disaggregating even down to the decode level. JH: That’s right, and that requires really tight coupling and really, really close integration of software. So how are you able to do that? You say you’re shipping later this year, this deal was just announced a couple of months ago. JH: Well, we started working on disaggregated inferencing, Dynamo really put Nvidia’s ideas on the table. The day that I announced Dynamo, everybody should have internalized that, I was already thinking about, “How do we disaggregate inference across a heterogeneous infrastructure more finely?”, and Groq’s architecture is such an extreme version of ours, they had a very hard time. Dynamo was a year ago, and Groq just happened sort of over Christmas. Was there an event that sort of made you think this needed to happen? JH: Well remember, I announced Dynamo a year ago, we’ve been working on Dynamo for two years, so we’ve been thinking about disaggregated inference thing for two, three years, and we started working with Groq maybe before we announced the deal, maybe six months earlier. So we’ve been thinking about working with them about unifying Grace Blackwell and Groq fairly early on. So the interaction with them, I really like the team and we don’t want their cloud service. They had another business that they really believe in and they still believe in, they’re doing really well with it and that wasn’t a part of the business that we wanted, so we decided to acquire the team and license the technology. Then we’ll take the fundamental architecture and we’ll evolve it from here. So it was just a happy coincidence or not a happy coincidence, maybe not a happy coincidence. JH: Strategic serendipity. Because OpenAI, you know, has an instance now with Cerebras that they announced in January . JH: That was done completely independent of us and frankly, I didn’t even know about it, but it wouldn’t have changed anything. I think the Groq architecture is the one I would have chosen anyways, it’s much more sensible to us. Was this the first time where there was sort of an ASIC approach that sort of made you raise your eyebrows like, “Oh, that’s actually fundamentally different”? JH: No, Mellanox . That’s a good example. JH: Yeah, Mellanox. We took a bunch of our computing stack and we put it into the Mellanox stack. NVLink wouldn’t be possible, you know, at the scale we’re talking about without the in-network fabric computing that we did with Mellanox. Taking the software stack, disaggregating it, and putting it where it needs to be, is a specialty of Nvidia. We’re not obsessed about where computing is done, we just want to accelerate the application. Remember, Nvidia is an accelerated computing company, not a GPU company. Right. So you talk about power being the constraint. When your customers are thinking about what to buy, we could buy all sort of traditional GPUs, or we could buy these LPU racks. Is that just, they should be thinking about it in terms of you’re just confident they can drive way more revenue? JH: It really depends on the kind of products they have. Suppose you really don’t have enterprise use cases at the moment, I don’t really think that adding Groq makes much sense, and the reason for that is because most of your customers are free tier customers, and they’re moving towards paying. So it might be two-thirds free tier, one-third paid, in that case, adding Groq to it, you’re adding a lot of expense. You’re taking some power, it’s not worth it. Complexity. And you’re taking away servers, the opportunity cost. JH: What you could actually be serving the free tier, yeah. However, if you have Anthropic-like business and you have OpenAI-like business where Codex is capturing really great economics, but you just wish you could generate more tokens, this is where adding that accelerator can really boost your revenues. Are we actually constrained by power right now in 2026 or by fab capacity or what? Everyone’s saying we don’t have enough supply. What’s the actual limiting factor? JH: I think it’s probably close on everything. You couldn’t double anything, really. Because you’ll hit some other constraints. It does feel like, though, the U.S. has I think done a pretty good job of scrounging up power , maybe more than people expected a couple years ago, it feels like chips are really much more of a limiter right now . JH: Our supply chain is fairly well planned. You know, we were planning for a very, very big year, and we’re planning for a very big year next year. We saw all the soju drinking and fried chickens. JH: (laughing) Yeah, right. We’re planning, we plan for, in our supply chain, we have got, you know, a couple of hundred partners in our supply chain and we’ve got long-term partnerships with them. So I feel pretty good about that part of it. I don’t think we have twice as much power as we need, I don’t think we have twice as much chip supply as we need, I don’t think we have twice of anything as we need. But I think everything is, everything that I see in the horizon, we will be able to support from a supply chain perspective and the thing that I wish probably more than anything is that all the land, power, and shell would just get stood up faster. Is it fair to say, is there a bit where Nvidia is actually the biggest beneficiary of scarcity, though, to the extent it exists? Like, if there’s a power scarcity, you’re the most efficient chip, so you’re going to be utilizing that power better. Or if there’s fab capacity, like you just said, you’ve been out there securing the supply chain, you got it sort of sorted, are you the big winners in that regard? JH: Well, we’re the largest company in this space, and we did a good job planning. And we plan upstream of the supply chain, we plan downstream of the supply chain and so I think we’ve done a really good job preparing everyone for growth. Right, but is this a bit where, at its core, why not having access to the Chinese market maybe is a threat? Like if China ends up with plenty of power and plenty of chips, even though those chips are only 7nm, they have the capacity to build up an ecosystem to potentially rival CUDA in the long run, is that the concern that you have? JH: There’s no question we need to have American tech stack in China, and I’ve been very consistent about that since the very beginning recognizing that open source software will come. No country contributes more to open source software than China does and we also know that 50% of the world’s AI researchers come from China, and we also know that they’re really inventive. DeepSeek is not a nominal piece of technology, it’s really, really good. And Kimi is really good, and Qwen is really good and they make unique contributions to architecture, and they make unique contributions to the AI stack so I think we have to take these companies seriously. To the extent that American tech stack is what the world builds on top of, then when that technology diffuses out of China, which it will, because it’s open source, and when it comes out of China, it goes into American industries, it goes into Southeast Asia, it goes into Europe, the American tech stack will be prepared to receive them. I’ve been really consistent that this is probably the single most geopolitical strategic issue for the American tech industry. Yeah, when we talked last time , the Trump administration had banned the H20. Were you surprised you were able to get the Trump administration to see your point of view? And then were you even more surprised that now you’re stymied by the Chinese government ? JH: I’m not surprised by us being stymied by them and the reason for that is because, of course, China would like to have their tech stack develop. In the time that we’ve left that market, you know how fast the Chinese industry moves, and Huawei achieved a record year for their company’s history. This is a very long-running company, and they had a record year. They had, what, five, six IPOs of chip companies that are addressing the AI industry. I think we need to be more strategic in how we think about American leadership and American geopolitical and technology leadership. AI is not just a model, and that’s a deep misunderstanding — AI, as I said and as you mentioned in the beginning, AI is a five-layer cake and we have to win the infrastructure layer, we have to win the chips layer, we have to win the platform layer, we have to win the model layer and we have to win the application layer. Some of the things that we do are jeopardizing our ability as a country to lead in each one of those five layers. I think it’s a terrible mistake to think that the way to win is to bundle all of it top-to-bottom and tie every company together into one holistic stack so that we can only win or win at the limits of what any one of the layers can win. We’ve got to let all the layers go out and try to win the market. Have those other layers maybe benefited from their longer experience in Washington and you sort of showed up a little late to the scene? JH: Yeah, maybe. What have you learned? What’s been the biggest thing you’ve learned about Washington? JH: Well, the thing that I was surprised by is how deep the doomers were integrated into Washington D.C. and how the messages of doomers affected the psychology of the policy makers. Everyone was scared instead of optimistic. JH: That’s right, and I think it has two fundamental problems. In this Industrial Revolution, if we don’t allow the technology to diffuse across the United States and we don’t take advantage of it ourselves, what will happen to us is what happened to Europe in the last Industrial Revolution — we left them behind. And they, in a lot of ways, they invented all the technologies of the last Industrial Revolution and we just took advantage of it. I hope that we have the historic wisdom, that we have the technological understanding and not get trapped in science fiction, doomerism, these incredible stories that are being invented to scare the living daylights out of policy makers who don’t understand technology very well and they give them these science fiction embodiments that are just not helpful. One of the situations that is most concerning to me is when you poll the United States, the population, the popularity of AI is decreasing, that’s a real problem. It’s no different than the popularity of electricity, the popularity of electric motors, the popularity of gasoline engines, in the last Industrial Revolution became less popular. The popularity of the Internet, could you just imagine? Other countries took advantage of it much more quickly than we did and then technology diffused into its industries and society much more quickly and so we just have to be much, much more concerned that we don’t give this technology some kind of a mystical science fiction embodiment that’s just not helpful and scaring people. And so I don’t like it when doomers are out scaring people, I think there’s a difference between genuinely being concerned and warning people versus is creating rhetoric that scares people. I think a characteristic you see all the time is people put on their big thinking hats and try to tease out all these nuances and forget the fact that actual popular communication is done in broad strokes. You don’t get to say, “Oh, you’re a little scared of this, but not this XYZ”— you’re just communicating fear as opposed to communicating optimism. JH: Yeah, and somehow it makes them sound smarter. People love to sound smart. JH: Sometimes it’s maybe, and we now know, it helps them with their fundraising and sometimes it helps them secure regulatory capture. So there’s a lot of different reasons why they do it, and these are incredibly smart people but I would just warn them that most of these things will likely backlash and will likely come back and they’ll be probably disappointed that they did it someday. I’m gonna tie a few questions together because I know we’re a little short on time. In the self-driving car space , you’re working with multiple automakers, you have your Alpamayo model , while still supplying chips to Tesla. You had a big bit about OpenClaw today in your presentation — meanwhile, a huge thing driving the Vera chips, for example, we talk about agents, is what’s happening with say, Claude Code and happening with Codex from OpenAI. Am I right to tease out a consistent element here, and your investment in your open source models goes with that, where you’re happy to supply the leading provider, or the inventor in a space with chips, but then you’re going to fast follow what they do for everyone else that is threatened by them? So you simultaneously broaden your customer base, you’re not just dependent on the leaders, but then also the leaders are helping you sell to everyone else because they’re worried about being left behind. JH: No, nothing like that. We’re at the frontier on so many different domains. In a lot of ways, we are the leader in many of these domains, but we never turn them into products. We’re a technology stack and so we have to be at the frontier, we have to be the world leader of the technology stack, but we’re not a solutions manufacturer, we’re not a service provider. And so that’s number one. Will that always be the case? JH: Yeah, always be the case. There’s no reason to, and we’re delighted not to. And so we create all this technology, we make it available to everybody. Well, it’s funny though, if you go back to like your boards, for example, like the products you ship, more and more of that, there’s what, 30,000 specific SKUs in a rack today or something like that. More and more of those are defined by you, “This is what it’s going to be”, in part to make it easier to assemble, all those sorts of pieces. Is there a bit where that’s gonna happen on the software side too, as you talk about those vertical bits and your open source model? JH: We create a thing vertically and then we open it horizontally and so everybody could use whatever piece they would like. As long as they’re running on Nvidia chips? JH: Whatever piece they would like, they don’t have to use all Nvidia chips, they don’t have to use all Nvidia software. We have to build it vertically, we have to integrate it vertically and optimize it vertically. But afterwards, we give them source, we give them — they just figure out how they want to do it. Do you think Nvidia can actually produce and keep up in terms of having a frontier model that can win that space or be a necessary provider of that space given that folks like Meta seem to have fallen off or the alternative is, seems to be by and large Chinese models. JH: Winning that space is not important to us. Right, well important not in terms of winning, but important in terms of there needs to be an open source frontier model, so if not you, then who? JH: That’s right, that’s right, somebody has to create open source models and Nvidia has a real capability in doing so. Whenever we create these open source models, we also learn a lot about the computation. Was that a bit of a problem with Blackwell? I’ve heard mutters that the training runs were maybe a little more difficult than they were sort of previously. JH: The challenge with Blackwell was 100% NVLink 72, NVLink 72 was backbreaking work. And it was the only time that I thanked the audience for working with us. Yeah, I noticed when you said that today, it came across as very sincere. JH: Yeah, because we tortured everybody, but everybody loves it now. This is the second time we’ve had a chance to talk in person, and my takeaway when I met you previously in Taipei was the extent that Nvidia still feels like a small company. Are you worried about getting stretched too thin, or do you still think you have sort of that CUDA-esque flywheel where, “It looks like we’re doing a lot, we’re just kind of doing the same thing over and over again?”. JH: The reason why Nvidia can move so fast is because we always have a unifying theory for the company, and that’s my job, I need to come up with a unifying theory for what’s important and why things connect together and how they connect together and then create an organization, an organism that’s really, really good at delivering on that unifying theory. And so the unifying theory for Nvidia is actually fairly simple. On the one hand, we have the computing platform, the software platform that’s related to CUDA-X . On the other hand, we’re a computing systems company, we optimize things vertically, we apply extreme co-design across the stack and all the different components of a computer and now that computer is a platform of ours and we integrate that platform into all the clouds and to all the OEMs and then we have another platform that’s now the data center platform, or the AI factory platform. So once you have a unifying theory about what Nvidia builds and how it goes about doing it — and I used the keynote to kind of tell that story even partly to our own employees. That’s what it felt like. That whole first hour of the keynote felt like you talking to your employees, reminding them of what you do. JH: It’s important that we’re always constantly reminded of what’s important to us and AI is important to us, but of course CUDA-X and all of the solvers and all of the applications that we can accelerate is really important to us. Thank you very much. JH: Thank you. It’s great to see you, Ben. Keep up the good work. This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery . The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly. Thanks for being a supporter, and have a great day!

0 views
Lalit Maganti 1 months ago

syntaqlite: high-fidelity devtools that SQLite deserves

Most SQL tools treat SQLite as a “flavor” of a generic SQL parser. They approximate the language, which means they break on SQLite-exclusive features like virtual tables , miss syntax like UPSERT , and ignore the 22 compile-time flags that change the syntax SQLite accepts. So I built syntaqlite : an open-source parser, formatter, validator, and LSP built directly on SQLite’s own Lemon-generated grammar. It sees SQL exactly how SQLite sees it, no matter which version of SQLite you’re using or which feature flags you compiled with. It ships as a CLI , VS Code extension , Claude Code LSP plugin , and C / Rust libraries. There’s also a web playground which you can try now: paste any SQLite SQL and see parsing, formatting, and validation live in the browser, no install needed. Full documentation is available here . Here’s syntaqlite in action: Formatting with the CLI Validation with the CLI

1 views
Krebs on Security 2 months ago

Microsoft Patch Tuesday, March 2026 Edition

Microsoft Corp. today pushed security updates to fix at least 77 vulnerabilities in its Windows operating systems and other software. There are no pressing “zero-day” flaws this month (compared to February’s five zero-day treat), but as usual some patches may deserve more rapid attention from organizations using Windows. Here are a few highlights from this month’s Patch Tuesday. Image: Shutterstock, @nwz. Two of the bugs Microsoft patched today were publicly disclosed previously. CVE-2026-21262 is a weakness that allows an attacker to elevate their privileges on SQL Server 2016 and later editions. “This isn’t just any elevation of privilege vulnerability, either; the advisory notes that an authorized attacker can elevate privileges to sysadmin over a network,” Rapid7’s Adam Barnett said. “The CVSS v3 base score of 8.8 is just below the threshold for critical severity, since low-level privileges are required. It would be a courageous defender who shrugged and deferred the patches for this one.” The other publicly disclosed flaw is CVE-2026-26127 , a vulnerability in applications running on .NET . Barnett said the immediate impact of exploitation is likely limited to denial of service by triggering a crash, with the potential for other types of attacks during a service reboot. It would hardly be a proper Patch Tuesday without at least one critical Microsoft Office exploit, and this month doesn’t disappoint. CVE-2026-26113 and CVE-2026-26110 are both remote code execution flaws that can be triggered just by viewing a booby-trapped message in the Preview Pane. Satnam Narang at Tenable notes that just over half (55%) of all Patch Tuesday CVEs this month are privilege escalation bugs, and of those, a half dozen were rated “exploitation more likely” — across Windows Graphics Component, Windows Accessibility Infrastructure, Windows Kernel, Windows SMB Server and Winlogon. These include: – CVE-2026-24291 : Incorrect permission assignments within the Windows Accessibility Infrastructure to reach SYSTEM (CVSS 7.8) – CVE-2026-24294 : Improper authentication in the core SMB component (CVSS 7.8) – CVE-2026-24289 : High-severity memory corruption and race condition flaw (CVSS 7.8) – CVE-2026-25187 : Winlogon process weakness discovered by Google Project Zero (CVSS 7.8). Ben McCarthy , lead cyber security engineer at Immersive , called attention to CVE-2026-21536 , a critical remote code execution bug in a component called the Microsoft Devices Pricing Program. Microsoft has already resolved the issue on their end, and fixing it requires no action on the part of Windows users. But McCarthy says it’s notable as one of the first vulnerabilities identified by an AI agent and officially recognized with a CVE attributed to the Windows operating system. It was discovered by XBOW , a fully autonomous AI penetration testing agent. XBOW has consistently ranked at or near the top of the Hacker One bug bounty leaderboard for the past year. McCarthy said CVE-2026-21536 demonstrates how AI agents can identify critical 9.8-rated vulnerabilities without access to source code. “Although Microsoft has already patched and mitigated the vulnerability, it highlights a shift toward AI-driven discovery of complex vulnerabilities at increasing speed,” McCarthy said. “This development suggests AI-assisted vulnerability research will play a growing role in the security landscape.” Microsoft earlier provided patches to address nine browser vulnerabilities, which are not included in the Patch Tuesday count above. In addition, Microsoft issued a crucial out-of-band (emergency) update on March 2 for Windows Server 2022 to address a certificate renewal issue with passwordless authentication technology Windows Hello for Business. Separately, Adobe shipped updates to fix 80 vulnerabilities — some of them critical in severity — in a variety of products , including Acrobat and Adobe Commerce . Mozilla Firefox v. 148.0.2 resolves three high severity CVEs. For a complete breakdown of all the patches Microsoft released today, check out the SANS Internet Storm Center’s Patch Tuesday post . Windows enterprise admins who wish to stay abreast of any news about problematic updates, AskWoody.com is always worth a visit. Please feel free to drop a comment below if you experience any issues apply this month’s patches.

0 views
Filippo Valsorda 2 months ago

Turn Dependabot Off

Dependabot is a noise machine. It makes you feel like you’re doing work, but you’re actually discouraging more useful work. This is especially true for security alerts in the Go ecosystem. I recommend turning it off and replacing it with a pair of scheduled GitHub Actions, one running govulncheck, and the other running your test suite against the latest version of your dependencies. On Tuesday, I published a security fix for filippo.io/edwards25519 . The method would produce invalid results if the receiver was not the identity point. A lot of the Go ecosystem depends on filippo.io/edwards25519, mostly through github.com/go-sql-driver/mysql (228k dependents only on GitHub). Essentially no one uses . Yesterday, Dependabot opened thousands of PRs against unaffected repositories to update filippo.io/edwards25519. These PRs were accompanied by a security alert with a nonsensical, made up CVSS v4 score and by a worrying 73% compatibility score , allegedly based on the breakage the update is causing in the ecosystem. Note that the diff between v1.1.0 and v1.1.1 is one line in the method no one uses . We even got one of these alerts for the Wycheproof repository, which does not import the affected filippo.io/edwards25519 package at all . Instead, it only imports the unaffected filippo.io/edwards25519/field package. We have turned Dependabot off. But isn’t this toil unavoidable, to prevent attackers from exploiting old vulnerabilities in your dependencies? Absolutely not! Computers are perfectly capable of doing the work of filtering out these irrelevant alerts for you. The Go Vulnerability Database has rich version, package, and symbol metadata for all Go vulnerabilities. Here’s the entry for the filippo.io/edwards25519 vulnerability , also available in standard OSV format . Any decent vulnerability scanner will at the very least filter based on the package, which requires a simple . This already silences a lot of noise, because it’s common and good practice for modules to separate functionality relevant to different dependents into different sub-packages. 1 For example, it would have avoided the false alert against the Wycheproof repository. If you use a third-party vulnerability scanner, you should demand at least package-level filtering. Good vulnerability scanners will go further, though, and filter based on the reachability of the vulnerable symbol using static analysis. That’s what govulncheck does! govulncheck noticed that my project indirectly depends on filippo.io/edwards25519 through github.com/go-sql-driver/mysql, which does not make the vulnerable symbol reachable, so it chose not to notify me. If you want, you can tell it to show the package- and module-level matches. It’s easy to integrate govulncheck into your processes or scanners, either using the CLI or the golang.org/x/vuln/scan Go API. You can replace Dependabot security alerts with this GitHub Action. It will run every day and only notify you if there is an actual vulnerability you should pay attention to. False positive alerts are not only a waste of time, they also reduce security by causing alert fatigue and making proper triage impractical. A security vulnerability should be assessed for its impact: production might need to be updated, secrets rotated, users notified! A business-as-usual dependency bump is a woefully insufficient remediation for an actual vulnerability, but it’s the only practical response to the constant stream of low-value Dependabot alerts. This is why as Go Security Team lead back in 2020–2021 I insisted the team invest in staffing the Go Vulnerability Database and implement a vulnerability scanner with static analysis filtering. The govulncheck Action will not automatically open a PR for you, and that’s a good thing! Now that security alerts are not mostly noise, you can afford to actually look at them and take them seriously, including any required remediation. Noisy vulnerability scanners also impact the open source ecosystem. I often get issues and PRs demanding I update the dependencies of my projects due to vulnerabilities that don’t affect them, because someone’s scanner is failing to filter them. That’s extra toil dropped at the feet of open source maintainers, which is unsustainable. The maintainer’s responsibility is making sure projects are not affected by security vulnerabilities. The responsibility of scanning tools is making sure they don’t disturb their users with false positives. The other purpose of Dependabot is to keep dependencies up to date, regardless of security vulnerabilities. Your practices and requirements will vary, but I find this misguided, too. Dependencies should be updated according to your development cycle, not the cycle of each of your dependencies. For example you might want to update dependencies all at once when you begin a release development cycle, as opposed to when each dependency completes theirs. There are two benefits to quick updates, though: first, you can notice and report (or fix) breakage more rapidly, instead of being stalled by an incompatibility that could have been addressed a year prior; second, you reduce your patch delta in case you need to update due to a security vulnerability, reducing the risk of having to rush through a refactor or unrelated fixes. You can capture both of those benefits without actually updating the dependencies by simply running CI against the latest versions of your dependencies every day. You just need to run before your test suite. In the npm ecosystem, you just run instead of . This way, you will still be alerted quickly of any potential issues, without having to pay attention to unproblematic updates, which you can defer to whenever fits your project best. This is a lot safer, too, because malicious code recently added to a dependency will not rapidly reach users or production, but only CI. Supply chain attacks have a short half-life! You can further mitigate the risk by using a CI sandboxing mechanism like geomys/sandboxed-step , which uses gVisor to remove the ambient authority that GitHub Actions grants every workflow, including supposedly read-only ones . For more spicy open source opinions, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @[email protected] . The Tevere has overflowed its lower banks, so a lot of previously familiar landscapes have changed slightly, almost eerily. This is the first picture I took after being able to somewhat safely descend onto (part of) the river’s banks. My work is made possible by Geomys , an organization of professional Go maintainers, which is funded by Ava Labs , Teleport , Tailscale , and Sentry . Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement .) Here are a few words from some of them! Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews. Ava Labs — We at Ava Labs , maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network ), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team. This also makes it possible to prune the tree of dependencies only imported by packages that are not relevant to a specific dependent, which has a large security benefit.  ↩ This also makes it possible to prune the tree of dependencies only imported by packages that are not relevant to a specific dependent, which has a large security benefit.  ↩

0 views
Evan Schwartz 2 months ago

PSA: Your SQLite Connection Pool Might Be Ruining Your Write Performance

Update (Feb 18, 2026): After a productive discussion on Reddit and additional benchmarking , I found that the solutions I originally proposed (batched writes or using a synchronous connection) don't actually help. The real issue is simpler and more fundamental than I described: SQLite is single-writer, so any amount of contention at the SQLite level will severely hurt write performance. The fix is to use a single writer connection with writes queued at the application level, and a separate connection pool for concurrent reads. The original blog post text is preserved below, with retractions and updates marked accordingly. My apologies to the SQLx maintainers for suggesting that this behavior was unique to SQLx. Write transactions can lead to lock starvation and serious performance degradation when using SQLite with SQLx , the popular async Rust SQL library. In retrospect, I feel like this should have been obvious, but it took a little more staring at suspiciously consistent "slow statement" logs than I'd like to admit, so I'm writing it up in case it helps others avoid this footgun. SQLite is single-writer. In WAL mode, it can support concurrent reads and writes (or, technically "write" singular), but no matter the mode there is only ever one writer at a time. Before writing, a process needs to obtain an EXCLUSIVE lock on the database. If you start a read transaction with a SELECT and then perform a write in the same transaction, the transaction will need to be upgraded to write transaction with an exclusive lock: A read transaction is used for reading only. A write transaction allows both reading and writing. A read transaction is started by a SELECT statement, and a write transaction is started by statements like CREATE, DELETE, DROP, INSERT, or UPDATE (collectively "write statements"). If a write statement occurs while a read transaction is active, then the read transaction is upgraded to a write transaction if possible. ( source ) Transactions started with or also take the exclusive write lock as soon as they are started. Transactions in SQLx look like this: This type of transaction where you read and then write is completely fine. The transaction starts as a read transaction and then is upgraded to a write transaction for the . Update: This section incorrectly attributes the performance degradation to the interaction between async Rust and SQLite. The problem is actually that any contention for the EXCLUSIVE lock at the SQLite level, whether from single statements or batches, will hurt write performance. The problem arises when you call within a write transaction. For example, this could happen if you call multiple write statements within a transaction: This code will cause serious performance degradation if you have multiple concurrent tasks that might be trying this operation, or any other write, at the same time. When the program reaches the first statement, the transaction is upgraded to a write transaction with an exclusive lock. However, when you call , the task yields control back to the async runtime. The runtime may schedule another task before returning to this one. The problem is that this task is now holding an exclusive lock on the database. All other writers must wait for this one to finish. If the newly scheduled task tries to write, it will simply wait until it hits the and returns a busy timeout error. The original task might be able to make progress if no other concurrent writers are scheduled before it, but under higher load you might continuously have new tasks that block the original writer from progressing. Starting a transaction with will also cause this problem, because you will immediately take the exclusive lock and then yield control with . In practice, you can spot this issue in your production logs if you see a lot of SQLx warnings that say where the time is very close to your (which is 5 seconds by default). This is the result of other tasks being scheduled by the runtime and then trying and failing to obtain the exclusive lock they need to write to the database while being blocked by a parked task. SQLite's concurrency model (in WAL mode) is many concurrent readers with exactly one writer. Mirroring this architecture at the application level provides the best performance. Instead of a single connection pool, where connections may be upgraded to write at any time, use two separate pools: With this setup, write transactions serialize within the application. Tasks will queue waiting for the single writer connection, rather than all trying to obtain SQLite's EXCLUSIVE lock. In my benchmarks , this approach was ~20x faster than using a single pool with multiple connections: An alternative to separate pools is wrapping writes in a Mutext , which achieves similar performance (95ms in the benchmarks). However, separate pools make the intent clearer and, if the reader pool is configured as read-only, prevent accidentally issuing a write on a reader connection. Having separate pools works when reads and writes are independent, but sometimes you need to atomically read and then write based on it: Sending this transaction to the single write connection is fine if the read is extremely fast, such as a single lookup by primary key. However, if your application requires expensive reads that must precede writes in a single atomic transaction, the shared connection pool with moderate concurrency might outperform a single writer. Retraction: Benchmarking showed that batched writes perform no better than the naive loop under concurrency, because 50 connections still contend for the write lock regardless of whether each connection issues 100 small s or one large . QueryBuilder is still useful for reducing per-statement overhead, but it does not fix the contention problem. We could safely replace the example code above with this snippet that uses a bulk insert to avoid the lock starvation problem: Note that if you do this with different numbers of values, you should call . By default, SQLx caches prepared statements. However, each version of the query with a different number of arguments will be cached separately, which may thrash the cache. Retraction: Benchmarking showed that this did not actually improve performance. Unfortunately, the fix for atomic writes to multiple tables is uglier and potentially very dangerous. To avoid holding an exclusive lock across an , you need to use the interface to execute a transaction in one shot: However, this can lead to catastrophic SQL injection attacks if you use this for user input, because does not support binding and sanitizing query parameters. Note that you can technically run a transaction with multiple statements in a call but the docs say: The query string may only contain a single DML statement: SELECT, INSERT, UPDATE, DELETE and variants. The SQLite driver does not currently follow this restriction, but that behavior is deprecated. If you find yourself needing atomic writes to multiple tables with SQLite and Rust, you might be better off rethinking your schema to combine those tables or switching to a synchronous library like with a single writer started with . Update: the most useful change would actually be making a distinction between a and a . Libraries like SQLx could enforce the distinction at compile time or runtime by inspecting the queries for the presence of write statements, or the could be configured as read-only. Maybe, but it probably won't. If SQLx offered both a sync and async API (definitely out of scope) and differentiated between read and write statements, a write could be like , which would prevent it from being held across an point. However, SQLx is not an ORM and it probably isn't worth it for the library to have different methods for read versus write statements. Without that, there isn't a way to prevent write transaction locks from being held across s while allowing safe read transactions to be used across s. So, in lieu of type safety to prevent this footgun, I wrote up this blog post and this pull request to include a warning about this in the docs. Discuss on r/rust and Hacker News .

0 views
Stone Tools 2 months ago

dBASE on the Kaypro II

The world that might have been has been discussed at length. In one possible world, Gary Kildall's CP/M operating system was chosen over MS-DOS to drive IBM's then-new "Personal Computer." As such, Bill Gates's hegemony over the trajectory of computing history never happened. Kildall wasn't constantly debunking the myth of an airplane joyride which denied him Microsoft-levels of industry dominance. Summarily, he'd likely be alive and innovating the industry to this day. Kildall's story is pitched as a "butterfly flaps its wings" inflection point that changed computing history. The truth is, of course, there were many points along our timeline which led to Kildall's fade and untimely death. Rather, I'd like to champion what Kildall did . Kildall did co-host Computer Chronicles with Stewart Chiefet for seven years. Kildall did create the first CD-ROM encyclopedia. Kildall did develop (and coin the term for) what we know today as the BIOS. Kildall did create CP/M, the first wide-spread, mass-market, portable operating system for microcomputers, possible because of said BIOS. CP/M did dominate the business landscape until the DOS era, with 20,000+ software titles in its library. Kildall did sell his company, Digital Research Inc., to Novell for US $120M. Kildall did good . Systems built to run Kildall's CP/M were prevalent, all built around the same 8-bit limits: an 8080 or Z80 processor and up to 64KB RAM. The Osborne 1, a 25lb (11kg) "portable" which sold for $1795 ($6300 in 2026), was the talk of the West Coast Computer Faire in 1981. The price was sweet, considering it came bundled with MSRP $1500 in software, including Wordstar and Supercalc . Andy Kay's company, Non-Linear Systems, debuted the Kaypro II (the "I" only existed in prototype form) the following year at $1595, $200 less (and four pounds heavier) than the Osborne. Though slower than an Osborne, it arguably made it easier to do actual work, with a significantly larger screen and beefier floppy disk capacity. Within the major operating system of its day, on popular hardware of its day, ran the utterly dominant relational database software of its day. PC Magazine , February 1984, said, "Independent industry watchers estimate that dBASE II enjoys 70 percent of the market for microcomputer database managers." Similar to past subjects HyperCard and Scala Multimedia , Wayne Ratcliff's dBASE II was an industry unto itself, not just for data-management, but for programmability, a legacy which lives on today as xBase. Strangely enough, dBASE also decided to attach "II" to its first release; a marketing maneuver to make the product appear more advanced and stable at launch. I'm sure the popularity of the Apple II had nothing to do with anyone's coincidentally similar roman numeral naming scheme whatsoever. Written in assembly, dBASE II squeezed maximum performance out of minimal hardware specs. This is my first time using both CP/M and dBASE. Let's see what made this such a power couple. I'm putting on my tan suit and wide brown tie for this one. As the owner of COMPUTRON/X, a software retail shop, I'm in Serious Businessman Mode™. I need to get inventory under control, snake the employee toilet, do profit projections, and polish a mind-boggling amount of glass and chrome. For now, I'll start with inventory and pop in this laserdisc to begin my dBASE journey. While the video is technically for 16-bit dBASE III , our host, Gentry Lee of Jet Propulsion Laboratory, assures us that 8-bit dBASE II users can do everything we see demonstrated, with a few interface differences. This is Gail Fisher, a smarty pants who thinks she's better than me. Tony Lima, in his book dBASE II for Beginners , concurs with the assessment of dBASE II and III 's differences being mostly superficial. Lima's book is pretty good, but I'm also going through Mastering dBASE II The Easy Way , by Paul W. Heiser, the official Kaypro dBASE II Manual, and dBase II for the First Time User by Alan Freedman. That last one is nicely organized by common tasks a dBASE user would want to do, like "Changing Your Data" and "Modifying Your Record Structure." I find I return to Freedman's book often. As I understand my time with CP/M, making custom bootable diskettes was the common practice. dBASE II is no different, and outright encourages this, lest we risk losing US$2000 (in 2026 dollars) in software. Being of its time and place in computing history, dBASE uses the expected UI. You know it, you love it, it's "a blinking cursor," here called "the dot prompt." While in-program is available, going through the video, books, and manual is a must. dBASE pitches the dot prompt as a simple, English language interface to the program. for example sets the default save drive to the B: drive. You could never intuit that by what it says, nor guess that it even needs to be done, but when you know how it works, it's simple to remember. It's English only in the sense that English-like words are strung together in English-like order. That said, I kind of like it? creates a new database, prompting first for a database name, then dropping me into a text entry prompt to start defining fields. This is a nice opportunity for me to feign anger at The Fishers, the family from the training video. Fancy-pants dBASE III has a more user-friendly entry mode, which requires no memorization of field input parameters. Prompts and on-screen help walk Gail through the process. In dBASE II , a field is defined by a raw, comma-delimited string. Field definitions must be entered in the order indicated on-screen. is the data type for the field, as string, number, or boolean. This is set by a one-letter code which will never be revealed at any time, even when it complains that I've used an invalid code. Remind me to dog-ear that page of the manual. For my store, I'm scouring for games released for CP/M. Poking through Moby Games digs up roughly 30 or so commercial releases, including two within the past five years . Thanks, PunyInform ! My fields are defined thusly, called up for review by the simple command. The most frustrating part about examining database software is that it doesn't do anything useful until I've entered a bunch of data. At this stage in my learning, this is strictly a manual process. Speaking frankly, this part blows, but it also blows for Gail Fisher, so my schadenfreude itch is scratched. dBASE does its best to minimize the amount of keyboard shenanigans during this process, and in truth data entry isn't stressful. I can pop through records fairly quickly, if the raw data is before me. The prompt starts at the first field and (not !) moves to the next. If entry to a field uses the entire field length (as defined by me when setting up the fields earlier), the cursor automatically jumps to the next field with a PC-speaker beep. I guess dBASE is trying to "help," but when touch typing I'm looking at my data source, not the screen. I don't know when I'm about to hit the end of a field, so I'm never prepared when it switches input fields and makes that ugly beep. More jarring is that if the final field of a record is completely filled, the cursor "helpfully" jumps to the beginning of a new record instantly, with no opportunity to read or correct the data I just input. It's never not annoying. Gail doesn't have these issues with dBASE III and her daughter just made dinner for her. Well, I can microwave a burrito as well as anyone so I'm not jealous . I'm not. In defining the fields, I have already made two mistakes. First, I wanted to enter the critic score as a decimal value so I could get the average. Number fields, like all fields, have a "width" (the maximum number of characters/bytes to allocate to the field), but also a "decimal places" value and as I type these very words I see now my mistake. Rubber ducking works . I tricked myself into thinking "width" was for the integer part, and "decimal places" was appended to that. I see now that, like character fields, I need to think of the entire maximum possible number as being the "width." Suppose in a value we expect to record . There are 2 decimal places, and a decimal point, and a leading 0, and potentially a sign, as or . So that means the "width" should be 5, with 2 "decimal places" (of those 5). Though I'm cosplaying as a store owner, I'm apparently cosplaying as a store owner that sucks! I didn't once considered pricing! Gah, Gail is so much better at business than I am! Time to get "sorta good." Toward that end, I have my to-do list after a first pass through data entry. Modifying dBASE "structures" (the field/type definitions) can be risky business. If there is no data yet, feel free to change whatever you want. If there is pre-existing data, watch out. will at least do the common decency of warning you about the pile you're about to step into. Modifying a database structure is essentially verboten, rather we must juggle files to effect a structure change. dBASE let's us have two active files, called "work areas," open simultaneously: a and a . Modifications to these are read from or written to disk in the moment; 64K can't handle much live data. It's not quite "virtual memory" but it makes the best of a tight situation. When wanting to change data in existing records, the command sounds like a good choice, but actually winds up being more useful. will focus in on specified fields for immediate editing across all records. It's simple to through fields making changes. I could to edit everything at once, but I'm finding it safer while learning to make small incremental changes or risk losing a large body of work. Make a targeted change, save, make another change, save, etc. 0:00 / 0:03 1× I laughed every time Gentry Lee showed up, like he's living with The Fishers as an invisible house gremlin. They never acknowledge his presence, but later he eats their salad! Being a novice at dBASE is a little dangerous, and MAME has its own pitfalls. I have been conditioned over time to when I want to "back out" of a process. This shuts down MAME instantly. When it happens, I swear The Fishers are mocking me, just on the edge of my peripheral vision, while Gentry Lee helps himself to my tuna casserole. dBASE is a relational database. Well, let's be less generous and call it "relational-ish." The relational model of data was defined by Edgar F. Codd in 1969 where "relation is used here in its accepted mathematical sense." It's all set theory stuff; way over my head. Skimming past the nerd junk, in that paper he defines our go-to relationship of interest: the join. As a relational database, dBASE keeps its data arranged VisiCalc style, in rows and columns. So long as two databases have a field in common, which is defined, named, and used identically in both , the two can be "joined" into a third, new database. I've created a mini database of developer phone numbers so I can call and yell at them for bugs and subsequent lost sales. I haven't yet built up the grin-and-bear-it temperament Gail possesses toward Amanda Covington. Heads will roll! You hear me, Lebling? Blank?! 64K (less CP/M and dBASE resources) isn't enough to do an in-memory join. Rather, joining creates and writes a completely new database to disk which is the union of two databases. The implication being you must have space on disk to hold both original databases as well as the newly joined database, and also the new database cannot exceed dBASE 's 65,535 record limit after joining. In the above , means and means , so we can precisely specify fields and their work area of origin. This is more useful for doing calculations at time, like to join only records where deletes specific records, if we know the record number, like . Commands in dBASE stack, so a query can define the target for a command, as one would hope and expect in 2026. Comparisons and sub-strings can be used as well. So, rather than deleting "Infocom, Inc." we could: The command looks for the left-hand string as a case-sensitive sub-string in the right-hand string. We can be a little flexible in how data may have been input, getting around case sensitivity through booleans. Yes, we have booleans! Wait, why am I deleting any Infocom games? I love those! What was I thinking?! Once everything is marked for deletion, that's all it is: marked for deletion. It's still in the database, and on disk, until we do real-deal, non-reversible, don't-forget-undo-doesn't-exist-in-1982, destruction with . Until now, I've been using the command as a kind of ad-hoc search mechanism. It goes through every record, in sequence, finding record matches. Records have positions in the database file, and dBASE is silently keeping track of a "record pointer" at all times. This represents "the current record" and commands without a query will be applied to the currently pointed record. Typing in a number at the dot prompt moves the pointer to that record. That moves me to record #3 and display its contents. When I don't know which record has what I want, will move the pointer to the first match it finds. At this point I could that record, or to see a list of records from the located record onward. Depending on the order of the records, that may or may not be useful. Right now, the order is just "the order I typed them into the system." We need to teach dBASE different orders of interest to a stripmall retail store. While the modern reaction would be to use the command, dBASE's Sort can only create entirely new database files on disk, sorted by the desired criteria. Sort a couple of times on a large data set and soon you'll find yourself hoarding the last of new-old 5 1/4" floppy disk stock from OfficeMax, or being very careful about deleting intermediary sort results. SQL brainiacs have a solution to our problem, which dBASE can also do. An "index" is appropriate for fast lookups on our columnar data. We can index on one or more fields, remapping records to the sort order of our heart's desire. Only one index can be used at a time, but a single index can be defined against multiple fields. It's easier to show you. When I set the index to "devs" and , that sets the record pointer to the first record which matches my find. I happen to know I have seven Infocom games, so I can for fields of interest. Both indexes group Infocom games together as a logical block, but within that block Publisher order is different. Don't get confused, the actual order of files in the database is betrayed by the record number. Notice they are neither contiguous nor necessarily sequential. would rearrange them into strict numerical record order. An Index only relates to the current state of our data, so if any edits occur we need to rebuild those indexes. Please, contain your excitement. Munging data is great, but I want to understand my data. Let's suppose I need the average rating of the games I sell. I'll first need a count of all games whose rating is not zero (i.e. games that actually have a rating), then I'll need a summation of those ratings. Divide those and I'll have the average. does what it says. only works on numeric fields, and also does what it says. With those, I basically have what I need. Like deletion, we can use queries as parameters for these commands. dBASE has basic math functions, and calculated values can be stored in its 64 "memory variables." Like a programming language, named variables can be referenced by name in further calculations. Many functions let us append a clause which shoves a query result into a memory variable, though array results cannot be memorized this way. shoves arbitrary values into memory, like or . As you can see in the screenshot above, the rating of CP/M games is (of 100). Higher than I expected, to be perfectly honest. As proprietor of a hot (power of positive thinking!) software retail store, I'd like to know how much profit I'll make if I sold everything I have in stock. I need to calculate, per-record, the following but this requires stepping through records and keeping a running tally. I sure hope the next section explains how to do that! Flipping through the 1,000 pages of Kaypro Software Directory 1984 , we can see the system, and CP/M by extension, was not lacking for software. Interestingly, quite a lot was written in and for dBASE II, bespoke database solutions which sold for substantially more than dBASE itself. Shakespeare wrote, "The first thing we do, let's kill all the lawyers." Judging from these prices, the first thing we should do is shake them down for their lunch money. In the HyperCard article I noted how an entire sub-industry sprung up in its wake, empowering users who would never consider themselves programmers to pick up the development reigns. dBASE paved the way for HyperCard in that regard. As Jean-Pierre Martel noted , "Because its programming language was so easy to learn, millions of people were dBASE programmers without knowing it... dBASE brought programming power to the masses." dBASE programs are written as procedural routines called Commands, or .CMD files. dBASE helpfully includes a built-in (stripped down) text editor for writing these, though any text editor will work. Once written, a .CMD file like can be invoked by . As Martel said, I seem to have become a dBASE programmer without really trying. Everything I've learned so far hasn't just been dot prompt commands, it has all been valid dBASE code. A command at the dot prompt is really just a one-line program. Cool beans! Some extra syntax for the purpose of development include: With these tools, designing menus which add a veneer of approachability to a dBASE database are trivial to create. Commands are interpreted, not compiled (that would come later), so how were these solutions sold to lawyers without bundling a full copy of dBASE with every Command file? For a while dBASE II was simply a requirement to use after-market dBASE solutions. The 1983 release of dBASE Runtime changed that, letting a user run a file, but not edit it. A Command file bundled with Runtime was essentially transformed into a standalone application. Knowing this, we're now ready to charge 2026 US$10,000 per seat for case management and tracking systems for attorneys. Hey, look at that, this section did help me with my profit calculation troubles. I can write a Command file and bask in the glow of COMPUTRON/X's shining, profitable future. During the 8 -> 16-bit era bridge, new hardware often went underutilized as developers came to grips with what the new tools could do. Famously, Visicalc 's first foray onto 16-bit systems didn't leverage any of the expanded RAM on the IBM-PC and intentionally kept all known bugs from the 8-bit Apple II version. The word "stop gap" comes to mind. Corporate America couldn't just wait around for good software to arrive. CP/M compatibility add-ons were a relatively inexpensive way to gain instant access to thousands of battle-tested business software titles. Even a lowly Coleco ADAM could, theoretically, run WordStar and Infocom games, the thought of which kept me warm at night as I suffered through an inferior Dragon's Lair adaptation. They promised a laserdisc attachment! For US$600 in 1982 ($2,000 in 2026) your new-fangled 16-bit IBM-PC could relive the good old days of 8-bit CP/M-80. Plug in XEDEX's "Baby Blue" ISA card with its Z80B CPU and 64K of RAM and the world is your slowly decaying oyster. That RAM is also accessible in 16-bit DOS, serving dual-purpose as a memory expansion for only $40 more than IBM's own bare bones 64K board. PC Magazine' s February 1982 review seemed open to the idea of the card, but was skeptical it had long-term value. XEDEX suggested the card could someday be used as a secondary processor, offloading tasks from the primary CPU to the Z80, but never followed through on that threat, as far as I could find. Own anApple II with an 8-bit 6502 CPU but still have 8-bit Z80 envy? Microsoft offered a Z80 daughter-card with 64K RAM for US$399 in 1981 ($1,413 in 2026). It doesn't provide the 80-column display you need to really make use of CP/M software, but is compatible with such add-ons. It was Bill Gates's relationship with Gary Kildall as a major buyer of CP/M for this very card that started the whole ball rolling with IBM, Gates's purchase of QDOS, and the rise of Microsoft. A 16K expansion option could combine with the Apple II's built-in 48K memory, to get about 64K for CP/M usage. BYTE Magazine 's November 1981 review raved, "Because of the flexibility it offers Apple users, I consider the Softcard an excellent buy." Good to know! How does one add a Z80 processor to a system with no expansion slots? Shove a Z80 computer into a cartridge and call it a day, apparently. This interesting, but limited, footnote in CP/M history does what it says, even if it doesn't do it well. Compute!'s Gazette wrote, "The 64 does not make a great CP/M computer. To get around memory limitations, CP/M resorts to intensive disk access. At the speed of the 1541, this makes programs run quite slowly," Even worse for CP/M users is that the slow 1541 can't read CP/M disks. Even if it could, you're stuck in 40-column mode. How were users expected to get CP/M software loaded? We'll circle back to that a little later. At any rate, Commodore offered customers an alternative solution. Where it's older brother had to make do with a cartridge add-on, the C128 takes a different approach. To maintain backward compatible with the C64 it includes a 6510 compatible processor, the 8502. It also wants to be CP/M compatible, so it needs a Z80 processor. What to do, what to do? Maybe they could put both processors into the unit? Is that allowed? Could they do that? They could, so they did. CP/M came bundled with the system, which has a native 80-column display in CP/M mode. It is ready to go with the newer, re-programmable 1571 floppy drive. Unfortunately, its slow bus speed forces the Z80 to run at only 2MHz, slower even than a Kaypro II. Compute!'s Gazette said in their April 1985 issue, "CP/M may make the Commodore 128 a bargain buy for small businesses. The price of the Commodore 128 with the 1571 disk drive is competitive with the IBM PCjr." I predict rough times ahead for the PCjr if that's true! Atari peripherals have adorable industrial design, but were quite expensive thanks to a strange system design decision. The 8-bit system's nonstandard serial bus necessitated specialized data encoding/decoding hardware inside each peripheral, driving up unit costs. For example, the Atari 910 5 1/4" floppy drive cost $500 in 1983 (almost $2,000 in 2026) thanks to that special hardware, yet only stored a paltry 90K per disk. SWP straightened out the Atari peripheral scene with the ATR8000. Shenanigans with special controller hardware are eliminated, opening up a world of cheaper, standardized floppy drives of all sizes and capacities. It also accepts Centronics parallel and RS-232C serial devices, making tons of printers, modems, and more compatible with the Atari. The device also includes a 16K print buffer and the ability to attach up to four floppy drives without additional controller board purchases. A base ATR8000 can replace a whole stack of expensive Atari-branded add-ons, while being more flexible and performant. The saying goes, "Cheaper, better, faster. Pick any two." The ATR8000 is that rare device which delivered all three. Now, upgrade that box with its CP/M compatibility option, adding a Z80 and 64K, and you've basically bought a second computer. When plugged into the Atari, the Atari functions as a remote terminal into the unit, using whatever 40/80-column display adapter you have connected. It could also apparently function standalone, accessible through any terminal, no Atari needed. That isn't even its final form. The Co-Power-88 is a 128K or 256K PC-compatible add-on to the Z80 CP/M board. When booted into the Z80, that extra RAM can be used as a RAM disk to make CP/M fly. When booted into the 8088, it's a full-on PC running DOS or CP/M-86. Tricked out, this eight pound box would set you back US$1000 in 1984 ($3,000 in 2026), but it should be obvious why this is a coveted piece of kit for the Atari faithful to this day. For UK£399 in 1985 (£1288 in 2026; US$1750) Acorn offered a Z80 with dedicated 64K of RAM. According to the manual, the Z80 handles the CP/M software, while the 6502 in the base unit handles floppies and printers, freeing up CP/M RAM in the process. Plugged into the side of the BBC Micro, the manual suggests desk space clearance of 5 ft wide and 2 1/2 feet deep. My god. Acorn User June 1984 declared, "To sum up, Acorn has put together an excellent and versatile system that has something for everyone." I'd like to note that glowing review was almost exclusively thanks to the bundled CP/M productivity software suite. Their evaluation didn't seem to try loading off-the-shelf software, which caused me to narrow my eyes, and stroke my chin in cynical suspicion. Flip through the manual to find out about obtaining additional software, and it gets decidedly vague. "You’ll find a large and growing selection available for your Z80 personal computer, including a special series of products that will work in parallel with the software in your Z80 pack." Like the C128, the Coleco ADAM was a Z80 native machine so CP/M can work without much fuss, though the box does proclaim "Made especially for ADAM!" Since we don't have to add hardware (well, we need a floppy; the ADAM only shipped with a high-speed cassette drive), we can jump into the ecosystem for about US$65 in 1985 ($200 in 2026). Like other CP/M solutions, the ADAM really needed an 80-column adapter, something Coleco promised but never delivered. Like Dragon's Lair on laserdisc! As it stands, CP/M scrolls horizontally to display all 80 columns. This version adds ADAM-style UI for its quaint(?) roman numeral function keys. OK, CP/M is running! Now what? To be honest, I've been toying with you this whole time, dangling the catnip of CP/M compatibility. It's time to come clean and admit the dark side of these add-on solutions. There ain't no software! Even when the CPU and CP/M version were technically compatible, floppy disc format was the sticking point for getting software to run any given machine. For example, the catalog for Kaypro software in 1984 is 896 pages long. That is all CP/M software and all theoretically compatible with a BBC Micro running CP/M. However, within that catalog, everything shipped expressly on Kaypro compatible floppy discs. Do you think a Coleco ADAM floppy drive can read Kaypro discs? Would you be even the tiniest bit shocked to learn it cannot? Kaypro enthusiast magazine PRO illustrates the issue facing consumers back then. Let's check in on the Morrow Designs (founded by Computer Chronicles sometimes co-host George Morrow!) CP/M system owners. How do they fare? OK then, what about that Baby Blue from earlier? The Microsoft Softcard must surely have figured something out. The Apple II was, according to Practical Computing , "the most widespread CP/M system" of its day. Almost every product faced the same challenge . On any given CP/M-80 software disk, the byte code is compatible with your Z8o, if your floppy drive can read the diskette. You couldn't just buy a random CP/M disk, throw it into a random CP/M system, and expect it to work, which would have been a crushing blow to young me hoping to play Planetfall on the ADAM. So what could be done? There were a few options, none of them particularly simple or straightforward, especially to those who weren't technically-minded. Some places offered transfer services. XEDEX, the makers of Baby Blue, would do it for $100 per disk . I saw another listing for a similar service (different machine) at $10 per disk. Others sold the software pre-transferred, as noted on a Coleco ADAM service flyer. A few software solutions existed, including Baby Blue's own Convert program, which shipped with their card and "supports bidirectional file transfer between PC-DOS and popular CP/M disk formats." They also had the Baby Blue Conversion Software which used emulation to "turn CP/M-80 programs into PC-DOS programs for fast, efficient execution on Baby Blue II." Xeno-Copy, by Vertex Systems, could copy from over 40+ disk formats onto PC-DOS for US$99.50 ($313 in 2026); their Plus version promised cross-format read/write capabilities. Notably, Apple, Commodore, Apricot, and other big names are missing from their compatibility list. The Kermit protocol , once installed onto a CP/M system disk, could handle cross-platform serial transfers, assuming you had the additional hardware necessary. "CP/M machines use many different floppy disk formats, which means that one machine often cannot read disks from another CP/M machine, and Kermit is used as part of a process to transfer applications and data between CP/M machines and other machines with different operating systems." The Catch-22 of it all is that you have to get Kermit onto your CP/M disk in the first place. Hand-coding a bare-bones Kermit protocol (CP/M ships with an assembler) for the purposes of getting "real" Kermit onto your system so you could then transfer the actual software you wanted in the first place, was a trick published in the Kermit-80 documentation . Of course, this all assumes you know someone with the proper CP/M setup to help; basically, you're going to need to make friends. Talk to your computer dealer, or better yet, get involved in a local CP/M User's Group. It takes a village to move Wordstar onto a C64. I really enjoyed my time learning dBASE II and am heartened by the consistency of its commands and the clean interaction between them. When I realized that I had accidentally learned how to program dBASE , that was a great feeling. What I expected to be a steep learning curve wasn't "steep" per se, but rather just intimidating. That simple, blinking cursor, can feel quite daunting at the first step, but each new command I learned followed a consistent pattern. Soon enough, simple tools became force multipliers for later tools. The more I used it, the more I liked it. dBASE II is uninviting, but good. On top of that, getting data out into the real world is simple, as you'll see below in "Sharpening the Stone." I'm not locked in. So what keeps me from being super enthusiastic about the experience? It is CP/M-80 which gives me pause. The 64K memory restriction, disk format shenanigans, and floppy disk juggling honestly push me away from that world except strictly for historical investigations. Speaking frankly, I don't care for it. CP/M-86 running dBASE III+ could probably win me over, though I would probably try DR-DOS instead. Memory constraints would be essentially erased, DOSBox-X is drag-and-drop trivial to move files in and out of the system, and dBASE III+ is more powerful while also being more user-friendly. Combine that with Clipper , which can compile dBASE applications into standalone .exe files, and there's powerful utility to be had . By the way, did you know dBASE is still alive ? Maybe. Kinda? Hard to say. The latest version is dBASE 2019 (not a typo!), but the site is unmaintained and my appeal to their LinkedIn for a demo has gone unanswered. Its owner, dBase LTD, sells dBASE Classic which is dBASE V for DOS running in DOSBox, a confession they know they lost the plot, I'd humbly suggest. An ignominious end to a venerable classic. Ways to improve the experience, notable deficiencies, workarounds, and notes about incorporating the software into modern workflows (if possible). When working with CP/M disk images, get to know cpmtools . This is a set of command line utilities for creating, viewing, and modifying CP/M disk images. The tools mostly align with Unix commands, prefixed with Those are the commands I wound up using with regularity. If your system of choice is a "weirdo system" you may be restricted in your disk image/formatting choices; these instructions may be of limited or no help. knows about Kaypro II disk layouts via diskdefs. This Github fork makes it easy to browse supported types. Here's what I did. Now that you can pull data out of CP/M, here's how to make use of it. Kaypro II emulation running in MAME. Default setup includes Dual floppies Z80 CPU at 2.4MHz dBase II v2.4 See "Sharpening the Stone" at the end of this post for how to get this going. Personally, I found this to be a tricky process to learn. Change the of the rating field and add in that data. Add pricing fields and related data. Add more games. and allow decision branching does iterations and will grab a character or string from the user prints text to screen at a specific character position and give control over system memory will run an assembly routine at a known memory location For this article I specifically picked a period-authentic combo of Kaypro II + CP/M 2.2 + dBASE II 2.4. You don't have to suffer my pain! CP/M-86 and dBASE III+ running in a more feature-rich emulator would be a better choice for digging into non-trivial projects. I'm cold on MAME for computer emulation, except in the sense that in this case it was the fastest option for spinning up my chosen tools. It works, and that's all I can say that I enjoyed. That's not nothing! I find I prefer the robust settings offered in products like WinUAE, Virtual ADAM, VICE , and others. Emulators with in-built disk tools are a luxury I have become addicted to. MAME's interface is an inelegant way to manage hardware configurations and disk swapping. MAME has no printer emulation, which I like to use for a more holistic retro computing experience. Getting a working, trouble-free copy of dBASE II onto a Kaypro II compatible disk image was a non-trivial task. It's easier now that I know the situation, but it took some cajoling. I had to create new, blank disks, and copy CP/M and dBASE over from other disk images. Look below under "Getting Your Data into the Real World" to learn about and how it fits into the process. Be careful of modern keyboard conventions, especially wanting to hit to cancel commands. In MAME this will hard quit the emulator with no warning! Exported data exhibited strange artifacts: The big one: it didn't export any "logical" (boolean) field values from my database. It just left that field blank on all records. Field names are not exported. Garbage data found after the last record; records imported fine. On Linux and Windows (via WSL) install thusly : view the contents of a CP/M disk image. Use the flag to tell it the format of the disk, like for the Kaypro II. : format a disk image with a CP/M file system : copy files to/from other disk or to the host operating system : remove files from a CP/M disk image : for making new, blank disk image files (still needs to be formatted) : makes a blank disk image to single-sided, double-density specification : formats that blank image for the Kaypro II : copies "DBASE.COM" from the current directory of the host operating system into the Kaypro II disk image. : displays the contents of the disk : copies "FILE.TXT" from the disk image into the current directory of the host operating system (i.e. ) dBASE has built-in exporting functionality, so long as you use the extension when saving ( in dBASE lingo). That creates a bog-standard ASCII text file, each record on its own line, comma-delimited (and ONLY comma-delimited). It is not Y2K compatible, if you're hoping to record today's date in a field. I tackled this a bit in the Superbase post . It is probably possible to hack up a Command file to work around this issue, since dates are just strings in dBASE . dBASE II doesn't offer the relational robustness of SQL. Many missing, useful tools could be built in the xBase programming language. It would be significant work in some cases; maybe not worth it or consider if you can do without those. Your needs may exceed what CP/M-80 hardware can support; its 8-bit nature is a limiting factor in and of itself. If you have big plans , consider dBASE III+ on DOS to stretch your legs. (I read dBASE IV sucks) The user interface helps at times, and is opaque at other times. This can be part of the fun in using these older systems, mastering esoterica for esoterica's sake, but may be a bridge too far for serious work of real value. Of course, when discussing older machines we are almost always excluding non-English speakers thanks to the limitations of ASCII. The world just wasn't as well-connected at the time.

0 views

Premium: The Hater's Guide To Microsoft

Have you ever looked at something too long and felt like you were sort of seeing through it? Has anybody actually looked at a company this much in a way that wasn’t some sort of obsequious profile of a person who worked there? I don’t mean this as a way to fish for compliments — this experience is just so peculiar, because when you look at them hard enough, you begin to wonder why everybody isn’t just screaming all the time.  Yet I really do enjoy it. When you push aside all the marketing and the interviews and all that and stare at what a company actually does and what its users and employees say, you really get a feel of the guts of a company. I’m enjoying it. The Hater’s Guides are a lot of fun, and I’m learning all sorts of things about the ways in which companies try to hide their nasty little accidents and proclivities.  Today, I focus on one of the largest.  In the last year I’ve spoken to over a hundred different tech workers, and the ones I hear most consistently from are the current and former victims of Microsoft, a company with a culture in decline, in large part thanks to its obsession with AI. Every single person I talk to about this company has venom on their tongue, whether they’re a regular user of Microsoft Teams or somebody who was unfortunate to work at the company any time in the last decade. Microsoft exists as a kind of dark presence over business software and digital infrastructure. You inevitably have to interact with one of its products — maybe it’s because somebody you work with uses Teams, maybe it’s because you’re forced to use SharePoint, or perhaps you’re suffering at the hands of PowerBI — because Microsoft is the king of software sales. It exists entirely to seep into the veins of an organization and force every computer to use Microsoft 365, or sit on effectively every PC you use, forcing you to interact with some sort of branded content every time you open your start menu . This is a direct results of the aggressive monopolies that Microsoft built over effectively every aspect of using the computer, starting by throwing its weight around in the 80s to crowd out potential competitors to MS-DOS and eventually moving into everything including cloud compute, cloud storage, business analytics, video editing, and console gaming, and I’m barely a third through the list of products.  Microsoft uses its money to move into new markets, uses aggressive sales to build long-term contracts with organizations, and then lets its products fester until it’s forced to make them better before everybody leaves, with the best example being the recent performance-focused move to “ rebuild trust in Windows ” in response to the upcoming launch of Valve’s competitor to the Xbox (and Windows gaming in general), the Steam Machine . Microsoft is a company known for two things: scale and mediocrity. It’s everywhere, its products range from “okay” to “annoying,” and virtually every one of its products is a clone of something else.  And nowhere is that mediocrity more obvious than in its CEO. Since taking over in 2014, CEO Satya Nadella has steered this company out of the darkness caused by aggressive possible chair-thrower Steve Ballmer , transforming from the evils of stack ranking to encouraging a “growth mindset” where you “believe your most basic abilities can be developed through dedication and hard work.” Workers are encouraged to be “learn-it-alls” rather than “know-it-alls,” all part of a weird cult-like pseudo-psychology that doesn’t really ring true if you actually work at the company .  Nadella sells himself as a calm, thoughtful and peaceful man, yet in reality he’s one of the most merciless layoff hogs in known history. He laid off 18,000 people in 2014 months after becoming CEO, 7,800 people in 2015 , 4,700 people in 2016 , 3,000 people in 2017 , “hundreds” of people in 2018 , took a break in 2019, every single one of the workers in its physical stores in 2020 along with everybody who worked at MSN , took a break in 2021, 1,000 people in 2022 , 16,000 people in 2023 , 15,000 people in 2024 and 15,000 people in 2025 .  Despite calling for a “ referendum on capitalism ” in 2020 and suggesting companies “grade themselves” on the wider economic benefits they bring to society, Nadella has overseen an historic surge in Microsoft’s revenues — from around $83 billion a year when he joined in 2014 to around $300 billion on a trailing 12-month basis — while acting in a way that’s callously indifferent to both employees and customers alike.  At the same time, Nadella has overseen Microsoft’s transformation from an asset-light software monopolist that most customers barely tolerate to an asset-heavy behemoth that feeds its own margins into GPUs that only lose it money. And it’s that transformation that is starting to concern investors , and raises the question of whether Microsoft is heading towards a painful crash.  You see, Microsoft is currently trying to pull a fast one on everybody, claiming that its investments in AI are somehow paying off despite the fact that it stopped reporting AI revenue in the first quarter of 2025 . In reality, the one segment where it would matter — Microsoft Azure, Microsoft’s cloud platform where the actual AI services are sold — is stagnant, all while Redmond funnels virtually every dollar of revenue directly into more GPUs.  Intelligent Cloud also represents around 40% of Microsoft’s total revenue, and has done so consistently since FY2022. Azure sits within Microsoft's Intelligent Cloud segment, along with server products and enterprise support. For the sake of clarity, here’s how Microsoft describes Intelligent Cloud in its latest end-of-year K-10 filing : Our Intelligent Cloud segment consists of our public, private, and hybrid server products and cloud services that power modern business and developers. This segment primarily comprises: It’s a big, diverse thing — and Microsoft doesn’t really break things down further from here — but Microsoft makes it clear in several places that Azure is the main revenue driver in this fairly diverse business segment.  Some bright spark is going to tell me that Microsoft said it has 15 million paid 365 Copilot subscribers (which, I add, sits under its Productivity and Business Processes segment), with reporters specifically saying these were corporate seats, a fact I dispute, because this is the quote from Microsoft’s latest conference call around earnings : At no point does Microsoft say “corporate seat” or “business seat.” “Enterprise Copilot Chat” is a free addition to multiple different Microsoft 365 products , and Microsoft 365 Copilot could also refer to Microsoft’s $18 to $21-a-month addition to Copilot Business , as well as Microsoft’s enterprise $30-a-month plans. And remember: Microsoft regularly does discounts through its resellers to bulk up these numbers. When Nadella took over, Microsoft had around $11.7 billion in PP&E (property, plant, and equipment ). A little over a decade later, that number has ballooned to $261 billion, with the vast majority added since 2020 (when Microsoft’s PP&E sat around $41 billion).  Also, as a reminder: Jensen Huang has made it clear that GPUs are going to be upgraded on a yearly cycle, guaranteeing that Microsoft’s armies of GPUs regularly hurtle toward obsolescence. Microsoft, like every big tech company, has played silly games with how it depreciates assets , extending the “useful life” of all GPUs so that they depreciate over six years, rather than four.  And while someone less acquainted with corporate accounting might assume that this move is a prudent, fiscally-conscious tactic to reduce spending by using assets for longer, and stretching the intervals between their replacements, in reality it’s a handy tactic to disguise the cost of Microsoft’s profligate spending on the balance sheet.  You might be forgiven for thinking that all of this investment was necessary to grow Azure, which is clearly the most important part of Microsoft’s Intelligent Cloud segment. I n Q2 FY2020 , Intelligent Cloud revenue sat at $11.9 billion on PP&E of around $40 billion, and as of Microsoft’s last quarter, Intelligent Cloud revenue sat at around $32.9 billion on PP&E that has increased by over 650%.  Good, right? Well, not really. Let’s compare Microsoft’s Intelligent Cloud revenue from the last five years: In the last five years, Microsoft has gone from spending 38% of its Intelligent Cloud revenue on capex to nearly every penny (over 94%) of it in the last six quarters, at the same time in two and a half years that Intelligent Cloud has failed to show any growth.  Things, I’m afraid, get worse. Microsoft announced in July 2025 — the end of its 2025 fiscal year— that Azure made $75 billion in revenue in FY2025 . This was, as the previous link notes, the first time that Microsoft actually broke down how much Azure actually made, having previously simply lumped it in with the rest of the Intelligent Cloud segment.  I’m not sure what to read from that, but it’s still not good. meaning that Microsoft spent every single penny of its Azure revenue from that fiscal year on capital expenditures of $88 billion and then some, a little under 117% of all Azure revenue to be precise. If we assume Azure regularly represents 71% of Intelligent Cloud revenue, Microsoft has been spending anywhere from half to three-quarters of Azure’s revenue on capex. To simplify: Microsoft is spending lots of money to build out capacity on Microsoft Azure (as part of Intelligent Cloud), and growth of capex is massively outpacing the meager growth that it’s meant to be creating.  You know what’s also been growing? Microsoft’s depreciation charges, which grew from $2.7 billion in the beginning of 2023 to $9.1 billion in Q2 FY2026 , though I will add that they dropped from $13 billion in Q1 FY2026, and if I’m honest, I have no idea why! Nevertheless, depreciation continues to erode Microsoft’s on-paper profits, growing (much like capex, as the two are connected!) at a much-faster rate than any investment in Azure or Intelligent Cloud. But worry not, traveler! Microsoft “beat” on earnings last quarter, making a whopping $38.46 billion in net income …with $9.97 billion of that coming from recapitalizing its stake in OpenAI. Similarly, Microsoft has started bulking up its Remaining Performance Obligations. See if you can spot the difference between Q1 and Q2 FY26, emphasis mine: So, let’s just lay it out: …Microsoft’s upcoming revenue dropped between quarters as every single expenditure increased, despite adding over $200 billion in revenue from OpenAI. A “weighted average duration” of 2.5 years somehow reduced Microsoft’s RPOs. But let’s be fair and jump back to Q4 FY2025… 40% of $375 billion is $150 billion. Q3 FY25 ? 40% on $321 billion, or $128.4 billion. Q2 FY25 ? $304 billion, 40%, or $121.6 billion.  It appears that Microsoft’s revenue is stagnating, even with the supposed additions of $250 billion in spend from OpenAI and $30 billion from Anthropic , the latter of which was announced in November but doesn’t appear to have manifested in these RPOs at all. In simpler terms, OpenAI and Anthropic do not appear to be spending more as a result of any recent deals, and if they are, that money isn’t arriving for over a year. Much like the rest of AI, every deal with these companies appears to be entirely on paper, likely because OpenAI will burn at least $115 billion by 2029 , and Anthropic upwards of $30 billion by 2028, when it mysteriously becomes profitable two years before OpenAI “does so” in 2030 .  These numbers are, of course, total bullshit. Neither company can afford even $20 billion of annual cloud spend, let alone multiple tens of billions a year, and that’s before you get to OpenAI’s $300 billion deal with Oracle that everybody has realized ( as I did in September ) requires Oracle to serve non-existent compute to OpenAI and be paid hundreds of billions of dollars that, helpfully, also don’t exist. Yet for Microsoft, the problems are a little more existential.  Last year, I calculated that big tech needed $2 trillion in new revenue by 2030 or investments in AI were a loss , and if anything, I think I slightly underestimated the scale of the problem. As of the end of its most recent fiscal quarter, Microsoft has spent $277 billion or so in capital expenditures since the beginning of FY2022, with the majority of them ($216 billion) happening since the beginning of FY2024. Capex has ballooned to the size of 45.5% of Microsoft’s FY26 revenue so far — and over 109% of its net income.  This is a fucking disaster. While net income is continuing to grow, it (much like every other financial metric) is being vastly outpaced by capital expenditures, none of which can be remotely tied to profits , as every sign suggests that generative AI only loses money. While AI boosters will try and come up with complex explanations as to why this is somehow alright, Microsoft’s problem is fairly simple: it’s now spending 45% of its revenues to build out data centers filled with painfully expensive GPUs that do not appear to be significantly contributing to overall revenue, and appear to have negative margins. Those same AI boosters will point at the growth of Intelligent Cloud as proof, so let’s do a thought experiment (even though they are wrong): if Intelligent Cloud’s segment growth is a result of AI compute, then the cost of revenue has vastly increased, and the only reason we’re not seeing it is that the increased costs are hitting depreciation first. You see, Intelligent Cloud is stalling, and while it might be up by 8.8% on an annualized basis (if we assume each quarter of the year will be around $30 billion, that makes $120 billion, so about an 8.8% year-over-year increase from $106 billion), that’s come at the cost of a massive increase in capex (from $88 billion for FY2025 to $72 billion for the first two quarters of FY2026 ), and gross margins that have deteriorated from 69.89% in Q3 FY2024 to 68.59% in FY2026 Q2 , and while operating margins are up, that’s likely due to Microsoft’s increasing use of contract workers and increased recruitment in cheaper labor markets. And as I’ll reveal later, Microsoft has used OpenAI’s billions in inference spend to cover up the collapse of the growth of the Intelligent Cloud segment. OpenAI’s inference spend now represents around 10% of Azure’s revenue. Microsoft, as I discussed a few weeks ago , is in a bind. It keeps buying GPUs, all while waiting for the GPUs it already has to start generating revenue, and every time a new GPU comes online, its depreciation balloons. Capex for GPUs began in seriousness in Q1 FY2023 following October’s shipments of NVIDIA’s H100 GPUs , with reports saying that Microsoft bought 150,000 H100s in 2023 (around $4 billion at $27,000 each) and 485,000 H100s in 2024 ($13 billion). These GPUs are yet to provide much meaningful revenue, let alone any kind of profit , with reports suggesting ( based on Oracle leaks ) that the gross margins of H100s are around 26% and A100s (an older generation launched in 2020) are 9%, for which the technical term is “dogshit.”  Somewhere within that pile of capex also lies orders for H200 GPUs, and as of 2024, likely NVIDIA’s B100 (and maybe B200) Blackwell GPUs too. You may also notice that those GPU expenses are only some portion of Microsoft’s capex, and the reason is because Microsoft spends billions on finance leases and construction costs. What this means in practical terms is that some of this money is going to GPUs that are obsolete in 6 years, some of it’s going to paying somebody else to lease physical space, and some of it is going into building a bunch of data centers that are only useful for putting GPUs in. And none of this bullshit is really helping the bottom line! Microsoft’s More Personal Computing segment — including Windows, Xbox, Microsoft 365 Consumer, and Bing — has become an increasingly-smaller part of revenue, representing in the latest quarter a mere 17.64% of Microsoft’s revenue in FY26 so far, down from 30.25% a mere four years ago. We are witnessing the consequences of hubris — those of a monopolist that chased out any real value creators from the organization, replacing them with an increasingly-annoying cadre of Business Idiots like career loser Jay Parikh and scummy, abusive timewaster Mustafa Suleyman .  Satya Nadella took over Microsoft with the intention of fixing its culture, only to replace the aggressive, loudmouthed Ballmer brand with a poisonous, passive aggressive business mantra of “you’ve always got to do more with less.” Today, I’m going to walk you through the rotting halls of Redmond’s largest son, a bumbling conga line of different businesses that all work exactly as well as Microsoft can get away with.  Welcome to The Hater’s Guide To Microsoft , or Instilling The Oaf Mindset. Server products and cloud services, including Azure and other cloud services, comprising cloud and AI consumption-based services, GitHub cloud services, Nuance Healthcare cloud services, virtual desktop offerings, and other cloud services; and Server products, comprising SQL Server, Windows Server, Visual Studio, System Center, related Client Access Licenses (“CALs”), and other on-premises offerings. Enterprise and partner services, including Enterprise Support Services, Industry Solutions, Nuance professional services, Microsoft Partner Network, and Learning Experience. Q1: $398 billion of RPOs, 40% within 12 months, $159.2 billion in upcoming revenue. Q2: $625 billion of RPOs, 25% within 12 months, $156.25 billion in upcoming revenue.

0 views
Evan Schwartz 3 months ago

Scour - January Update

Hi friends, In January, Scour scoured 805,241 posts from 16,555 feeds (939 were newly added). I also rolled out a lot of new features that I'm excited to tell you about. Maybe because of some of these, I found more posts than usual that I thought were especially worth sharing. You can find them at the bottom of this post. Let's dive in! The Scour homepage has been completely revamped. It includes a new tagline, a more succinct description, and a live demo where you can try out my feed right from that page. Let me know what you think! Scour also finally has its own logo! (And it looks great on my phone's home screen, if I do say so myself! See below ) Have you ever wondered how Scour works? There is now a full documentation section, complete with detailed write-ups about Interests , Feeds , Reactions , How Ranking Works , and more. There are also guides specifically for RSS users and readers of Hacker News , arXiv , Reddit , and Substack . All of the docs have lots of interactive elements, which I wrote about in Building Docs Like a Product . My favorite one is on the Hacker News guide where you can search for hidden gems that have been submitted to HN but that have not reached the front page. Thanks to Tiago Ferreira , Andrew Doran , and everyone else who gave me the feedback that they wanted to understand more about how Scour works! Scour is now a Progressive Web App (PWA). That means you can install it as an icon on your home screen and access it easily. Just open Scour on your phone and follow the instructions there. Thanks to Adam Benenson for the encouragement to finally do this! This is one of the features I have most wanted as a user of Scour myself. When you're browsing the feed, Scour now keeps track of which items you've seen and scrolled past so it shows you new content each time you check it. If you don't want this behavior, you can disable it in the feed filter menu or change your default view to show seen posts. If you subscribe to specific feeds, as opposed to scouring all of them, it's now easier to find the feed for an article you liked . Click the "..." menu under the post, then "Show Feeds" to show feeds where the item was found. When populating that list, Scour will now automatically search the website where the article was found to see if it has a feed that Scour wasn't already checking. This makes it easy to discover new feeds and follow websites or authors whose content you like. This was another feature I've wanted for a long time myself. Previously, when I liked an article, I'd copy the domain and try to add it to my feeds on the Feeds page. Now, Scour does that with the click of a button. Some of the most disliked and flagged articles on Scour had titles such as "The Top 10..." or "5 tricks...". Scour now automatically penalizes articles with titles like those. Because I'm explicitly trying to avoid using popularity in ranking , I need to find other ways to boost high-quality content and down-rank low-quality content. You can expect more of these types of changes in the future to increase the overall quality of what you see in your feed. Previously, posts found through Google News links would show Google News as the domain under the post. Now, Scour extracts the original link. You can now navigate your feed using just your keyboard. Type to get the list of available keyboard shortcuts. Finally, here are some of my favorite posts that I found on Scour in January. There were a lot! Happy Scouring! Have feedback for Scour? Post it on the feedback board and upvote others' suggestions to help me prioritize new features! I appreciate this minimalist approach to coding agents: Pi: The Minimal Agent Within OpenClaw , even though it didn't yet convince me to switch away from Claude Code. A long and interesting take on which software tools will survive the AI era: Software Survival 3.0 . Scour uses Litestream for backup. While this new feature isn't directly relevant, I'm excited that it's now powering Fly.io's new Sprites offering (so I expect it to be a little more actively developed): Litestream Writable VFS . This is a very cool development in embedding models: a family of different size (and, as a result, cost) models whose embeddings are interoperable with one another: The Voyage 4 model family: shared embedding space with MoE architecture . A thought-provoking piece from Every about How AI Made Pricing Hard Again . TL;DR: over are the days where SaaS businesses have practically zero marginal cost for additional users or additional usage. A nice bit of UX design history about the gas tank arrow indicator on a car, with a lesson applied to AI: The Moylan Arrow: IA Lessons for AI-Powered Experiences . Helpful context for Understanding U.S. Intervention in Venezuela . Stoolap: an interesting new embedded database. Stoolap 0.2 Released For Modern Embedded SQL Database In Rust . I keep browsing fonts and, while I decided not to use this one for Scour, I think this is a neat semi-sans-serif from an independent designer: Heliotrope .

0 views

Prolog Basics Explained with Pokémon

The project that inspired this post is a little silly—I am about to describe the mechanics of a children’s video game in great detail—but this particular problem is what finally made Prolog click for me, an epiphany I’ve been hunting for ever since reading Bruce Tate’s “Seven Languages in Seven Weeks.” This exercise has taught me a lot about the kinds of interfaces I’m trying to build in somewhat more practical domains . For certain kinds of relationships, logic programming is by far the most concise and expressive programming system I’ve ever used. To understand why, let’s talk about Pokémon. Pokémon is a video game series/multimedia franchise/lifestyle brand set in a world where humans live alongside a menagerie of colorful animal characters. “Pokémon” is both the name of the franchise and the generic term for the animal characters themselves, which all have their own individual species names. There are over a thousand distinct species of Pokémon, from Bulbasaur ( #1 ) to Pecharunt ( #1025 ). There are all sorts of Pokémon games now, but the main series has always been about catching and battling them. During a battle, your team of six Pokémon faces off against another team. Each Pokémon is equipped with four moves that it can choose to (usually) do damage to their opponent. You need to reduce the HP (Hit Points) of all your opponent’s Pokémon to zero before they are able to do so to you. Each Pokémon has unique traits that affects how it battles. They have a set of base stats, a large pool of possible moves, a handful of abilities, and a typing. As you will see in a moment, the immense number of combinations here is the motivation for trying to track this with software. Typing is especially important. Moves have a type, like Fire or Rock, and Pokémon can have up to two types. A move with a type that is Super Effective against the opposing Pokémon will do double damage; a move that is Not Very Effective will do half damage. It’s a little more intuitive with examples. The Fire-type move Flamethrower will do 2x to Grass-type Pokémon, because Grass is weak to Fire, but the Water-type move Surf will only do ½ damage to them, because Grass resists Water. Type modifiers can stack. Scizor is a Bug/Steel type, and both Bug and Steel are weak to Fire, so Fire moves will do 4x damage to Scizor. Electric is weak to Water, but Ground is immune, so if you use an Electric type move against Water/Ground Swampert , you’ll do zero damage, since 0×2 is still 0. Naturally, there is a chart to help you keep track. Those are effectively the mechanics of the Pokémon video games as I understood them when I was 8. Click moves to do damage, try to click moves with good type matchups. These games are for children and, at the surface level, they’re not very hard. Before I explain how wonky the Pokémon mechanics can get under the hood, I first need to explain how logic programming works. Pokémon is a great fit for logic programming because Pokémon battles are essentially an extremely intricate rules engine. Let’s start by creating a file with a bunch of facts. In Prolog, we declare “predicates.” Predicates define relationships: is a , is a , and so on. We refer to this predicate as , because the name of the predicate is and it has one argument. These facts are loaded into an interactive prompt called the “top-level.” You query the top-level by typing a statement into the prompt; Prolog tries to find all the ways to make that statement true. When there’s more than one possible solution, the top-level displays the first solution and then awaits user input. You can then have it display one more solution, all the solutions, or stop entirely. In this first example, we type and hit Enter. The top-level replies Squirtle is, in fact, a Pokémon. Not all things are Pokémon. Let’s add Pokémon types in there, as the predicate . Recall that some Pokémon have just one type while others have two. In the latter case, that’s modeled with two facts. Bulbasaur is a Grass type, and Bulbasaur is a Poison type; both are true. The paradigm is similar to a One-To-Many relation in a SQL database. Interactively, we can confirm whether Squirtle is a water type. Can we state that Squirtle is a Grass type? No, because Squirtle is a Water type. Suppose we didn’t know what type Squirtle was. We can ask! In Prolog, names that start with an upper-case letter are variables. Prolog tries to “unify” the predicate with all possible matches for the variable. There’s only one way to make this particular predicate true though: has to be , because Squirtle’s only type is Water. For Pokémon with two types, the predicate unifies twice. Semantically, that leading semicolon on the third line means “or.” is true when or when . Any of the terms can be be a variable, which means we can ask questions in any direction. What are all the Grass types? Just make the first argument the variable, and set the second argument to . I cut it off, but the prompt would happily would list all 164 of them. Commas can be used to list multiple predicates—Prolog will unify the variables such that all of them are true. Listing all the Water/Ice types is just a matter of asking what Pokémon exist that unify with both the Water and Ice types. Even though is a variable, in the context of the query, both instances of it have to be the same (just like in algebra). The query only unifies for values of where both those predicates hold. For instance, the Water/Ice type Dewgong is a solution because our program contains the following two facts: Therefore, subbing in for the variable satisfies the query. Squirtle, by contrast, is just a Water type: exists, but not . The query requires both to unify, so is not a possible value for . Pokémon have lots of data that you can play around with. Iron Bundle is a strong Water/Ice-type Pokémon with high Special Attack. How high exactly? With Special Attack that high, we want to make use of strong Special moves. What Special moves does Iron Bundle know? Freeze-Dry is a particularly good Special move. Here’s a query for all Ice-type Pokémon with Special Attack greater than 120 that learn Freeze-Dry . One last concept before we move on: Rules. Rules have a head and a body, and they unify if the body is true. A move is considered a damaging move if it’s either a Physical Move or a Special Move. The predicate defines all the moves that do direct damage. This will unify with any moves that do direct damage. Nothing I’ve shown so far is, logically speaking, very ambitious—just “and” and “or” statements about various facts. It’s essentially a glorified lookup table. Still, take a moment to appreciate how much nicer it is to query this database than a plausible alternative, like SQL. For the facts we’ve seen so far, I would probably set up SQL tables like this: Then query it like so: For comparison, here’s the equivalent Prolog query again: I’m not ripping on SQL—I love SQL—but that’s the best declarative query language most people interact with. It’s amazing to me how much simpler and more flexible the Prolog version is. The SQL query would become unmanageably complex if we continued to add clauses, while the Prolog query remains easy to read and edit (once you get the hang of how variables work). With the basics established, here’s some context on the project I’m working on. Pokémon battles have an outrageous number of number of mechanics that all interact in complex and probabilistic ways. Part of the appeal of these games is the futile attempt to keep them all in your head better than your opponent, using that information to out-predict and out-maneuver their plans. It’s a sort of like very silly Poker. The challenge, if you want to build software for this game, is to model all that complexity without losing your mind. Prolog is stunningly good at this, for two main reasons: To illustrate that, here’s how I implemented priority moves for my Pokémon draft league. Pokémon draft is pretty much what it sounds like. Pokémon are given a point value based on how good they are, each player is given a certain amount of points to spend, and you draft until every player has spent their points. Your team ends up with about 8-11 Pokémon and each week you go head to head against another person in the league. My friend and WMI collaborator Morry invited me to his a couple years ago and I’ve been hooked on the format ever since. The games are 6v6, so a big part of the battle is preparing for all the possible combinations of six your opponent could bring, and putting together six of your own that can handle all of them. Naturally, you can only build teams with the Pokémon you drafted. I just made that predicate my name: . What Pokémon do I have that learn Freeze-Dry ? None. Rats. One very important type of move is priority moves. Earlier I mentioned that the Speed stat controls which Pokémon moves first. Some nuance: the Pokémon that used the move with the highest priority goes first, and if they both selected a move of the same priority, then the one with the higher Speed goes first. Most moves have a priority of zero. Ah, but not all! Accelerock has a priority of 1. A Pokémon that uses Accelerock will move before any Pokémon that uses a move with priority 0 (or less), even if the latter Pokémon has a higher Speed stat. I define a predicate that unifies with a Pokémon, the priority move it learns, and what priority that move is. A simple query that asks “what priority moves does my team learn” returns a lot of answers. Although this is technically correct (the best kind), most of these answers are not actually useful. Helping Hand and Ally Switch have very high priority, but they only have a purpose in Double Battles, which isn’t the format I’m playing. To fix this, I define all the Double Battle moves and exclude them. I’m going to exclude the move Bide too, which is functionally useless. The predicate means “true if this goal fails”, and means “these two terms are different.” We get the following results: Much better, but there’s a handful of moves in there that go first because they protect the user from damage or status, like Detect . That’s not really what I mean by priority move—I’m interested in moves that will surprise my opponent with damage or an adverse side effect, like Quick Attack and Sucker Punch . With those rules in place, we arrive at a very useful answer! It’s even more useful to look up what priority moves my opponent for the week has. At this point, I showed the program to Morry and he hit me with a challenge. Pokémon with the Prankster ability get an additional +1 priority on their status moves. Could the rule be extended to note that? I happen to have one such Pokémon on my team. This took me 3 minutes, using Prolog’s if/then construct, . Now the same query includes all of Tornadus’ status moves, with their increased priority. At the top, I said that this experience had taught me about the kinds of interfaces I want to build. One of those lessons is fairly obvious: Prolog can be a little clunky, but it’s an elegant language for expressing and querying relations like the ones described here. That has implications if you, like me, are interested in the judicious use of declarative DSLs for programming. The other lesson is what kinds of tools work for non -programmers. I’m not the first person to think “it would be nice to know what priority moves my opponent’s team has.” The Pokémon community has resources like this, built in the best programming interface of all time: the humble spreadsheet. I use a copy of “Techno’s Prep Doc” , which is one of those spectacularly-advanced Google Sheets you come across in the wild sometimes. You put in the teams and it generates tons of useful information about the matchup. It has a great interface, support for a variety of formats, scannable visuals, and even auto-complete. I was curious about the formula for finding priority moves. It’s gnarly. With a little bit of clicking around, I was basically able to figure out what this does. There’s a “Backend” sheet that lists all the moves. It’s effectively a hard-coded version of my Prolog query. The lookup formula does some filtering, VLOOKUP-ing, and kinda-metaprogramming (INDIRECT returns a cell reference ) to find all the Pokémon on your team that are in that Backend list, and display them. There are a number of reasons that I, personally, would prefer to work on a version of this database implemented in Prolog instead of one implemented with spreadsheet VLOOKUPs. I plan to build webapps with this that do things the existing suite of Pokémon tooling can’t. (If I can ever get scryer-prolog to compile to WASM , that is.) Furthermore, the Prolog paradigm is clearly more extensible. The spreadsheet backend is a hard-coded list of notable moves; my database can look up any move. I still can’t really believe this query, which finds all the Special moves that Tornadus learns which are super-effective against any member of Justin’s team. Nothing like that exists in any tool that I know of—it’s the kind of thing I normally try to figure out by endlessly switching tabs. With the grammar established by my program, I put this together in like 30 seconds. I’m not interested in how structured programming is more extensible than spreadsheets, though. I already know why I don’t do all my programming in spreadsheets. A question I find very important is: What is it about this particular problem, and the kinds of people who were motivated to solve it, where the most well-maintained solution available is a spreadsheet? I believe there are a great many problems like that in the world, and a lot of improvements on that programming paradigm yet to be properly realized. Thanks to Morry Kolman for reading a draft of this blog . Some moves miss a certain percentage of the time, doing no damage. Some moves raise or lower a Pokémon's stats. Pokémon can hold items that have various effects. Damage calculations aren't constant; moves do normally-distributed damage within the calculated range. Pokémon can get frozen, burned, paralyzed, poisoned, or fall asleep; these all have various adverse effects. There are a variety of field effects (like weather, terrain, Trick Room) which alter move damage, turn order, and other things. Pokémon each have an ability that has various effects i.e Levitate makes you immune to ground moves, Drizzle turns the weather to Rain when the Pokemon switches in, Sheer Force disables a move's side effects but multiplies its damage by 1.3x. Players have points they (invisibly) allocate to each Pokémon before the game, to boost chosen stats. Depending on they built the team, each Pokemon might do more damage or take hits better than you were expecting. The challenge, if you want to build software for this game, is to model all that complexity without losing your mind. Prolog is stunningly good at this, for two main reasons: Take a look at the damage calculator to get an idea of what I mean. The query model excels at describing ad-hoc combinations. The data model is perfectly suited to layering rules in a consistent way. I joined the draft league in Season 3, lost in finals, then won Seasons 4 and 5. We just started Season 6. If you want it, you can have the crown . There are a number of coders in this draft league and I have gotten precisely zero of them to try out my Prolog program. That’s kind of the point though! It needs to be a website… The Prolog implementation I’m using is Scryer Prolog , a modern Prolog implementation that emphasizes standards and formal correctness. The creator, Markus Triska, has a terrific online book, “The Power of Prolog,” and accompanying YouTube channel that has soundtracked my breakfast for weeks. Scryer Prolog is also designed to encourage more constructs that preserve logical completeness and monotonicity , which means I’m not really supposed to use the or predicates. I couldn’t really figure out how to express what I wanted with the replacements offered, though. Happy to edit if anyone wants to help. Also, on Markus’ website : “My goal is to provide programs that work as intended, reliably and conveniently, with zero surprises. Programs that you can run for multiple decades without any issues such as crashes, resource leaks or other unexpected behaviour.” This guy and I have some similar interests! I did some fun metaprogrogramming to get all the data into Prolog predicates using the Pokémon Showdown NodeJS API. Yes, putting the accent on the “e” everywhere but the code blocks was very annoying.

0 views
Evan Schwartz 4 months ago

Scour Year End Update 2025

I thought about sending out a personalized "Scour Wrapped"... until I got the 7th Wrapped from some random service. So instead, I'll just say Happy New Year and thanks for your support in 2025! 🥂 These were the new features added since the last update in October. Scour now identifies articles that are paywalled and indicates them with a yellow dollar sign next to the domain. In your settings , you can opt to hide paywalled content. If you do, you can also exempt specific domains where you have a subscription so you will see their content even if it is behind the paywall. Thank you to Johnny and Allen for requesting this feature! For anyone interested in the technical details, I wrote a blog post about a neat SQL trick I came across while building this: Short-Circuiting Correlated Subqueries in SQLite . You can also now block content from specific websites. The option to block a domain can be found by clicking the "..." button below each post. You can see and manage your excluded domains in your settings . Thanks to Vahe for this suggestion! If you subscribe to specific feeds (as opposed to scouring all of them), Scour will now recommend other sources for you to follow right in your personalized feed. These recommendations are based on Scour looking for content that matches your interests that you aren't currently getting. You can find more recommendations on your Feeds page . Each feed also now displays its three most recent posts below its description to make it easier to know what you'll get if you subscribe. You can click on the feed's title to see all of the posts from that feed. Thanks to Tiago for this suggestion! By default, clicking on a link to a post will bring you to the original website where it was published. However, if you prefer to read it on Scour, you can read the Preview, which can be found in the "..." menu under each post. Thanks to Linh for this suggestion! The filter menu for your feed (accessible via the button next to where it says Your Top Finds) should be clearer and more mobile-friendly. You can filter by time range and toggle between seeing posts from feeds you’ve subscribed to or see posts from everyone’s feeds. Thanks Stefan for the feedback on this! A number of people have told me that they are confused about how the love/like/dislike reactions are used on Scour. I'll work on making this clearer in the future but in the meantime, there's now a section in the FAQs about this. The answer is: Loves and likes are saved to your Likes page, so you can use them to bookmark interesting content. Unlike most content aggregators, Scour does not use reactions to change what shows up in your feed. Instead, reactions are used to generate Interest Recommendations for you. Scour only shows content related to topics you've explicitly chosen. You can also subscribe to other users' Likes as feeds. Everyone's reactions contribute to the Popular Posts page. Here were some of my favorite posts I found on Scour in November and December: Thanks to everyone who wrote about Scour on their blog or website in 2025! This included: If you write about Scour in the future, or if you already did and I didn't include you, please let me know! Thank you to everyone who provided feedback on Scour this year! Specifically, thank you to Aaron, Alberto, Alex K, Alex W, Allen, Andrew D, Andrew M, Andy M, Andy P, Cairin, Cole, Daniel, Elyem, Hary, Imperfect, Jadi, Jeppe, Jesse, Johnny, Jon, Karit, Kilpatrj, Linh, Proudmuslim-dev, Ryan, Sarah, Stefan, Tiago, Tomáš, Tyler, and Vahe. And thank you to all of the anonymous feedback givers as well! Because you made it to the end of the post, here's a little preview of an upcoming feature for you. Let's say you want to only see posts from small websites, like individuals' blogs. You can now try filtering your feed by how many posts each website or feed publishes per month. For example, you can use these links to see only posts from quieter domains or quieter feeds . Or, you can try this one to only see articles from larger websites . Let me know what you think! UI for controlling these filters is coming soon! Happy New Year and happy Scouring! - Evan Scour scoured 9,940,460 posts from 15,608 feeds 1,013 new users signed up (welcome!!) 12,620 interests were added, with 6,688 of those from recommendations 26,702 posts were read, 3,023 were liked, and 383 were loved 55 suggestions on the feedback board were completed Paper AI Tigers Build / Buy / Bot More databases should be single-threaded Disks Lie: Building a WAL that actually survives Minsuk Kang: Scour and minifeed are 100X better than Instagram and X (January) Winther: Blog Discovery (June) Daniel Prindii: My Read it later and discoverability systems in 2025 (July) PPC Land: Developer revives RSS with AI while Google targets syndication infrastructure (August) Tomáš Burkert: RSS feeds discovery strategies (October) Alex White: Discovering the Indie Web (November) Matt Maldre: Search engine for blogs (November) Andrew Doran: Tools for discovering the IndieWeb (December)

0 views
Grumpy Gamer 4 months ago

Sqlite Comments

When I started using Hugu for static site generation I lost the ability to have comments and we all know now supportive the Internet can be, so why wouldn’t you have comments? I wrote a few php scripts that I added on to Hugo and I had comments again. I decided to store the comments as flat files so I didn’t complicate things by needing the bloated MySQL. I wanted to keep it as simple and fast as possible. When a comment is added, my PHP script created a directory (if needed) for the post and saves the comment out as a .json file with name as the current time to make sorting easy. When the blog page was displayed, these files (already sorted thanks to the filename) were loaded and displayed. And it all worked well until it didn’t. Flat files are simple. but they can be hard to search or maintain if they need cleaning up or dealt with after a spam attack. I figured I use commandline tools to do all of that, but it’s a lot more cumbersome than I first thought. I missed have them in a sql database. I didn’t want to install MySQL again, but my site doesn’t get a lot of commenting traffic so I could use Sqlite instead. The downside is Sqlite write-locks the database while a write is happening. In my case it’s a fraction of a second and wouldn’t be a issue. The second problem I had was the version of Ubuntu my server was using is 5 years old and some of the packages I wanted wouldn’t available for it. I tried to update Ubuntu and for reasons I don’t fully understand I couldn’t. So I spun up a new server. Since grumpygamer.com is a statics site I only had to install Apache and I was off and running. Fun times. But the comment flat files still bugged me and I thought I’d use this as an opportunity to convert over to Sqlite. PHP/Apache comes with Sqilte already installed, so that’s easy. A long weekend and I rewrote the code to save comments and everything is back and working. Given that a webserver and PHP already needed to be installed, it isn’t a big deal to use Sqlite. If you’re not comfortable with SQL, it might be harder but I like SQL.

0 views
Ludicity 4 months ago

Merry Christmas, Ya Filthy Animals (2025)

It’s my last day of writing for the year, so I’m going to try keep this one quick – it was knocked out over three hours, so I hope you can forgive me if it’s a bit clumsier than my usual writing. For some strange reason, one of the few clear memories I have from growing up in Malaysia is a particular moment when I was seven years old. It was the first day of school for the year, and I was studying at Sekolah Kebangsaan Batu Lanchang , which in English is “Batu Lanchang National School”. When you’re seven years old, being told that you’ve got to wait an hour to see your best friend is an insurmountable obstacle. It feels like forever . The year it would take to go up another grade is, accordingly, so long that it’s not even imaginable. I recall thinking, probably in simpler language, “I probably won’t make it to eight years old. A year is way too much time for something random to happen. I’ll get hit by a car or something.” To round out this brief moment of uncharacteristic sobriety, is it very likely that my next thought “Blastoise is obviously better Pokemon than Venusaur and Charizard because he has big cannons.” Now I’m 31 and the years are flying by so fast that I have to desperately seize their trailing collars so that I don’t suddenly find myself seventy without noticing. So, as is becoming tradition, what happened this year? But hell I'm just a blind man on the plains, I drink my water when it rains, And live by chance among the lightning strikes. – Burden of Tomorrow , Tallest Man on Earth For the first time, I have absolutely no idea what’s going to happen next year. Don’t get me wrong, I’ve never been right about what’s going to happen next year, but I’ve always thought I had an idea. I thought I was going to be a failing student in Southeast Asia because I hated mathematics, maybe become a journalist, then presumably a corpse with 2.5 children. If we run the tape back, I ended up in Australia, getting deeply into sabre fencing, and somehow became an extremely un-failed student in psychology and then statistics, before earning a bajillion dollars in software. Then I was locked in an apartment for a year by a global pandemic, somehow became at least a not-totally-unknown writer, threw away the bajillion dollars to the absolutely horror of my very conservative all-doctor family, got into improv theatre, rejected a book deal, and started a software consultancy. At the start of every year until now, I’ve had some sort of plausible social script for how the year was going to go, and it has never, ever gone that way, but I nonetheless allowed myself a fresh misconception on January 1st. This year, I really have no idea, and I’m not going to bother wasting any time trying to figure it out. Not a clue what’s going to happen next year. Maybe I’ll start a recruiting agency and make a million dollars. Maybe I’ll get to February and run into a crippling illness. It’s very freeing in a lot of ways, particularly for someone that got into the works of Taleb at a young age. I surf chaos full time now. For example, despite all the marketing work we did, our biggest contract this year happened because I got a message on LinkedIn about a data engineering job, which would normally have been totally unsuitable for a whole company to work on. Except over a year ago, I was asked to get coffee with Mel Kendell and Martin Foster , who allowed me to give a talk at a Meetup where Dan Prager and Martin Chesbrough 1 , and they both worked with the messenger. I scarcely bother to plan anymore. I’ll either earn a million dollars or become homeless next year. I don’t know, I don’t care, bring on the lightning strikes. ’…I have been "in denial" for some time, knowingly burning the candle at both ends and finding that it often gives a lovely light. But for precisely this reason, I can't see myself smiting my brow with shock or hear myself whining about how it's all so unfair: I have been taunting the Reaper into taking a free scythe in my direction and have now succumbed to something so predictable and banal that it bores even me.’ – Christopher Hitchens I’m at the age now where the older people in my life are starting to get sick. Me and my friends find ourselves in hospital lobbies more often – there are emergency flights, sterile hospital waiting rooms, and trying to figure out what it means when a doctor says that someone is “stable”. Relax, I’m not going to be a downer. I know I don’t have to explain how it goes. And if I do, you’re probably twenty, in which case hoo boy, you have got some experiences headed your way. What I’m getting at is that I’m coming to terms with is the fact that we’re all running out the clock in one way or another. We know, to some level of precision, how the story goes. The general term for this is mortality salience , i.e, the realization that there’s a hospital bed or worse at the end of the rainbow, and we’ve got to make do with the time we’ve got. Sometimes it can be a little bit confronting, but it clarifies things too. There’s a story from David Whyte that I absolutely adore, about a conversation he had with his best friend before said friend passed 2 . We were towards the end of our meal on the Saturday evening, and I was in a kind of reverie. I was realizing that I needed to help my father out. Almost to the ceiling, I spoke out: “You know, my dad’s in a bit of trouble. I’m thinking of giving him some money.” John immediately leaned across the table and said, “How much are you thinking of giving him?” My father was in Yorkshire, in England. “I dunno — one thousand pounds,” thinking I was being very generous. John looked at me and said, “Go against yourself. Give two.” I said, “Thank you, John” (laughing). “A friend in need is a friend indeed.” Then John looked at me again and said, “Go against yourself again. Give four .” I took a big gulp. Part of the spirit of our meetings — these philosophical writing meetings, walking meetings — was to push each other. In the spirit of that, I said, “I will,” and we shook hands across the table. John made sure we were committed. Sure enough, I went away and gave the four thousand to my father, and you could have knocked him down with a feather. I was often giving him money, but in drips and drabs. He was always falling down financial holes and having to climb out again, and I’d have to help him climb out. But this four thousand actually transformed his financial life, because he was able to sort himself out — and he never fell down the hole again. I said to myself, “Wasn’t that a great thing for a friend to do?” One of the qualities that lies at the heart of friendship is encouraging your friend to be the best part of themselves — to be more generous — and to be a witness to that. John had done that for me, and I thought, “Wasn’t that a marvelous thing for a friend to do?” It wasn’t nine months later that we were at dinner again. John had obviously forgotten about this conversation, because towards the end of the meal he looked towards the ceiling and said, “You know, I have a good friend in a bit of financial trouble, and I’m thinking of giving him some money.” I said, “How much are you thinking of giving, John?” He shook his head. “I dunno — one thousand euros.” He was in Ireland. I looked at him and said, “Go against yourself, John. Give two.” John looked back across the table and said, “Jesus, Holy and Saint Mary, Joseph — tonight I’m in this for four.” I’ve done this a few times this year — that is, quadrupling the money I've sent someone that needed it — which may prove to be unwise in 2026 if our revenue dries up, but for now I don’t have any regrets. I was raised up believing I was somehow unique Like a snowflake distinct among snowflakes Unique in each way you'd conceive And now after some thinking I'd say I'd rather be A functioning cog in some great machinery Serving something beyond me – Helplnessness Blues , Fleet Foxes Two years ago, I was adamant that I wanted to “make it alone” when I started a business, largely because I wanted to lay out a blueprint for anyone to succeed, not just people with well-known blogs. This has quickly turned out to be utterly ridiculous. No one makes it alone. I’ve been on the receiving end of a huge amount of generosity for the past two years, and it would be ridiculous to pretend that isn’t the case, or that it’s even possible to succeed without that being the case. I was briefly tempted to start listing all the people that have helped me out this year, from helping me keep sharp on my software skills, to editing help, to preventing me from making horrible contractual blunders, but I realized that it would take me literal hours even if all I was doing was writing down their names and pasting links to their websites in. I’m generally pretty good at accepting help, but a big lesson from this year is to lean into it entirely. It’s going to be insufferable. Every other post is going to be banging on your doors demanding help with sales, obscure programming questions, and book recommendations. I've been holding back a lot of my thoughts on things because I was experiencing the first-world problem of being self-conscious about having too much writing go viral. I'm not going to worry about this next year, and oh boy, have I got thoughts on things . I don't like being told that it's my duty to love my enemies. No, we have to hate our enemies and try to destroy them before they destroy us. – Christopher Hitchens Next year, I’m going to try and put my enemies in the dirt. Earlier this year, we had a very unpleasant run-in with a competing consultancy in Melbourne. They have far more staff than us, but were running years late on their deliverables, were putting small, greenfield clinics on SSIS in 2025 3 , and had a contract that said they owned all the SQL in client systems so the clients could never migrate away. When it happened, I really wanted to stick the knife in. It was very much everything I’m opposed to in the industry – at best, incompetents, at worse, grifters, taking advantage of medical institutions. With a bit of distance, I’ve realized that the people there didn’t even know they were doing a bad job. As far as they were concerned, SSIS is state-of-the-art, and the fact that they didn’t have to learn how to code was pure upside, and every project they had ever been on was late so they weren’t being particularly ineffective. Sure, I’ve run into people with actual monstrous views about making money – an executive told me on Christmas Eve that there’s no room for ethics in a business 4 – but my enemy is generally not individual people. It’s the ideas and systems that create people with worldviews so comprehensively myopic. I am probably not going to be able to destroy them by taking all of their business in one swoop – it’s hard to compete with people that will lie for sales, advertently or not. Nor will I be able to have an impact if I do what I’ve been doing this year – paying my team a good wage, with no intention of ever growing. So what is there to do, if I’m not happy just giving myself a ton of money and watching the world slowly erode? Going into the next year, I want to grow our team until we have enough leverage to make hires, develop our own philosophy of engineering, and lay out a blueprint for how to run an ethical, human-oriented business for other people to follow. There is some size where every business turns evil – even mine would turn evil, if we got large enough to become acquired and our founding team quit – so all I can think of is to lay out the playbook for your peers to kill you during the full moon, when you’re selling Azure consulting and GenAI SEO platforms. What does that actually mean? I have no idea, we’ll figure it out somehow. All I know right now is that the goal is to make sure that everyone on my team is compensated around their corporate salaries by the end of 2026, that we’re in a position to comfortably support a few good people who need a good place to work by 2027, and that we have enough about the process documented that anyone with a bit of fearlessness can replicate our process. Then I’m going to stick in the knife and take all their business. I have had it up to here just having to watch Musk-and-Altman-types flounce around, lying and absolutely fucking everything up, and if I need to start obtaining a huge pile of money to engage them in mortal combat on the astral plane, then fine , someone needs to get on this. The theme for next year is generosity and preparation for economic damage. In any case, I hope you all had a great Christmas, are headed into a great new year, and that you also decide to choose violence in 2026. I’ve been talking to Martin for over a year, and I swear to God, he has told me that he’s the “Principal Intern” at Everest Engineering every single time. I have no idea what he does when though my team is literally subcontracted through Everest and I am what passes for our CEO.  ↩ I had to spend ages tracking this down and transcribing it by hand – I think it’s the only full transcription of the story on the internet – so you’d better enjoy it, okay? This is, genuinely, a Christmas present from me to you. I paid a gross amount of money to even access the audio.  ↩ For the non-data-engineers in the audience, all the specialists who read this just visibly winced.  ↩ This is true if you’re incompetent and have no leverage. Sucks to suck.  ↩ I’ve been talking to Martin for over a year, and I swear to God, he has told me that he’s the “Principal Intern” at Everest Engineering every single time. I have no idea what he does when though my team is literally subcontracted through Everest and I am what passes for our CEO.  ↩ I had to spend ages tracking this down and transcribing it by hand – I think it’s the only full transcription of the story on the internet – so you’d better enjoy it, okay? This is, genuinely, a Christmas present from me to you. I paid a gross amount of money to even access the audio.  ↩ For the non-data-engineers in the audience, all the specialists who read this just visibly winced.  ↩ This is true if you’re incompetent and have no leverage. Sucks to suck.  ↩

0 views