Latest Posts (20 found)

Video + notes on upgrading a Datasette plugin for the latest 1.0 alpha, with help from uv and OpenAI Codex CLI

I'm upgrading various plugins for compatibility with the new Datasette 1.0a20 alpha release and I decided to record a video of the process. This post accompanies that video with detailed additional notes. I picked a very simple plugin to illustrate the upgrade process (possibly too simple). datasette-checkbox adds just one feature to Datasette: if you are viewing a table with boolean columns (detected as integer columns with names like or or ) and your current user has permission to update rows in that table it adds an inline checkbox UI that looks like this: I built the first version with the help of Claude back in August 2024 - details in this issue comment . Most of the implementation is JavaScript that makes calls to Datasette 1.0's JSON write API . The Python code just checks that the user has the necessary permissions before including the extra JavaScript. The first step in upgrading any plugin is to run its tests against the latest Datasette version. Thankfully makes it easy to run code in scratch virtual environments that include the different code versions you want to test against. I have a test utility called (for "test against development Datasette") which I use for that purpose. I can run it in any plugin directory like this: And it will run the existing plugin tests against whatever version of Datasette I have checked out in my directory. You can see the full implementation of (and its friend described below) in this TIL - the basic version looks like this: I started by running in the directory, and got my first failure... but it wasn't due to permissions, it was because the for the plugin was pinned to a specific mismatched version of Datasette: I fixed this problem by swapping to and ran the tests again... and they passed! Which was a problem because I was expecting permission-related failures. It turns out when I first wrote the plugin I was lazy with the tests - they weren't actually confirming that the table page loaded without errors. I needed to actually run the code myself to see the expected bug. First I created myself a demo database using sqlite-utils create-table : Then I ran it with Datasette against the plugin's code like so: Sure enough, visiting produced a 500 error about the missing method. The next step was to update the test to also trigger this error: And now fails as expected. It this point I could have manually fixed the plugin itself - which would likely have been faster given the small size of the fix - but instead I demonstrated a bash one-liner I've been using to apply these kinds of changes automatically: runs OpenAI Codex in non-interactive mode - it will loop until it has finished the prompt you give it. I tell it to consult the subset of the Datasette upgrade documentation that talks about Datasette permissions and then get the command to pass its tests. This is an example of what I call designing agentic loops - I gave Codex the tools it needed ( ) and a clear goal and let it get to work on my behalf. The remainder of the video covers finishing up the work - testing the fix manually, commiting my work using: Then shipping a 0.1a4 release to PyPI using the pattern described in this TIL . Finally, I demonstrated that the shipped plugin worked in a fresh environment using like this: Executing this command installs and runs a fresh Datasette instance with a fresh copy of the new alpha plugin ( ). It's a neat way of confirming that freshly released software works as expected. This video was shot in a single take using Descript , with no rehearsal and perilously little preparation in advance. I recorded through my AirPods and applied the "Studio Sound" filter to clean up the audio. I pasted in a closing slide from my previous video and exported it locally at 1080p, then uploaded it to YouTube. Something I learned from the Software Carpentry instructor training course is that making mistakes in front of an audience is actively helpful - it helps them see a realistic version of how software development works and they can learn from watching you recover. I see this as a great excuse for not editing out all of my mistakes! I'm trying to build new habits around video content that let me produce useful videos while minimizing the amount of time I spend on production. I plan to iterate more on the format as I get more comfortable with the process. I'm hoping I can find the right balance between production time and value to viewers. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
Simon Willison 2 days ago

Code research projects with async coding agents like Claude Code and Codex

I've been experimenting with a pattern for LLM usage recently that's working out really well: asynchronous code research tasks . Pick a research question, spin up an asynchronous coding agent and let it go and run some experiments and report back when it's done. Software development benefits enormously from something I call code research . The great thing about questions about code is that they can often be definitively answered by writing and executing code. I often see questions on forums which hint at a lack of understanding of this skill. "Could Redis work for powering the notifications feed for my app?" is a great example. The answer is always "it depends", but a better answer is that a good programmer already has everything they need to answer that question for themselves. Build a proof-of-concept, simulate the patterns you expect to see in production, then run experiments to see if it's going to work. I've been a keen practitioner of code research for a long time. Many of my most interesting projects started out as a few dozen lines of experimental code to prove to myself that something was possible. It turns out coding agents like Claude Code and Codex are a fantastic fit for this kind of work as well. Give them the right goal and a useful environment and they'll churn through a basic research project without any further supervision. LLMs hallucinate and make mistakes. This is far less important for code research tasks because the code itself doesn't lie: if they write code and execute it and it does the right things then they've demonstrated to both themselves and to you that something really does work. They can't prove something is impossible - just because the coding agent couldn't find a way to do something doesn't mean it can't be done - but they can often demonstrate that something is possible in just a few minutes of crunching. I've used interactive coding agents like Claude Code and Codex CLI for a bunch of these, but today I'm increasingly turning to their asynchronous coding agent family members instead. An asynchronous coding agent is a coding agent that operates on a fire-and-forget basis. You pose it a task, it churns away on a server somewhere and when it's done it files a pull request against your chosen GitHub repository. OpenAI's Codex Cloud , Anthropic's Claude Code for web , Google Gemini's Jules , and GitHub's Copilot coding agent are four prominent examples of this pattern. These are fantastic tools for code research projects. Come up with a clear goal, turn it into a few paragraphs of prompt, set them loose and check back ten minutes later to see what they've come up with. I'm firing off 2-3 code research projects a day right now. My own time commitment is minimal and they frequently come back with useful or interesting results. You can run a code research task against an existing GitHub repository, but I find it's much more liberating to have a separate, dedicated repository for your coding agents to run their projects in. This frees you from being limited to research against just code you've already written, and also means you can be much less cautious about what you let the agents do. I have two repositories that I use for this - one public, one private. I use the public one for research tasks that have no need to be private, and the private one for anything that I'm not yet ready to share with the world. The biggest benefit of a dedicated repository is that you don't need to be cautious about what the agents operating in that repository can do. Both Codex Cloud and Claude Code for web default to running agents in a locked-down environment, with strict restrictions on how they can access the network. This makes total sense if they are running against sensitive repositories - a prompt injection attack of the lethal trifecta variety could easily be used to steal sensitive code or environment variables. If you're running in a fresh, non-sensitive repository you don't need to worry about this at all! I've configured my research repositories for full network access, which means my coding agents can install any dependencies they need, fetch data from the web and generally do anything I'd be able to do on my own computer. Let's dive into some examples. My public research repository is at simonw/research on GitHub. It currently contains 13 folders, each of which is a separate research project. I only created it two weeks ago so I'm already averaging nearly one a day! It also includes a GitHub Workflow which uses GitHub Models to automatically update the README file with a summary of every new project, using Cog , LLM , llm-github-models and this snippet of Python . Here are a some example research projects from the repo. node-pyodide shows an example of a Node.js script that runs the Pyodide WebAssembly distribution of Python inside it - yet another of my ongoing attempts to find a great way of running Python in a WebAssembly sandbox on a server. python-markdown-comparison ( transcript ) provides a detailed performance benchmark of seven different Python Markdown libraries. I fired this one off because I stumbled across cmarkgfm , a Python binding around GitHub's Markdown implementation in C, and wanted to see how it compared to the other options. This one produced some charts! came out on top by a significant margin: Here's the entire prompt I used for that project: Create a performance benchmark and feature comparison report on PyPI cmarkgfm compared to other popular Python markdown libraries - check all of them out from github and read the source to get an idea for features, then design and run a benchmark including generating some charts, then create a report in a new python-markdown-comparison folder (do not create a _summary.md file or edit anywhere outside of that folder). Make sure the performance chart images are directly displayed in the README.md in the folder. Note that I didn't specify any Markdown libraries other than - Claude Code ran a search and found the other six by itself. cmarkgfm-in-pyodide is a lot more fun. A neat thing about having all of my research projects in the same repository is that new projects can build on previous ones. Here I decided to see how hard it would be to get - which has a C extension - working inside Pyodide inside Node.js. Claude successfully compiled a 88.4KB file with the necessary C extension and proved it could be loaded into Pyodide in WebAssembly inside of Node.js. I ran this one using Claude Code on my laptop after an initial attempt failed. The starting prompt was: Figure out how to get the cmarkgfm markdown lover [typo in prompt, this should have been "library" but it figured it out anyway] for Python working in pyodide. This will be hard because it uses C so you will need to compile it to pyodide compatible webassembly somehow. Write a report on your results plus code to a new cmarkgfm-in-pyodide directory. Test it using pytest to exercise a node.js test script that calls pyodide as seen in the existing node.js and pyodide directory There is an existing branch that was an initial attempt at this research, but which failed because it did not have Internet access. You do have Internet access. Use that existing branch to accelerate your work, but do not commit any code unless you are certain that you have successfully executed tests that prove that the pyodide module you created works correctly. This one gave up half way through, complaining that emscripten would take too long. I told it: Complete this project, actually run emscripten, I do not care how long it takes, update the report if it works It churned away for a bit longer and complained that the existing Python library used CFFI which isn't available in Pyodide. I asked it: Can you figure out how to rewrite cmarkgfm to not use FFI and to use a pyodide-friendly way of integrating that C code instead? ... and it did. You can see the full transcript here . blog-tags-scikit-learn . Taking a short break from WebAssembly, I thought it would be fun to put scikit-learn through its paces on a text classification task against my blog: Work in a new folder called blog-tags-scikit-learn Download - a SQLite database. Take a look at the blog_entry table and the associated tags - a lot of the earlier entries do not have tags associated with them, where the later entries do. Design, implement and execute models to suggests tags for those earlier entries based on textual analysis against later ones Use Python scikit learn and try several different strategies Produce JSON of the results for each one, plus scripts for running them and a detailed markdown description Also include an HTML page with a nice visualization of the results that works by loading those JSON files. This resulted in seven files, four results files and a detailed report . (It ignored the bit about an HTML page with a nice visualization for some reason.) Not bad for a few moments of idle curiosity typed into my phone! That's just three of the thirteen projects in the repository so far. The commit history for each one usually links to the prompt and sometimes the transcript if you want to see how they unfolded. More recently I added a short file to the repo with a few extra tips for my research agents. You can read that here . My preferred definition of AI slop is AI-generated content that is published without human review. I've not been reviewing these reports in great detail myself, and I wouldn't usually publish them online without some serious editing and verification. I want to share the pattern I'm using though, so I decided to keep them quarantined in this one public repository. A tiny feature request for GitHub: I'd love to be able to mark a repository as "exclude from search indexes" such that it gets labelled with tags. I still like to keep AI-generated content out of search, to avoid contributing more to the dead internet . It's pretty easy to get started trying out this coding agent research pattern. Create a free GitHub repository (public or private) and let some agents loose on it and see what happens. You can run agents locally but I find the asynchronous agents to be more convenient - especially as I can run them (or trigger them from my phone) without any fear of them damaging my own machine or leaking any of my private data. Claude Code for web offers a free $250 of credits for their $20/month users for a limited time (until November 18, 2025). Gemini Jules has a free tier . There are plenty of other coding agents you can try out as well. Let me know if your research agents come back with anything interesting! You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Code research Coding agents Asynchronous coding agents Give them a dedicated GitHub repository Let them rip with unlimited network access My simonw/research collection This is total slop, of course Try it yourself

0 views
Simon Willison 3 days ago

A new SQL-powered permissions system in Datasette 1.0a20

Datasette 1.0a20 is out with the biggest breaking API change on the road to 1.0, improving how Datasette's permissions system works by migrating permission logic to SQL running in SQLite. This release involved 163 commits , with 10,660 additions and 1,825 deletions, most of which was written with the help of Claude Code. Datasette's permissions system exists to answer the following question: Is this actor allowed to perform this action , optionally against this particular resource ? An actor is usually a user, but might also be an automation operating via the Datasette API. An action is a thing they need to do - things like view-table, execute-sql, insert-row. A resource is the subject of the action - the database you are executing SQL against, the table you want to insert a row into. Datasette's default configuration is public but read-only: anyone can view databases and tables or execute read-only SQL queries but no-one can modify data. Datasette plugins can enable all sorts of additional ways to interact with databases, many of which need to be protected by a form of authentication Datasette also 1.0 includes a write API with a need to configure who can insert, update, and delete rows or create new tables. Actors can be authenticated in a number of different ways provided by plugins using the actor_from_request() plugin hook. datasette-auth-passwords and datasette-auth-github and datasette-auth-existing-cookies are examples of authentication plugins. The previous implementation included a design flaw common to permissions systems of this nature: each permission check involved a function call which would delegate to one or more plugins and return a True/False result. This works well for single checks, but has a significant problem: what if you need to show the user a list of things they can access, for example the tables they can view? I want Datasette to be able to handle potentially thousands of tables - tables in SQLite are cheap! I don't want to have to run 1,000+ permission checks just to show the user a list of tables. Since Datasette is built on top of SQLite we already have a powerful mechanism to help solve this problem. SQLite is really good at filtering large numbers of records. The biggest change in the new release is that I've replaced the previous plugin hook - which let a plugin determine if an actor could perform an action against a resource - with a new permission_resources_sql(actor, action) plugin hook. Instead of returning a True/False result, this new hook returns a SQL query that returns rules helping determine the resources the current actor can execute the specified action against. Here's an example, lifted from the documentation: This hook grants the actor with ID "alice" permission to view the "sales" table in the "accounting" database. The object should always return four columns: a parent, child, allow (1 or 0), and a reason string for debugging. When you ask Datasette to list the resources an actor can access for a specific action, it will combine the SQL returned by all installed plugins into a single query that joins against the internal catalog tables and efficiently lists all the resources the actor can access. This query can then be limited or paginated to avoid loading too many results at once. Datasette has several additional requirements that make the permissions system more complicated. Datasette permissions can optionally act against a two-level hierarchy . You can grant a user the ability to insert-row against a specific table, or every table in a specific database, or every table in every database in that Datasette instance. Some actions can apply at the table level, others the database level and others only make sense globally - enabling a new feature that isn't tied to tables or databases, for example. Datasette currently has ten default actions but plugins that add additional features can register new actions to better participate in the permission systems. Datasette's permission system has a mechanism to veto permission checks - a plugin can return a deny for a specific permission check which will override any allows. This needs to be hierarchy-aware - a deny at the database level can be outvoted by an allow at the table level. Finally, Datasette includes a mechanism for applying additional restrictions to a request. This was introduced for Datasette's API - it allows a user to create an API token that can act on their behalf but is only allowed to perform a subset of their capabilities - just reading from two specific tables, for example. Restrictions are described in more detail in the documentation. That's a lot of different moving parts for the new implementation to cover. Since permissions are critical to the security of a Datasette deployment it's vital that they are as easy to understand and debug as possible. The new alpha adds several new debugging tools, including this page that shows the full list of resources matching a specific action for the current user: And this page listing the rules that apply to that question - since different plugins may return different rules which get combined together: This screenshot illustrates two of Datasette's built-in rules: there is a default allow for read-only operations such as view-table (which can be over-ridden by plugins) and another rule that says the root user can do anything (provided Datasette was started with the option.) Those rules are defined in the datasette/default_permissions.py Python module. There's one question that the new system cannot answer: provide a full list of actors who can perform this action against this resource. It's not possibly to provide this globally for Datasette because Datasette doesn't have a way to track what "actors" exist in the system. SSO plugins such as mean a new authenticated GitHub user might show up at any time, with the ability to perform actions despite the Datasette system never having encountered that particular username before. API tokens and actor restrictions come into play here as well. A user might create a signed API token that can perform a subset of actions on their behalf - the existence of that token can't be predicted by the permissions system. This is a notable omission, but it's also quite common in other systems. AWS cannot provide a list of all actors who have permission to access a specific S3 bucket, for example - presumably for similar reasons. Datasette's plugin ecosystem is the reason I'm paying so much attention to ensuring Datasette 1.0 has a stable API. I don't want plugin authors to need to chase breaking changes once that 1.0 release is out. The Datasette upgrade guide includes detailed notes on upgrades that are needed between the 0.x and 1.0 alpha releases. I've added an extensive section about the permissions changes to that document. I've also been experimenting with dumping those instructions directly into coding agent tools - Claude Code and Codex CLI - to have them upgrade existing plugins for me. This has been working extremely well . I've even had Claude Code update those notes itself with things it learned during an upgrade process! This is greatly helped by the fact that every single Datasette plugin has an automated test suite that demonstrates the core functionality works as expected. Coding agents can use those tests to verify that their changes have had the desired effect. I've also been leaning heavily on to help with the upgrade process. I wrote myself two new helper scripts - and - to help test the new plugins. The and implementations can be found in this TIL . Some of my plugin upgrades have become a one-liner to the command, which runs OpenAI Codex CLI with a prompt without entering interactive mode: There are still a bunch more to go - there's a list in this tracking issue - but I expect to have the plugins I maintain all upgraded pretty quickly now that I have a solid process in place. This change to Datasette core by far the most ambitious piece of work I've ever attempted using a coding agent. Last year I agreed with the prevailing opinion that LLM assistance was much more useful for greenfield coding tasks than working on existing codebases. The amount you could usefully get done was greatly limited by the need to fit the entire codebase into the model's context window. Coding agents have entirely changed that calculation. Claude Code and Codex CLI still have relatively limited token windows - albeit larger than last year - but their ability to search through the codebase, read extra files on demand and "reason" about the code they are working with has made them vastly more capable. I no longer see codebase size as a limiting factor for how useful they can be. I've also spent enough time with Claude Sonnet 4.5 to build a weird level of trust in it. I can usually predict exactly what changes it will make for a prompt. If I tell it "extract this code into a separate function" or "update every instance of this pattern" I know it's likely to get it right. For something like permission code I still review everything it does, often by watching it as it works since it displays diffs in the UI. I also pay extremely close attention to the tests it's writing. Datasette 1.0a19 already had 1,439 tests, many of which exercised the existing permission system. 1.0a20 increases that to 1,583 tests. I feel very good about that, especially since most of the existing tests continued to pass without modification. I built several different proof-of-concept implementations of SQL permissions before settling on the final design. My research/sqlite-permissions-poc project was the one that finally convinced me of a viable approach, That one started as a free ranging conversation with Claude , at the end of which I told it to generate a specification which I then fed into GPT-5 to implement. You can see that specification at the end of the README . I later fed the POC itself into Claude Code and had it implement the first version of the new Datasette system based on that previous experiment. This is admittedly a very weird way of working, but it helped me finally break through on a problem that I'd been struggling with for months. Now that the new alpha is out my focus is upgrading the existing plugin ecosystem to use it, and supporting other plugin authors who are doing the same. The new permissions system unlocks some key improvements to Datasette Cloud concerning finely-grained permissions for larger teams, so I'll be integrating the new alpha there this week. This is the single biggest backwards-incompatible change required before Datasette 1.0. I plan to apply the lessons I learned from this project to the other, less intimidating changes. I'm hoping this can result in a final 1.0 release before the end of the year! You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Understanding the permissions system Permissions systems need to be able to efficiently list things The new permission_resources_sql() plugin hook Hierarchies, plugins, vetoes, and restrictions New debugging tools The missing feature: list actors who can act on this resource Upgrading plugins for Datasette 1.0a20 Using Claude Code to implement this change Starting with a proof-of-concept Miscellaneous tips I picked up along the way What's next? = "test against datasette dev" - it runs a plugin's existing test suite against the current development version of Datasette checked out on my machine. It passes extra options through to so I can run or as needed. = "run against datasette dev" - it runs the latest dev command with the plugin installed. When working on anything relating to plugins it's vital to have at least a few real plugins that you upgrade in lock-step with the core changes. The and shortcuts were invaluable for productively working on those plugins while I made changes to core. Coding agents make experiments much cheaper. I threw away so much code on the way to the final implementation, which was psychologically easier because the cost to create that code in the first place was so low. Tests, tests, tests. This project would have been impossible without that existing test suite. The additional tests we built along the way give me confidence that the new system is as robust as I need it to be. Claude writes good commit messages now! I finally gave in and let it write these - previously I've been determined to write them myself. It's a big time saver to be able to say "write a tasteful commit message for these changes". Claude is also great at breaking up changes into smaller commits. It can also productively rewrite history to make it easier to follow, especially useful if you're still working in a branch. A really great way to review Claude's changes is with the GitHub PR interface. You can attach comments to individual lines of code and then later prompt Claude like this: . This is a very quick way to apply little nitpick changes - rename this function, refactor this repeated code, add types here etc. The code I write with LLMs is higher quality code . I usually find myself making constant trade-offs while coding: this function would be neater if I extracted this helper, it would be nice to have inline documentation here, this changing this would be good but would break a dozen tests... for each of those I have to determine if the additional time is worth the benefit. Claude can apply changes so much faster than me that these calculations have changed - almost any improvement is worth applying, no matter how trivial, because the time cost is so low. Internal tools are cheap now. The new debugging interfaces were mostly written by Claude and are significantly nicer to use and look at than the hacky versions I would have knocked out myself, if I had even taken the extra time to build them. That trick with a Markdown file full of upgrade instructions works astonishingly well - it's the same basic idea as Claude Skills . I maintain over 100 Datasette plugins now and I expect I'll be automating all sorts of minor upgrades in the future using this technique.

0 views
Simon Willison 5 days ago

New prompt injection papers: Agents Rule of Two and The Attacker Moves Second

Two interesting new papers regarding LLM security and prompt injection came to my attention this weekend. The first is Agents Rule of Two: A Practical Approach to AI Agent Security , published on October 31st on the Meta AI blog. It doesn't list authors but it was shared on Twitter by Meta AI security researcher Mick Ayzenberg. It proposes a "Rule of Two" that's inspired by both my own lethal trifecta concept and the Google Chrome team's Rule Of 2 for writing code that works with untrustworthy inputs: At a high level, the Agents Rule of Two states that until robustness research allows us to reliably detect and refuse prompt injection, agents must satisfy no more than two of the following three properties within a session to avoid the highest impact consequences of prompt injection. [A] An agent can process untrustworthy inputs [B] An agent can have access to sensitive systems or private data [C] An agent can change state or communicate externally It's still possible that all three properties are necessary to carry out a request. If an agent requires all three without starting a new session (i.e., with a fresh context window), then the agent should not be permitted to operate autonomously and at a minimum requires supervision --- via human-in-the-loop approval or another reliable means of validation. It's accompanied by this handy diagram: I like this a lot . I've spent several years now trying to find clear ways to explain the risks of prompt injection attacks to developers who are building on top of LLMs. It's frustratingly difficult. I've had the most success with the lethal trifecta, which boils one particular class of prompt injection attack down to a simple-enough model: if your system has access to private data, exposure to untrusted content and a way to communicate externally then it's vulnerable to private data being stolen. The one problem with the lethal trifecta is that it only covers the risk of data exfiltration: there are plenty of other, even nastier risks that arise from prompt injection attacks against LLM-powered agents with access to tools which the lethal trifecta doesn't cover. The Agents Rule of Two neatly solves this, through the addition of "changing state" as a property to consider. This brings other forms of tool usage into the picture: anything that can change state triggered by untrustworthy inputs is something to be very cautious about. It's also refreshing to see another major research lab concluding that prompt injection remains an unsolved problem, and attempts to block or filter them have not proven reliable enough to depend on. The current solution is to design systems with this in mind, and the Rule of Two is a solid way to think about that. Update : On thinking about this further there's one aspect of the Rule of Two model that doesn't work for me: the Venn diagram above marks the combination of untrustworthy inputs and the ability to change state as "safe", but that's not right. Even without access to private systems or sensitive data that pairing can still produce harmful results. Unfortunately adding an exception for that pair undermines the simplicity of the "Rule of Two" framing! Update 2 : Mick Ayzenberg responded to this note in a comment on Hacker News : Thanks for the feedback! One small bit of clarification, the framework would describe access to any sensitive system as part of the [B] circle, not only private systems or private data. The intention is that an agent that has removed [B] can write state and communicate freely, but not with any systems that matter (wrt critical security outcomes for its user). An example of an agent in this state would be one that can take actions in a tight sandbox or is isolated from production. The Meta team also updated their post to replace "safe" with "lower risk" as the label on the intersections between the different circles. I've updated my screenshots of their diagrams in this post, here's the original for comparison. Which brings me to the second paper... This paper is dated 10th October 2025 on Arxiv and comes from a heavy-hitting team of 14 authors - Milad Nasr, Nicholas Carlini, Chawin Sitawarin, Sander V. Schulhoff, Jamie Hayes, Michael Ilie, Juliette Pluto, Shuang Song, Harsh Chaudhari, Ilia Shumailov, Abhradeep Thakurta, Kai Yuanqing Xiao, Andreas Terzis, Florian Tramèr - including representatives from OpenAI, Anthropic, and Google DeepMind. The paper looks at 12 published defenses against prompt injection and jailbreaking and subjects them to a range of "adaptive attacks" - attacks that are allowed to expend considerable effort iterating multiple times to try and find a way through. The defenses did not fare well: By systematically tuning and scaling general optimization techniques—gradient descent, reinforcement learning, random search, and human-guided exploration—we bypass 12 recent defenses (based on a diverse set of techniques) with attack success rate above 90% for most; importantly, the majority of defenses originally reported near-zero attack success rates. Notably the "Human red-teaming setting" scored 100%, defeating all defenses. That red-team consisted of 500 participants in an online competition they ran with a $20,000 prize fund. The key point of the paper is that static example attacks - single string prompts designed to bypass systems - are an almost useless way to evaluate these defenses. Adaptive attacks are far more powerful, as shown by this chart: The three automated adaptive attack techniques used by the paper are: The paper concludes somewhat optimistically: [...] Adaptive evaluations are therefore more challenging to perform, making it all the more important that they are performed. We again urge defense authors to release simple, easy-to-prompt defenses that are amenable to human analysis. [...] Finally, we hope that our analysis here will increase the standard for defense evaluations, and in so doing, increase the likelihood that reliable jailbreak and prompt injection defenses will be developed. Given how totally the defenses were defeated, I do not share their optimism that reliable defenses will be developed any time soon. As a review of how far we still have to go this paper packs a powerful punch. I think it makes a strong case for Meta's Agents Rule of Two as the best practical advice for building secure LLM-powered agent systems today in the absence of prompt injection defenses we can rely on. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Gradient-based methods - these were the least effective, using the technique described in the legendary Universal and Transferable Adversarial Attacks on Aligned Language Models paper from 2023 . Reinforcement learning methods - particularly effective against black-box models: "we allowed the attacker model to interact directly with the defended system and observe its outputs", using 32 sessions of 5 rounds each. Search-based methods - generate candidates with an LLM, then evaluate and further modify them using LLM-as-judge and other classifiers.

0 views
Simon Willison 1 weeks ago

Hacking the WiFi-enabled color screen GitHub Universe conference badge

I'm at GitHub Universe this week (thanks to a free ticket from Microsoft). Yesterday I picked up my conference badge... which incorporates a full Raspberry Pi Raspberry Pi Pico microcontroller with a battery, color screen, WiFi and bluetooth. GitHub Universe has a tradition of hackable conference badges - the badge last year had an eInk display. This year's is a huge upgrade though - a color screen and WiFI connection makes this thing a genuinely useful little computer! The only thing it's missing is a keyboard - the device instead provides five buttons total - Up, Down, A, B, C. It might be possible to get a bluetooth keyboard to work though I'll believe that when I see it - there's not a lot of space on this device for a keyboard driver. Everything is written using MicroPython, and the device is designed to be hackable: connect it to a laptop with a USB-C cable and you can start modifying the code directly on the device. Out of the box the badge will play an opening animation (implemented as a sequence of PNG image frames) and then show a home screen with six app icons. The default apps are mostly neat Octocat-themed demos: a flappy-bird clone, a tamagotchi-style pet, a drawing app that works like an etch-a-sketch, an IR scavenger hunt for the conference venue itself (this thing has an IR sensor too!), and a gallery app showing some images. The sixth app is a badge app. This will show your GitHub profile image and some basic stats, but will only work if you dig out a USB-C cable and make some edits to the files on the badge directly. I did this on a Mac. I plugged a USB-C cable into the badge which caused MacOS to treat it as an attached drive volume. In that drive are several files including . Open that up, confirm the WiFi details are correct and add your GitHub username. The file should look like this: The badge comes with the SSID and password for the GitHub Universe WiFi network pre-populated. That's it! Unmount the disk, hit the reboot button on the back of the badge and when it comes back up again the badge app should look something like this: Here's the official documentation for building software for the badge. When I got mine yesterday the official repo had not yet been updated, so I had to figure this out myself. I copied all of the code across to my laptop, added it to a Git repo and then fired up Claude Code and told it: Here's the result , which was really useful for getting a start on understanding how it all worked. Each of the six default apps lives in a folder, for example apps/sketch/ for the sketching app. There's also a menu app which powers the home screen. That lives in apps/menu/ . You can edit code in here to add new apps that you create to that screen. I told Claude: This was a bit of a long-shot, but it totally worked! The first version had an error: I OCRd that photo (with the Apple Photos app) and pasted the message into Claude Code and it fixed the problem. This almost worked... but the addition of a seventh icon to the 2x3 grid meant that you could select the icon but it didn't scroll into view. I had Claude fix that for me too . Here's the code for apps/debug/__init__.py , and the full Claude Code transcript created using my terminal-to-HTML app described here . Here are the four screens of the debug app: The icons used on the app are 24x24 pixels. I decided it would be neat to have a web app that helps build those icons, including the ability to start by creating an icon from an emoji. I bulit this one using Claude Artifacts . Here's the result, now available at tools.simonwillison.net/icon-editor : I noticed that last year's badge configuration app (which I can't find in github.com/badger/badger.github.io any more, I think they reset the history on that repo?) worked by talking to MicroPython over the Web Serial API from Chrome. Here's my archived copy of that code . Wouldn't it be useful to have a REPL in a web UI that you could use to interact with the badge directly over USB? I pointed Claude Code at a copy of that repo and told it: It took a bit of poking (here's the transcript ) but the result is now live at tools.simonwillison.net/badge-repl . It only works in Chrome - you'll need to plug the badge in with a USB-C cable and then click "Connect to Badge". If you're a GitHub Universe attendee I hope this is useful. The official badger.github.io site has plenty more details to help you get started. There isn't yet a way to get hold of this hardware outside of GitHub Universe - I know they had some supply chain challenges just getting enough badges for the conference attendees! It's a very neat device, built for GitHub by Pimoroni in Sheffield, UK. A version of this should become generally available in the future under the name "Pimoroni Tufty 2350". You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

1 views
Simon Willison 2 weeks ago

Video: Building a tool to copy-paste share terminal sessions using Claude Code for web

This afternoon I was manually converting a terminal session into a shared HTML file for the umpteenth time when I decided to reduce the friction by building a custom tool for it - and on the spur of the moment I fired up Descript to record the process. The result is this new 11 minute YouTube video showing my workflow for vibe-coding simple tools from start to finish. The problem I wanted to solve involves sharing my Claude Code CLI sessions - and the more general problem of sharing interesting things that happen in my terminal. A while back I discovered (using my vibe-coded clipboard inspector ) that copying and pasting from the macOS terminal populates a rich text clipboard format which preserves the colors and general formatting of the terminal output. The problem is that format looks like this: This struck me as the kind of thing an LLM might be able to write code to parse, so I had ChatGPT take a crack at it and then later rewrote it from scratch with Claude Sonnet 4.5 . The result was this rtf-to-html tool which lets you paste in rich formatted text and gives you reasonably solid HTML that you can share elsewhere. To share that HTML I've started habitually pasting it into a GitHub Gist and then taking advantage of , a neat little unofficial tool that accepts and displays the gist content as a standalone HTML page... which means you can link to rendered HTML that's stored in a gist. So my process was: Not too much hassle, but frustratingly manual if you're doing it several times a day. Ideally I want a tool where I can do this: I decided to get Claude Code for web to build the entire thing. Here's the full prompt I used on claude.ai/code , pointed at my repo, to build the tool: It's quite a long prompt - it took me several minutes to type! But it covered the functionality I wanted in enough detail that I was pretty confident Claude would be able to build it. I'm using one key technique in this prompt: I'm referencing existing tools in the same repo and telling Claude to imitate their functionality. I first wrote about this trick last March in Running OCR against PDFs and images directly in your browser , where I described how a snippet of code that used PDF.js and another snippet that used Tesseract.js was enough for Claude 3 Opus to build me this working PDF OCR tool . That was actually the tool that kicked off my tools.simonwillison.net collection in the first place, which has since grown to 139 and counting. Here I'm telling Claude that I want the RTF to HTML functionality of rtf-to-html.html combined with the Gist saving functionality of openai-audio-output.html . That one has quite a bit going on. It uses the OpenAI audio API to generate audio output from a text prompt, which is returned by that API as base64-encoded data in JSON. Then it offers the user a button to save that JSON to a Gist, which gives the snippet a URL. Another tool I wrote, gpt-4o-audio-player.html , can then accept that Gist ID in the URL and will fetch the JSON data and make the audio playable in the browser. Here's an example . The trickiest part of this is API tokens. I've built tools in the past that require users to paste in a GitHub Personal Access Token (PAT) (which I then store in in their browser - I don't want other people's authentication credentials anywhere near my own servers). But that's a bit fiddly. Instead, I figured out the minimal Cloudflare worker necessary to implement the server-side portion of GitHub's authentication flow. That code lives here and means that any of the HTML+JavaScript tools in my collection can implement a GitHub authentication flow if they need to save Gists. But I don't have to tell the model any of that! I can just say "do the same trick that openai-audio-output.html does" and Claude Code will work the rest out for itself. Here's what the resulting app looks like after I've pasted in some terminal output from Claude Code CLI: It's exactly what I asked for, and the green-on-black terminal aesthetic is spot on too. There are a bunch of other things that I touch on in the video. Here's a quick summary: You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . The initial problem The problem I wanted to solve involves sharing my Claude Code CLI sessions - and the more general problem of sharing interesting things that happen in my terminal. A while back I discovered (using my vibe-coded clipboard inspector ) that copying and pasting from the macOS terminal populates a rich text clipboard format which preserves the colors and general formatting of the terminal output. The problem is that format looks like this: This struck me as the kind of thing an LLM might be able to write code to parse, so I had ChatGPT take a crack at it and then later rewrote it from scratch with Claude Sonnet 4.5 . The result was this rtf-to-html tool which lets you paste in rich formatted text and gives you reasonably solid HTML that you can share elsewhere. To share that HTML I've started habitually pasting it into a GitHub Gist and then taking advantage of , a neat little unofficial tool that accepts and displays the gist content as a standalone HTML page... which means you can link to rendered HTML that's stored in a gist. So my process was: Copy terminal output Paste into rtf-to-html Copy resulting HTML Paste that int a new GitHub Gist Grab that Gist's ID Share the link to Copy terminal output Paste into a new tool Click a button and get a link to share tools.simonwillison.net/colophon is the list of all of my tools, with accompanying AI-generated descriptions. Here's more about how I built that with Claude Code and notes on how I added the AI-generated descriptions . gistpreview.github.io is really neat. I used Descript to record and edit the video. I'm still getting the hang of it - hence the slightly clumsy pan-and-zoom - but it's pretty great for this kind of screen recording. The site's automated deploys are managed by this GitHub Actions workflow . I also have it configured to work with Cloudflare Pages for those preview deployments from PRs (here's an example ). The automated documentation is created using my llm tool and llm-anthropic plugin. Here's the script that does that , recently upgraded to use Claude Haiku 4.5.

0 views
Simon Willison 2 weeks ago

Dane Stuckey (OpenAI CISO) on prompt injection risks for ChatGPT Atlas

My biggest complaint about the launch of the ChatGPT Atlas browser the other day was the lack of details on how OpenAI are addressing prompt injection attacks. The launch post mostly punted that question to the System Card for their "ChatGPT agent" browser automation feature from July. Since this was my single biggest question about Atlas I was disappointed not to see it addressed more directly. OpenAI's Chief Information Security Officer Dane Stuckey just posted the most detail I've seen yet in a lengthy Twitter post . I'll quote from his post here (with my emphasis in bold) and add my own commentary. He addresses the issue directly by name, with a good single-sentence explanation of the problem: One emerging risk we are very thoughtfully researching and mitigating is prompt injections, where attackers hide malicious instructions in websites, emails, or other sources, to try to trick the agent into behaving in unintended ways . The objective for attackers can be as simple as trying to bias the agent’s opinion while shopping, or as consequential as an attacker trying to get the agent to fetch and leak private data , such as sensitive information from your email, or credentials. We saw examples of browser agents from other vendors leaking private data in this way identified by the Brave security team just yesterday . Our long-term goal is that you should be able to trust ChatGPT agent to use your browser, the same way you’d trust your most competent, trustworthy, and security-aware colleague or friend. This is an interesting way to frame the eventual goal, describing an extraordinary level of trust and competence. As always, a big difference between AI systems and a human is that an AI system cannot be held accountable for its actions . I'll let my trusted friend use my logged-in browser only because there are social consequences if they abuse that trust! We’re working hard to achieve that. For this launch, we’ve performed extensive red-teaming, implemented novel model training techniques to reward the model for ignoring malicious instructions, implemented overlapping guardrails and safety measures , and added new systems to detect and block such attacks. However, prompt injection remains a frontier, unsolved security problem, and our adversaries will spend significant time and resources to find ways to make ChatGPT agent fall for these attacks . I'm glad to see OpenAI's CISO openly acknowledging that prompt injection remains an unsolved security problem (three years after we started talking about it !). That "adversaries will spend significant time and resources" thing is the root of why I don't see guardrails and safety measures as providing a credible solution to this problem. As I've written before, in application security 99% is a failing grade . If there's a way to get past the guardrails, no matter how obscure, a motivated adversarial attacker is going to figure that out. Dane goes on to describe some of those measures: To protect our users, and to help improve our models against these attacks: I like this a lot. OpenAI have an advantage here of being a centralized system - they can monitor their entire user base for signs of new attack patterns. It's still bad news for users that get caught out by a zero-day prompt injection, but it does at least mean that successful new attack patterns should have a small window of opportunity. "Defense in depth" always sounds good, but it worries me that it's setting up a false sense of security here. If it's harder but still possible someone is going to get through. Logged out mode is very smart, and is already a tried and tested pattern. I frequently have Claude Code or Codex CLI fire up Playwright to interact with websites, safe in the knowledge that they won't have access to my logged-in sessions. ChatGPT's existing agent mode provides a similar capability. Logged in mode is where things get scary, especially since we're delegating security decisions to end-users of the software. We've demonstrated many times over that this is an unfair burden to place on almost any user. This detail is new to me: I need to spend more time with ChatGPT Atlas to see what it looks like in practice. I tried just now using both GitHub and an online banking site and neither of them seemed to trigger "watch mode" - Atlas continued to navigate even when I had switched to another application. Watch mode sounds reasonable in theory - similar to a driver-assisted car that requires you to keep your hands on the wheel - but I'd like to see it in action before I count it as a meaningful mitigation. Dane closes with an analogy to computer viruses: New levels of intelligence and capability require the technology, society, the risk mitigation strategy to co-evolve. And as with computer viruses in the early 2000s, we think it’s important for everyone to understand responsible usage , including thinking about prompt injection attacks, so we can all learn to benefit from this technology safely. I don't think the average computer user ever really got the hang of staying clear of computer viruses... we're still fighting that battle today, albeit much more successfully on mobile platforms that implement tight restrictions on what software can do. My takeaways from all of this? It's not done much to influence my overall skepticism of the entire category of browser agents, but it does at least demonstrate that OpenAI are keenly aware of the problems and are investing serious effort in finding the right mix of protections. How well those protections work is something I expect will become clear over the next few months. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . We’ve prioritized rapid response systems to help us quickly identify block attack campaigns as we become aware of them. We are also continuing to invest heavily in security, privacy, and safety - including research to improve the robustness of our models, security monitors, infrastructure security controls, and other techniques to help prevent these attacks via defense in depth . We’ve designed Atlas to give you controls to help protect yourself. We have added a feature to allow ChatGPT agent to take action on your behalf, but without access to your credentials called “logged out mode” . We recommend this mode when you don’t need to take action within your accounts. Today, we think “logged in mode” is most appropriate for well-scoped actions on very trusted sites, where the risks of prompt injection are lower . Asking it to add ingredients to a shopping cart is generally safer than a broad or vague request like “review my emails and take whatever actions are needed.” When agent is operating on sensitive sites, we have also implemented a "Watch Mode" that alerts you to the sensitive nature of the site and requires you have the tab active to watch the agent do its work . Agent will pause if you move away from the tab with sensitive information. This ensures you stay aware - and in control - of what agent actions the agent is performing. [...]

1 views
Simon Willison 2 weeks ago

Living dangerously with Claude

I gave a talk last night at Claude Code Anonymous in San Francisco, the unofficial meetup for coding agent enthusiasts. I decided to talk about a dichotomy I've been struggling with recently. On the one hand I'm getting enormous value from running coding agents with as few restrictions as possible. On the other hand I'm deeply concerned by the risks that accompany that freedom. Below is a copy of my slides, plus additional notes and links as an annotated presentation . I'm going to be talking about two things this evening... Why you should always use . (This got a cheer from the room full of Claude Code enthusiasts.) And why you should never use . (This did not get a cheer.) is a bit of a mouthful, so I'm going to use its better name, "YOLO mode", for the rest of this presentation. Claude Code running in this mode genuinely feels like a completely different product from regular, default Claude Code. The default mode requires you to pay constant attention to it, tracking everything it does and actively approving changes and actions every few steps. In YOLO mode you can leave Claude alone to solve all manner of hairy problems while you go and do something else entirely. I have a suspicion that many people who don't appreciate the value of coding agents have never experienced YOLO mode in all of its glory. I'll show you three projects I completed with YOLO mode in just the past 48 hours. I wrote about this one at length in Getting DeepSeek-OCR working on an NVIDIA Spark via brute force using Claude Code . I wanted to try the newly released DeepSeek-OCR model on an NVIDIA Spark, but doing so requires figuring out how to run a model using PyTorch and CUDA, which is never easy and is a whole lot harder on an ARM64 device. I SSHd into the Spark, started a fresh Docker container and told Claude Code to figure it out. It took 40 minutes and three additional prompts but it solved the problem , and I got to have breakfast and tinker with some other projects while it was working. This project started out in Claude Code for the web . I'm eternally interested in options for running server-side Python code inside a WebAssembly sandbox, for all kinds of reasons. I decided to see if the Claude iPhone app could launch a task to figure it out. I wanted to see how hard it was to do that using Pyodide running directly in Node.js. Claude Code got it working and built and tested this demo script showing how to do it. I started a new simonw/research repository to store the results of these experiments, each one in a separate folder. It's up to 5 completed research projects already and I created it less than 2 days ago. Here's my favorite, a project from just this morning. I decided I wanted to try out SLOCCount , a 2001-era Perl tool for counting lines of code and estimating the cost to develop them using 2001 USA developer salaries. .. but I didn't want to run Perl, so I decided to have Claude Code (for web, and later on my laptop) try and figure out how to run Perl scripts in WebAssembly. TLDR: it got there in the end ! It turned out some of the supporting scripts in SLOCCount were written in C, so it had to compile those to WebAssembly as well. And now tools.simonwillison.net/sloccount is a browser-based app which runs 25-year-old Perl+C in WebAssembly against pasted code, GitHub repository references and even zip files full of code. The wild thing is that all three of these projects weren't even a priority for me - they were side quests, representing pure curiosity that I could outsource to Claude Code and solve in the background while I was occupied with something else. I got a lot of useful work done in parallel to these three flights of fancy. But there's a reason has that scary name. It's dangerous to use Claude Code (and other coding agents) in this way! The reason for this is prompt injection , a term I coined three years ago to describe a class of attacks against LLMs that take advantage of the way untrusted content is concatenated together with trusted instructions. (It's named after SQL injection which shares a similar shape.) This remains an incredibly common vulnerability. Here's a great example of a prompt injection attack against a coding agent, described by Johann Rehberger as part of his Month of AI Bugs , sharing a new prompt injection report every day for the month of August. If a coding agent - in this case OpenHands - reads this file it can be tricked into grepping the available environment variables for (matching GitHub Personal Access Tokens) and sending that to the attacker's external server for "help debugging these variables". I coined another term to try and describe a common subset of prompt injection attacks: the lethal trifecta . Any time an LLM system combines access to private data with exposure to untrusted content and the ability to externally communicate , there's an opportunity for attackers to trick the system into leaking that private data back to them. These attacks are incredibly common . If you're running YOLO coding agents with access to private source code or secrets (like API keys in environment variables) you need to be concerned about the potential of these attacks. This is the fundamental rule of prompt injection: anyone who can get their tokens into your context should be considered to have full control over what your agent does next, including the tools that it calls. Some people will try to convince you that prompt injection attacks can be solved using more AI to detect the attacks. This does not work 100% reliably, which means it's not a useful security defense at all . The only solution that's credible is to run coding agents in a sandbox . The best sandboxes are the ones that run on someone else's computer! That way the worst that can happen is someone else's computer getting owned. You still need to worry about your source code getting leaked. Most of my stuff is open source anyway, and a lot of the code I have agents working on is research code with no proprietary secrets. If your code really is sensitive you need to consider network restrictions more carefully, as discussed in a few slides. There are lots of great sandboxes that run on other people's computers. OpenAI Codex Cloud, Claude Code for the web, Gemini Jules are all excellent solutions for this. I also really like the code interpreter features baked into the ChatGPT and Claude consumer apps. There are two problems to consider with sandboxing. The first is easy: you need to control what files can be read and written on the filesystem. The second is much harder: controlling the network connections that can be made by code running inside the agent. The reason network access is so important is that it represents the data exfiltration leg of the lethal trifecta. If you can prevent external communication back to an attacker they can't steal your private information, even if they manage to sneak in their own malicious instructions. Claude Code CLI grew a new sandboxing feature just yesterday, and Anthropic released an a new open source library showing how it works. The key to the implementation - at least on macOS - is Apple's little known but powerful command. This provides a way to run any command in a sandbox configured by a policy document. Those policies can control which files are visible but can also allow-list network connections. Anthropic run an HTTP proxy and allow the Claude Code environment to talk to that, then use the proxy to control which domains it can communicate with. (I used Claude itself to synthesize this example from Anthropic's codebase.) ... the bad news is that has been marked as deprecated in Apple's documentation since at least 2017! It's used by Codex CLI too, and is still the most convenient way to run a sandbox on a Mac. I'm hoping Apple will reconsider. So go forth and live dangerously! (But do it in a sandbox.) You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
Simon Willison 2 weeks ago

Claude Code for web - a new asynchronous coding agent from Anthropic

Anthropic launched Claude Code for web this morning. It's an asynchronous coding agent - their answer to OpenAI's Codex Cloud and Google's Jules , and has a very similar shape. I had preview access over the weekend and I've already seen some very promising results from it. It's available online at claude.ai/code and shows up as a tab in the Claude iPhone app as well: As far as I can tell it's their latest Claude Code CLI app wrapped in a container (Anthropic are getting really good at containers these days) and configured to . It appears to behave exactly the same as the CLI tool, and includes a neat "teleport" feature which can copy both the chat transcript and the edited files down to your local Claude Code CLI tool if you want to take over locally. It's very straight-forward to use. You point Claude Code for web at a GitHub repository, select an environment (fully locked down, restricted to an allow-list of domains or configured to access domains of your choosing, including "*" for everything) and kick it off with a prompt. While it's running you can send it additional prompts which are queued up and executed after it completes its current step. Once it's done it opens a branch on your repo with its work and can optionally open a pull request. Claude Code for web's PRs are indistinguishable from Claude Code CLI's, so Anthropic told me it was OK to submit those against public repos even during the private preview. Here are some examples from this weekend: That second example is the most interesting. I saw a tweet from Armin about his MiniJinja Rust template language adding support for Python 3.14 free threading. I hadn't realized that project had Python bindings, so I decided it would be interesting to see a quick performance comparison between MiniJinja and Jinja2. I ran Claude Code for web against a private repository with a completely open environment ( in the allow-list) and prompted: I’m interested in benchmarking the Python bindings for https://github.com/mitsuhiko/minijinja against the equivalente template using Python jinja2 Design and implement a benchmark for this. It should use the latest main checkout of minijinja and the latest stable release of jinja2. The benchmark should use the uv version of Python 3.14 and should test both the regular 3.14 and the 3.14t free threaded version - so four scenarios total The benchmark should run against a reasonably complicated example of a template, using template inheritance and loops and such like In the PR include a shell script to run the entire benchmark, plus benchmark implantation, plus markdown file describing the benchmark and the results in detail, plus some illustrative charts created using matplotlib I entered this into the Claude iPhone app on my mobile keyboard, hence the typos. It churned away for a few minutes and gave me exactly what I asked for. Here's one of the four charts it created: (I was surprised to see MiniJinja out-performed by Jinja2, but I guess Jinja2 has had a decade of clever performance optimizations and doesn't need to deal with any extra overhead of calling out to Rust.) Note that I would likely have got the exact same result running this prompt against Claude CLI on my laptop. The benefit of Claude Code for web is entirely in its convenience as a way of running these tasks in a hosted container managed by Anthropic, with a pleasant web and mobile UI layered over the top. It's interesting how Anthropic chose to announce this new feature: the product launch is buried half way down their new engineering blog post Beyond permission prompts: making Claude Code more secure and autonomous , which starts like this: Claude Code's new sandboxing features, a bash tool and Claude Code on the web, reduce permission prompts and increase user safety by enabling two boundaries: filesystem and network isolation. I'm very excited to hear that Claude Code CLI is taking sandboxing more seriously. I've not yet dug into the details of that - it looks like it's using seatbelt on macOS and Bubblewrap on Linux. Anthropic released a new open source (Apache 2) library, anthropic-experimental/sandbox-runtime , with their implementation of this so far. Filesystem sandboxing is relatively easy. The harder problem is network isolation, which they describe like this: Network isolation , by only allowing internet access through a unix domain socket connected to a proxy server running outside the sandbox. This proxy server enforces restrictions on the domains that a process can connect to, and handles user confirmation for newly requested domains. And if you’d like further-increased security, we also support customizing this proxy to enforce arbitrary rules on outgoing traffic. This is crucial to protecting against both prompt injection and lethal trifecta attacks. The best way to prevent lethal trifecta attacks is to cut off one of the three legs, and network isolation is how you remove the data exfiltration leg that allows successful attackers to steal your data. If you run Claude Code for web in "No network access" mode you have nothing to worry about. I'm a little bit nervous about their "Trusted network access" environment. It's intended to only allow access to domains relating to dependency installation, but the default domain list has dozens of entries which makes me nervous about unintended exfiltration vectors sneaking through. You can also configure a custom environment with your own allow-list. I have one called "Everything" which allow-lists "*", because for projects like my MiniJinja/Jinja2 comparison above there are no secrets or source code involved that need protecting. I see Anthropic's focus on sandboxes as an acknowledgment that coding agents run in YOLO mode ( and the like) are enormously more valuable and productive than agents where you have to approve their every step. The challenge is making it convenient and easy to run them safely. This kind of sandboxing kind is the only approach to safety that feels credible to me. Update : A note on cost: I'm currently using a Claude "Max" plan that Anthropic gave me in order to test some of their features, so I don't have a good feeling for how Claude Code would cost for these kinds of projects. From running (an unofficial cost estimate tool ) it looks like I'm using between $1 and $5 worth of daily Claude CLI invocations at the moment. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Add query-string-stripper.html tool against my simonw/tools repo - a very simple task that creates (and deployed via GitHub Pages) this query-string-stripper tool. minijinja vs jinja2 Performance Benchmark - I ran this against a private repo and then copied the results here, so no PR. Here's the prompt I used. Update deepseek-ocr README to reflect successful project completion - I noticed that the README produced by Claude Code CLI for this project was misleadingly out of date, so I had Claude Code for web fix the problem.

0 views
Simon Willison 2 weeks ago

Getting DeepSeek-OCR working on an NVIDIA Spark via brute force using Claude Code

DeepSeek released a new model yesterday: DeepSeek-OCR , a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular hardware. This small project (40 minutes this morning, most of which was Claude Code churning away while I had breakfast and did some other things) ties together a bunch of different concepts I've been exploring recently. I designed an agentic loop for the problem, gave Claude full permissions inside a Docker sandbox, embraced the parallel agents lifestyle and reused my notes on the NVIDIA Spark from last week. I knew getting a PyTorch CUDA model running on the Spark was going to be a little frustrating, so I decided to outsource the entire process to Claude Code to see what would happen. TLDR: It worked. It took four prompts (one long, three very short) to have Claude Code figure out everything necessary to run the new DeepSeek model on the NVIDIA Spark, OCR a document for me and produce copious notes about the process. I connected to the Spark from my Mac via SSH and started a new Docker container there: Then I installed npm and used that to install Claude Code: Then started Claude Code, telling it that it's OK that it's running as because it's in a sandbox: It provided me a URL to click on to authenticate with my Anthropic account. I kicked things off with this prompt: Create a folder deepseek-ocr and do everything else in that folder Then I ran the following, providing links to both the GitHub repository and the Hugging Face model, providing a clue about NVIDIA ARM and giving it an image ( this one , see previous post ) that I wanted it to run OCR on. Your task is to get this working: https://github.com/deepseek-ai/DeepSeek-OCR - it uses Hugging Face Transformers and models from https://huggingface.co/deepseek-ai/DeepSeek-OCR - you are running in a Docker container on an NVIDIA ARM device, first research your environment to check it will be possible to do this. Git clone both of those - you may need to enable LFS first. Make extensive notes as you go in a notes.md file which you should only ever append new notes to. At the end of this you should have scripts in the folder which can be run to both setup and then run the model against an example - use https://static.simonwillison.net/static/2025/ft.jpeg as the example image that you OCR. As a final step create a detailed README.md file explaining the project and what you learned and how to run your code. Note how I also told it to "make extensive notes as you go in a notes.md file which you should only ever append new notes to" so I'd have some useful notes at the end. It churned away for 13 minutes, and made a sequence of discoveries about the platform and the model and the difficulties of running PyTorch on it. And then... it gave up : The NVIDIA GB10 GPU has compute capability sm_121, which is too new for PyTorch 2.5.1 Additionally, the model has hardcoded calls, preventing CPU fallback without code modifications. So I gave it some encouragement: Any other options that might help get CUDA working like using different versions of PyTorch that have available CUDA wheels for this platform? Have you checked to see what PyTorch versions already have these ARM CUDA wheels that would work here? It ran this command to grep for all available PyTorch wheels on that match the ARM architecture: ... and that unblocked it. Excellent! I can see that PyTorch 2.9.0 has ARM64 wheels for CUDA 12.8, 12.9, and CUDA 13.0 (cu130)! This is exactly what we need! It installed the wheels it found and reported: Good news! PyTorch 2.9.0 can detect the GB10 GPU with CUDA capability 12.1 (sm_121), but it says the maximum supported is 12.0. However, it's just a warning now instead of a hard error! Let's try running inference: [...] Then kept on churning for a bit, ignoring warnings, and managed to run the model ! It loaded the image and ran this prompt: And got back output that started like this: Claude declared victory and pointed me to the output/result.mmd file, which contained only whitespace. So OCR had worked but the result had failed to be written correctly to disk. That result.mmd file looks to be exclusively whitespace to me, try and get a better result there - the bounding box image was good though Which caused Claude to reconsider the prompt it used: Maybe we need to use a different prompt. Looking at the README from DeepSeek-OCR, there are different prompts: The "grounding" mode seems to be focused on bounding boxes, which is why we got good bounding box results but the text output might be in a different format. Let me try running it again with a different prompt that's more focused on text extraction rather than grounding. It tried all of the prompt variants, made a note about them and created a file called PROMPTS_GUIDE.md with a bunch more details, including this comparison table: And this table under the heading "Performance Benchmarks": Test image: 3503×1668 pixels (Financial Times article) My final prompt was this, to gather everything together into a zip file I could extract from the Docker container: Create a zip file with the output and output_text and all of the scripts and notes - but leave out the github repo and the huggingface repo directories I added the contents of that zip file to my new simonw/research GitHub repo in the deepseek-ocr-nvidia-spark folder. Claude really likes writing notes! Here's the directory listing of that finished folder: My first prompt was at 15:31:07 (UTC). The final message from Claude Code came in at 16:10:03. That means it took less than 40 minutes start to finish, and I was only actively involved for about 5-10 minutes of that time. The rest of the time I was having breakfast and doing other things. Having tried and failed to get PyTorch stuff working in the past, I count this as a huge win. I'll be using this process a whole lot more in the future. How good were the actual results? There's honestly so much material in the resulting notes created by Claude that I haven't reviewed all of it. There may well be all sorts of errors in there, but it's indisputable that it managed to run the model and made notes on how it did that such that I'll be able to do the same thing in the future. I think the key factors in executing this project successfully were the following: Oh, and it looks like DeepSeek OCR is a pretty good model if you spend the time experimenting with different ways to run it. A small TIL from today: I had kicked off the job running in the Docker container via SSH to the Spark when I realized it would be neat if I could easily monitor the files it was creating while it was running. I asked Claude.ai : I am running a Docker container on a remote machine, which I started over SSH How can I have my local VS Code on MacOS show me the filesystem in that docker container inside that remote machine, without restarting anything? It gave me a set of steps that solved this exact problem: At the end when I told Claude to create a zip file of the results I could select that in the VS Code file explorer and use the "Download" menu item to download it to my Mac. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . PyTorch 2.5.1 supports: sm_50, sm_80, sm_86, sm_89, sm_90, sm_90a GB10 requires: sm_121 - for documents - general OCR - without layouts I gave it exactly what it needed: a Docker environment in the target hardware, instructions on where to get what it needed (the code and the model) and a clear goal for it to pursue. This is a great example of the pattern I described in designing agentic loops . Running it in a Docker sandbox meant I could use and leave it running on its own. If I'd had to approve every command it wanted to run I would have got frustrated and quit the project after just a few minutes. I applied my own knowledge and experience when it got stuck. I was confident (based on previous experiments with the Spark) that a CUDA wheel for ARM64 existed that was likely to work, so when it gave up I prompted it to try again, leading to success. Install the VS Code "Remote SSH" and "Dev Containers" extensions Use "Remote-SSH: Connect to Host" to connect to the remote machine (on my Tailscale network that's ) In the window for that remote SSH session, run "Dev Containers: Attach to Running Container" - this shows a list of containers and you can select the one you want to attach to ... and that's it! VS Code opens a new window providing full access to all of the files in that container. I opened up and watched it as Claude Code appended to it in real time.

0 views
Simon Willison 3 weeks ago

Claude Skills are awesome, maybe a bigger deal than MCP

Anthropic this morning introduced Claude Skills , a new pattern for making new abilities available to their models: Claude can now use Skills to improve how it performs specific tasks. Skills are folders that include instructions, scripts, and resources that Claude can load when needed. Claude will only access a skill when it's relevant to the task at hand. When used, skills make Claude better at specialized tasks like working with Excel or following your organization's brand guidelines. Their engineering blog has a more detailed explanation . There's also a new anthropic/skills GitHub repo. (I inadvertently preempted their announcement of this feature when I reverse engineered and wrote about it last Friday !) Skills are conceptually extremely simple: a skill is a Markdown file telling the model how to do something, optionally accompanied by extra documents and pre-written scripts that the model can run to help it accomplish the tasks described by the skill. Claude's new document creation abilities , which accompanied their new code interpreter feature in September, turned out to be entirely implemented using skills. Those are now available Anthropic's repo covering , , , and files. There's one extra detail that makes this a feature, not just a bunch of files in disk. At the start of a session Claude's various harnesses can scan all available skill files and read a short explanation for each one from the frontmatter YAML in the Markdown file. This is very token efficient: each skill only takes up a few dozen extra tokens, with the full details only loaded in should the user request a task that the skill can help solve. Here's that metadata for an example slack-gif-creator skill that Anthropic published this morning: Toolkit for creating animated GIFs optimized for Slack, with validators for size constraints and composable animation primitives. This skill applies when users request animated GIFs or emoji animations for Slack from descriptions like "make me a GIF for Slack of X doing Y". I just tried this skill out in the Claude mobile web app, against Sonnet 4.5. First I enabled the slack-gif-creator skill in the settings , then I prompted: And Claude made me this GIF . Click to play (it's almost epilepsy inducing, hence the click-to-play mechanism): OK, this particular GIF is terrible, but the great thing about skills is that they're very easy to iterate on to make them better. Here are some noteworthy snippets from the Python script it wrote , comments mine: This is pretty neat. Slack GIFs need to be a maximum of 2MB, so the skill includes a validation function which the model can use to check the file size. If it's too large the model can have another go at making it smaller. The skills mechanism is entirely dependent on the model having access to a filesystem, tools to navigate it and the ability to execute commands in that environment. This is a common pattern for LLM tooling these days - ChatGPT Code Interpreter was the first big example of this back in early 2023 , and the pattern later extended to local machines via coding agent tools such as Cursor, Claude Code, Codex CLI and Gemini CLI. This requirement is the biggest difference between skills and other previous attempts at expanding the abilities of LLMs, such as MCP and ChatGPT Plugins . It's a significant dependency, but it's somewhat bewildering how much new capability it unlocks. The fact that skills are so powerful and simple to create is yet another argument in favor of making safe coding environments available to LLMs. The word safe there is doing a lot of work though! We really need to figure out how best to sandbox these environments such that attacks such as prompt injections are limited to an acceptable amount of damage. Back in January I made some foolhardy predictions about AI/LLMs , including that "agents" would once again fail to happen: I think we are going to see a lot more froth about agents in 2025, but I expect the results will be a great disappointment to most of the people who are excited about this term. I expect a lot of money will be lost chasing after several different poorly defined dreams that share that name. I was entirely wrong about that. 2025 really has been the year of "agents", no matter which of the many conflicting definitions you decide to use (I eventually settled on " tools in a loop "). Claude Code is, with hindsight, poorly named. It's not purely a coding tool: it's a tool for general computer automation. Anything you can achieve by typing commands into a computer is something that can now be automated by Claude Code. It's best described as a general agent . Skills make this a whole lot more obvious and explicit. I find the potential applications of this trick somewhat dizzying. Just thinking about this with my data journalism hat on: imagine a folder full of skills that covers tasks like the following: Congratulations, you just built a "data journalism agent" that can discover and help publish stories against fresh drops of US census data. And you did it with a folder full of Markdown files and maybe a couple of example Python scripts. Model Context Protocol has attracted an enormous amount of buzz since its initial release back in November last year . I like to joke that one of the reasons it took off is that every company knew they needed an "AI strategy", and building (or announcing) an MCP implementation was an easy way to tick that box. Over time the limitations of MCP have started to emerge. The most significant is in terms of token usage: GitHub's official MCP on its own famously consumes tens of thousands of tokens of context, and once you've added a few more to that there's precious little space left for the LLM to actually do useful work. My own interest in MCPs has waned ever since I started taking coding agents seriously. Almost everything I might achieve with an MCP can be handled by a CLI tool instead. LLMs know how to call , which means you don't have to spend many tokens describing how to use them - the model can figure it out later when it needs to. Skills have exactly the same advantage, only now I don't even need to implement a new CLI tool. I can drop a Markdown file in describing how to do a task instead, adding extra scripts only if they'll help make things more reliable or efficient. One of the most exciting things about Skills is how easy they are to share. I expect many skills will be implemented as a single file - more sophisticated ones will be a folder with a few more. Anthropic have Agent Skills documentation and a Claude Skills Cookbook . I'm already thinking through ideas of skills I might build myself, like one on how to build Datasette plugins . Something else I love about the design of skills is there is nothing at all preventing them from being used with other models. You can grab a skills folder right now, point Codex CLI or Gemini CLI at it and say "read pdf/SKILL.md and then create me a PDF describing this project" and it will work, despite those tools and models having no baked in knowledge of the skills system. I expect we'll see a Cambrian explosion in Skills which will make this year's MCP rush look pedestrian by comparison. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Trying out the slack-gif-creator skill Skills depend on a coding environment Claude as a General Agent Skills compared to MCP Here come the Skills Where to get US census data from and how to understand its structure How to load data from different formats into SQLite or DuckDB using appropriate Python libraries How to publish data online, as Parquet files in S3 or pushed as tables to Datasette Cloud A skill defined by an experienced data reporter talking about how best to find the interesting stories in a new set of data A skill that describes how to build clean, readable data visualizations using D3

0 views
Simon Willison 3 weeks ago

NVIDIA DGX Spark: great hardware, early days for the ecosystem

NVIDIA sent me a preview unit of their new DGX Spark desktop "AI supercomputer". I've never had hardware to review before! You can consider this my first ever sponsored post if you like, but they did not pay me any cash and aside from an embargo date they did not request (nor would I grant) any editorial input into what I write about the device. The device retails for around $4,000. They officially go on sale tomorrow. First impressions are that this is a snazzy little computer. It's similar in size to a Mac mini, but with an exciting textured surface that feels refreshingly different and a little bit science fiction . There is a very powerful machine tucked into that little box. Here are the specs, which I had Claude Code figure out for me by poking around on the device itself : Hardware Specifications GPU Specifications Short version: this is an ARM64 device with 128GB of memory that's available to both the GPU and the 20 CPU cores at the same time, strapped onto a 4TB NVMe SSD. The Spark is firmly targeted at “AI researchers”. It’s designed for both training and running models. Until now almost all of my own model running experiments have taken place on a Mac. This has gotten far less painful over the past year and a half thanks to the amazing work of the MLX team and community, but it's still left me deeply frustrated at my lack of access to the NVIDIA CUDA ecosystem. I've lost count of the number of libraries and tutorials which expect you to be able to use Hugging Face Transformers or PyTorch with CUDA, and leave you high and dry if you don't have an NVIDIA GPU to run things on. Armed (ha) with my new NVIDIA GPU I was excited to dive into this world that had long eluded me... only to find that there was another assumption baked in to much of this software: x86 architecture for the rest of the machine. This resulted in all kinds of unexpected new traps for me to navigate. I eventually managed to get a PyTorch 2.7 wheel for CUDA on ARM, but failed to do so for 2.8. I'm not confident there because the wheel itself is unavailable but I'm finding navigating the PyTorch ARM ecosystem pretty confusing. NVIDIA are trying to make this easier, with mixed success. A lot of my initial challenges got easier when I found their official Docker container , so now I'm figuring out how best to use Docker with GPUs. Here's the current incantation that's been working for me: I have not yet got my head around the difference between CUDA 12 and 13. 13 appears to be very new, and a lot of the existing tutorials and libraries appear to expect 12. When I first received this machine around a month ago there was very little in the way of documentation to help get me started. This meant climbing the steep NVIDIA+CUDA learning curve mostly on my own. This has changed substantially in just the last week. NVIDIA now have extensive guides for getting things working on the Spark and they are a huge breath of fresh air - exactly the information I needed when I started exploring this hardware. Here's the getting started guide , details on the DGX dashboard web app , and the essential collection of playbooks . There's still a lot I haven't tried yet just in this official set of guides. Claude Code was an absolute lifesaver for me while I was trying to figure out how best to use this device. My Ubuntu skills were a little rusty, and I also needed to figure out CUDA drivers and Docker incantations and how to install the right versions of PyTorch. Claude 4.5 Sonnet is much better than me at all of these things. Since many of my experiments took place in disposable Docker containers I had no qualms at all about running it in YOLO mode: The environment variable stops Claude from complaining about running as root. I was tipped off about IS_SANDBOX after I published this article. Here's my original workaround: Claude understandably won't let you do this as root, even in a Docker container, so I found myself using the following incantation in a fresh instance pretty often: Then switch to the user and run Claude for the first time: This will provide a URL which you can visit to authenticate with your Anthropic account, confirming by copying back a token and pasting it into the terminal. Docker tip: you can create a snapshot of the current image (with Claude installed) by running to get the container ID and then: Then later you can start a similar container using: Here's an example of the kinds of prompts I've been running in Claude Code inside the container: That one worked flawlessly - Claude checked out the repo, compiled it for me and iterated on it until it could run that model on the GPU. Here's a full transcript , converted from Claude's log format to Markdown using a script I vibe coded just now . I later told it: Which produced this handy set of notes . Having a machine like this on my local network is neat, but what's even neater is being able to access it from anywhere else in the world, from both my phone and my laptop. Tailscale is perfect for this. I installed it on the Spark (using the Ubuntu instructions here ), signed in with my SSO account (via Google)... and the Spark showed up in the "Network Devices" panel on my laptop and phone instantly. I can SSH in from my laptop or using the Termius iPhone app on my phone. I've also been running tools like Open WebUI which give me a mobile-friendly web interface for interacting with LLMs on the Spark. The embargo on these devices dropped yesterday afternoon, and it turns out a whole bunch of relevant projects have had similar preview access to myself. This is fantastic news as many of the things I've been trying to figure out myself suddenly got a whole lot easier. Four particularly notable examples: It's a bit too early for me to provide a confident recommendation concerning this machine. As indicated above, I've had a tough time figuring out how best to put it to use, largely through my own inexperience with CUDA, ARM64 and Ubuntu GPU machines in general. The ecosystem improvements in just the past 24 hours have been very reassuring though. I expect it will be clear within a few weeks how well supported this machine is going to be. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Architecture: aarch64 (ARM64) CPU: 20 cores 10x Cortex-X925 (performance cores) 10x Cortex-A725 (efficiency cores) RAM: 119 GB total (112 GB available) - I’m not sure why Claude reported it differently here, the machine is listed as 128GB - it looks like a 128GB == 119GiB thing because Claude used free -h Storage: 3.7 TB (6% used, 3.3 TB available) Model: NVIDIA GB10 (Blackwell architecture) Compute Capability: sm_121 (12.1) Memory: 119.68 GB Multi-processor Count: 48 streaming multiprocessors Architecture: Blackwell Ollama works out of the box . They actually had a build that worked a few weeks ago, and were the first success I had running an LLM on the machine. creator Georgi Gerganov just published extensive benchmark results from running on a Spark. He's getting ~3,600 tokens/second to read the prompt and ~59 tokens/second to generate a response with the MXFP4 version of GPT-OSS 20B and ~817 tokens/second to read and ~18 tokens/second to generate for GLM-4.5-Air-GGUF. LM Studio now have a build for the Spark . I haven't tried this one yet as I'm currently using my machine exclusively via SSH. vLLM - one of the most popular engines for serving production LLMs - had early access and there's now an official NVIDIA vLLM NGC Container for running their stack.

0 views
Simon Willison 1 months ago

Claude can write complete Datasette plugins now

This isn't necessarily surprising, but it's worth noting anyway. Claude Sonnet 4.5 is capable of building a full Datasette plugin now. I've seen models complete aspects of this in the past, but today is the first time I've shipped a new plugin where every line of code and test was written by Claude, with minimal prompting from myself. The plugin is called datasette-os-info . It's a simple debugging tool - all it does is add a JSON page which dumps out as much information as it can about the OS it's running on. Here's a live demo on my TIL website. I built it to help experiment with changing the Docker base container that Datasette uses to publish images to one that uses Python 3.14. Here's the full set of commands I used to create the plugin. I started with my datasette-plugin cookiecutter template: This created a directory with the initial and and files. Here's an example of that starter template . I created a virtual environment for it, installed the initial test dependencies and ran to check that worked: Then I fired up Claude Code in that directory in YOLO mode: (I actually used my shortcut which runs the above.) Then, in Claude, I told it how to run the tests: When that worked, I told it to build the plugin: ... and that was it! Claude implemented the plugin using Datasette's register_routes() plugin hook to add the page,and then without me prompting it to do so built this basic test as well . It ran the new test, spotted a bug (it had guessed a non-existent parameter), fixed the bug and declared itself done. I built myself a wheel: Then uploaded that to an S3 bucket and deployed it to test it out using . It did exactly what I had hoped - here's what that page looked like: I dread to think how long it would have taken me to figure out all of those different useful OS details! To finish up, I had Claude fill out the README via a couple of extra prompts: Then had a brief conversation about how safe the output was: Claude said: Yes, there are several potentially sensitive items in the output: Most sensitive: Moderately sensitive: Considerations: Suggestions: You might want to: Would you like me to add any of these protections to the plugin? In this case I'm fine with the current behavior provided people understand the risks, so I told it: And then (because the warning it added was quite long): And requested an extra README note: Three last prompts: ... and that was the project finished. I pushed it to GitHub , configured Trusted Publishing for it on PyPI and posted the 0.1 release , which ran this GitHub Actions publish.yml and deployed that release to datasette-os-info on PyPI . Now that it's live you can try it out without even installing Datasette using a one-liner like this: That's using the CLI option to show what that path in the Datasette instance would return, as described in the Datasette documentation . I've shared my full Claude Code transcript in a Gist. A year ago I'd have been very impressed by this. Today I wasn't even particularly surprised that this worked - the coding agent pattern implemented by Claude Code is spectacularly effective when you combine it with pre-existing templates, and Datasette has been aroung for long enough now that plenty of examples of plugins have made it into the training data for the leading models. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . hostname - Could reveal internal naming schemes or organization details Environment variables: HOME - Reveals user's home directory path USER - Reveals the username PATH - Reveals filesystem structure and software installation locations VIRTUAL_ENV - Could reveal project paths and directory structure python_executable - Reveals filesystem paths and user directories CPU count and memory info - Could reveal infrastructure capacity Kernel version details - Could potentially be used to identify specific vulnerabilities This endpoint is accessible without authentication by default (at /-/os) If you're running Datasette on a public server, this information would be visible to anyone The information could help attackers understand your infrastructure setup Add authentication requirements to this endpoint Make certain fields optional/configurable Add a warning in the README that this exposes system information Consider filtering out or redacting sensitive paths (replace usernames with ***, etc.)

1 views
Simon Willison 1 months ago

Vibe engineering

I feel like vibe coding is pretty well established now as covering the fast, loose and irresponsible way of building software with AI - entirely prompt-driven, and with no attention paid to how the code actually works. This leaves us with a terminology gap: what should we call the other end of the spectrum, where seasoned professionals accelerate their work with LLMs while staying proudly and confidently accountable for the software they produce? I propose we call this vibe engineering , with my tongue only partially in my cheek. One of the lesser spoken truths of working productively with LLMs as a software engineer on non-toy-projects is that it's difficult . There's a lot of depth to understanding how to use the tools, there are plenty of traps to avoid, and the pace at which they can churn out working code raises the bar for what the human participant can and should be contributing. The rise of coding agents - tools like Claude Code (released February 2025), OpenAI's Codex CLI (April) and Gemini CLI (June) that can iterate on code, actively testing and modifying it until it achieves a specified goal, has dramatically increased the usefulness of LLMs for real-world coding problems. I'm increasingly hearing from experienced, credible software engineers who are running multiple copies of agents at once, tackling several problems in parallel and expanding the scope of what they can take on. I was skeptical of this at first but I've started running multiple agents myself now and it's surprisingly effective, if mentally exhausting! This feels very different from classic vibe coding, where I outsource a simple, low-stakes task to an LLM and accept the result if it appears to work. Most of my tools.simonwillison.net collection ( previously ) were built like that. Iterating with coding agents to produce production-quality code that I'm confident I can maintain in the future feels like a different process entirely. It's also become clear to me that LLMs actively reward existing top tier software engineering practices: If you're going to really exploit the capabilities of these new tools, you need to be operating at the top of your game . You're not just responsible for writing the code - you're researching approaches, deciding on high-level architecture, writing specifications, defining success criteria, designing agentic loops , planning QA, managing a growing army of weird digital interns who will absolutely cheat if you give them a chance, and spending so much time on code review . Almost all of these are characteristics of senior software engineers already! AI tools amplify existing expertise . The more skills and experience you have as a software engineer the faster and better the results you can get from working with LLMs and coding agents. Is this a stupid name? Yeah, probably. "Vibes" as a concept in AI feels a little tired at this point. "Vibe coding" itself is used by a lot of developers in a dismissive way. I'm ready to reclaim vibes for something more constructive. I've never really liked the artificial distinction between "coders" and "engineers" - that's always smelled to me a bit like gatekeeping. But in this case a bit of gatekeeping is exactly what we need! Vibe engineering establishes a clear distinction from vibe coding. It signals that this is a different, harder and more sophisticated way of working with AI tools to build production software. I like that this is cheeky and likely to be controversial. This whole space is still absurd in all sorts of different ways. We shouldn't take ourselves too seriously while we figure out the most productive ways to apply these new tools. I've tried in the past to get terms like AI-assisted programming to stick, with approximately zero success. May as well try rubbing some vibes on it and see what happens. I also really like the clear mismatch between "vibes" and "engineering". It makes the combined term self-contradictory in a way that I find mischievous and (hopefully) sticky. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Automated testing . If your project has a robust, comprehensive and stable test suite agentic coding tools can fly with it. Without tests? Your agent might claim something works without having actually tested it at all, plus any new change could break an unrelated feature without you realizing it. Test-first development is particularly effective with agents that can iterate in a loop. Planning in advance . Sitting down to hack something together goes much better if you start with a high level plan. Working with an agent makes this even more important - you can iterate on the plan first, then hand it off to the agent to write the code. Comprehensive documentation . Just like human programmers, an LLM can only keep a subset of the codebase in its context at once. Being able to feed in relevant documentation lets it use APIs from other areas without reading the code first. Write good documentation first and the model may be able to build the matching implementation from that input alone. Good version control habits . Being able to undo mistakes and understand when and how something was changed is even more important when a coding agent might have made the changes. LLMs are also fiercely competent at Git - they can navigate the history themselves to track down the origin of bugs, and they're better than most developers at using git bisect . Use that to your advantage. Having effective automation in place. Continuous integration, automated formatting and linting, continuous deployment to a preview environment - all things that agentic coding tools can benefit from too. LLMs make writing quick automation scripts easier as well, which can help them then repeat tasks accurately and consistently next time. A culture of code review . This one explains itself. If you're fast and productive at code review you're going to have a much better time working with LLMs than if you'd rather write code yourself than review the same thing written by someone (or something) else. A very weird form of management . Getting good results out of a coding agent feels uncomfortably close to getting good results out of a human collaborator. You need to provide clear instructions, ensure they have the necessary context and provide actionable feedback on what they produce. It's a lot easier than working with actual people because you don't have to worry about offending or discouraging them - but any existing management experience you have will prove surprisingly useful. Really good manual QA (quality assurance) . Beyond automated tests, you need to be really good at manually testing software, including predicting and digging into edge-cases. Strong research skills . There are dozens of ways to solve any given coding problem. Figuring out the best options and proving an approach has always been important, and remains a blocker on unleashing an agent to write the actual code. The ability to ship to a preview environment . If an agent builds a feature, having a way to safely preview that feature (without deploying it straight to production) makes reviews much more productive and greatly reduces the risk of shipping something broken. An instinct for what can be outsourced to AI and what you need to manually handle yourself. This is constantly evolving as the models and tools become more effective. A big part of working effectively with LLMs is maintaining a strong intuition for when they can best be applied. An updated sense of estimation . Estimating how long a project will take has always been one of the hardest but most important parts of being a senior engineer, especially in organizations where budget and strategy decisions are made based on those estimates. AI-assisted coding makes this even harder - things that used to take a long time are much faster, but estimations now depend on new factors which we're all still trying to figure out.

0 views
Simon Willison 1 months ago

OpenAI DevDay 2025 live blog

I'm at OpenAI DevDay in Fort Mason, San Francisco today. As I did last year , I'm going to be live blogging the announcements from the kenote. Unlike last year, this year there's a livestream . Disclosure: OpenAI provides me with a free ticket and reserved me a seat in the press/influencer section for the keynote. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
Simon Willison 1 months ago

Embracing the parallel coding agent lifestyle

For a while now I've been hearing from engineers who run multiple coding agents at once - firing up several Claude Code or Codex CLI instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees . I was pretty skeptical about this at first. AI-generated code needs to be reviewed, which means the natural bottleneck on all of this is how fast I can review the results. It's tough keeping up with just a single LLM given how fast they can churn things out, where's the benefit from running more than one at a time if it just leaves me further behind? Despite my misgivings, over the past few weeks I've noticed myself quietly starting to embrace the parallel coding agent lifestyle. I can only focus on reviewing and landing one significant change at a time, but I'm finding an increasing number of tasks that can still be fired off in parallel without adding too much cognitive overhead to my primary work. Here are some patterns I've found for applying parallel agents effectively. The first category of tasks I've been applying this pattern to is research . Research tasks answer questions or provide recommendations without making modifications to a project that you plan to keep. A lot of software projects start with a proof of concept. Can Yjs be used to implement a simple collaborative note writing tool with a Python backend? The libraries exist , but do they work when you wire them together? Today's coding agents can build a proof of concept with new libraries and resolve those kinds of basic questions. Libraries too new to be in the training data? Doesn't matter: tell them to checkout the repos for those new dependencies and read the code to figure out how to use them. If you need a reminder about how a portion of your existing system works, modern "reasoning" LLMs can provide a detailed, actionable answer in just a minute or two. It doesn't matter how large your codebase is: coding agents are extremely effective with tools like grep and can follow codepaths through dozens of different files if they need to. Ask them to make notes on where your signed cookies are set and read, or how your application uses subprocesses and threads, or which aspects of your JSON API aren't yet covered by your documentation. These LLM-generated explanations are worth stashing away somewhere, because they can make excellent context to paste into further prompts in the future. Now we're moving on to code edits that we intend to keep, albeit with very low-stakes. It turns out there are a lot of problems that really just require a little bit of extra cognitive overhead which can be outsourced to a bot. Warnings are a great example. Is your test suite spitting out a warning that something you are using is deprecated? Chuck that at a bot - tell it to run the test suite and figure out how to fix the warning. No need to take a break from what you're doing to resolve minor irritations like that. There is a definite knack to spotting opportunities like this. As always, the best way to develop that instinct is to try things - any small maintenance task is something that's worth trying with a coding agent. You can learn from both their successes and their failures. Reviewing code that lands on your desk out of nowhere is a lot of work. First you have to derive the goals of the new implementation: what's it trying to achieve? Is this something the project needs? Is the approach taken the best for this current project, given other future planned changes? A lot of big questions before you can even start digging into the details of the code. Code that started from your own specification is a lot less effort to review. If you already decided what to solve, picked the approach and worked out a detailed specification for the work itself, confirming it was built to your needs can take a lot less time. I described my more authoritarian approach to prompting models for code back in March. If I tell them exactly how to build something the work needed to review the resulting changes is a whole lot less taxing. My daily drivers are currently Claude Code (on Sonnet 4.5), Codex CLI (on GPT-5-Codex), and Codex Cloud (for asynchronous tasks, frequently launched from my phone.) I'm also dabbling with GitHub Copilot Coding Agent (the agent baked into the GitHub.com web interface in various places) and Google Jules , Google's currently-free alternative to Codex Cloud. I'm still settling into patterns that work for me. I imagine I'll be iterating on my processes for a long time to come, especially as the landscape of coding agents continues to evolve. I frequently have multiple terminal windows open running different coding agents in different directories. These are currently a mixture of Claude Code and Codex CLI, running in YOLO mode (no approvals) for tasks where I'm confident malicious instructions can't sneak into the context. (I need to start habitually running my local agents in Docker containers to further limit the blast radius if something goes wrong.) I haven't adopted git worktrees yet: if I want to run two agents in isolation against the same repo I do a fresh checkout, often into . For riskier tasks I'm currently using asynchronous coding agents - usually Codex Cloud - so if anything goes wrong the worst that can happen is my source code getting leaked (since I allow it to have network access while running). Most of what I work on is open source anyway so that's not a big concern for me. I occasionally use GitHub Codespaces to run VS Code's agent mode, which is surprisingly effective and runs directly in my browser. This is particularly great for workshops and demos since it works for anyone with GitHub account, no extra API key necessary. This category of coding agent software is still really new, and the models have only really got good enough to drive them effectively in the past few months - Claude 4 and GPT-5 in particular. I plan to write more as I figure out the ways of using them that are most effective. I encourage other practitioners to do the same! Jesse Vincent wrote How I'm using coding agents in September, 2025 which describes his workflow for parallel agents in detail, including having an architect agent iterate on a plan which is then reviewed and implemented by fresh instances of Claude Code. In The 7 Prompting Habits of Highly Effective Engineers Josh Bleecher Snyder describes several patterns for this kind of work. I particularly like this one: Send out a scout . Hand the AI agent a task just to find out where the sticky bits are, so you don’t have to make those mistakes. I've tried this a few times with good results: give the agent a genuinely difficult task against a large codebase, with no intention of actually landing its code, just to get ideas from which files it modifies and how it approaches the problem. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views
Simon Willison 1 months ago

Designing agentic loops

Coding agents like Anthropic's Claude Code and OpenAI's Codex CLI represent a genuine step change in how useful LLMs can be for producing working code. These agents can now directly exercise the code they are writing, correct errors, dig through existing implementation details, and even run experiments to find effective code solutions to problems. As is so often the case with modern AI, there is a great deal of depth involved in unlocking the full potential of these new tools. A critical new skill to develop is designing agentic loops . One way to think about coding agents is that they are brute force tools for finding solutions to coding problems. If you can reduce your problem to a clear goal and a set of tools that can iterate towards that goal a coding agent can often brute force its way to an effective solution. My preferred definition of an LLM agent is something that runs tools in a loop to achieve a goal . The art of using them well is to carefully design the tools and loop for them to use. Agents are inherently dangerous - they can make poor decisions or fall victim to malicious prompt injection attacks , either of which can result in harmful results from tool calls. Since the most powerful coding agent tool is "run this command in the shell" a rogue agent can do anything that you could do by running a command yourself. To quote Solomon Hykes : An AI agent is an LLM wrecking its environment in a loop. Coding agents like Claude Code counter this by defaulting to asking you for approval of almost every command that they run. This is kind of tedious, but more importantly, it dramatically reduces their effectiveness at solving problems through brute force. Each of these tools provides its own version of what I like to call YOLO mode, where everything gets approved by default. This is so dangerous , but it's also key to getting the most productive results! Here are three key risks to consider from unattended YOLO mode. If you want to run YOLO mode anyway, you have a few options: Most people choose option 3. Despite the existence of container escapes I think option 1 using Docker or the new Apple container tool is a reasonable risk to accept for most people. Option 2 is my favorite. I like to use GitHub Codespaces for this - it provides a full container environment on-demand that's accessible through your browser and has a generous free tier too. If anything goes wrong it's a Microsoft Azure machine somewhere that's burning CPU and the worst that can happen is code you checked out into the environment might be exfiltrated by an attacker, or bad code might be pushed to the attached GitHub repository. There are plenty of other agent-like tools that run code on other people's computers. Code Interpreter mode in both ChatGPT and Claude can go a surprisingly long way here. I've also had a lot of success (ab)using OpenAI's Codex Cloud . Coding agents themselves implement various levels of sandboxing, but so far I've not seen convincing enough documentation of these to trust them. Update : It turns out Anthropic have their own documentation on Safe YOLO mode for Claude Code which says: Letting Claude run arbitrary commands is risky and can result in data loss, system corruption, or even data exfiltration (e.g., via prompt injection attacks). To minimize these risks, use in a container without internet access. You can follow this reference implementation using Docker Dev Containers. Locking internet access down to a list of trusted hosts is a great way to prevent exfiltration attacks from stealing your private source code. Now that we've found a safe (enough) way to run in YOLO mode, the next step is to decide which tools we need to make available to the coding agent. You can bring MCP into the mix at this point, but I find it's usually more productive to think in terms of shell commands instead. Coding agents are really good at running shell commands! If your environment allows them the necessary network access, they can also pull down additional packages from NPM and PyPI and similar. Ensuring your agent runs in an environment where random package installs don't break things on your main computer is an important consideration as well! Rather than leaning on MCP, I like to create an AGENTS.md (or equivalent) file with details of packages I think they may need to use. For a project that involved taking screenshots of various websites I installed my own shot-scraper CLI tool and dropped the following in : Just that one example is enough for the agent to guess how to swap out the URL and filename for other screenshots. Good LLMs already know how to use a bewildering array of existing tools. If you say "use playwright python " or "use ffmpeg" most models will use those effectively - and since they're running in a loop they can usually recover from mistakes they make at first and figure out the right incantations without extra guidance. In addition to exposing the right commands, we also need to consider what credentials we should expose to those commands. Ideally we wouldn't need any credentials at all - plenty of work can be done without signing into anything or providing an API key - but certain problems will require authenticated access. This is a deep topic in itself, but I have two key recommendations here: I'll use an example to illustrate. A while ago I was investigating slow cold start times for a scale-to-zero application I was running on Fly.io . I realized I could work a lot faster if I gave Claude Code the ability to directly edit Dockerfiles, deploy them to a Fly account and measure how long they took to launch. Fly allows you to create organizations, and you can set a budget limit for those organizations and issue a Fly API key that can only create or modify apps within that organization... So I created a dedicated organization for just this one investigation, set a $5 budget, issued an API key and set Claude Code loose on it! In that particular case the results weren't useful enough to describe in more detail, but this was the project where I first realized that "designing an agentic loop" was an important skill to develop. Not every problem responds well to this pattern of working. The thing to look out for here are problems with clear success criteria where finding a good solution is likely to involve (potentially slightly tedious) trial and error . Any time you find yourself thinking "ugh, I'm going to have to try a lot of variations here" is a strong signal that an agentic loop might be worth trying! A few examples: A common theme in all of these is automated tests . The value you can get from coding agents and other LLM coding tools is massively amplified by a good, cleanly passing test suite. Thankfully LLMs are great for accelerating the process of putting one of those together, if you don't have one yet. Designing agentic loops is a very new skill - Claude Code was first released in just February 2025! I'm hoping that giving it a clear name can help us have productive conversations about it. There's so much more to figure out about how to use these tools as effectively as possible. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . The joy of YOLO mode Picking the right tools for the loop Issuing tightly scoped credentials When to design an agentic loop This is still a very fresh area Bad shell commands deleting or mangling things you care about. Exfiltration attacks where something steals files or data visible to the agent - source code or secrets held in environment variables are particularly vulnerable here. Attacks that use your machine as a proxy to attack another target - for DDoS or to disguise the source of other hacking attacks. Run your agent in a secure sandbox that restricts the files and secrets it can access and the network connections it can make. Use someone else's computer. That way if your agent goes rogue, there's only so much damage they can do, including wasting someone else's CPU cycles. Take a risk! Try to avoid exposing it to potential sources of malicious instructions and hope you catch any mistakes before they cause any damage. Try to provide credentials to test or staging environments where any damage can be well contained. If a credential can spend money, set a tight budget limit. Debugging : a test is failing and you need to investigate the root cause. Coding agents that can already run your tests can likely do this without any extra setup. Performance optimization : this SQL query is too slow, would adding an index help? Have your agent benchmark the query and then add and drop indexes (in an isolated development environment!) to measure their impact. Upgrading dependencies : you've fallen behind on a bunch of dependency upgrades? If your test suite is solid an agentic loop can upgrade them all for you and make any minor updates needed to reflect breaking changes. Make sure a copy of the relevant release notes is available, or that the agent knows where to find them itself. Optimizing container sizes : Docker container feeling uncomfortably large? Have your agent try different base images and iterate on the Dockerfile to try to shrink it, while keeping the tests passing.

3 views
Simon Willison 1 months ago

Claude Sonnet 4.5 is probably the "best coding model in the world" (at least for now)

Anthropic released Claude Sonnet 4.5 today , with a very bold set of claims: Claude Sonnet 4.5 is the best coding model in the world. It's the strongest model for building complex agents. It’s the best model at using computers. And it shows substantial gains in reasoning and math. Anthropic gave me access to a preview version of a "new model" over the weekend which turned out to be Sonnet 4.5. My initial impressions were that it felt like a better model for code than GPT-5-Codex, which has been my preferred coding model since it launched a few weeks ago . This space moves so fast - Gemini 3 is rumored to land soon so who knows how long Sonnet 4.5 will continue to hold the "best coding model" crown. The pricing is the same as the previous Sonnet: $3/million input tokens and $15/million output tokens. This remains significantly cheaper than Claude Opus - $15/$75 - but still quite a bit more than GPT-5 and GPT-5-Codex, both at $1.25/$10. The claude.ai web interface (not yet the Claude iPhone native app) recently added the ability for Claude to write and then directly execute code in a sandboxed server environment, using Python and Node.js. I wrote about that in detail three weeks ago. Anthropic's implementation of this code interpreter pattern is more powerful than ChatGPT's equivalent because it can directly clone code from GitHub and install software packages from NPM and PyPI. Sonnet 4.5 absolutely shines at using this tool! Here's my most ambitious experiment with it. I started like this: It did exactly as I asked, checking out my LLM repo , installing it along with all of its dependencies, and running the tests. ✅ All 466 tests passed in 167.69 seconds (about 2 minutes 47 seconds) Then I posed it a wildly ambitious challenge: As you can see in the resulting conversation transcript , it did a really impressive job of this! It churned away across dozens of different tool calls, eventually outputting the following summary of its activity: I've successfully experimented with adding tree-structured conversations to the LLM CLI tool. Here's what was accomplished: 1. Database Schema Enhancement 2. Comprehensive Utility Module ( ) 3. Complete Test Suite ( ) 4. Integration Verification The implementation is production-ready with comprehensive testing. Next steps would be: All code is tested, documented, and ready to use! Did I mention I kicked off this experiment by entering prompts on my phone? I told it to - you can explore the contents of the file it made me in this Gist . Update : I shared a longer version of this experiment where I had the model experiment with a complex message hashing mechanism as well. I've pushed the code to a branch , which is best explained by the message_matching_notes.md document written by Claude. I released llm-anthropic 0.19 adding support for the new model. Then I ran my classic pelican benchmark , first with thinking turned on : And then without : These are pretty good - they are recognizably pelicans! - though not quite as good as GPT-5-Codex which is better at drawing bicycles. I also tried having it describe this image: This image shows a large gathering of pelicans crowded together along what appears to be a waterfront or shoreline. The birds are densely packed, creating a chaotic scene with dozens of brown pelicans visible. Their distinctive long bills and pouches are clearly visible among the mass of feathered bodies. In the background, there's calm water and a dark tree line or forested area. The lighting suggests this was taken during early morning or late afternoon, giving the scene a golden, warm tone. The pelicans appear to be resting or congregating together, which is typical behavior for these colonial seabirds. The release of this model has been very well coordinated. My embargo on talking about it was due to lift at 10am Pacific today, and I got an email from them linking to their blog post at 10am on the dot. It's also already live on OpenRouter and in Cursor and GitHub Copilot and no doubt a whole bunch of other places as well. Anthropic also shipped a new Claude Code VS Code extension today, plus a big upgrade to the Claude Code terminal app. Plus they rebranded their confusingly named Claude Code SDK to the Claude Agent SDK instead, emphasizing that it's a tool for building agents beyond just customizing the existing Claude Code product. That's available for both TypeScript and Python . You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Added column to the table via migration Enables tree-structured (branching) conversations instead of just linear sequences Fully backward compatible - existing data unaffected 12 helper functions for tree operations: Navigation (children, parent, siblings, paths) Analysis (depth, descendants, size) Queries (roots, leaves, statistics) Visualization (ASCII tree display) 16 tests covering all tree operations Tests linear chains, branching, multiple roots, forests Includes realistic debugging scenario All tests pass ✅ Updated existing migration tests to expect new column All 6 migration tests pass Total: 22/22 tests passing ✅ README.md - Quick start guide IMPLEMENTATION_SUMMARY.md - Complete technical overview MIGRATION_CODE.md - Database migration details TEST_RESULTS.md - Full test coverage report tree_notes.md - Design decisions and development notes tree_utils.py - Utility functions module test_tree_conversations.py - Test suite Multiple branches from any conversation point Multiple roots per conversation (forest structure) Rich analytics (depth, branching factor, tree size) ASCII tree visualization Cycle detection for safety Integrate into LLM package Add CLI commands ( , ) Update to accept

0 views
Simon Willison 1 months ago

I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now

I've noticed something interesting over the past few weeks: I've started using the term "agent" in conversations where I don't feel the need to then define it, roll my eyes or wrap it in scare quotes. This is a big piece of personal character development for me! Moving forward, when I talk about agents I'm going to use this: An LLM agent runs tools in a loop to achieve a goal. I've been very hesitant to use the term "agent" for meaningful communication over the last couple of years. It felt to me like the ultimate in buzzword bingo - everyone was talking about agents, but if you quizzed them everyone seemed to hold a different mental model of what they actually were.

0 views
Simon Willison 1 months ago

My review of Claude's new Code Interpreter, released under a very confusing name

Today on the Anthropic blog: Claude can now create and edit files : Claude can now create and edit Excel spreadsheets, documents, PowerPoint slide decks, and PDFs directly in Claude.ai and the desktop app. [...] File creation is now available as a preview for Max, Team, and Enterprise plan users. Pro users will get access in the coming weeks. Then right at the very end of their post: This feature gives Claude internet access to create and analyze files, which may put your data at risk. Monitor chats closely when using this feature. Learn more .

0 views