Containerizing Agents
Simon Willison has been writing about using parallel coding agents ( blog ), and his post encouraged me to write down about my current workflow, which involves both parallelism, containerization, and web browsers. I’m spoiled by (and helped build) sketch.dev’s agent containerization , so, when I need to use other agents as well, I wrote a shell script to containerize them "just so." My workflow is that I run , and I find myself in a web browser, in the same git repo I was in, but now in a randomly named branch, in a container, in . The first pane is the agent, but there are other panes doing other stuff. When I'm done, I've got a branch to work with, and I merge/rebase/cherry-pick. Let's break up the pieces: First, my shell script is in my favorite shell scripting language, dependency-less python3. Python3 has the advantage of not requiring you to think about and is sufficiently available. Second, I have a customized Dockerfile with the dependencies my projects need. I don't minimize the container; I add all the things I want. Browsers, playwright, subtrace, tmux, etc. Third, U cross-mount my git repo itself into the container, and create a worktree inside the container. From the outside, this work tree is going to look "prunable", but that causes no harm, and there’s a new branch that corresponds to the agent’s worktree. I like worktrees more than remotes because they’re in the same namespace; you don’t need to "fetch" or "push" across them. It’s easy to lose changes when the container exits; I commit automatically on exit. It's also easy to lose the worktree if something calls on your behalf, but recovery is possible with and some fiddling. Fourth, I run tmux inside the container so that opening a shell in the container is as simple as opening a new pane. (Somehow, is too rich.) I'm used to sketch.dev's terminal pane to do the little git operation, take a look at a diff, run a server... tmux helps. Fifth, networking magic with Tailscale. I publish ports 8000-9999 (and 11111) on my tailnet, using the same randomly generated name as I've used for my container and my branch. You're inevitably working on a web app, and you inevitably need to actually look at it, and Docker networking is doable, but you have to pre-declare exposed ports, and avoid conflicts, and ... it's just not great for this use case. There are other solutions (ngrok, SSH port forwarding), but I already use Tailscale, so this works nicely. I originally started with tsnsrv , but then vibe-coded a custom thing that supports port ranges. is the userland networking library here, and the agents do a fine job one-shotting this stuff. Sixth, I use to expose my to my browser over the tailnet network. I'm used to having a browser-tab per agent, and this gives me that. (Terminal-based agents feel weird to me. Browsers are great at scrolling, expand/collapse widgets, cut and paste, word wrap of text, etc.) Seventh, I vibe-coded a headless browser tool called , which wraps the excellent chromedp library, which remote-controls a headless Chrome over it's debugging protocol. Getting the MCPs configured for playwright was finnicky, especially across multiple agents, and I'm experimenting with this command line tool to do the same. As I’ve written about before . Using agents in containers gives me two things I value: Isolation for parallel work. The agents can start processes and run tests and so forth without conflicting on ports or files. A bit more security. Even the Economist has now picked up on the Lethal Trifecta (or Simon Willison's original) . By explicitly choosing which environment variables I forward, and not sharing my cookies and my SSH keys, I’m exerting some control over what data and capabilities are exposed to the agent. We’re still playing with fire (can you break out of Colima? Sure! Can you edit my git repo? Sure! Break into my tailnet? Sorta.), but it’s a smaller, more controlled burn. If you want to try my nonsense, https://github.com/philz/ctr-agent . First, my shell script is in my favorite shell scripting language, dependency-less python3. Python3 has the advantage of not requiring you to think about and is sufficiently available. Second, I have a customized Dockerfile with the dependencies my projects need. I don't minimize the container; I add all the things I want. Browsers, playwright, subtrace, tmux, etc. Third, U cross-mount my git repo itself into the container, and create a worktree inside the container. From the outside, this work tree is going to look "prunable", but that causes no harm, and there’s a new branch that corresponds to the agent’s worktree. I like worktrees more than remotes because they’re in the same namespace; you don’t need to "fetch" or "push" across them. It’s easy to lose changes when the container exits; I commit automatically on exit. It's also easy to lose the worktree if something calls on your behalf, but recovery is possible with and some fiddling. Fourth, I run tmux inside the container so that opening a shell in the container is as simple as opening a new pane. (Somehow, is too rich.) I'm used to sketch.dev's terminal pane to do the little git operation, take a look at a diff, run a server... tmux helps. Fifth, networking magic with Tailscale. I publish ports 8000-9999 (and 11111) on my tailnet, using the same randomly generated name as I've used for my container and my branch. You're inevitably working on a web app, and you inevitably need to actually look at it, and Docker networking is doable, but you have to pre-declare exposed ports, and avoid conflicts, and ... it's just not great for this use case. There are other solutions (ngrok, SSH port forwarding), but I already use Tailscale, so this works nicely. I originally started with tsnsrv , but then vibe-coded a custom thing that supports port ranges. is the userland networking library here, and the agents do a fine job one-shotting this stuff. Sixth, I use to expose my to my browser over the tailnet network. I'm used to having a browser-tab per agent, and this gives me that. (Terminal-based agents feel weird to me. Browsers are great at scrolling, expand/collapse widgets, cut and paste, word wrap of text, etc.) Seventh, I vibe-coded a headless browser tool called , which wraps the excellent chromedp library, which remote-controls a headless Chrome over it's debugging protocol. Getting the MCPs configured for playwright was finnicky, especially across multiple agents, and I'm experimenting with this command line tool to do the same. Isolation for parallel work. The agents can start processes and run tests and so forth without conflicting on ports or files. A bit more security. Even the Economist has now picked up on the Lethal Trifecta (or Simon Willison's original) . By explicitly choosing which environment variables I forward, and not sharing my cookies and my SSH keys, I’m exerting some control over what data and capabilities are exposed to the agent. We’re still playing with fire (can you break out of Colima? Sure! Can you edit my git repo? Sure! Break into my tailnet? Sorta.), but it’s a smaller, more controlled burn.