GreatReads - Blog Aggregator · Phoenix Framework

Containerizing Agents

Simon Willison has been writing about using parallel coding agents ( blog ), and his post encouraged me to write down about my current workflow, which involves both parallelism, containerization, and web browsers. I’m spoiled by (and helped build) sketch.dev’s agent containerization , so, when I need to use other agents as well, I wrote a shell script to containerize them "just so." My workflow is that I run , and I find myself in a web browser, in the same git repo I was in, but now in a randomly named branch, in a container, in . The first pane is the agent, but there are other panes doing other stuff. When I'm done, I've got a branch to work with, and I merge/rebase/cherry-pick. Let's break up the pieces: First, my shell script is in my favorite shell scripting language, dependency-less python3. Python3 has the advantage of not requiring you to think about and is sufficiently available. Second, I have a customized Dockerfile with the dependencies my projects need. I don't minimize the container; I add all the things I want. Browsers, playwright, subtrace, tmux, etc. Third, U cross-mount my git repo itself into the container, and create a worktree inside the container. From the outside, this work tree is going to look "prunable", but that causes no harm, and there’s a new branch that corresponds to the agent’s worktree. I like worktrees more than remotes because they’re in the same namespace; you don’t need to "fetch" or "push" across them. It’s easy to lose changes when the container exits; I commit automatically on exit. It's also easy to lose the worktree if something calls on your behalf, but recovery is possible with and some fiddling. Fourth, I run tmux inside the container so that opening a shell in the container is as simple as opening a new pane. (Somehow, is too rich.) I'm used to sketch.dev's terminal pane to do the little git operation, take a look at a diff, run a server... tmux helps. Fifth, networking magic with Tailscale. I publish ports 8000-9999 (and 11111) on my tailnet, using the same randomly generated name as I've used for my container and my branch. You're inevitably working on a web app, and you inevitably need to actually look at it, and Docker networking is doable, but you have to pre-declare exposed ports, and avoid conflicts, and ... it's just not great for this use case. There are other solutions (ngrok, SSH port forwarding), but I already use Tailscale, so this works nicely. I originally started with tsnsrv , but then vibe-coded a custom thing that supports port ranges. is the userland networking library here, and the agents do a fine job one-shotting this stuff. Sixth, I use to expose my to my browser over the tailnet network. I'm used to having a browser-tab per agent, and this gives me that. (Terminal-based agents feel weird to me. Browsers are great at scrolling, expand/collapse widgets, cut and paste, word wrap of text, etc.) Seventh, I vibe-coded a headless browser tool called , which wraps the excellent chromedp library, which remote-controls a headless Chrome over it's debugging protocol. Getting the MCPs configured for playwright was finnicky, especially across multiple agents, and I'm experimenting with this command line tool to do the same. As I’ve written about before . Using agents in containers gives me two things I value: Isolation for parallel work. The agents can start processes and run tests and so forth without conflicting on ports or files. A bit more security. Even the Economist has now picked up on the Lethal Trifecta (or Simon Willison's original) . By explicitly choosing which environment variables I forward, and not sharing my cookies and my SSH keys, I’m exerting some control over what data and capabilities are exposed to the agent. We’re still playing with fire (can you break out of Colima? Sure! Can you edit my git repo? Sure! Break into my tailnet? Sorta.), but it’s a smaller, more controlled burn. If you want to try my nonsense, https://github.com/philz/ctr-agent . First, my shell script is in my favorite shell scripting language, dependency-less python3. Python3 has the advantage of not requiring you to think about and is sufficiently available. Second, I have a customized Dockerfile with the dependencies my projects need. I don't minimize the container; I add all the things I want. Browsers, playwright, subtrace, tmux, etc. Third, U cross-mount my git repo itself into the container, and create a worktree inside the container. From the outside, this work tree is going to look "prunable", but that causes no harm, and there’s a new branch that corresponds to the agent’s worktree. I like worktrees more than remotes because they’re in the same namespace; you don’t need to "fetch" or "push" across them. It’s easy to lose changes when the container exits; I commit automatically on exit. It's also easy to lose the worktree if something calls on your behalf, but recovery is possible with and some fiddling. Fourth, I run tmux inside the container so that opening a shell in the container is as simple as opening a new pane. (Somehow, is too rich.) I'm used to sketch.dev's terminal pane to do the little git operation, take a look at a diff, run a server... tmux helps. Fifth, networking magic with Tailscale. I publish ports 8000-9999 (and 11111) on my tailnet, using the same randomly generated name as I've used for my container and my branch. You're inevitably working on a web app, and you inevitably need to actually look at it, and Docker networking is doable, but you have to pre-declare exposed ports, and avoid conflicts, and ... it's just not great for this use case. There are other solutions (ngrok, SSH port forwarding), but I already use Tailscale, so this works nicely. I originally started with tsnsrv , but then vibe-coded a custom thing that supports port ranges. is the userland networking library here, and the agents do a fine job one-shotting this stuff. Sixth, I use to expose my to my browser over the tailnet network. I'm used to having a browser-tab per agent, and this gives me that. (Terminal-based agents feel weird to me. Browsers are great at scrolling, expand/collapse widgets, cut and paste, word wrap of text, etc.) Seventh, I vibe-coded a headless browser tool called , which wraps the excellent chromedp library, which remote-controls a headless Chrome over it's debugging protocol. Getting the MCPs configured for playwright was finnicky, especially across multiple agents, and I'm experimenting with this command line tool to do the same. Isolation for parallel work. The agents can start processes and run tests and so forth without conflicting on ports or files. A bit more security. Even the Economist has now picked up on the Lethal Trifecta (or Simon Willison's original) . By explicitly choosing which environment variables I forward, and not sharing my cookies and my SSH keys, I’m exerting some control over what data and capabilities are exposed to the agent. We’re still playing with fire (can you break out of Colima? Sure! Can you edit my git repo? Sure! Break into my tailnet? Sorta.), but it’s a smaller, more controlled burn.

Containerizing Agents

State of My Homelab 2025

Cutting the cord on TV and movie subscriptions

A secure & efficient Node/npm in Docker setup for frontend development

How To Secure Your Docker Environment By Using a Docker Socket Proxy

docker deployment

How To Self-Host FreshRSS Reader Using Docker

How To Move Docker's Data Directory To Free Up Disk Space

The IPv6 situation on Docker is good now!

How To Setup Docker Registry In Kubernetes Using Traefik v2

Extra credit

Generating a docker image with nix

Logging on Nomad with Vector

How to connect to localhost from a Docker container

Optimizing Docker images for production

Using ngrok in Docker

Docker profiles: Scenario based services

Docker on macOS Without Performance Problems