GreatReads - Blog Aggregator · Phoenix Framework

Agentic Fluidity - OpenCode is OpenClaw for coding

One of the reasons OpenClaw got so popular was how fluidly you can chat with and operate your agents. Pull up your phone, send a quick message on WhatsApp, and you’re in business. As we focus more on agent orchestration 1 in 2026, I think an important aspect will be access fluidity . How do you hop into your agent’s context from any device, terminal, or IDE and just start coding? Claude Code supports this in a limited way, while others like Cursor and Codex take a cloud-based approach. The best option I’ve found for this “on-the-go” agentic coding is an open-source one — OpenCode. OpenCode - your best “on-the-go” option for agentic coding. OpenCode uses a native server-client architecture. You can simply spin it up in a regular terminal tab, just like or . But the power move is running it as a server and connecting multiple clients. A client can be your terminal tab, a mobile device, or a desktop computer. Each terminal tab becomes a new, isolated CLI session that connects to the server. Couple this with Tailscale , and you can securely connect to a dev machine running an OpenCode server from anywhere. I’d start by using like a regular CLI tool. Once it feels familiar, switch to server/web mode. The beauty is you can open that URL in any browser, and it’s fully synced. Credit to my co-host Iury for tooting the OpenCode horn early, and my Instacart colleague Spencer for questioning my luddite tmux ways. 2 I’ll write a future post singing OpenCode’s other praises. For now, if you’re exploring the bleeding edge of agent access fluidity, don’t sleep on it. See my post on AI paradigms . ↩︎ I noticed some memory leaks when using tmux sessions with OpenCode, and Spencer asked me: why not lean on the server-client model more and use regular Ghostty tabs and splits. ↩︎ See my post on AI paradigms . ↩︎ I noticed some memory leaks when using tmux sessions with OpenCode, and Spencer asked me: why not lean on the server-client model more and use regular Ghostty tabs and splits. ↩︎

Open Source

AI

Programming

0 views

Kaushik Gopal 1 months ago

AI model choices 2026-01

Which AI model do I use? This is a common question I get asked, but models evolve so rapidly that I never felt like I could give an answer that would stay relevant for more than a month or two. This year, I finally feel like I have a stable set of model choices that consistently give me good results. I’m jotting it down here to share more broadly and to trace how my own choices evolve over time. GPT 5.2 (High) for planning and writing, including plans Opus 4.5 for anything coding, task automation, and tool calling Gemini ’s range of models for everything else: Gemini 3 (Thinking) for learning and understanding concepts (underrated) Gemini 3 (Flash) for quick fire questions Nano Banana (obv) for image generation NVIDIA’s Parakeet for voice transcription

Machine Learning

AI

0 views

Kaushik Gopal 2 months ago

Wi-Fi sharing is a killer Android feature

Ubiquiti announced a new travel router . Much of the internet is excited. So am I. Then I tried to remember the last time I actually needed a travel router. You see, Android has supported a feature I’ll call Wi-Fi sharing for years. 1 Your phone connects to an existing Wi-Fi network and re-shares it as a hotspot. This might sound like a regular hotspot feature that most phones (including the iPhone) come with. But it’s not. iPhones can share mobile data. They can’t re-share a Wi-Fi connection as a hotspot. Wi-Fi sharing Your phone connects to Wi-Fi, and then re-shares that same Wi-Fi as a hotspot. This is different from typical hotspot functionality where the phone shares its mobile data connection (vs Wi-Fi). Neat trick, but why bother? Can’t you just connect each device to Wi-Fi? Captive portals are annoying when you’re carrying multiple devices. I typically travel with 3-4 devices that want internet. Signing each one in, every time, gets old fast. Some devices are worse: Chromecast and Fire TV sticks are particularly painful to get past captive portals. If everything connects to your hotspot, you only deal with the portal once. 2 On a plane, I sometimes want both my laptop and phone online. Some paid Wi-Fi plans only allow one device at a time. Unless you’re ok paying twice, Wi-Fi sharing is simpler. 3 Hotels and conference centers do the same: sign-in plus device limits. Wi-Fi sharing works around it. This one is less obvious, but common in hotels and conference Wi-Fi: your devices have internet, but they can’t see each other locally. Chromecast (or printers) won’t show up as a cast target because it doesn’t appear on the network. That’s usually client/AP isolation. 4 Put your devices on your phone’s hotspot, and local discovery usually works again. This is slightly advanced. With a Tailscale setup and an explicit exit node, you basically have a private VPN. 5 On phones where hotspot traffic routes through that VPN, you only have to set it up on your Android phone, and every device that connects to your phone gets the same “safe” path out. If I have to log in to bank accounts when roaming or connecting to “free” Wi-Fi, this helps me feel safer knowing the local network can’t see or tamper with the contents of my traffic. 6 I should pause my gloating over iPhones for a second: a few Android devices may not support this feature. The Android OS has Wi-Fi sharing baked in, but it still requires hardware + driver support. Notable exceptions include the Pixel 7a, the Pixel 8a, and yes the (first generation) Pixel Fold. Wi-Fi sharing requires Wi-Fi hardware (chipset + drivers) that can run as both a client and an access point at the same time (STA + AP). 7 Chipsets can implement this in a few ways (DBS, SBS, MCC, SCC). 8 Android doesn’t mandate one mode; it depends on the Wi-Fi chipset. DBS/SBS use multiple radios, so the phone can keep the upstream connection and hotspot truly simultaneous (for example, 5 GHz upstream and a 2.4 GHz hotspot). MCC/SCC share a radio, so the hotspot either stays on the same channel (SCC) or the radio hops channels (MCC). If a phone can’t do STA + AP concurrency well (or at all), OEMs disable Wi-Fi sharing (which is why some phones and many older devices don’t support it). Travel routers still have their place: Ethernet ports, better radios, and an always-on box you can run a VPN on. But if you’re on Android and your phone supports Wi-Fi sharing, you already have the core trick. Android doesn’t call it this in Settings, but it’s the best term I have for “connect to Wi-Fi, then share that Wi-Fi as a hotspot”. In strict networking terms, this isn’t L2 bridging; it’s typically tethering (routing/NAT) with a Wi-Fi upstream. ↩︎ This works because the captive portal only sees your phone; everything else is NATed behind it. ↩︎ Thank you Delta for being one of the few US domestic airlines that don’t place this restriction. Looking at you United. ↩︎ Hotel and conference Wi-Fi often blocks device-to-device traffic on purpose (“client isolation”) so guests can’t discover, scan, or connect to each other’s devices. Your phone’s hotspot creates a separate little LAN, so your devices can talk to each other again. ↩︎ I have a post in the making about this: “With Tailscale you don’t need to pay for a VPN”. ↩︎ HTTPS encrypts the bank session, but open Wi-Fi is still untrusted: a malicious access point can tamper with DNS and try to steer you into phishing. A VPN (or Tailscale exit node) reduces the surface area by encrypting your traffic to a trusted endpoint. ↩︎ Modern devices support AP (Access Point) + STA (Station) Mode, letting them act as both a client to one network and a hotspot for others, allowing Wi-Fi extension or tethering. ↩︎ Definitions from Android’s Wi-Fi vendor HAL ( ): DBS (Dual Band Simultaneous), SBS (Single Band Simultaneous), MCC (Multi Channel Concurrency), SCC (Single Channel Concurrency). ↩︎ Connect your Android phone to the Wi-Fi network you want to share. If it’s behind a captive portal, sign in as needed. Go to Settings → Hotspot & tethering → Wi-Fi hotspot (wording varies) and turn it on. Typically, if your phone does not support Wi-Fi sharing, it will disable Wi-Fi. Some OEMs show a separate toggle to enable Wi-Fi sharing. On Pixel phones, it’s automatic. Android doesn’t call it this in Settings, but it’s the best term I have for “connect to Wi-Fi, then share that Wi-Fi as a hotspot”. In strict networking terms, this isn’t L2 bridging; it’s typically tethering (routing/NAT) with a Wi-Fi upstream. ↩︎ This works because the captive portal only sees your phone; everything else is NATed behind it. ↩︎ Thank you Delta for being one of the few US domestic airlines that don’t place this restriction. Looking at you United. ↩︎ Hotel and conference Wi-Fi often blocks device-to-device traffic on purpose (“client isolation”) so guests can’t discover, scan, or connect to each other’s devices. Your phone’s hotspot creates a separate little LAN, so your devices can talk to each other again. ↩︎ I have a post in the making about this: “With Tailscale you don’t need to pay for a VPN”. ↩︎ HTTPS encrypts the bank session, but open Wi-Fi is still untrusted: a malicious access point can tamper with DNS and try to steer you into phishing. A VPN (or Tailscale exit node) reduces the surface area by encrypting your traffic to a trusted endpoint. ↩︎ Modern devices support AP (Access Point) + STA (Station) Mode, letting them act as both a client to one network and a hotspot for others, allowing Wi-Fi extension or tethering. ↩︎ Definitions from Android’s Wi-Fi vendor HAL ( ): DBS (Dual Band Simultaneous), SBS (Single Band Simultaneous), MCC (Multi Channel Concurrency), SCC (Single Channel Concurrency). ↩︎

Mobile

Hardware

Android

0 views

Kaushik Gopal 2 months ago

Combating AI coding atrophy with Rust

It’s no secret that I’ve fully embraced AI for my coding. A valid concern ( and one I’ve been thinking about deeply ) is the atrophying of the part of my brain that helps me code. To push back on that, I’ve been learning Rust on the side for the last few months. I am absolutely loving it. Kotlin remains my go-to language. It’s the language I know like the back of my hand. If someone sends me a swath of Kotlin code, whether handwritten or AI generated, I can quickly grok it and form a strong opinion on how to improve it. But Kotlin is a high-level language that runs on a JVM. There are structural limits to the performance you can eke out of it, and for most of my career 1 I’ve worked with garbage-collected languages. For a change, I wanted a systems-level language, one without the training wheels of a garbage collector. I also wanted a language with a different core philosophy, something that would force me to think in new ways. I picked up Go casually but it didn’t feel like a big enough departure from the languages I already knew. It just felt more useful to ask AI to generate Go code than to learn it myself. With Rust, I could get code translated, but then I’d stare at the generated code and realize I was missing some core concepts and fundamentals. I loved that! The first time I hit a lifetime error, I had no mental model for it. That confusion was exactly what I was looking for. Coming from a GC world, memory management is an afterthought — if it requires any thought at all. Rust really pushes you to think through the ownership and lifespan of your data, every step of the way. In a bizarre way, AI made this gap obvious. It showed me where I didn’t understand things and pointed me toward something worth learning. Here’s some software that’s either built entirely in Rust or uses it in fundamental ways: Many of the most important tools I use daily are built with Rust. Can’t hurt to know the language they’re written in. Rust is quite similar to Kotlin in many ways. Both use strict static typing with advanced type inference. Both support null safety and provide compile-time guarantees. The compile-time strictness and higher-level constructs made it fairly easy for me to pick up the basics. Syntactically, it feels very familiar. I started by rewriting a couple of small CLI tools I used to keep in Bash or Go. Even in these tiny programs, the borrow checker forced me to be clear about who owns what and when data goes away. It can be quite the mental workout at times, which is perfect for keeping that atrophy from setting in. After that, I started to graduate to slightly larger programs and small services. There are two main resources I keep coming back to: There are times when the book or course mentions a concept and I want to go deeper. Typically, I’d spend time googling, searching Stack Overflow, finding references, diving into code snippets, and trying to clear up small nuances. But that’s changed dramatically with AI. One of my early aha moments with AI was how easy it made ramping up on code. The same is true for learning a new language like Rust. For example, what’s the difference 2 between these two: Another thing I loved doing is asking AI: what are some idiomatic ways people use these concepts? Here’s a prompt I gave Gemini while learning: Here’s an abbreviated response (the full response was incredibly useful): It’s easy to be doom and gloom about AI in coding — the “we’ll all forget how to program” anxiety is real. But I hope this offers a more hopeful perspective. If you’re an experienced developer worried about skill atrophy, learn a language that forces you to think differently. AI can help you cross that gap faster. Use it as a tutor, not just a code generator. I did a little C/C++ in high school, but nowhere close to proficiency. ↩︎ Think mutable var to a “shared reference” vs. immutable var to an “exclusive reference”. ↩︎ fd (my tool of choice for finding files) ripgrep (my tool of choice for searching files) Fish shell (my shell of choice, recently rewrote in Rust) Zed (my text/code editor of choice) Firefox ( my browser of choice) Android?! That’s right: Rust now powers some of the internals of the OS, including the recent Quick Share feature. Fondly referred to as “ The Book ”. There’s also a convenient YouTube series following the book . Google’s Comprehensive Rust course, presumably created to ramp up their Android team. It even has a dedicated Android chapter . This worked beautifully for me. I did a little C/C++ in high school, but nowhere close to proficiency. ↩︎ Think mutable var to a “shared reference” vs. immutable var to an “exclusive reference”. ↩︎

AI

Rust Bash

C++

Programming

0 views

Kaushik Gopal 3 months ago

Go with monthly AI subscriptions friends

Go with monthly AI subscriptions friends. I can’t remember where I read this tip, but given how fast the AI lab models move, it’s smarter to stick with a monthly plan instead of locking into an annual one, even if the annual price looks more attractive. I hit a DI issue on Android and was too lazy to debug it myself, so I pointed two models at it. GPT Codex gave me the cleanest, correct fix. Claude Sonnet 4.5 found a fix, but it wasn’t idiomatic and was pretty aggressive with the changes. A month ago, I wouldn’t have bothered with anything other than the Claude models for coding. Today, Codex clearly feels ahead. Google is about to ship its next Gemini model and, from what I’m hearing, it’s going to be absurdly good. In these wonderfully unstable times, monthly subscriptions are the way to go.

AI

Programming

0 views

Kaushik Gopal 3 months ago

Firefox + UbO is still better than Brave, Edge or any Chromium-based solution

I often find myself replying to claims that Brave, Edge, or other Chromium browsers effectively achieve the same privacy standards as Firefox + uBlock Origin (uBO). This is simply not true. Brave and other Chromium browsers are constrained by Google’s Manifest V3. Brave works around this by patching Chromium and self-hosting some MV2 extensions, but it is still swimming upstream against the underlying engine. Firefox does not have these MV3 constraints, so uBlock Origin on Firefox retains more powerful, user-controllable blocking than MV3-constrained setups like Brave + uBO Lite. Brave is an excellent product and what I used for a long time. But the comparison often ignores structural realities. There are important nuances that make Firefox the more future-proof platform for privacy-conscious users. The core issue is Manifest V3 (MV3). This is Google’s new extension architecture for Chromium (what Chrome, Brave, and Edge are built on). Under Manifest V2, blockers like uBO used the blocking version of the API ( + ) to run their own code on each network request and decide whether to cancel, redirect, or modify it. MV3 deprecates that blocking path for normal extensions and replaces it with the (DNR) API: extensions must declare a capped set of static rules in advance, and the browser enforces those rules without running extension code per request. This preserves basic blocking but, as uBO’s developer documents, removes whole classes of filtering capabilities uBO relies on. And Google is forcing this change by deprecating MV2 . Yeah, shitty. To get around the problem, Brave is effectively swimming upstream against its own engine. It does this in two ways: They wrote a great post about this too. Brave is doing a great job, but it is operating with a sword of Damocles hanging over it. The team must manually patch a hostile underlying engine to maintain functionality that Firefox simply provides out of the box. A lot of people also say, wait, we now have “uBlock Origin Lite” that does the same thing and is even more lightweight! It is “lite” for a reason. You are not getting the same blocking safeguards. uBO Lite is a stripped-down version necessitated by Google’s API restrictions. As detailed in the uBlock Origin FAQ , the “Lite” version lacks in the following ways: uBlock Origin is widely accepted as the most effective content blocker available. Its creator, gorhill, has explicitly stated that uBlock Origin works best on Firefox . So while using a browser like Brave is better than using Chrome or other browsers that lack a comprehensive blocker, it is not equivalent to Firefox + uBlock Origin. Brave gives you strong, mostly automatic blocking on a Chromium base that is ultimately constrained by Google’s MV3 decisions. Firefox + uBlock Origin gives you a full-featured, user-controllable blocker on an engine that is not tied to MV3, which matters if you care about long-term, maximum control over what loads and who sees your traffic. Native patching: It implements ad-blocking (Shields) natively in C++/Rust within the browser core to bypass extension limitations. Manual extension hosting: Brave now has to manually host and update specific Manifest V2 extensions (like uBO and AdGuard) on its own servers to keep them alive as Google purges them from the store. No on-demand list updates: uBO Lite compiles filter lists into the extension package. The resulting declarative rulesets are refreshed only when the extension itself updates, so you cannot trigger an immediate filter-list or malware-list update from within the extension. No “Strict Blocking”: uBO Lite does not support uBlock Origin’s strict blocking modes or its per-site dynamic matrix. With full uBO on Firefox, my setup defines and exposes a custom, per-site rule set that ensures Facebook never sees my activity on other sites. uBO Lite does not let me express or maintain that kind of custom policy; I have to rely entirely on whatever blocking logic ships with the extension. No dynamic filtering: You lose the advanced matrix to block specific scripts or frames per site. Limited element picker: “Pointing and zapping” items requires specific, permission-gated steps rather than being seamless. No custom filters: You cannot write your own custom rules to block nearly anything, from annoying widgets to entire domains.

Open Source

Rust

C++

Security

1 views

Kaushik Gopal 3 months ago

Cognitive Burden

A common argument I hear against AI tools: “It doesn’t do the job better or faster than me, so why am I using this again?” Simple answer: cognitive burden. My biggest unlock with AI was realizing I could get more done, not because I was faster , but because I wasn’t wringing my brain with needless tedium. Even if it took longer or needed more iterations, I’d finish less exhausted. That was the aha moment that sold me. Simple example: when writing a technical 1 post, I start with bullet points. Sometimes there’s a turn of phrase or a bit of humor I enjoy, and I’ll throw those in too. Then a custom agent trained on my writing generates a draft in my voice. After it drafts, I still review every single word. A naysayer might ask: “Well, if you’re reviewing every single word anyway, at that point, why not just write the post from scratch?” Because it’s dramatically easier and more enjoyable not to grind through and string together a bunch of prepositions to draft the whole post. I’ve captured the main points and added my creative touch; the AI handles the rest. With far less effort , I can publish more quickly — not due to raw speed, but because it’s low‑touch and I focus only on what makes it uniquely me. Cognitive burden ↓. About two years ago I pushed back on our CEO in a staff meeting: “Most of the time we engineers waste isn’t in writing the code. It’s the meetings, design discussions, working with PMs, fleshing out requirements — that’s where we should focus our AI efforts first.” 2 I missed the same point. Yes, I enjoy crafting every line of code and I’m not bogged down by that process per se, but there’s a cognitive tax to pay. I’d even say I could still build a feature faster than some LLMs today (accounting for quality and iterations) before needing to take a break and recharge. Now I typically have 3–4 features in flight (with requisite docs, tests, and multiple variants to boot). Yes, I’m more productive. And sure, I’m probably shipping faster. But that’s correlation, not causation. Speed is a byproduct. The real driver is less cognitive burden, which lets me carry more. What’s invigorated me further as a product engineer is that I’m spending a lot more time on actually building a good product . It’s not that I don’t know how to write every statement; it’s just… no longer interesting. Others feel differently. Great! To each their own. For me, that was the aha moment that sold me on AI. Reducing cognitive burden made me more effective; everything else followed. I still craft the smaller personal posts from scratch. I do this mostly because it helps evolve my thinking as I write each word down — a sort of muscle memory formed over the years of writing here. ↩︎ In hindsight, maybe not one of my finest arguments especially given my recent fervor . To be fair, while I concede my pushback was wrong, I don’t think leaders then had the correct reasoning fully synthesized. ↩︎ I still craft the smaller personal posts from scratch. I do this mostly because it helps evolve my thinking as I write each word down — a sort of muscle memory formed over the years of writing here. ↩︎ In hindsight, maybe not one of my finest arguments especially given my recent fervor . To be fair, while I concede my pushback was wrong, I don’t think leaders then had the correct reasoning fully synthesized. ↩︎

AI

Career

1 views

Kaushik Gopal 4 months ago

Standardize with ⌘ O ⌘ P to reduce cognitive load

There are a few apps on macOS in the text manipulation category that I end up spending a lot of time on. For example: Obsidian (for notes), Zed (text editor + IDE lite), Android Studio & Intellij (IDE++), Cursor (IDE + AI), etc. All these apps have two types of commands that I frequently use: But by default, these apps use ever so slightly different shortcuts. One might use ⌘ P, another might use ⌘ ⇧ P, etc. I’ve found it incredibly helpful to take a few minutes and make these specific keyboard shortcuts the same everywhere. So now I use: This small change has reduced cognitive load significantly. I no longer have to think about which app I’m in, and what the shortcut is for that specific app. Muscle memory takes over, and I can just get things done faster. Highly recommended! Open a specific file or note Open the command palette (or find any action menu) ⌘ O – Open a file/note ⌘ P – Open the command palette (or equivalent action menu)

Programming

0 views

Kaushik Gopal 4 months ago

Claude Skills: What's the Deal?

Anthropic announced Claude Skills and my first reaction was: “So what?” We already have , slash commands, nested instructions, or even MCPs. What’s new here? But if Simon W thinks this is a big deal , then pelicans be damned; I must be missing something. So I dissected every word of Anthropic’s eng. blog post to find what I missed. I don’t think the innovation is what Skills does or achieves, but rather how it does it that’s super interesting. This continues their push on context engineering as the next frontier. Skills are simple markdown files with YAML frontmatter. But what makes them different is the idea of progressive disclosure : Progressive disclosure is the core design principle that makes Agent Skills flexible and scalable. Like a well-organized manual that starts with a table of contents, then specific chapters, and finally a detailed appendix, skills let Claude load information only as needed: So here’s how it works: This dynamic context loading mechanism is very token efficient ; that’s the interesting development here. In this token-starved AI economy, that’s 🤑. Other solutions aren’t as good in this specific way. Why not throw everything into ? You could add all the info directly and agents would load it at session start. The problem: loading everything fills up your context window fast, and your model starts outputting garbage unless you adopt other strategies. Not scalable. Place an AGENTS.md in each subfolder and agents read the nearest file in the tree. This splits context across folders and solves token bloat. But it’s not portable across directories and creates an override behavior instead of true composition. Place instructions in separate files and reference them in AGENTS.md. This fixes the portability problem vs the nested approach. But when referenced, the full content still loads statically. Feels closest to Skills, but lacks the JIT loading mechanism. Slash commands (or in Codex) let you provide organized, hyper-specific instructions to the LLM. You can even script sequences of actions, just like Skills. The problem: these aren’t auto-discovered. You must manually invoke them, which breaks agent autonomy. Skills handle 80% of MCP use cases with 10% of the complexity. You don’t need a network protocol if you can drop a markdown file that says “to access GitHub API, use with .” To be quite honest, I’ve never been a big fan of MCPs. I think they make a lot of sense for the inter-service communication but more often than not they’re overkill. Token-efficient context loading is the innovation. Everything else you can already do with existing tools. If this gets adoption, it could replace slash commands and simplify MCP use cases. I keep forgetting, this is for the Claude product generally (not just Claude Code) which is cool. Skills is starting to solve the larger problem: “How do I give my agent deep expertise without paying the full context cost upfront?” That’s an architectural improvement definitely worth solving and Skills looks like a good attempt. Scan at startup : Claude scans available Skills and reads only their YAML descriptions (name, summary, when to use) Build lightweight index : This creates a catalog of capabilities (with minimal token cost); so think dozens of tokens per skill Load on demand : The full content of a Skill only gets injected into context when Claude’s reasoning determines it’s relevant to the current task ✓ Auto-discovered and loaded ✗ Static: all context loaded upfront (bloats context window at scale) ✓ Scoped to directories ✗ Not portable across folders; overrides behavior, not composition ✓ Organized and modular ✗ Still requires static loading when referenced ✓ Powerful and procedural ✗ Manual invocation breaks agent autonomy ✓ Access to external data sources ✗ Heavyweight; vendor lock-in; overkill for procedural knowledge Token-efficient context loading is the innovation. Everything else you can already do with existing tools. If this gets adoption, it could replace slash commands and simplify MCP use cases. I keep forgetting, this is for the Claude product generally (not just Claude Code) which is cool.

Yaml

AI

0 views

Kaushik Gopal 4 months ago

Cargo Culting

If you’re a software engineer long enough, you will meet some gray beards that throw out-of-left-field phrases to convey software wisdom. For example, you should know if you’re yak-shaving or bike-shedding , and when that’s even a good thing. A recent HN article 1 reminded me of another nugget – Cargo Culting (or Cargo Cult Programming). Cargo Culting : ritualizing a process without understanding it. In the context of programming: practice of applying a design pattern or coding style blindly without understanding the reasons behind it I’m going to take this opportunity to air one of my personal cargo-culting pet peeves, sure to kick up another storm: Making everything small . When I get PR feedback saying “this class is too long, split this!”, I get ready to launch into a tirade: you’re confusing small with logically small – ritualizing line count without understanding cohesion. You can make code small by being terse: removing whitespace, cramming logic into one-liners, using clever shorthand 2 . But you’ve just made it harder to read. A function that does one cohesive thing beats multiple smaller functions scattered across files. As the parable goes, after the end of the Second World War, indigenous tribes believed that air delivery of cargo would resume if they carried out the proper rituals, such as building runways, lighting fires next to them, and wearing headphones carved from wood while sitting in fabricated control towers. While on the surface amusing, there’s sadness if you dig into the history and contributing factors (value dominance, language & security barriers). I don’t think that’s reason to avoid the term altogether. We as humans sometimes have to embrace our dark history, acknowledge our wrongs and build kindness in our hearts. We cannot change our past, but we can change our present and future. The next time someone on your team ritualizes a pattern without understanding it, you’ll know what to call it. Who comes up with these terms anyway? Now that you’re aware of the term, you’ll realize that the original article’s use of the term cargo-cult is weak at best. In HN style, the comments were quick to call this out. ↩︎ You know exactly what I’m thinking of, fellow Kotlin programmers. ↩︎ Now that you’re aware of the term, you’ll realize that the original article’s use of the term cargo-cult is weak at best. In HN style, the comments were quick to call this out. ↩︎ You know exactly what I’m thinking of, fellow Kotlin programmers. ↩︎

Programming

Kotlin

0 views

Kaushik Gopal 4 months ago

ExecPlans – How to get your coding agent to run for hours

I’ve long maintained that the biggest unlock with AI coding agents is the planning step. In my previous post , I describe how I use a directory and ask the agent to diligently write down its tasks before and during execution. Most coding agents now include this as a feature. Cursor, for example, introduced it as an explicit feature recently. While that all felt validating, on a plane ride home I watched OpenAI’s DevDay. One of the most valuable sessions was Shipping with Codex . Aaron Friel — credited with record-long sessions and token output — walked through his process and the idea of “ExecPlans.” It felt similar at first, but I quickly realized this was some god-level planning. He said OpenAI would release his PLANS.md soon, but I couldn’t wait. On that flight, with janky wifi, I rebuilt what I could from the talk and grew my baby plan into something more mature — and I was already seeing better results. I pinged Aaron on BlueSky for the full doc, and he very kindly shared the PR that’s about to get merged with detailed information. My god, this thing is a work of art. Aaron clearly spent a lot of time honing it. I’ve tried it on two PRs so far, and it’s working fantastically. I still need to put it through its paces on some larger work projects, but I feel comfortable preemptively calling it the gold standard for planning. I’ve made a few small tactical tweaks to how I use it: This is really a big unlock, folks. Try it now. The latest PLANS.md can be found in Aaron’s PR . Use it as a template in your folder. Then instruct your agent via AGENTS.md to always write an ExecPlan when working on complex tasks. I highly recommend you go watch Aaron’s part of the talk Shipping with Codex . I’ll update this post once it’s merged or if anything changes. Update: I’ve been using this for the last few days (~8 PRs so far) and on an average I’ve definitely gotten my agents to run for much longer successfully (longest was about ~1 hour but frequently >30 mts). This is the way. I instruct the agent to write plans to (works across coding agents) In my AGENTS.md I tell agents to put temporary plans in (which I’ve gitignored) I keep the master Aaron shared at

AI

Programming

0 views

Kaushik Gopal 4 months ago

Job Displacement with AI — Software Engineers → Conductors

Engineers won’t be replaced by tools that do their tasks better; they’ll be replaced by systems that make those tasks nonessential. Sangeet Paul Choudary wrote an insightful piece on AI-driven job displacement and a more transformative way to think about it: To truly understand how AI affects jobs, we must look beyond individual tasks to comprehend AI’s impact on our workflows and organizations. The task-centric view sees AI as a tool that improves how individual tasks are performed. Work remains structurally unchanged. AI is simply layered on top to improve speed or lower costs. …In this framing, the main risk is that a smarter tool might replace the person doing the task. The system-centric view, on the other hand, looks at how AI reshapes the organization of work itself. It focuses on how tasks fit into broader workflows and how their value is determined by the logic of the overall system. In this view, even if tasks persist, the rationale for grouping them into a particular job, or even performing them within the company, may no longer hold once AI changes the system’s structure. If we adopt a system-centric view, how does the role of a software engineer evolve 1 ? I’ve had a notion for some time — the role will transform into a software “conductor”. music conductors conducting is the art of directing the simultaneous performance of several players or singers by the use of gesture The tasks a software conductor must master differ from those of today’s software engineer. Here are some of the shifts I can think of: The craft is knowing exactly how much detail to provide in prompts: too little and models thrash; too much and they overfit or hallucinate constraints. You’ll need to write spec-grade prompts that define interfaces, acceptance criteria, and boundaries — chunking work into units atomic enough for clear execution yet large enough to preserve context. Equally critical: recognizing when to interrupt and redirect — catching drift early and steering with surgical edits rather than expensive reruns or loops. You’ll need to design systems that AI can both navigate and extend elegantly. This means clear module boundaries with explicit interfaces, descriptive naming that models can infer purpose from, and tests that double as executable specs. The goal: systems where AI agents can make surgical changes quickly and efficiently without cascading tech debt. We’re moving from building one solution to exploring many simultaneously. This unlocks three levels of experimentation: Feature variants — Build competing product approaches in parallel. One agent implements phone-only authentication while another builds traditional email/password. Both ship behind feature flags. Let users decide which wins. Implementation variants — Build the same feature with different architectures. Redis caching on path A, SQLite on path B. Run offline benchmarks and online canaries to measure which performs better under real load. Personalized variants — Stop looking for a single winner. The most radical shift: each user might get their own variant. Not just enterprise vs consumer, but individual-level personalization where the system learns what works for you specifically. Power users get keyboard shortcuts and dense information; casual users get guided flows with progressive disclosure. Users who convert on social proof see testimonials; analytical users see feature comparisons. AI makes the economics work — what was prohibitively expensive (maintaining thousands of personalized codepaths manually) becomes viable when AI generates, tests, and synchronizes variants automatically. The skill: running rigorous evals, measuring trade-offs with metrics, and orchestrating the complexity of multiple live variants. Every API call has a price, a latency budget, and quality trade-offs. You’ll need to master arbitrage between expensive reasoning models and cheaper models, knowing when to leverage MCPs, local tools, or cloud APIs. Learn how models approach refactors differently from new features or bug fixes, then tune prompts, context windows, and routing strategies accordingly. You’ll need to build golden test sets, trace model runs, classify failure modes, and treat evals like unit tests. Evaluation frameworks with baseline datasets, regression suites, and automated canaries that catch quality drift before production become non-negotiable. Without observability, you can’t iterate safely or validate that changes actually improve outcomes. Framework fluency loses value when AI handles syntax. What matters is depth in three areas: Core computer science fundamentals — Not because AI doesn’t know them, but because you need to verify AI made the right trade-offs for your specific constraints. AI might use quicksort when your dataset is always 10 items. It might optimize a function that runs once a day while missing the N+1 query in your hot path — where you loop through 1000 users making a database call for each instead of batching. Your value is code review with context: catching when AI optimizes for the wrong thing, knowing when simple beats clever, and spotting performance cliffs before they ship. Product judgment — Knowing which problem to solve, not just how to solve it. AI can build any feature you describe, but it can’t tell you whether that feature matters. Understanding user needs, prioritizing ruthlessly, and recognizing when you’re overbuilding becomes the bottleneck. Domain expertise — Deep knowledge of your problem space — whether it’s payments, healthcare, logistics, or graphics. AI can write generic code, but it struggles with domain-specific edge cases, regulations, and the unwritten rules experts know. The more niche your expertise, the harder you are to replace. These are the skills that matter for the next three years. But I don’t have a crystal ball beyond that. At the pace AI is evolving, even conductors might become a role that AI plays better. The orchestration itself could be automated, leaving us asking the same questions about the next evolution. For now, learning to conduct is how we stay relevant. Companies will change how they ship too; but the nearer shift is the individual’s role, so that’s my focus for this post. ↩︎ Companies will change how they ship too; but the nearer shift is the individual’s role, so that’s my focus for this post. ↩︎

Career

AI

0 views

Kaushik Gopal 4 months ago

Sorting Prompts - LLMs are not wrong you just caught them mid thought

Good sensemaking processes iterate. We develop initial theories, note some alternative ones. We then take those theories that we’ve seen and stack up the evidence for one against the other (or others). Even while doing that we keep an eye out for other possible explanations to test. When new explanations stop appearing and we feel that the evidence pattern increasingly favors one idea significantly over another we call it a day. LLMs are no different. What often is deemed a “wrong” response is often4 merely a first pass at describing the beliefs out there. And the solution is the same: iterate the process. What I’ve found specifically is that pushing it to do a second pass without putting a thumb on the scale almost always leads to a better result. To do this I use what I call “sorting statements” that try to do a variety of things Mike Caulfield is someone who cares about the veracity of information. The entire post is fascinating and has painted LLM search results in a new way for me. I now have a Raycast Snippet which expands to this: Already I’m seeing much better results.

AI

0 views

Kaushik Gopal 4 months ago

Build your own /init command like Claude Code

Build your own command Claude’s makes it easy to add clear repo instructions. Build your own and use it with any agent to add or improve on an existing AGENTS.md Here’s the one I came up with . Claude Code really nailed the onboarding experience for agentic coding. Open it, type , and you get a that delivers better results than a repo without proper system instructions (or an ). It’s a clever way to ramp a repo fast. As I wrote last time, it hits one of the three levers for successful AI coding - seeding the right context. Even Codex CLI now comes with a built-in init prompt. There’s no secret 1 sauce: is just a strong prompt that writes (or improves) an instructions file. Here’s the prompt, per folks who’ve reverse‑engineered it: You can write your own and get the same result. I use a custom on new repos to get up and running fast 2 . I tweaked it to work across different coding agents and sprinkled in a few tips I collected along the way. It should create a relevant ; if one exists, it updates it. Save this prompt as a custom command and use it with any tool — Gemini CLI, Codex, Amp, Firebender, etc. You aren’t stuck with any single tool. One more tip: a reasoning model works best for these types of commands. I must say: the more time I spend with these tools, the more “emperor‑has‑no‑clothes” moments I have. Some of the ways these things work are deceptively simple. Claude does a few other things, like instructing its inner agent tools (BatchTool & GlobTool) to collect related files and existing instructions ( , , , , etc.) as context for generating or updating . But the prompt is the meat. ↩︎ I used this prompt when I vibe‑engineered a maintainable Firefox add‑on . ↩︎ Claude does a few other things, like instructing its inner agent tools (BatchTool & GlobTool) to collect related files and existing instructions ( , , , , etc.) as context for generating or updating . But the prompt is the meat. ↩︎ I used this prompt when I vibe‑engineered a maintainable Firefox add‑on . ↩︎

AI

Programming

0 views

Kaushik Gopal 4 months ago

Three important things to get right for successful AI Coding

I often hear AI coding feels inconsistent or underwhelming. I’m surprised by this because more often than not, I get good results. When working with any AI agent ( or any LLM tool ), there are really just three things that drive your results: This may sound discouragingly obvious, but being deliberate about these three (every time you send a request to Claude Code, ChatGPT etc.) makes a noticeable difference. …and it’s straightforward to get 80% of this right. LLMs are pocket‑sized world knowledge machines. Every time you work on a task, you need to trim that machine to a surgical one that’s only focused on the task at hand. You do this by seeding context. The simplest way to do this, especially for AI Coding: There are many other ways, and engineering better context delivery is fast becoming the next frontier in AI development 2 . Think of prompts as specs, not search queries. For example: ‘Write me a unit test for this authentication class’ 🙅‍♂️. Instead of that one‑liner, here’s how I would start that same prompt: I use a text‑expansion snippet, , almost every single time. It reminds me to structure any prompt: This structure forces you to think through the problem and gives the AI what it needs to make good decisions. Writing detailed prompts every single time gets tedious. So you might want to create “ command ” templates. These are just markdown files that capture your detailed prompts. People don’t leverage this enough. If your team maintains a shared folder of commands that everyone iterates on, you end up with a powerful set of prompts you can quickly reuse for strong results. I have commands like , , , etc. AI agents hit limits: context windows fill up, attention drifts, hallucinations creep in, results suffer. Newer models can run hours‑long coding sessions, but until that’s common, the simpler fix is to break work into discrete chunks and plan before coding. Many developers miss this. I can’t stress how important it is, especially when you’re working on longer tasks. My post covers this; it was the single biggest step‑function improvement in my own AI coding practice. Briefly, here’s how I go about it: One‑shot requests force the agent to plan and execute simultaneously — which rarely produces great results. If you were to submit these as PRs to your colleagues for review, how would you break them up? You wouldn’t ship 10,000 lines, so don’t do that with your agents either. Plan → chunk → execute → verify. So the next time you’re not getting good results, ask yourself these three things: I wrote a post about this btw, on consolidating these instructions for various agents and tools. ↩︎ Anthropic’s recent post on “context engineering” is a good overview of techniques. ↩︎ the context you provide the prompt you write executing in chunks System rules & agent instructions : This is basically your file where you briefly explain what the project is, the architecture, conventions used in the repository, and navigation the project 1 . Tooling : Lot of folks miss this, but in your AGENTS.md, explicitly point to the commands you use yourself to build, test and verify. I’m a big fan of maintaining a single with the most important commands, that the assistant can invoke easily from the command line. Real‑time data ( MCP ): when you need real-time data or connect to external tools, use MCPs. People love to go on about complex MCP setup but don’t over index on this. For e.g. instead of a github MCP just install the cli command let the agent run these directly. You can burn tokens if you’re not careful with MCPs. But of course, for things like Figma/JIRA where there’s no other obvious connection path, use it liberally. Share the high‑level goal and iterate with the agent Don’t write code in this session; use it to tell the agent what it’s about to do. Once you’re convinced, ask the agent to write the plan in detailed markdown in your folder Reset context before you start executing Spawn a fresh agent, load . Implement only that task, verify & commit. Reset or clear your session. Proceed to and repeat. Am I providing all the necessary context? Is my prompt a clear spec? Am I executing in small, verifiable chunks? I wrote a post about this btw, on consolidating these instructions for various agents and tools. ↩︎ Anthropic’s recent post on “context engineering” is a good overview of techniques. ↩︎

Programming

AI

0 views

Kaushik Gopal 5 months ago

Vibe-engineering a Firefox add-on: Container Traffic Control

I wanted to test a simple claim: you can ship maintainable software by vibe-coding end to end. I set strict constraints: In about a day 1 I had a working Firefox add-on I could submit for review. The code meets my bar for readability and long‑term change. Even the icon came from an image model 2 . Introducing Container Traffic Control . Install and source • Install: Firefox add-on listing • Code: github.com/kaushikgopal/ff-container-traffic-control It’s in vogue to share horror stories of decimated vibe-coded repos 3 . But I’m convinced that with the right fundamentals, you can vibe-code a codebase you’d comfortably hand to another engineer. This was my experiment to vet my feelings on the subject. Granted, this was a small and arguably very simple repository, but I’ve also seen success with moderately larger codebases personally. It comes down to scrupulous pruning : updating system instructions, diligent prompting, and code review. I plan to write much more about this later, but let’s talk about some of the mechanics of how it went: I didn’t write a single line of JavaScript by hand. When I needed changes, better structure, reusable patterns, small refactors — I asked the agent. The goal throughout was simple: keep the codebase readable and maintainable. It now has a lot of the things we consider important for a decent codebase: The best part: most of this came together over two days 4 . Some example pull requests from the repository with the exact prompt I used and the plan that was generated: Here’s the very first prompt I used to generate the guts of the code: I captured my prompts but wasn’t diligent about surfacing them in pull requests; here are a few I did capture: The code is open source, so go ahead and check it out . In my last post, How to Firefox , I covered “Privacy power-up: Containers” 5 . “Containers” let you log in to multiple Gmail accounts without separate browser profiles. Add Total Cookie Protection and you get strong isolation. That’s great, but managing it automatically gets tedious fast. Examples: Added these test cases I realized while writing this post I should probably have these exact use cases tested, so I did just that right now… as I continued to flush out this post. You can’t achieve this level of control with default containers unless you micromanage every case — and even then, some are impossible. I tried various add-ons but kept hitting cases that just wouldn’t work . So I built my own. I also prefer how this add-on asks you to set up rules: Overall, I enjoyed the experiment. I’ve been happily using my add-on , and I feel confident that if I needed to make changes, I could do it in what I consider a maintainable codebase. Stay tuned for my tips on how you can use AI coding more constructively. vibe-coding vs vibe-engineering Simon Willison started using the term vibe-engineering for precisely vibe-coding with this level of rigor. I’m trying to adopt this more. The bulk took a few hours; the rest was tweaks between other work. ↩︎ Google’s new 🍌 model . ↩︎ which I don’t for a second deny exist. ↩︎ Honestly, the work put together was probably a few hours. I was issuing commands mostly on the side and going about my business, coming back later when I had time to tweak and re-instruct. ↩︎ I’ve since updated the post to point to my new Firefox add-on. ↩︎ New platform (I haven’t built a browser extension/add-on) Language I’m no longer proficient in (JavaScript) Zero manual code editing Tests (for the important parts) Well-organized code Clear, useful logging Code comments (uses a style called space-shuttle style programming , which I think is increasingly valuable with vibe-coding) Here’s a PR where midway I captured a major feature change: the original version of the add-on used a very different way of capturing the rules. It wasn’t as intuitive, so I decided to change it up. This was more a fun one where I asked it to critique the code as an HN reader would. Some good suggestions came out of it, but the explicit persona callout didn’t generate anything helpful in this specific case. Keep searches in one container, but open result links in my default container. From work Gmail, clicking a GitHub link: if it’s , open in Work; if it’s , open in Personal. In Google Docs (Personal), clicking a Sheets or Drive link should stay in Personal — even though my default for Sheets is Work. The bulk took a few hours; the rest was tweaks between other work. ↩︎ Google’s new 🍌 model . ↩︎ which I don’t for a second deny exist. ↩︎ Honestly, the work put together was probably a few hours. I was issuing commands mostly on the side and going about my business, coming back later when I had time to tweak and re-instruct. ↩︎ I’ve since updated the post to point to my new Firefox add-on. ↩︎

JavaScript

Web Development

Open Source

0 views

Kaushik Gopal 5 months ago

A terminal command that tells you if your USB-C cable is bad

now includes macOS Tahoe support Apple slightly altered the system command for Tahoe You have a drawer full of USB cables. Half are junk that barely charge your phone. The other half transfer data at full speed. But which is which? Android Studio solved this. Recent versions warn you when you connect a slow cable to your phone: I wanted this for the command line. So I “built” 1 , a script to check your USB connections. The script parses macOS’s 2 command, which produces a dense, hard-to-scan raw output: With a little bit of scripting, the output becomes much cleaner: When I connect my Pixel 3 : The first version was a bash script I cobbled together with AI. It worked, but was a mess to maintain. Because I let AI take the wheel, even minor tweaks like changing output colors were difficult. Second time around, I decided to vibe-code again but asked AI to rewrite the entire thing in Go . I chose Go because I felt I could structure the code more legibly and tweaks would be easier to follow. Go also has the unique ability to compile a cross-platform binary, which I can run on any machine. But perhaps the biggest reason is, it took me a grand total of 10 minutes to have AI rewrite the entire thing. I was punching through my email actively as Claude was chugging on the side. Two years ago, I wouldn’t have bothered with the rewrite, let alone creating the script in the first place. The friction was too high. Now, small utility scripts like this are almost free to build. That’s the real story. Not the script, but how AI changes the calculus of what’s worth our time. yes, vibe coded. Shamelessly, I might add. ↩︎ prevoius versoins of macOS used ↩︎ I had Claude pull specs for the most common Pixel phones. I’ll do the same for iPhones if I ever switch back. ↩︎ yes, vibe coded. Shamelessly, I might add. ↩︎ prevoius versoins of macOS used ↩︎ I had Claude pull specs for the most common Pixel phones. I’ll do the same for iPhones if I ever switch back. ↩︎

Bash

Hardware

0 views

Kaushik Gopal 7 months ago

Google still knows how to really swag

Business

Marketing

Culture

0 views

Kaushik Gopal 7 months ago

reclaiming em-en dashing back from AI and lowercasing

AI is transforming our tools, our writing, and — apparently now — our sense of typographic originality. But there are two quirks of my writing that now get me side-eye from friends: I know, i know. nobody likes the guy who says he liked the band before they got famous. but here we are. I was lowercasing and em-en dashing before AI and i’d like to claim this back please. my friends roll their eyes when i try. even the polite ones. so, since i can’t get a word in at dinner, i’ll do it here. I need to build some credibility before I attempt to explain myself. one of the advantages of writing this blog for some time now: I can do a quick 1 and show you some early mention of these “quirks”: The earliest post I have here using an em dash is from 2017. might I remind you, ChatGPT was introduced to this world in 2022 . And now the more egregious one that Sam Altman is attempting to rob me of 2 : listen folks, — & – are beautiful characters that the english language provides us. Strunk & White, who wrote a once seminal guide have this to say: A dash is a mark of separation stonger than a comma, less formal than a colon, and more relaxed than parentheses… we cannot let the AI overlords claim these characters theirs. I have a proposal: Put spaces around your em and en dashes. you see when AI genereates text, it tends to do the typographically standard thing—not pad the dash with spaces, like i just demonstrated. but if you did pad it with spaces — it becomes a stylistic choice, and adds visual rhythm to boot. we can now use em and en dashes and prove that ChatGPT didn’t write this! I’ve mentioned my rsi before and even the crazy keyboard hacks to help me with it. one important one is mapping caps-lock to escape. if i have to capitalize my characters often, that’s my pinky painfully stretching to one of the ⇧ (shift) keys while typing the character with the other hand. no bueno for me. i now implore you: let’s normalize not calling people typing in all lowercase as AI zealots proving their humanness. yeah, i know this one is a stretch. but my fingers will click-clack their way through the shame. very cool git trick to search your history, by the way. yes, I was desperate to find the evidence. ↩︎ The eagle eye will notice that I sometimes use proper punctuation. this is purely accidental or most likely my Mac auto-correcting my away. ↩︎ em and en dashes predominant lowercasing very cool git trick to search your history, by the way. yes, I was desperate to find the evidence. ↩︎ The eagle eye will notice that I sometimes use proper punctuation. this is purely accidental or most likely my Mac auto-correcting my away. ↩︎

AI

0 views

Kaushik Gopal 7 months ago

Getting Into Flow State with Agentic Coding

I recently found myself in a deep state of flow while coding — the kind where time melts away and you gain real clarity about the software you’re building. The difference this time: I was using Claude Code primarily. If my recent posts are any indication, I’ve been experimenting a lot with AI coding — not just with toy side projects, but high-stakes production code for my day job. I have a flow that I think works pretty well. I’m documenting it here as a way to hone my own process and, hopefully, benefit others as well. Skeptics vs cynics Many of my friends and colleagues are understandably skeptical about AI’s role in development. That’s ok. That’s actually good. We should be skeptical of anything that’s upending our field with such ferociousness. We just shouldn’t be cynical 1 . You know what’ll definitely get you out of the flow? Having to constantly repeat basic instructions to your agent. These are all very important instructions that you shouldn’t have to repeat to your agent every single time. Invest a little upfront in your master ai instructions file. It makes a big difference and gets you up and coding quickly with any agent. Of course, I also recommend consolidating your ai instructions to a single source of truth , so you’re not locked in with any single vendor. Update: Checkout my post ExecPlans . I now use OpenAI’s Aaron Friel ExecPlans approach for this. I won’t lie. The idea of this step did not exactly spark joy for me. There are times where I plan my code out in a neat list, but rarely. More often than not, I peruse the code, formulate the plan in my head and let the code guide me. Yeah, that didn’t work at all with the agents. Two phrases you’ll hear thrown around a lot: “Context is king” and “Garbage in, garbage out.” This planning phase is how you provide high-quality context to the agent and make sure it doesn’t spit garbage out. After I got over my unease and explicitly spent time on this step, I started seeing dramatically better results. Importantly, it also allows you to pause and have the agent pick up the work at any later point of your session, without missing a beat. You might be in the flow, but if your agent runs out of tokens, it will lose its flow. Here’s my process: This is the exact prompt I use: I have this prompt saved as a “plan-tasks.md” command in my folder so I don’t have to type this all the time. What follows now is a clean back-and-forth conversation with the agent that will culminate with the agent writing down the task plan. Sometimes I may not understand or agree with the task or sequencing, so I’ll ask for an explanation and adapt the plan. Other times I’m certain a task isn’t needed and I’ll ask for it to be removed. This is the most crucial step in the entire process . A note on software engineering experience If the future of software engineering is going to be AI reliant, I believe this planning step is what will distinguish the senior engineers from the junior ones. This is where your experience shines: you can spot the necessary tasks, question the unnecessary ones, challenge the agent’s assumptions, and ultimately shape a plan that is both robust and efficient. I intently prune these tasks and want them to look as close to the sequence I would actually follow myself. To reiterate: the key here is to ask for the plan to be written so that another agent can execute it autonomously . After committing all the plans (in your first PR), the real fun begins and we implement . You would think my process now is to spawn multiple agents in parallel and ask them to go execute the other plans as well. Maybe one day, but I’m not there yet. Instead, I start to parallelize the work around a single task. Taking as an example, I’ll spawn three agents simultaneously: Notice that with this approach, the likelihood of the agents causing merge conflicts is incredibly slim. Yet, all three tasks are crucial for delivering high-quality software. This is typically when I’ll go grab a quick ☕. When I’m back, I’ll tab through the agents, but I spend most of my time with the implementer. As code is being written, I’ll pop open my IDE and review it, improving my own understanding of the surrounding codebase. Sometimes I might realize there’s a better way to structure something, or a pattern I’d missed previously. I don’t hesitate to stop the agents, refine the plan, and set them off again. Eventually, the agents complete their work. I commit the code as a checkpoint (I’m liberal with my s and use them as checkpoints). I run the tests, and more often than not, something usually fails. This isn’t a setback; it’s part of the process. I analyze whether the code was wrong or the test was incomplete, switch to that context, and fix it. Refactor aggressively This is also where I get aggressive with refactoring. It’s a painful but necessary habit for keeping the codebase clean. I still lead the strategy, devising the plan myself, and then direct the agent to execute it or validate my approach. While it usually agrees (which isn’t always comforting 🙄), it can sometimes spot details I might have otherwise missed. Once I’m satisfied and the checks pass, I do a final commit and push up a draft PR. I review the code again on GitHub, just as I would for any other PR. Seeing the diff in a different context often helps me spot mistakes I might have missed in the IDE. If it’s a small change, I’ll make it myself. For anything larger, I’ll head back to the agents. All the while, I’m very much in the flow. I may be writing less of the boilerplate code, but I’m still doing a lot of “code thinking”. I’m also not “vibe coding” by any stretch. If anything, I’m spending more time thinking about different approaches to the same problem and working to find a better solution. In this way, I’m producing better code. I’ve been building like this for the past few weeks and it’s been miraculously effective. Deep work beats busywork, every time. Let agents handle the gruntwork, so you can stay locked in on what matters most. lest we get swept by this tide vs surfing smoothly over it ↩︎ I absolutely love deconstructing a feature into multiple stacked PRs . ↩︎ set the stage plan with the agent (no really 🤮) spawn your agents verify and refactor the final review All my plans live in an folder. I treat them as fungible; as soon as a task is done, the plan is deleted. Say I’m working on a ticket and intend to implement it with a stack 2 of 4 PRs. Each of those PRs should have a corresponding one plan file: , , etc. The plan (and PR) should be discrete and implement one aspect of the feature or ticket. Of course, I make the agent write all the plans and store it in the folder. ( I came this far in my career without being so studious about jira tasks and stories; I wasn’t about to start now ). Agent 1: The Implementer. Executes the core logic laid out in the plan. Agent 2: The Tester. Writes meaningful tests for the functionality, assuming the implementation will be completed correctly. Agent 3: The Documenter. Updates my project documentation (typically in ) with the task being executed. lest we get swept by this tide vs surfing smoothly over it ↩︎ I absolutely love deconstructing a feature into multiple stacked PRs . ↩︎

AI

Programming

0 views