Latest Posts (20 found)
Kaushik Gopal 2 weeks ago

Go with monthly AI subscriptions friends

Go with monthly AI subscriptions friends. I can’t remember where I read this tip, but given how fast the AI lab models move, it’s smarter to stick with a monthly plan instead of locking into an annual one, even if the annual price looks more attractive. I hit a DI issue on Android and was too lazy to debug it myself, so I pointed two models at it. GPT Codex gave me the cleanest, correct fix. Claude Sonnet 4.5 found a fix, but it wasn’t idiomatic and was pretty aggressive with the changes. A month ago, I wouldn’t have bothered with anything other than the Claude models for coding. Today, Codex clearly feels ahead. Google is about to ship its next Gemini model and, from what I’m hearing, it’s going to be absurdly good. In these wonderfully unstable times, monthly subscriptions are the way to go.

0 views
Kaushik Gopal 2 weeks ago

Firefox + UbO is still better than Brave, Edge or any Chromium-based solution

I often find myself replying to claims that Brave, Edge, or other Chromium browsers effectively achieve the same privacy standards as Firefox + uBlock Origin (uBO). This is simply not true. Brave and other Chromium browsers are constrained by Google’s Manifest V3. Brave works around this by patching Chromium and self-hosting some MV2 extensions, but it is still swimming upstream against the underlying engine. Firefox does not have these MV3 constraints, so uBlock Origin on Firefox retains more powerful, user-controllable blocking than MV3-constrained setups like Brave + uBO Lite. Brave is an excellent product and what I used for a long time. But the comparison often ignores structural realities. There are important nuances that make Firefox the more future-proof platform for privacy-conscious users. The core issue is Manifest V3 (MV3). This is Google’s new extension architecture for Chromium (what Chrome, Brave, and Edge are built on). Under Manifest V2, blockers like uBO used the blocking version of the API ( + ) to run their own code on each network request and decide whether to cancel, redirect, or modify it. MV3 deprecates that blocking path for normal extensions and replaces it with the (DNR) API: extensions must declare a capped set of static rules in advance, and the browser enforces those rules without running extension code per request. This preserves basic blocking but, as uBO’s developer documents, removes whole classes of filtering capabilities uBO relies on. And Google is forcing this change by deprecating MV2 . Yeah, shitty. To get around the problem, Brave is effectively swimming upstream against its own engine. It does this in two ways: They wrote a great post about this too. Brave is doing a great job, but it is operating with a sword of Damocles hanging over it. The team must manually patch a hostile underlying engine to maintain functionality that Firefox simply provides out of the box. A lot of people also say, wait, we now have “uBlock Origin Lite” that does the same thing and is even more lightweight! It is “lite” for a reason. You are not getting the same blocking safeguards. uBO Lite is a stripped-down version necessitated by Google’s API restrictions. As detailed in the uBlock Origin FAQ , the “Lite” version lacks in the following ways: uBlock Origin is widely accepted as the most effective content blocker available. Its creator, gorhill, has explicitly stated that uBlock Origin works best on Firefox . So while using a browser like Brave is better than using Chrome or other browsers that lack a comprehensive blocker, it is not equivalent to Firefox + uBlock Origin. Brave gives you strong, mostly automatic blocking on a Chromium base that is ultimately constrained by Google’s MV3 decisions. Firefox + uBlock Origin gives you a full-featured, user-controllable blocker on an engine that is not tied to MV3, which matters if you care about long-term, maximum control over what loads and who sees your traffic. Native patching: It implements ad-blocking (Shields) natively in C++/Rust within the browser core to bypass extension limitations. Manual extension hosting: Brave now has to manually host and update specific Manifest V2 extensions (like uBO and AdGuard) on its own servers to keep them alive as Google purges them from the store. No on-demand list updates: uBO Lite compiles filter lists into the extension package. The resulting declarative rulesets are refreshed only when the extension itself updates, so you cannot trigger an immediate filter-list or malware-list update from within the extension. No “Strict Blocking”: uBO Lite does not support uBlock Origin’s strict blocking modes or its per-site dynamic matrix. With full uBO on Firefox, my setup defines and exposes a custom, per-site rule set that ensures Facebook never sees my activity on other sites. uBO Lite does not let me express or maintain that kind of custom policy; I have to rely entirely on whatever blocking logic ships with the extension. No dynamic filtering: You lose the advanced matrix to block specific scripts or frames per site. Limited element picker: “Pointing and zapping” items requires specific, permission-gated steps rather than being seamless. No custom filters: You cannot write your own custom rules to block nearly anything, from annoying widgets to entire domains.

1 views
Kaushik Gopal 3 weeks ago

Cognitive Burden

A common argument I hear against AI tools: “It doesn’t do the job better or faster than me, so why am I using this again?” Simple answer: cognitive burden. My biggest unlock with AI was realizing I could get more done, not because I was faster , but because I wasn’t wringing my brain with needless tedium. Even if it took longer or needed more iterations, I’d finish less exhausted. That was the aha moment that sold me. Simple example: when writing a technical 1 post, I start with bullet points. Sometimes there’s a turn of phrase or a bit of humor I enjoy, and I’ll throw those in too. Then a custom agent trained on my writing generates a draft in my voice. After it drafts, I still review every single word. A naysayer might ask: “Well, if you’re reviewing every single word anyway, at that point, why not just write the post from scratch?” Because it’s dramatically easier and more enjoyable not to grind through and string together a bunch of prepositions to draft the whole post. I’ve captured the main points and added my creative touch; the AI handles the rest. With far less effort , I can publish more quickly — not due to raw speed, but because it’s low‑touch and I focus only on what makes it uniquely me. Cognitive burden ↓. About two years ago I pushed back on our CEO in a staff meeting: “Most of the time we engineers waste isn’t in writing the code. It’s the meetings, design discussions, working with PMs, fleshing out requirements — that’s where we should focus our AI efforts first.” 2 I missed the same point. Yes, I enjoy crafting every line of code and I’m not bogged down by that process per se, but there’s a cognitive tax to pay. I’d even say I could still build a feature faster than some LLMs today (accounting for quality and iterations) before needing to take a break and recharge. Now I typically have 3–4 features in flight (with requisite docs, tests, and multiple variants to boot). Yes, I’m more productive. And sure, I’m probably shipping faster. But that’s correlation, not causation. Speed is a byproduct. The real driver is less cognitive burden, which lets me carry more. What’s invigorated me further as a product engineer is that I’m spending a lot more time on actually building a good product . It’s not that I don’t know how to write every statement; it’s just… no longer interesting. Others feel differently. Great! To each their own. For me, that was the aha moment that sold me on AI. Reducing cognitive burden made me more effective; everything else followed. I still craft the smaller personal posts from scratch. I do this mostly because it helps evolve my thinking as I write each word down — a sort of muscle memory formed over the years of writing here.  ↩︎ In hindsight, maybe not one of my finest arguments especially given my recent fervor . To be fair, while I concede my pushback was wrong, I don’t think leaders then had the correct reasoning fully synthesized.  ↩︎ I still craft the smaller personal posts from scratch. I do this mostly because it helps evolve my thinking as I write each word down — a sort of muscle memory formed over the years of writing here.  ↩︎ In hindsight, maybe not one of my finest arguments especially given my recent fervor . To be fair, while I concede my pushback was wrong, I don’t think leaders then had the correct reasoning fully synthesized.  ↩︎

0 views
Kaushik Gopal 4 weeks ago

Standardize with ⌘ O ⌘ P to reduce cognitive load

There are a few apps on macOS in the text manipulation category that I end up spending a lot of time on. For example: Obsidian (for notes), Zed (text editor + IDE lite), Android Studio & Intellij (IDE++), Cursor (IDE + AI), etc. All these apps have two types of commands that I frequently use: But by default, these apps use ever so slightly different shortcuts. One might use ⌘ P, another might use ⌘ ⇧ P, etc. I’ve found it incredibly helpful to take a few minutes and make these specific keyboard shortcuts the same everywhere. So now I use: This small change has reduced cognitive load significantly. I no longer have to think about which app I’m in, and what the shortcut is for that specific app. Muscle memory takes over, and I can just get things done faster. Highly recommended! Open a specific file or note Open the command palette (or find any action menu) ⌘ O – Open a file/note ⌘ P – Open the command palette (or equivalent action menu)

0 views
Kaushik Gopal 1 months ago

Claude Skills: What's the Deal?

Anthropic announced Claude Skills and my first reaction was: “So what?” We already have , slash commands, nested instructions, or even MCPs. What’s new here? But if Simon W thinks this is a big deal , then pelicans be damned; I must be missing something. So I dissected every word of Anthropic’s eng. blog post to find what I missed. I don’t think the innovation is what Skills does or achieves, but rather how it does it that’s super interesting. This continues their push on context engineering as the next frontier. Skills are simple markdown files with YAML frontmatter. But what makes them different is the idea of progressive disclosure : Progressive disclosure is the core design principle that makes Agent Skills flexible and scalable. Like a well-organized manual that starts with a table of contents, then specific chapters, and finally a detailed appendix, skills let Claude load information only as needed: So here’s how it works: This dynamic context loading mechanism is very token efficient ; that’s the interesting development here. In this token-starved AI economy, that’s 🤑. Other solutions aren’t as good in this specific way. Why not throw everything into ? You could add all the info directly and agents would load it at session start. The problem: loading everything fills up your context window fast, and your model starts outputting garbage unless you adopt other strategies. Not scalable. Place an AGENTS.md in each subfolder and agents read the nearest file in the tree. This splits context across folders and solves token bloat. But it’s not portable across directories and creates an override behavior instead of true composition. Place instructions in separate files and reference them in AGENTS.md. This fixes the portability problem vs the nested approach. But when referenced, the full content still loads statically. Feels closest to Skills, but lacks the JIT loading mechanism. Slash commands (or in Codex) let you provide organized, hyper-specific instructions to the LLM. You can even script sequences of actions, just like Skills. The problem: these aren’t auto-discovered. You must manually invoke them, which breaks agent autonomy. Skills handle 80% of MCP use cases with 10% of the complexity. You don’t need a network protocol if you can drop a markdown file that says “to access GitHub API, use with .” To be quite honest, I’ve never been a big fan of MCPs. I think they make a lot of sense for the inter-service communication but more often than not they’re overkill. Token-efficient context loading is the innovation. Everything else you can already do with existing tools. If this gets adoption, it could replace slash commands and simplify MCP use cases. I keep forgetting, this is for the Claude product generally (not just Claude Code) which is cool. Skills is starting to solve the larger problem: “How do I give my agent deep expertise without paying the full context cost upfront?” That’s an architectural improvement definitely worth solving and Skills looks like a good attempt. Scan at startup : Claude scans available Skills and reads only their YAML descriptions (name, summary, when to use) Build lightweight index : This creates a catalog of capabilities (with minimal token cost); so think dozens of tokens per skill Load on demand : The full content of a Skill only gets injected into context when Claude’s reasoning determines it’s relevant to the current task ✓ Auto-discovered and loaded ✗ Static: all context loaded upfront (bloats context window at scale) ✓ Scoped to directories ✗ Not portable across folders; overrides behavior, not composition ✓ Organized and modular ✗ Still requires static loading when referenced ✓ Powerful and procedural ✗ Manual invocation breaks agent autonomy ✓ Access to external data sources ✗ Heavyweight; vendor lock-in; overkill for procedural knowledge Token-efficient context loading is the innovation. Everything else you can already do with existing tools. If this gets adoption, it could replace slash commands and simplify MCP use cases. I keep forgetting, this is for the Claude product generally (not just Claude Code) which is cool.

0 views
Kaushik Gopal 1 months ago

Cargo Culting

If you’re a software engineer long enough, you will meet some gray beards that throw out-of-left-field phrases to convey software wisdom. For example, you should know if you’re yak-shaving or bike-shedding , and when that’s even a good thing. A recent HN article 1 reminded me of another nugget – Cargo Culting (or Cargo Cult Programming). Cargo Culting : ritualizing a process without understanding it. In the context of programming: practice of applying a design pattern or coding style blindly without understanding the reasons behind it I’m going to take this opportunity to air one of my personal cargo-culting pet peeves, sure to kick up another storm: Making everything small . When I get PR feedback saying “this class is too long, split this!”, I get ready to launch into a tirade: you’re confusing small with logically small – ritualizing line count without understanding cohesion. You can make code small by being terse: removing whitespace, cramming logic into one-liners, using clever shorthand 2 . But you’ve just made it harder to read. A function that does one cohesive thing beats multiple smaller functions scattered across files. As the parable goes, after the end of the Second World War, indigenous tribes believed that air delivery of cargo would resume if they carried out the proper rituals, such as building runways, lighting fires next to them, and wearing headphones carved from wood while sitting in fabricated control towers. While on the surface amusing, there’s sadness if you dig into the history and contributing factors (value dominance, language & security barriers). I don’t think that’s reason to avoid the term altogether. We as humans sometimes have to embrace our dark history, acknowledge our wrongs and build kindness in our hearts. We cannot change our past, but we can change our present and future. The next time someone on your team ritualizes a pattern without understanding it, you’ll know what to call it. Who comes up with these terms anyway? Now that you’re aware of the term, you’ll realize that the original article’s use of the term cargo-cult is weak at best. In HN style, the comments were quick to call this out.  ↩︎ You know exactly what I’m thinking of, fellow Kotlin programmers.  ↩︎ Now that you’re aware of the term, you’ll realize that the original article’s use of the term cargo-cult is weak at best. In HN style, the comments were quick to call this out.  ↩︎ You know exactly what I’m thinking of, fellow Kotlin programmers.  ↩︎

0 views
Kaushik Gopal 1 months ago

ExecPlans – How to get your coding agent to run for hours

I’ve long maintained that the biggest unlock with AI coding agents is the planning step. In my previous post , I describe how I use a directory and ask the agent to diligently write down its tasks before and during execution. Most coding agents now include this as a feature. Cursor, for example, introduced it as an explicit feature recently. While that all felt validating, on a plane ride home I watched OpenAI’s DevDay. One of the most valuable sessions was Shipping with Codex . Aaron Friel — credited with record-long sessions and token output — walked through his process and the idea of “ExecPlans.” It felt similar at first, but I quickly realized this was some god-level planning. He said OpenAI would release his PLANS.md soon, but I couldn’t wait. On that flight, with janky wifi, I rebuilt what I could from the talk and grew my baby plan into something more mature — and I was already seeing better results. I pinged Aaron on BlueSky for the full doc, and he very kindly shared the PR that’s about to get merged with detailed information. My god, this thing is a work of art. Aaron clearly spent a lot of time honing it. I’ve tried it on two PRs so far, and it’s working fantastically. I still need to put it through its paces on some larger work projects, but I feel comfortable preemptively calling it the gold standard for planning. I’ve made a few small tactical tweaks to how I use it: This is really a big unlock, folks. Try it now. The latest PLANS.md can be found in Aaron’s PR . Use it as a template in your folder. Then instruct your agent via AGENTS.md to always write an ExecPlan when working on complex tasks. I highly recommend you go watch Aaron’s part of the talk Shipping with Codex . I’ll update this post once it’s merged or if anything changes. Update: I’ve been using this for the last few days (~8 PRs so far) and on an average I’ve definitely gotten my agents to run for much longer successfully (longest was about ~1 hour but frequently >30 mts). This is the way. I instruct the agent to write plans to (works across coding agents) In my AGENTS.md I tell agents to put temporary plans in (which I’ve gitignored) I keep the master Aaron shared at

0 views
Kaushik Gopal 1 months ago

Job Displacement with AI — Software Engineers → Conductors

Engineers won’t be replaced by tools that do their tasks better; they’ll be replaced by systems that make those tasks nonessential. Sangeet Paul Choudary wrote an insightful piece on AI-driven job displacement and a more transformative way to think about it: To truly understand how AI affects jobs, we must look beyond individual tasks to comprehend AI’s impact on our workflows and organizations. The task-centric view sees AI as a tool that improves how individual tasks are performed. Work remains structurally unchanged. AI is simply layered on top to improve speed or lower costs. …In this framing, the main risk is that a smarter tool might replace the person doing the task. The system-centric view, on the other hand, looks at how AI reshapes the organization of work itself. It focuses on how tasks fit into broader workflows and how their value is determined by the logic of the overall system. In this view, even if tasks persist, the rationale for grouping them into a particular job, or even performing them within the company, may no longer hold once AI changes the system’s structure. If we adopt a system-centric view, how does the role of a software engineer evolve 1 ? I’ve had a notion for some time — the role will transform into a software “conductor”. music conductors conducting is the art of directing the simultaneous performance of several players or singers by the use of gesture The tasks a software conductor must master differ from those of today’s software engineer. Here are some of the shifts I can think of: The craft is knowing exactly how much detail to provide in prompts: too little and models thrash; too much and they overfit or hallucinate constraints. You’ll need to write spec-grade prompts that define interfaces, acceptance criteria, and boundaries — chunking work into units atomic enough for clear execution yet large enough to preserve context. Equally critical: recognizing when to interrupt and redirect — catching drift early and steering with surgical edits rather than expensive reruns or loops. You’ll need to design systems that AI can both navigate and extend elegantly. This means clear module boundaries with explicit interfaces, descriptive naming that models can infer purpose from, and tests that double as executable specs. The goal: systems where AI agents can make surgical changes quickly and efficiently without cascading tech debt. We’re moving from building one solution to exploring many simultaneously. This unlocks three levels of experimentation: Feature variants — Build competing product approaches in parallel. One agent implements phone-only authentication while another builds traditional email/password. Both ship behind feature flags. Let users decide which wins. Implementation variants — Build the same feature with different architectures. Redis caching on path A, SQLite on path B. Run offline benchmarks and online canaries to measure which performs better under real load. Personalized variants — Stop looking for a single winner. The most radical shift: each user might get their own variant. Not just enterprise vs consumer, but individual-level personalization where the system learns what works for you specifically. Power users get keyboard shortcuts and dense information; casual users get guided flows with progressive disclosure. Users who convert on social proof see testimonials; analytical users see feature comparisons. AI makes the economics work — what was prohibitively expensive (maintaining thousands of personalized codepaths manually) becomes viable when AI generates, tests, and synchronizes variants automatically. The skill: running rigorous evals, measuring trade-offs with metrics, and orchestrating the complexity of multiple live variants. Every API call has a price, a latency budget, and quality trade-offs. You’ll need to master arbitrage between expensive reasoning models and cheaper models, knowing when to leverage MCPs, local tools, or cloud APIs. Learn how models approach refactors differently from new features or bug fixes, then tune prompts, context windows, and routing strategies accordingly. You’ll need to build golden test sets, trace model runs, classify failure modes, and treat evals like unit tests. Evaluation frameworks with baseline datasets, regression suites, and automated canaries that catch quality drift before production become non-negotiable. Without observability, you can’t iterate safely or validate that changes actually improve outcomes. Framework fluency loses value when AI handles syntax. What matters is depth in three areas: Core computer science fundamentals — Not because AI doesn’t know them, but because you need to verify AI made the right trade-offs for your specific constraints. AI might use quicksort when your dataset is always 10 items. It might optimize a function that runs once a day while missing the N+1 query in your hot path — where you loop through 1000 users making a database call for each instead of batching. Your value is code review with context: catching when AI optimizes for the wrong thing, knowing when simple beats clever, and spotting performance cliffs before they ship. Product judgment — Knowing which problem to solve, not just how to solve it. AI can build any feature you describe, but it can’t tell you whether that feature matters. Understanding user needs, prioritizing ruthlessly, and recognizing when you’re overbuilding becomes the bottleneck. Domain expertise — Deep knowledge of your problem space — whether it’s payments, healthcare, logistics, or graphics. AI can write generic code, but it struggles with domain-specific edge cases, regulations, and the unwritten rules experts know. The more niche your expertise, the harder you are to replace. These are the skills that matter for the next three years. But I don’t have a crystal ball beyond that. At the pace AI is evolving, even conductors might become a role that AI plays better. The orchestration itself could be automated, leaving us asking the same questions about the next evolution. For now, learning to conduct is how we stay relevant. Companies will change how they ship too; but the nearer shift is the individual’s role, so that’s my focus for this post.  ↩︎ Companies will change how they ship too; but the nearer shift is the individual’s role, so that’s my focus for this post.  ↩︎

0 views
Kaushik Gopal 1 months ago

Sorting Prompts - LLMs are not wrong you just caught them mid thought

Good sensemaking processes iterate. We develop initial theories, note some alternative ones. We then take those theories that we’ve seen and stack up the evidence for one against the other (or others). Even while doing that we keep an eye out for other possible explanations to test. When new explanations stop appearing and we feel that the evidence pattern increasingly favors one idea significantly over another we call it a day. LLMs are no different. What often is deemed a “wrong” response is often4 merely a first pass at describing the beliefs out there. And the solution is the same: iterate the process. What I’ve found specifically is that pushing it to do a second pass without putting a thumb on the scale almost always leads to a better result. To do this I use what I call “sorting statements” that try to do a variety of things Mike Caulfield is someone who cares about the veracity of information. The entire post is fascinating and has painted LLM search results in a new way for me. I now have a Raycast Snippet which expands to this: Already I’m seeing much better results.

0 views
Kaushik Gopal 1 months ago

Build your own /init command like Claude Code

Build your own command Claude’s makes it easy to add clear repo instructions. Build your own and use it with any agent to add or improve on an existing AGENTS.md Here’s the one I came up with . Claude Code really nailed the onboarding experience for agentic coding. Open it, type , and you get a that delivers better results than a repo without proper system instructions (or an ). It’s a clever way to ramp a repo fast. As I wrote last time, it hits one of the three levers for successful AI coding - seeding the right context. Even Codex CLI now comes with a built-in init prompt. There’s no secret 1 sauce: is just a strong prompt that writes (or improves) an instructions file. Here’s the prompt, per folks who’ve reverse‑engineered it: You can write your own and get the same result. I use a custom on new repos to get up and running fast 2 . I tweaked it to work across different coding agents and sprinkled in a few tips I collected along the way. It should create a relevant ; if one exists, it updates it. Save this prompt as a custom command and use it with any tool — Gemini CLI, Codex, Amp, Firebender, etc. You aren’t stuck with any single tool. One more tip: a reasoning model works best for these types of commands. I must say: the more time I spend with these tools, the more “emperor‑has‑no‑clothes” moments I have. Some of the ways these things work are deceptively simple. Claude does a few other things, like instructing its inner agent tools (BatchTool & GlobTool) to collect related files and existing instructions ( , , , , etc.) as context for generating or updating . But the prompt is the meat.  ↩︎ I used this prompt when I vibe‑engineered a maintainable Firefox add‑on .  ↩︎ Claude does a few other things, like instructing its inner agent tools (BatchTool & GlobTool) to collect related files and existing instructions ( , , , , etc.) as context for generating or updating . But the prompt is the meat.  ↩︎ I used this prompt when I vibe‑engineered a maintainable Firefox add‑on .  ↩︎

0 views
Kaushik Gopal 1 months ago

Three important things to get right for successful AI Coding

I often hear AI coding feels inconsistent or underwhelming. I’m surprised by this because more often than not, I get good results. When working with any AI agent ( or any LLM tool ), there are really just three things that drive your results: This may sound discouragingly obvious, but being deliberate about these three (every time you send a request to Claude Code, ChatGPT etc.) makes a noticeable difference. …and it’s straightforward to get 80% of this right. LLMs are pocket‑sized world knowledge machines. Every time you work on a task, you need to trim that machine to a surgical one that’s only focused on the task at hand. You do this by seeding context. The simplest way to do this, especially for AI Coding: There are many other ways, and engineering better context delivery is fast becoming the next frontier in AI development 2 . Think of prompts as specs, not search queries. For example: ‘Write me a unit test for this authentication class’ 🙅‍♂️. Instead of that one‑liner, here’s how I would start that same prompt: I use a text‑expansion snippet, , almost every single time. It reminds me to structure any prompt: This structure forces you to think through the problem and gives the AI what it needs to make good decisions. Writing detailed prompts every single time gets tedious. So you might want to create “ command ” templates. These are just markdown files that capture your detailed prompts. People don’t leverage this enough. If your team maintains a shared folder of commands that everyone iterates on, you end up with a powerful set of prompts you can quickly reuse for strong results. I have commands like , , , etc. AI agents hit limits: context windows fill up, attention drifts, hallucinations creep in, results suffer. Newer models can run hours‑long coding sessions, but until that’s common, the simpler fix is to break work into discrete chunks and plan before coding. Many developers miss this. I can’t stress how important it is, especially when you’re working on longer tasks. My post covers this; it was the single biggest step‑function improvement in my own AI coding practice. Briefly, here’s how I go about it: One‑shot requests force the agent to plan and execute simultaneously — which rarely produces great results. If you were to submit these as PRs to your colleagues for review, how would you break them up? You wouldn’t ship 10,000 lines, so don’t do that with your agents either. Plan → chunk → execute → verify. So the next time you’re not getting good results, ask yourself these three things: I wrote a post about this btw, on consolidating these instructions for various agents and tools.  ↩︎ Anthropic’s recent post on “context engineering” is a good overview of techniques.  ↩︎ the context you provide the prompt you write executing in chunks System rules & agent instructions : This is basically your file where you briefly explain what the project is, the architecture, conventions used in the repository, and navigation the project 1 . Tooling : Lot of folks miss this, but in your AGENTS.md, explicitly point to the commands you use yourself to build, test and verify. I’m a big fan of maintaining a single with the most important commands, that the assistant can invoke easily from the command line. Real‑time data ( MCP ): when you need real-time data or connect to external tools, use MCPs. People love to go on about complex MCP setup but don’t over index on this. For e.g. instead of a github MCP just install the cli command let the agent run these directly. You can burn tokens if you’re not careful with MCPs. But of course, for things like Figma/JIRA where there’s no other obvious connection path, use it liberally. Share the high‑level goal and iterate with the agent Don’t write code in this session; use it to tell the agent what it’s about to do. Once you’re convinced, ask the agent to write the plan in detailed markdown in your folder Reset context before you start executing Spawn a fresh agent, load . Implement only that task, verify & commit. Reset or clear your session. Proceed to and repeat. Am I providing all the necessary context? Is my prompt a clear spec? Am I executing in small, verifiable chunks? I wrote a post about this btw, on consolidating these instructions for various agents and tools.  ↩︎ Anthropic’s recent post on “context engineering” is a good overview of techniques.  ↩︎

0 views
Kaushik Gopal 2 months ago

Vibe-engineering a Firefox add-on: Container Traffic Control

I wanted to test a simple claim: you can ship maintainable software by vibe-coding end to end. I set strict constraints: In about a day 1 I had a working Firefox add-on I could submit for review. The code meets my bar for readability and long‑term change. Even the icon came from an image model 2 . Introducing Container Traffic Control . Install and source • Install: Firefox add-on listing • Code: github.com/kaushikgopal/ff-container-traffic-control It’s in vogue to share horror stories of decimated vibe-coded repos 3 . But I’m convinced that with the right fundamentals, you can vibe-code a codebase you’d comfortably hand to another engineer. This was my experiment to vet my feelings on the subject. Granted, this was a small and arguably very simple repository, but I’ve also seen success with moderately larger codebases personally. It comes down to scrupulous pruning : updating system instructions, diligent prompting, and code review. I plan to write much more about this later, but let’s talk about some of the mechanics of how it went: I didn’t write a single line of JavaScript by hand. When I needed changes, better structure, reusable patterns, small refactors — I asked the agent. The goal throughout was simple: keep the codebase readable and maintainable. It now has a lot of the things we consider important for a decent codebase: The best part: most of this came together over two days 4 . Some example pull requests from the repository with the exact prompt I used and the plan that was generated: Here’s the very first prompt I used to generate the guts of the code: I captured my prompts but wasn’t diligent about surfacing them in pull requests; here are a few I did capture: The code is open source, so go ahead and check it out . In my last post, How to Firefox , I covered “Privacy power-up: Containers” 5 . “Containers” let you log in to multiple Gmail accounts without separate browser profiles. Add Total Cookie Protection and you get strong isolation. That’s great, but managing it automatically gets tedious fast. Examples: Added these test cases I realized while writing this post I should probably have these exact use cases tested, so I did just that right now… as I continued to flush out this post. You can’t achieve this level of control with default containers unless you micromanage every case — and even then, some are impossible. I tried various add-ons but kept hitting cases that just wouldn’t work . So I built my own. I also prefer how this add-on asks you to set up rules: Overall, I enjoyed the experiment. I’ve been happily using my add-on , and I feel confident that if I needed to make changes, I could do it in what I consider a maintainable codebase. Stay tuned for my tips on how you can use AI coding more constructively. vibe-coding vs vibe-engineering Simon Willison started using the term vibe-engineering for precisely vibe-coding with this level of rigor. I’m trying to adopt this more. The bulk took a few hours; the rest was tweaks between other work.  ↩︎ Google’s new 🍌 model .  ↩︎ which I don’t for a second deny exist.  ↩︎ Honestly, the work put together was probably a few hours. I was issuing commands mostly on the side and going about my business, coming back later when I had time to tweak and re-instruct.  ↩︎ I’ve since updated the post to point to my new Firefox add-on.  ↩︎ New platform (I haven’t built a browser extension/add-on) Language I’m no longer proficient in (JavaScript) Zero manual code editing Tests (for the important parts) Well-organized code Clear, useful logging Code comments (uses a style called space-shuttle style programming , which I think is increasingly valuable with vibe-coding) Here’s a PR where midway I captured a major feature change: the original version of the add-on used a very different way of capturing the rules. It wasn’t as intuitive, so I decided to change it up. This was more a fun one where I asked it to critique the code as an HN reader would. Some good suggestions came out of it, but the explicit persona callout didn’t generate anything helpful in this specific case. Keep searches in one container, but open result links in my default container. From work Gmail, clicking a GitHub link: if it’s , open in Work; if it’s , open in Personal. In Google Docs (Personal), clicking a Sheets or Drive link should stay in Personal — even though my default for Sheets is Work. The bulk took a few hours; the rest was tweaks between other work.  ↩︎ Google’s new 🍌 model .  ↩︎ which I don’t for a second deny exist.  ↩︎ Honestly, the work put together was probably a few hours. I was issuing commands mostly on the side and going about my business, coming back later when I had time to tweak and re-instruct.  ↩︎ I’ve since updated the post to point to my new Firefox add-on.  ↩︎

0 views
Kaushik Gopal 2 months ago

A terminal command that tells you if your USB-C cable is bad

now includes macOS Tahoe support Apple slightly altered the system command for Tahoe You have a drawer full of USB cables. Half are junk that barely charge your phone. The other half transfer data at full speed. But which is which? Android Studio solved this. Recent versions warn you when you connect a slow cable to your phone: I wanted this for the command line. So I “built” 1 , a script to check your USB connections. The script parses macOS’s 2 command, which produces a dense, hard-to-scan raw output: With a little bit of scripting, the output becomes much cleaner: When I connect my Pixel 3 : The first version was a bash script I cobbled together with AI. It worked, but was a mess to maintain. Because I let AI take the wheel, even minor tweaks like changing output colors were difficult. Second time around, I decided to vibe-code again but asked AI to rewrite the entire thing in Go . I chose Go because I felt I could structure the code more legibly and tweaks would be easier to follow. Go also has the unique ability to compile a cross-platform binary, which I can run on any machine. But perhaps the biggest reason is, it took me a grand total of 10 minutes to have AI rewrite the entire thing. I was punching through my email actively as Claude was chugging on the side. Two years ago, I wouldn’t have bothered with the rewrite, let alone creating the script in the first place. The friction was too high. Now, small utility scripts like this are almost free to build. That’s the real story. Not the script, but how AI changes the calculus of what’s worth our time. yes, vibe coded. Shamelessly, I might add.  ↩︎ prevoius versoins of macOS used   ↩︎ I had Claude pull specs for the most common Pixel phones. I’ll do the same for iPhones if I ever switch back.  ↩︎ yes, vibe coded. Shamelessly, I might add.  ↩︎ prevoius versoins of macOS used   ↩︎ I had Claude pull specs for the most common Pixel phones. I’ll do the same for iPhones if I ever switch back.  ↩︎

0 views
Kaushik Gopal 4 months ago

reclaiming em-en dashing back from AI and lowercasing

AI is transforming our tools, our writing, and — apparently now — our sense of typographic originality. But there are two quirks of my writing that now get me side-eye from friends: I know, i know. nobody likes the guy who says he liked the band before they got famous. but here we are. I was lowercasing and em-en dashing before AI and i’d like to claim this back please. my friends roll their eyes when i try. even the polite ones. so, since i can’t get a word in at dinner, i’ll do it here. I need to build some credibility before I attempt to explain myself. one of the advantages of writing this blog for some time now: I can do a quick 1 and show you some early mention of these “quirks”: The earliest post I have here using an em dash is from 2017. might I remind you, ChatGPT was introduced to this world in 2022 . And now the more egregious one that Sam Altman is attempting to rob me of 2 : listen folks, — & – are beautiful characters that the english language provides us. Strunk & White, who wrote a once seminal guide have this to say: A dash is a mark of separation stonger than a comma, less formal than a colon, and more relaxed than parentheses… we cannot let the AI overlords claim these characters theirs. I have a proposal: Put spaces around your em and en dashes. you see when AI genereates text, it tends to do the typographically standard thing—not pad the dash with spaces, like i just demonstrated. but if you did pad it with spaces — it becomes a stylistic choice, and adds visual rhythm to boot. we can now use em and en dashes and prove that ChatGPT didn’t write this! I’ve mentioned my rsi before and even the crazy keyboard hacks to help me with it. one important one is mapping caps-lock to escape. if i have to capitalize my characters often, that’s my pinky painfully stretching to one of the ⇧ (shift) keys while typing the character with the other hand. no bueno for me. i now implore you: let’s normalize not calling people typing in all lowercase as AI zealots proving their humanness. yeah, i know this one is a stretch. but my fingers will click-clack their way through the shame. very cool git trick to search your history, by the way. yes, I was desperate to find the evidence.  ↩︎ The eagle eye will notice that I sometimes use proper punctuation. this is purely accidental or most likely my Mac auto-correcting my away.  ↩︎ em and en dashes predominant lowercasing very cool git trick to search your history, by the way. yes, I was desperate to find the evidence.  ↩︎ The eagle eye will notice that I sometimes use proper punctuation. this is purely accidental or most likely my Mac auto-correcting my away.  ↩︎

0 views
Kaushik Gopal 4 months ago

Getting Into Flow State with Agentic Coding

I recently found myself in a deep state of flow while coding — the kind where time melts away and you gain real clarity about the software you’re building. The difference this time: I was using Claude Code primarily. If my recent posts are any indication, I’ve been experimenting a lot with AI coding — not just with toy side projects, but high-stakes production code for my day job. I have a flow that I think works pretty well. I’m documenting it here as a way to hone my own process and, hopefully, benefit others as well. Skeptics vs cynics Many of my friends and colleagues are understandably skeptical about AI’s role in development. That’s ok. That’s actually good. We should be skeptical of anything that’s upending our field with such ferociousness. We just shouldn’t be cynical 1 . You know what’ll definitely get you out of the flow? Having to constantly repeat basic instructions to your agent. These are all very important instructions that you shouldn’t have to repeat to your agent every single time. Invest a little upfront in your master ai instructions file. It makes a big difference and gets you up and coding quickly with any agent. Of course, I also recommend consolidating your ai instructions to a single source of truth , so you’re not locked in with any single vendor. Update: Checkout my post ExecPlans . I now use OpenAI’s Aaron Friel ExecPlans approach for this. I won’t lie. The idea of this step did not exactly spark joy for me. There are times where I plan my code out in a neat list, but rarely. More often than not, I peruse the code, formulate the plan in my head and let the code guide me. Yeah, that didn’t work at all with the agents. Two phrases you’ll hear thrown around a lot: “Context is king” and “Garbage in, garbage out.” This planning phase is how you provide high-quality context to the agent and make sure it doesn’t spit garbage out. After I got over my unease and explicitly spent time on this step, I started seeing dramatically better results. Importantly, it also allows you to pause and have the agent pick up the work at any later point of your session, without missing a beat. You might be in the flow, but if your agent runs out of tokens, it will lose its flow. Here’s my process: This is the exact prompt I use: I have this prompt saved as a “plan-tasks.md” command in my folder so I don’t have to type this all the time. What follows now is a clean back-and-forth conversation with the agent that will culminate with the agent writing down the task plan. Sometimes I may not understand or agree with the task or sequencing, so I’ll ask for an explanation and adapt the plan. Other times I’m certain a task isn’t needed and I’ll ask for it to be removed. This is the most crucial step in the entire process . A note on software engineering experience If the future of software engineering is going to be AI reliant, I believe this planning step is what will distinguish the senior engineers from the junior ones. This is where your experience shines: you can spot the necessary tasks, question the unnecessary ones, challenge the agent’s assumptions, and ultimately shape a plan that is both robust and efficient. I intently prune these tasks and want them to look as close to the sequence I would actually follow myself. To reiterate: the key here is to ask for the plan to be written so that another agent can execute it autonomously . After committing all the plans (in your first PR), the real fun begins and we implement . You would think my process now is to spawn multiple agents in parallel and ask them to go execute the other plans as well. Maybe one day, but I’m not there yet. Instead, I start to parallelize the work around a single task. Taking as an example, I’ll spawn three agents simultaneously: Notice that with this approach, the likelihood of the agents causing merge conflicts is incredibly slim. Yet, all three tasks are crucial for delivering high-quality software. This is typically when I’ll go grab a quick ☕. When I’m back, I’ll tab through the agents, but I spend most of my time with the implementer. As code is being written, I’ll pop open my IDE and review it, improving my own understanding of the surrounding codebase. Sometimes I might realize there’s a better way to structure something, or a pattern I’d missed previously. I don’t hesitate to stop the agents, refine the plan, and set them off again. Eventually, the agents complete their work. I commit the code as a checkpoint (I’m liberal with my s and use them as checkpoints). I run the tests, and more often than not, something usually fails. This isn’t a setback; it’s part of the process. I analyze whether the code was wrong or the test was incomplete, switch to that context, and fix it. Refactor aggressively This is also where I get aggressive with refactoring. It’s a painful but necessary habit for keeping the codebase clean. I still lead the strategy, devising the plan myself, and then direct the agent to execute it or validate my approach. While it usually agrees (which isn’t always comforting 🙄), it can sometimes spot details I might have otherwise missed. Once I’m satisfied and the checks pass, I do a final commit and push up a draft PR. I review the code again on GitHub, just as I would for any other PR. Seeing the diff in a different context often helps me spot mistakes I might have missed in the IDE. If it’s a small change, I’ll make it myself. For anything larger, I’ll head back to the agents. All the while, I’m very much in the flow. I may be writing less of the boilerplate code, but I’m still doing a lot of “code thinking”. I’m also not “vibe coding” by any stretch. If anything, I’m spending more time thinking about different approaches to the same problem and working to find a better solution. In this way, I’m producing better code. I’ve been building like this for the past few weeks and it’s been miraculously effective. Deep work beats busywork, every time. Let agents handle the gruntwork, so you can stay locked in on what matters most. lest we get swept by this tide vs surfing smoothly over it  ↩︎ I absolutely love deconstructing a feature into multiple stacked PRs .  ↩︎ set the stage plan with the agent (no really 🤮) spawn your agents verify and refactor the final review All my plans live in an folder. I treat them as fungible; as soon as a task is done, the plan is deleted. Say I’m working on a ticket and intend to implement it with a stack 2 of 4 PRs. Each of those PRs should have a corresponding one plan file: , , etc. The plan (and PR) should be discrete and implement one aspect of the feature or ticket. Of course, I make the agent write all the plans and store it in the folder. ( I came this far in my career without being so studious about jira tasks and stories; I wasn’t about to start now ). Agent 1: The Implementer. Executes the core logic laid out in the plan. Agent 2: The Tester. Writes meaningful tests for the functionality, assuming the implementation will be completed correctly. Agent 3: The Documenter. Updates my project documentation (typically in ) with the task being executed. lest we get swept by this tide vs surfing smoothly over it  ↩︎ I absolutely love deconstructing a feature into multiple stacked PRs .  ↩︎

0 views
Kaushik Gopal 4 months ago

Introducing “shorts” for Henry (my Hugo blog engine/theme)

Introducing “shorts” for Henry (my Hugo blog engine/theme). Often, I find myself wanting to post a quick thought or note without the ceremony of a full-blown “blog post”. That’s usually when I’d post to Bluesky , X , Threads , or Mastodon . But as I’ve said before , I prefer investing in a feed I control. With Henry, I can effortlessly post a quick thought, and share it from a feed I own.

0 views
Kaushik Gopal 4 months ago

🦊 How to Firefox

Chrome finally pulled the trigger on the web’s best ad-blocker, uBlock Origin . Now that Chrome has hobbled uBO, Firefox — my beloved — 1 is surging again. I want to do my part to convince you to switch to Firefox and show you how I use it. Let’s get through the important talking points, in case you need a quick copy paste to convince a friend. This section can be quick. Here’s a github link to the source code of the Firefox browser. You can clone the repo, pop open your favorite AI code assistant and start asking questions about your browser - the most important app you use. What libraries does their Android app use? libs.versions.toml boom! Also 8.11.1 on android gradle plugin? not bad Firefox. Their license allows you to fork and distribute alternative versions . Vibe code a whole new browser. Most of the web today is enshittified with a cesspool of ads, popups, cookie notices, and tracking scripts. Our primary defense has been ad-blockers, with the most powerful being uBlock Origin. uBO relies on community-curated filter lists that play a cat-and-mouse game to zap known ads, trackers, and other digital sludge. But with Chrome controlling the web, Google followed through on its promise to kneecap uBO with Manifest V3, effectively blocking the full version from its extension store. Sure, there’s uBlock Origin “Lite” now, which does the same thing, right? Nope : uBlock Origin Lite != uBlock Origin Filter lists update only when the extension updates no fetching up to date lists from servers (this is a big one!) No custom filters so no element picker which allows you to point and zap Many filters are dropped at conversion time due to MV3’s limited filter syntax No strict-blocked pages No per-site switches No dynamic filtering No importing external lists Did you know, uBlock Origin works best on Firefox . Why not just use the real thing then? My browsing experience is beautiful because I have most of the shit-bits blocked away. On my Pixel too. With Firefox for Android, you get seamless sync of tabs, bookmarks, passwords between browser and phone 2 . Let’s face it, Safari between Mac and iPhone is a sublime experience. We can get that with Firefox. Here’s something the iPhone isn’t getting anytime soon: honest-to-god browser extensions that you use on your desktop, also on your phone. Which means… you can run uBlock Origin on Android, completely unnerfed . Safari has extensions, but they still require an App Store review for distribution on Apple platforms. They also just got a version of the uBO “Lite” extension. But… Firefox doesn’t look as clean and minimal as Safari. You can claw the vertical tabs out of my cold dead Arc hands! This is what my Firefox browser looks like: It only takes about five minutes and a browser restart to get this look. I’ll walk you through my setup now, from essential add-ons to privacy tweaks and a few “nice-to-have” extras. Nerd Alert This is my setup. I’m a nerd, so I find joy in tinkering. You don’t need to do all of this, but a few small tweaks can give you a massively better browser. Think of uBO as a powerful, wide-spectrum filter for the web. It uses community-maintained lists to block ads, trackers, cookie notices, and other digital sludge before it ever loads. Your browser stays faster and cleaner. It can be confusing to know which filter lists to enable. I follow the advice of a uBO wizard on Reddit , and these settings alone make the web 90% better. Check the same boxes, and you’re good to go. Custom Filters are an exclusive uBO superpower The “My filters” tab is where you can write your own rules to block nearly anything, from annoying widgets to entire domains. For the truly privacy-conscious, uBO can block all outgoing traffic to specific domains, like Facebook. 3 In the past, Firefox’s Facebook Container add-on helped by isolating your Facebook activity. But if any other site embedded a Facebook widget or tracker, your data could still leak to Meta’s servers, fingerprinting you even if you never visit Facebook directly. With a custom uBO rule, you can sever that connection entirely from non-Facebook sites. This is a level of control other browsers don’t offer. The other line you see there? That one-liner blocks all those “Sign in with Google?” pop-ups. This granular control is only possible with the full uBlock Origin, not the “Lite” version found on other browsers. If you want to go deeper, this video is a great showcase of its advanced capabilities. Firefox now includes Total Cookie Protection (TCP) by default. This automatically isolates cookies to the site that created them, giving each site its own “cookie jar”. Importantly it means sites aren’t allowed to read cookies from other site’s jars. This stops trackers from following you across the web. Firefox first piloted this feature as the Multi-Account Containers (MAC) add-on. But with TCP, the containers feature is somewhat redundant for basic anti-tracking . However, the container technology is still incredibly useful if you want to seamlessly manage multiple online identities. Instead of juggling separate browser profiles, you can use “Containers” to stay logged into two different Gmail accounts (e.g., “Work” and “Personal”) in the same browser window, with zero overlap. The old MAC add-on made this possible, but it was really clunky to setup. It’s actually much easier to do this today without installing an additional add-on. That’s it. This works without the extra MAC add-on because the Container concept is baked natively into Firefox. So with the above config tweaks, you enable the default built-in containers and whenever you click the new tab button, you can choose which container to open. You can now open gmail in these containers and it’s as if you’re opening an entirely new browser. But what about links? A work link (like Datadog or Sentry) clicked from your email in a Work container, might open in the default container and use the wrong Google account. You could right click the link and say “Open in Container >” but that gets old fast. Wouldn’t it be nice if you could give Firefox a set of rules and say, always open these websites or URLs in these containers? and everything just worked magically? I created the firefox add-on Container Traffic Control to help with that exact requirement. It’s open source and pretty well documented + tested. This combination of the native config tweak and my add-on provides a simple, but more powerful multi-profile setup (than even MAC). These are also not essential, but they add a nice layer of polish. Firefox is famously customizable via . Besides the container tweak, I only use one other: I’m collecting in this section, some of the cooler Firefox features that’ll make you wonder why every browser doesn’t have them: The web can still be beautiful. You just need the right tools to see it. Go download Firefox and make your web beautiful again. If you try this setup or have suggestion, let me know in the comments. No, goddammit, AI didn’t write this post.  ↩︎ For the few who have reached this point of the article and furiously questioned why I don’t just use Zen browser or Libre.  ↩︎ You can of course send outgoing traffic from Meta owned websites so Threads etc. still work.  ↩︎ Seriously, try the apostrophe trick. It’s a game-changer for keyboard navigation.  ↩︎ 100% open-source Un-enshittify the web Android users rejoice Customize to your heart’s content Filter lists update only when the extension updates no fetching up to date lists from servers (this is a big one!) No custom filters so no element picker which allows you to point and zap Many filters are dropped at conversion time due to MV3’s limited filter syntax No strict-blocked pages No per-site switches No dynamic filtering No importing external lists Dark Reader : For a consistent, customizable dark mode on every site. Stylus : To apply custom CSS. I use it to force my on code blocks. Return YouTube Dislike : Does what it says on the tin. Obsidian Web Clipper : To save notes and clippings directly to Obsidian, from desktop or mobile. Auto Tab Discard : Suspends background tabs to save RAM. A holdover from my RAM-strapped MacBook days, but it still does its job silently. New tabs open next to your current tab, not at the end. You catch that Mr.Gruber ? Type and start typing for quick find (vs ⌘F). But dig this, and Firefox will only match text for hyper links 4 If you have an obnoxious site disable right click, just hold Shift and Firefox will bypass and show it to you. No add-one required. URL bar search shortcuts: for bookmarks, for open tabs, for history No, goddammit, AI didn’t write this post.  ↩︎ For the few who have reached this point of the article and furiously questioned why I don’t just use Zen browser or Libre.  ↩︎ You can of course send outgoing traffic from Meta owned websites so Threads etc. still work.  ↩︎ Seriously, try the apostrophe trick. It’s a game-changer for keyboard navigation.  ↩︎

0 views
Kaushik Gopal 4 months ago

AI Programming Paradigms: A Timeline

A developer podcast host recently said they only use AI for autocomplete. This shocked me. That’s two generations behind today’s state of the art. This is how the field is evolving: This used to be one of the biggest selling points of certain IDEs like JetBrains & Visual Studio 1 . The first wave of AI changed this. IDEs now predict entire classes and logic blocks, not just keywords. They excel because they feed surrounding code context to the LLM for relevant suggestions. GitHub deserves credit for kicking off the revolution with their Copilot offering. Here’s an early YouTube video of mine from June 2022 2 showing it in action. Autocomplete keeps advancing. Cursor (today’s most popular AI IDE) has “Tab” . JetBrains, the pre-AI autocomplete champion, is building Mellum (an LLM built for code completion). This paradigm is alive and well, but it’s become table stakes. Most developers use it, but it’s far from the frontier. When ChatGPT took the world by storm, a new paradigm emerged. Conversational coding, where you chat with the AI and pair program together. Unlike the autocomplete era, where you trust the AI’s suggestions, here you direct the AI , give it context, and nudge it toward better solutions. This is arguably what most devs use today and envision when they hear “AI programming”. It feels magical, productivity jumps are real, and it’s hard to think of every going backwards. Cursor leads this charge: chat with your IDE and have it make code changes on the fly. The challenge with conversational coding though is IDE lock-in. Android developers love Android Studio, iOS developers love Xcode. Asking them to switch to Cursor is a big ask. There are workarounds: Firebender offers Cursor-like functionality to existing IDEs, JetBrains has Junie (though they’re late to the party), and Xcode developers… well, they wait for Apple. This constraint is why the next paradigm is starting to shine. This is the bleeding edge. Agentic coding works anywhere you can run commands. Pop open a terminal, fire up Claude Code , and let it work independently. The AI comes up with a plan, confirms the plan, makes changes, runs tests, fixes errors — all through your existing CLI tools. Claude dominates here. The TUI feels retro but works brilliantly for terminal-heavy development. You give the agent a task. It runs commands, checks results, iterates. The feedback loop is tight: ask, execute, verify, repeat. The space is exploding. Google’s Gemini CLI runs on Gemini 2.5 Pro with aggressive pricing 3 . Cursor too has a similar take that they call background agent mode. There’s also agentic tools outside of the terminal like Jules (Google) and Codex (OpenAI) that run cloud agents with GUIs. But Claude nails the terminal workflow imho. The core idea for agentic coding is the same: let the AI act on its own, not just suggest. So what’s next? My guess: we’ll move from mono agentic coding to multi sub-agent workflows. Claude can already do this today 4 . Think of a chess master playing a (simul)taneous exhibition match: multiple agents building features in parallel, coordinating to avoid merge hell, and you at the center, reviewing each PR in its own tab. The field evolves at breakneck speed. By the time you read this, this might feel like a mainstay. Revolutionary becomes quaint in two years. Today’s cutting edge is tomorrow’s baseline. Ah, the million-dollar question. Where does vibe coding fit? Andrej Karpathy (AI Programming Yoda) coined this term . But vibe coding isn’t a paradigm, it’s a style. It’s about trusting the AI to make big decisions with minimal guidance. You can apply it to any paradigm above, but the vibe gets strong in collaborative and agentic modes. If you’re a programmer wondering where to invest today: don’t get too attached to one specific tool. Try them all liberally. Start getting conversant. It’s too early to say which tool or paradigm will dominate, but AI programming is inevitable and will probably just be what we call programming. not that one. see intellisense .  ↩︎ Over 2.5 years ago! This is what I mean by “2 generations behind” (in AI development time, each year seems to bring a fundamental shift in how we interact with these tools).  ↩︎ Google is offering 60 requests per minute and 1,000 requests per day completely free . For context, Claude’s Pro plan at $20/mo gives you roughly 10–40 prompts every 5 hours.  ↩︎ instruct it specifically to “use sub-agents for this workflow”.  ↩︎ Super autocomplete : AI predicts code, not just keywords. Conversational Coding : Chat with your IDE, direct the AI, iterate together. Agentic Coding : AI acts independently - runs commands, checks work, iterates. Simul Agentic ? : Multiple sub-agents, parallel workflows, working together. not that one. see intellisense .  ↩︎ Over 2.5 years ago! This is what I mean by “2 generations behind” (in AI development time, each year seems to bring a fundamental shift in how we interact with these tools).  ↩︎ Google is offering 60 requests per minute and 1,000 requests per day completely free . For context, Claude’s Pro plan at $20/mo gives you roughly 10–40 prompts every 5 hours.  ↩︎ instruct it specifically to “use sub-agents for this workflow”.  ↩︎

0 views
Kaushik Gopal 4 months ago

Keep your AGENTS.md in sync - One Source of Truth for AI Instructions

Getting useful results from your AI assistant often hinges on providing the right instructions. Yet most developers take this step casually, then wonder why their AI outputs are mediocre. One of the simplest ways 1 you can do this is with your master instructions or AGENTS.md file. In this file, you provide persistent rules that shape every interaction you will have with the agent. You need to be pruning and tweaking these instructions to get the best results. If - like me - you use multiple tools (Claude, Cursor, Gemini CLI, Codex, Firebender 2 ), then updating the same rules for each of these tools, quickly becomes untenable. In this post, I’ll show you how I consolidate it all and edit it one place, while keeping it sync everywhere else. Every AI tool wants its own special instruction file: Good news Most coding tools have since consolidated around as the standard. So it makes sense to use that now. Start by creating the master instructions file: Generating your first file Claude Code’s command generates an excellent starting template for your project by analyzing your project structure and creating sensible defaults. But if you don’t use Claude Code, I wrote a more recent post on how you can do it yourself pretty easily with a custom command. You now have a centralized : you edit in one place, and all tools stay synchronized. Nested AGENTS.md support Most AI tools have also started to support “nested” files. This means you can have a specific instructions for the agent based on the module or folder it’s operating in. This helps you from having an overly stuffed parent AGENTS.md file which can sometimes unnecessarily fill up your context. With nested AGENTS.md files, coding agents know to pick up the additional context from nested folders only when necessary and relevant. While more often than not, you’ll follow the above instructions to setup agent instructions at the project level, certain agents also allow you to setup instructions at a user level. Think of these as your personal preferences. In addition to the main system instructions, I also like organizing my AI related “assets” in an .ai folder: The short answer: yes. While I haven’t done rigorous A/B testing on each rule or instruction, the improvements are noticeable. Without instructions, AI tools default to generic patterns and I’ve found myself having to repeat basic instructions more often. Folks online also seem to agree on this. Want to see what context your AI is actually using? Try asking: Different tools have varying levels of transparency here. Claude and Cursor are generally forthcoming about loaded context, while others may be more opaque. Use this feedback to refine your instructions, removing redundancy and clarifying ambiguous rules. I wrote a later post distilling the three things that matter most for getting good AI coding results btw.  ↩︎ for the IntelliJ users  ↩︎ conveniently they own the domain https://agents.md and have graciously been trying to rally everyone behind this  ↩︎ Claude Code : Cursor : and previously Codex CLI , OpenCode : Gemini CLI : Amp : 3 2025-10-12: removing artifacts .ai/rules, .ai/checkpoints, .ai/docs the notion of ExecPlans makes more sense than checkpoints AGENTS.md should have instructions, README.md should have docs rules should be incorporated into the AGENTS.md file 2025-09-18: change to AGENTS.md as more agents consolidate around it Codex supports AGENTS.md Gemini supports AGENTS.md Gemini in Android Studio supports AGENTS.md Cursor supports AGENTS.md VSCode supports AGENTS.md I wrote a later post distilling the three things that matter most for getting good AI coding results btw.  ↩︎ for the IntelliJ users  ↩︎ conveniently they own the domain https://agents.md and have graciously been trying to rally everyone behind this  ↩︎

0 views