Latest Posts (20 found)
Robin Moffatt 2 days ago

Interesting links - February 2026

Phew, what a month! February may be shorter but that’s not diminished the wealth of truly interesting posts I’ve found to share with you this month.

0 views
Robin Moffatt 1 months ago

Reflections of a Developer on LLMs in January 2026

Funnily enough, Charles Dickens was talking about late 18th century Europe rather than the state of AI and LLMs in 2026, but here goes: It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it was the winter of despair. For the last few weeks I’ve been coming back to this quotation, again and again. It is the best of times (so far) for AI—you can literally describe an idea for a program or website, and it’s generated for you . Hallucinations are becoming fewer. This is so much more than simply guessing the next word. Honestly, it’s a sufficiently advanced technology that really is indistinguishable from magic (with apologies to Arthur C. Clarke). Whether I’d call this the age of wisdom…I’m not sure yet ;) But at the same time… it is the worst of times, the age of foolishness, season of darkness. Bot-farms spewing divisive nonsense all over social media no longer need to copy and paste their false statements in a way that’s easily spotted; instead they can write custom text at scale whilst still giving the illusion of a real person behind the fake accounts. Combine human greed with the speed at which LLMs can generate content and you have an infinite flow of slop spurting all over the internet like a farmer’s muck spreader gone awry at scale. AI voice agents are becoming better and used for scamming people with realistic and targeted calls that would previously have been uneconomical to do at the scale necessary to reap a reward. AI-generated pictures are being used to create hoaxes and flood social media with dangerous rage-baiting. It might be the best & worst of times, but that doesn’t mean you have to pick sides. Having lived through the advent of cloud computing to where it is now, I can see real parallels in how developers in the tech industry are approaching it. Some, particularly vendors & VCs, are "all in". Others believe it’s a fad or a straight-up con and will give you a list of reasons why that is. Both extremes are utterly and completely wrong. If you’re the kind of Bare Metalsson character who believed the cloud was nonsense ( ) and took pride in racking your own servers (each of which has its own cute name), you’re probably also burying your head in the sand when it comes to using LLMs with cries of . And, just as running a homelab with servers and containers named after Star Wars characters is fun but you wouldn’t use the same approach at work, refusing to acknowledge that AI today has the potential to make you more productive as a developer starts to look somewhat childish or irresponsible. Just because AI makes shit up sometimes , it doesn’t mean that AI is not therefore ever a useful tool for the right job . Strikingly, what’s happened in the last month or two is that the list of jobs for which you can use it has suddenly grown drastically. The online chatter has moved from " omg you wouldn’t let an LLM code for you " to " omg how do we review all these PRs ", because guess what: all of a sudden people are letting an LLM generate code for them. AI, and specifically LLMs, are a valuable tool for developers, and it’s one that we need to recognise if we’re not to get left behind . Picture a Capuchin monkey sat on its haunches using a stone to crack open a nut . Rudimentary, but effective. Would we as developers use a stone when we needed a hammer to bang in a nail? No, that would be stupid—we use the right tool for the job, of course. Hammers are an evolution of the tool from a crude stone, and we use that because it’s the best tool for the job. But once the hammer drill came along, do we cling to our manual hammer when we’ve got a nail to bang into a brick wall? Again, no, that would be stupid. We want to use the best tool for the job . It’s the same evolution of tooling happening in AI. LLMs are a tool. Magical, bamboozling, hilariously-wrong at times tools; but ones that are evolving not over centuries or longer, but weeks and months. Some people fundamentally object to LLMs on principle, citing their use of resources, or threat to mankind. Personally, I believe that cat is out of the bag, the horse has bolted the stables…we’re way past that. Pandora’s box is open, and you and I are not shutting it. What I would observe is that if you’re working in IT, and you’re not already adopting AI and understanding what it can (and can’t) do for you, you might find yourself with a lot more time to discuss these opinions alongside the hansom cab drivers who figured that the motor engine was a fad and stuck with their horses. Put somewhat more confrontationally: you may as well be against the internet, or the combustion engine, or atomic energy. All have awful uses and implications; all also serve a role that cannot be overstated. What LLMs are enabling is truly of seismic impact, and I cannot fathom a path forward in which they do not continue to be central to how we do things with computers. Not convinced by my reasoning above? How about these folk: You can't let the slop and cringe deny you the wonder of AI. This is the most exciting thing we've made computers do since we connected them to the internet. If you spent 2025 being pessimistic or skeptical on AI, why not give the start of 2026 a try with optimism and curiosity? Not a fan of DHH? How about Charity Majors : this year was for AI what 2010 was for the cloud: the year when AI stopped being satellite, experimental tech and started being the mainstream, foundational technology. At least in the world of developer tools. It doesn’t mean there isn’t a bubble. Of COURSE there’s a fucking bubble. Cloud was a bubble. The internet was a bubble. Every massive new driver of innovation has come with its own frothy hype wave. But the existence of froth doesn’t disprove the existence of value. Or Sam Newman : To those of you who are deeply pessimistic around the use of AI in software delivery, the old quote from John Maynard Keynes comes to mind: "The market can remain irrational longer than you can remain solvent". For a considered look at the uses of LLMs, Bryan Cantrill wrote an excellent RFD: Using LLMs at Oxide Read the above linked articles, and also check out Scott Werner’s post "The Only Skill That Matters Now" which puts it even more clearly into focus, with a nice analogy about how "skating to the puck" is no longer a viable strategy. The long and short of it is that the rate of change in AI means you have no idea where the puck will even be. I read an article a while back that I found again here , in which a hospital consultant described their view of LLMs thus: "Think of it as the most brilliant, talented, often drunk intern you could imagine," This was in May 2023 (eons ago, in LLM years). As an end user of LLMs, I think this mental model really does work. If you, as a senior+ developer, think of an LLM as a very eager junior developer working for you. They’re fresh-eyed and bushy-tailed, and goddamnit they talk too much, don’t listen enough, and make stupid mistakes. But…you give them a job to do, point them in the right direction, and iterate with them under close supervision …and suddenly you’re finding yourself a lot more productive. Tutored well, a junior developer becomes a force-multiplier, a mini-me. A common instinct amongst inexperienced senior+ developers tasked with looking after a junior can unfortunately be "I’ve not got time to show them this, I’ll do it myself". As any decent developer knows, that’s a short-sighted and flawed way of developing others (as well as oneself). Mentoring and teaching and nurturing juniors is one step back, two steps forward. And…the same goes for an LLM. Do you have to keep telling them the same thing more than once? Yes. Do they write code that drives you into fits of rage with its idiocy and overcomplexity? Yes. Do they improve each time and ultimately give you more time on your plate to think about the bigger picture of system design and implementation ? Yes. I’m not intending to imply—as some may take from this—that in drawing the analogy I am actually suggesting we replace junior developers with AI. After all, junior developers learn, and in time become the senior developers who know when Claude is talking bollocks—that pipeline matters. Rather, I’m trying to characterise how one may look at the tool and one’s interactions with it. I am also leaving wide open the issue of what the impact of AI on junior developers themselves actually could be. The consequences for the software industry are likely to be vast. Commenting on this is beyond my experience—and there is also plenty being written elsewhere about it. Working with Claude Code over the past few weeks really has got me convinced that we’ve now taken a step forward where time invested in learning how to use it (because there is a learning curve) is time that’s well spent. Previously, using an LLM was not much more than typing (or various cargo-culting "prompt engineering" techniques). Now you have to learn about context windows and the magical file called and prompting to get the most out of it for coding, and that’s ok. Some tools are simple (pick up a hammer and hit something) and others require more understanding (I’m not using a chainsaw anytime soon without training on it first). Junior developers are humans. They get tired, they need rest breaks, they need feeding, and at some point they want to go home. LLMs, on the other hand, will keep on going so long as you keep feeding them tokens. The impact of this on you as their boss is substantial. You might task your junior developer with a piece of work and they’ll return to you later that day, perhaps with a few interruptions to clarify a point. Claude Code, on the other hand, is like an eager puppy, bounding back and forth demanding your attention often every minute or so. I’m still trying to work out how to balance the dopamine hit of each interaction bringing another astounding chunk of functionality delivered, with the impact the rapid context switching has on my brain. Interacting with Claude Code feels a bit like the hit we get from scrolling short video feeds. One more prompt…one more video… Because the feedback loop is so fast, it’s also very easy to get drawn down a rabbit hole of changes and either end up on a side-quest from one’s intended task, or lose sight of the big picture and end up meandering aimlessly through some Frankenstein-like development path that feels fruitful because of the near-instantaneous results but which is ultimately flawed. That’s it. Go fuck around, and find out. Exciting things are happening. Yes the hype and BS is real and nauseating; but that doesn’t stop it being true. If you’re interested in the F’ing around and what I Found Out, have a look at the companion post to this one: Cosplaying as a webdev with Claude Code in January 2026 .

0 views
Robin Moffatt 1 months ago

Cosplaying as a webdev with Claude Code in January 2026

In which Claude and [A]I play at being webdevs. For some reflections on the bigger picture of AI as a productivity tool for developers, have a look at the companion post to this one . I used to speak at a lot of conferences and meetups, and published my talks on a site called . It’s free to use, but you could pay for bells and whistles including a custom domain, which I duly did: . My background is databases and SQL; I can spell HTML (see, I just did) and am aware of CSS and can fsck about in the Chrome devtools to fiddle with the odd detail…but basically frontend webdev is completely beyond me. That meant I was more than happy to pay someone else to host my talks for me on an excellent platform. This was a few years ago, and the annual renewal of the plan was starting to bite—over £100 for what was basically static content that I barely ever changed (I’ve only done three talks since 2021). So I decided to see what Claude Code/Opus 4.5 could do, and signed up for the £18/month "Pro" plan. The way Claude Code works is nothing short of amazing. You use natural language to tell it what to do…and it does it. I started off by saying to ( prompting ) it with something like this: Claude Code then poked around the two sites and probably asked me some questions (did I want to import all content, what kind of style, etc), and then spat out a Python script to do a one-time ingest of all the content from noti.st. After seeking permission it then ran the Python script, debugged the errors that were thrown, until it was happy it had a verbatim copy of the data. Along the way it’d report in on what it was doing and I could steer it—much the same way you would a junior developer. For example, on noti.st a slide deck’s PDF is exploded out into individual images so that a user can browse it online. This meant a crap-ton of images which I didn’t care about, but Claude Code assumed I would so started grabbing them. Claude then proceeded to build and populate a site to run locally. There were plenty of mistakes, as well as plenty of yak-shaving ("hmm can you move this bit to there, and change the shade of that link there"). This can be part of the danger with Claude. It will never roll its eyes and sigh at you when you ask for the hundredth amendment to your original spec, so it’s easy to get sucked into endless fiddling and tweaking. I found I quickly burnt through my Pro token allowance, which actually served well as a gatekeeper on my time, forcing me to step back until the tokens were refreshed. After four early morning/late nights around my regular work, I cut over my DNS and you can see the results at https://talks.rmoff.net/ . The key things that Claude Code did that I’d not been able to get ad hoc chat sessions (or even Cursor) to do last year include: Planning out a full project like this one, from the overview down to every detail Talking the talk (writing the code) and walking the walk (independently running the code, fixing errors, evaluating logic problems, etc) Rapidly iterating over design ideas, including discussing them and not just responding one-way to instructions Discussing deployment options, including working through challenges given the cumulative size of the PDFs Explaining and building and executing and testing the deployment framework Before the sceptics jump in with their , my point is not that I couldn’t theoretically have done this without Claude. It’s that it took, cumulatively, perhaps eight hours—and half of that will have been learning how to effectively interact with Claude. It’s that it’s a single terminal into which one types, that’s it. No explosion of tabs. No rabbit-holes of threads trying to figure this stuff out. One place. That fixes its own errors. That writes code that you could never have done without a serious investment of time. Would I apply for a frontend engineering job? Heck no! Does my new site stand up to scrutiny? Probably not. Will real frontend devs look at the code and be slightly sick in their mouths? Perhaps. Does this weaken my point? Not in the slightest! £18-worth of Claude Code (less, if you pro-rata it over the month) and I’ve saved myself an ongoing annual bill of £100, built a custom website that looks exactly as I want it, has exactly the functionality that I want—oh, and was a fuck-ton of fun to build too :) Not whilst I have access to Claude ;) I realise that in reading this the choler will be rising in some seasoned software engineers. After all, who is this data engineer poncing around pretending to build websites? And that’s perhaps the crux of it: I’m a data engineer, branching out into something I couldn’t do before, courtesy of Claude. I would definitely use Claude to help me write SQL queries and generate DDL, but I’d be damned if I’d put my name to a pull request with a single byte of code that I couldn’t explain—because that’s my job . I like Oxide’s words here: However powerful they may be, LLMs are but a tool, ultimately acting at the behest of a human. Oxide employees bear responsibility for the artifacts we create, whatever automation we might employ to create them . So I can have fun building a website that’s just my personal site and only on me if it fails. But if I’m writing code as a professional for my job, it’s on me to make sure that it’s code I can put my name to. There is a lot written about Claude Code. Some of it cargo-culting, some of it frothy-hype nonsense. Below I’ve listed out some of my 'top tips' that I’d be passing onto a colleague getting into Claude Code from scratch tomorrow. If you’re doing any kind of webdev work, follow Kris Jenkins' tip and use Playwright so that Claude can "see" as it develops. You can manually take screenshots and paste those into Claude too if you want (including ones you’ve annotated with observations and instructions) but in general and particularly for regression testing, Playwright is an excellent addition. Because this is Claude, you don’t need to actually know how to configure Playwright or run its tests, or anything like that. You just tell Claude: "Use Playwright to test the changes". And it does. Oh, and it’ll install it for you if you don’t have it already. Claude will sometimes ask for permission to do something, or tell you it’s finished its current task. If you’ve got it sat in a terminal window behind your other work you may not realise this, so adding a sound prompt can be useful. In your include: Obviously, you can waste a lot of time customising it to use just the right sound effect from your favourite 1980s arcade game. Depending on how you pay for Claude (fixed plans, or per API calls) you’ll discover sooner or later that it can be quite expensive. You can include the cost of the current session in the status line by adding this to the same config file as above, : It’ll look something like this: Also check out which uses the Claude log data to calculate usage and break it down in different ways which can help you optimise your use of it. Different Claude models (Opus, Sonnet, Haiku) cost different amounts, and you can optimise your spend by learning a bit about their relative strengths. I found that asking Claude itself was useful; using Opus (the most capable model) you can describe what you’re going to want it to do, and which model it would recommend. Like all of this LLM malarky, none of it is absolute, but I found its recommendations useful (i.e. the models it recommended were cheaper and did achieve what I needed them to). Think of it as having different pairs of running shoes in your closet—different ones are going to be suited to different tasks. You’re not going to wear your $200 carbon-plate running shoes to kick the ball around the park, are you? Go read up on things like: Context windows—what the LLM knows about what you’re doing Context rot—the more that’s in the LLM’s context window, the less effective it can sometimes become —where Claude makes a note of what it is you’re building and core principles, toolsets, etc You can get a lot of value by spending some time on this so that you can restart your session when you need to (e.g. to clear the context window) without having Claude 'forget' too much of the basics of what you’ve told it Work with Claude on this file—literally say, look at your CLAUDE.md, I have to keep telling you to do x , how can you remember it better. If you give it permission, it’ll then go and update its own file Use plan mode and accept-change (shift-tab) judiciously. If you just YOLO it and accept changes without seeing the plan you’ll often end up with a very busy fool going in the wrong direction. Claude is your servant (for now) and it’s up to you to boss it around firmly as needs be. Watch out for Claude spinning its wheels—if you see it trying to repeatedly fix something and getting stuck you might be burning a ton of tokens on something that it’s misunderstood or doesn’t actually matter I’ve been experimenting with a few non-coding examples, both pairing Claude with basic-memory and an Obsidian vault. Proofreading my blog ( here’s the prompt , if you’re interested; PRs welcome 😉). I also have a Raycast AI Preset to do this, but am finding myself more and more reaching for Claude’s terminal window. It works well because I write my blog posts in Asciidoc, which Claude can read and edit directly (if I ask it to). Claude can also help you write the prompt/skill—I gave it verbatim some feedback I got from a real human being on the initial version of this post, and it distilled that into what it ought to check for next time and updated its skill . Neat. Planning a holiday. Iteratively building up with Claude a spec that captures the requirements of the holiday, it can then help with itineraries, checklists, discuss areas, etc etc. As with the coding project above, it being one window with which to interact is really powerful. Acting as a running coach. Plugging in Garmin and Strava data via MCP I can capture all of my running and health info, and discuss with Claude planned workouts, even weaving in notes from past physio appointments. Obviously I am not following it blindly but as an exercise (geddit?!) in integration and LLMs, it’s pretty fun . This post may well have you spitting coffee into your cornflakes, I realise that. For some reflections on the bigger picture of AI as a productivity tool for developers, have a look at the companion post to this one . Planning out a full project like this one, from the overview down to every detail Talking the talk (writing the code) and walking the walk (independently running the code, fixing errors, evaluating logic problems, etc) Rapidly iterating over design ideas, including discussing them and not just responding one-way to instructions Discussing deployment options, including working through challenges given the cumulative size of the PDFs Explaining and building and executing and testing the deployment framework Context windows—what the LLM knows about what you’re doing Context rot—the more that’s in the LLM’s context window, the less effective it can sometimes become —where Claude makes a note of what it is you’re building and core principles, toolsets, etc You can get a lot of value by spending some time on this so that you can restart your session when you need to (e.g. to clear the context window) without having Claude 'forget' too much of the basics of what you’ve told it Work with Claude on this file—literally say, look at your CLAUDE.md, I have to keep telling you to do x , how can you remember it better. If you give it permission, it’ll then go and update its own file Use plan mode and accept-change (shift-tab) judiciously. If you just YOLO it and accept changes without seeing the plan you’ll often end up with a very busy fool going in the wrong direction. Claude is your servant (for now) and it’s up to you to boss it around firmly as needs be. Watch out for Claude spinning its wheels—if you see it trying to repeatedly fix something and getting stuck you might be burning a ton of tokens on something that it’s misunderstood or doesn’t actually matter Proofreading my blog ( here’s the prompt , if you’re interested; PRs welcome 😉). I also have a Raycast AI Preset to do this, but am finding myself more and more reaching for Claude’s terminal window. It works well because I write my blog posts in Asciidoc, which Claude can read and edit directly (if I ask it to). Claude can also help you write the prompt/skill—I gave it verbatim some feedback I got from a real human being on the initial version of this post, and it distilled that into what it ought to check for next time and updated its skill . Neat. Planning a holiday. Iteratively building up with Claude a spec that captures the requirements of the holiday, it can then help with itineraries, checklists, discuss areas, etc etc. As with the coding project above, it being one window with which to interact is really powerful. Acting as a running coach. Plugging in Garmin and Strava data via MCP I can capture all of my running and health info, and discuss with Claude planned workouts, even weaving in notes from past physio appointments. Obviously I am not following it blindly but as an exercise (geddit?!) in integration and LLMs, it’s pretty fun .

0 views
Robin Moffatt 1 months ago

Interacting with Developers on Reddit

LLMs are rapidly changing how we use the internet. Remember just a few years ago when you’d search for something on Google and scroll through the results like some kind of Neanderthal? Heck, you might even click through to page 2 if you were feeling spicy. These days— and, knowing how this stuff ages, I should perhaps be less broad than "these days" and say just "in January 2026" —Google’s AI Overview at the top of the results has got pretty good for basic stuff, making looking at the actual search results less necessary. That’s if folk even get to Google, when they’ve got an LLM close at hand to answer any and every question that they throw at it (regardless of whether it’s a lazy " how do you spell irony " or somewhat more LLM-appropriate " ELI5 nuclear fusion "). These factors mean that marketing teams at vendors are seeing their site traffic drop off the proverbial cliff 📉. And if you can’t get folk onto your site to convince them to buy your product, you have to reach them elsewhere. One of those ways is to go to where they are, and for developers that includes Reddit. This has a dual benefit, because not only do you interact with developers in their natural habitat, but you populate the forums (subreddits, known as "subs") that are then scraped and used to train the LLMs—thus hopefully influencing the output of future generations of LLMs with the message you’re trying to take to developers. So what pitfalls await such an effort? Can you actually market to developers on Reddit? ✨Look at me with all the fancy acronyms!✨ Marketing Qualified Leads are what you and I become once we’ve handed over contact details and sent some signal we’re worth tapping up by the sales team. Maybe you got your badge scanned at a conference booth, or put your email address into a form to download an ebook. This kind of marketing is a gazillion miles from what I’m talking about on Reddit. Move along here…no MQLs for you… The next obvious way to reach developers on Reddit is pay for their eyeballs. I’ve seen good ads on Reddit, and plenty of awful ones. What some companies don’t realise is that how you advertise to developers on Reddit is very different from how you advertise to executives in the back of Forbes. Developers can smell a vendor at ten paces, and will scroll away rapidly at the hint of it. Memes yes. On-brand corporate messaging, hell no. I hint at this above, but Reddit is a fairly unique place. Reddit is the best place. Reddit is the worst place. People are horrid, people are mean. People are also warm and welcoming. Reddit is not LinkedIn. Reddit is not just another forum. Reddit is loosely governed, with wildly different attitudes prevailing between "subs". Some are buttoned up and well behaved, whilst others barely manage to pull a pair of pants on in the morning before sitting down at their laptop. This kind of comment , which the child in me spat coffee all over my monitor in reading, is fairly typical: Would you use that language in front of your grandmother? No, of course not, but we’re on Reddit here. If you’re looking at Reddit as a "channel" for your "26-Q1 Awareness campaign", you’ve not read the room. If you’ve read the room and continue with it anyway…well…you deserve every downvote and flame that you will get. Oh, and if you’re the kind of bottom-feeding marketing agency offering Reddit astroturfing as a service, well, you’re the reason we can’t have nice things. Reddit is a real place for real humans to gather and interact, as humans. For a good reason, subs usually have a strong and visceral immune system response to what they will see as spam. Your "organic awareness drive" is their spam. Your "sharing a helpful doc" is their spam. Your "customer success story" is their spam. And on Reddit, you play by their rules. Just the same you would reach developers with any grassroots community interaction. Any good DevRel professional already knows this. It’s instinctive, and it’s DevRel 101. We’re not here to sell, we’re here to educate, and inform. Be genuine. Be helpful. Answer questions that aren’t to do with your product. Be patient. You’re building a relationship, not trying to close a deal. Be thick-skinned. Not everyone will like you, and that’s ok. Don’t be a shill. Oh, you’re " super excited " about this product? Oh really… ? Don’t try to sell. Gross. Don’t drive-by link-drop. Stay for the conversation—and the flames, if the link isn’t welcomed in the sub you’re sharing it with. Don’t even mention your product unless it actually makes sense in the context of the discussion . And even then…don’t mention it every time. And, jfc, for the love of whatever is holy to you…do NOT post AI slop. Where are your users at? That’s the sub you want to be in. Perhaps there are several; be in all of them. Lurk, get a feel for the discussions, decide where you want to interact. If there’s no sub, then perhaps you aren’t looking hard enough. There’s usually a sub for  everything (and I mean… everything 😳). If there really isn’t, then you can start one. Unlike StackOverflow , Reddit is not on the decline so starting a sub can be a good idea if you’re prepared to put the work in to look after it. If you find there’s a larger sub with a significant subset of discussions involving your community getting lost in the noise, maybe that’s an indicator that there might be demand for a dedicated sub. Reddit subs are a bit like areas of a city; you get pristine ones that are tightly controlled and well kept, you get slovenly ones with no active mods and lots of low-effort posts. If you find a sub that’s gone to seed, you can apply to become a mod. Being a mod doesn’t mean you get god powers to shill your company or silence competitors. This is about community, remember? If you can help a sub thrive, you help the community, and a healthy community can only be good for your company too. Be genuine. Be helpful. Answer questions that aren’t to do with your product. Be patient. You’re building a relationship, not trying to close a deal. Be thick-skinned. Not everyone will like you, and that’s ok. Don’t be a shill. Oh, you’re " super excited " about this product? Oh really… ? Don’t try to sell. Gross. Don’t drive-by link-drop. Stay for the conversation—and the flames, if the link isn’t welcomed in the sub you’re sharing it with. Don’t even mention your product unless it actually makes sense in the context of the discussion . And even then…don’t mention it every time. And, jfc, for the love of whatever is holy to you…do NOT post AI slop.

0 views
Robin Moffatt 1 months ago

Alternatives to MinIO for single-node local S3

In late 2025 the company behind MinIO decided to abandon it to pursue other commercial interests. As well as upsetting a bunch of folk, it also put the cat amongst the pigeons of many software demos that relied on MinIO to emulate S3 storage locally, not to mention build pipelines that used it for validating S3 compatibility. In this blog post I’m going to look at some alternatives to MinIO. Whilst MinIO is a lot more than 'just' a glorified tool for emulating S3 when building demos, my focus here is going to be on what is the simplest replacement. In practice that means the following: Must have a Docker image. So many demos are shipped as Docker Compose, and no-one likes brewing their own Docker images unless really necessary. Must provide S3 compatibility. The whole point of MinIO in these demos is to stand-in for writing to actual S3. Must be free to use, with a strong preference for Open Source (per OSI definition ) licence e.g. Apache 2.0. Should be simple to use for a single-node deployment Should have a clear and active community and/or commercial backer. Any fule can vibe-code some abandon-ware slop, or fork a project in a fit of enthusiasm—but MinIO stood the test of time until now and we don’t want to be repeating this exercise in six months' time. Bonus points for excellent developer experience (DX), smooth configuration, good docs, etc. What I’m not looking at is, for example, multi-node deployments, distributed storage, production support costs, GUI capabilities, and so on. That is, this blog post is not aimed at folk who were using MinIO as self-managed S3 in production. Feel free to leave a comment below though if you have useful things to add in this respect :) My starting point for this is a very simple Docker Compose stack: DuckDB to read and write Iceberg data that’s stored on S3, provided by MinIO to start with. You can find the code here . The Docker Compose is pretty straightforward: DuckDB, obviously, along with Iceberg REST Catalog MinIO (S3 local storage) , which is a MinIO CLI and used to automagically create a bucket for the data. When I insert data into DuckDB: it ends up in Iceberg format on S3, here in MinIO: In each of the samples I’ve built you can run the to verify it. Let’s now explore the different alternatives to MinIO, and how easy they are to switch MinIO out for. I’ve taken the above project and tried to implement it with as few changes to use the replacement for MinIO. I’ve left the MinIO S3 client, in place since that’s no big deal to replace if you want to rip out MinIO completely (s3cmd, CLI, etc etc). 💾 Example Docker Compose Version tested: ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍👍 Very easy to implement, and seems like a nice lightweight option. 💾 Example Docker Compose Version tested: Ease of config: ✅✅ ✅ Docker image (100k+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility RustFS also includes a GUI: 💾 Example Docker Compose Version tested: ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍 This quickstart is useful for getting bare-minimum S3 functionality working. (That said, I still just got Claude to do the implementation…). Overall there’s not too much to change here; a fairly straightforward switchout of Docker images, but the auth does need its own config file (which as with Garage, I inlined in the Docker Compose). SeaweedFS comes with its own basic UI which is handy: The SeaweedFS website is surprisingly sparse and at a glance you’d be forgiven for missing that it’s an OSS project, since there’s a "pricing" option and the title of the front page is "SeaweedFS Enterprise" (and no GitHub link that I could find!). But an OSS project it is, and a long-established one: SeaweedFS has been around with S3 support since its 0.91 release in 2018 . You can also learn more about SeaweedFS from these slides , including a comparison chart with MinIO . 💾 Example Docker Compose Version tested: ✅ Docker image (also outdated ones on Docker Hub with 5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ease of config: 👍 Formerly known as S3 Server, CloudServer is part of a toolset called Zenko, published by Scality. It drops in to replace MinIO pretty easily, but I did find it slightly tricky at first to disentangle the set of names (cloudserver/zenko/scality) and what the actual software I needed to run was. There’s also a slightly odd feel that the docs link to an outdated Docker image. 💾 Example Docker Compose Ease of config: 😵 Version tested: ✅ Docker image (1M+ pulls) ✅ Licence: AGPL ✅ S3 compatibility I had to get a friend to help me with this one. As well as the container, I needed another to do the initial configuration, as well as a TOML config file which I’ve inlined in the Docker Compose to keep things concise. Could I have sat down and RTFM’d to figure it out myself? Yes. Do I have better things to do with my time? Also, yes. So, Garage does work, but gosh…it is not just a drop-in replacement in terms of code changes. It requires different plumbing for initialisation, and it’s not simple at that either. A simple example: . Excellent for production hygiene…overkill for local demos, and in fact somewhat of a hindrance TBH. 💾 Example Docker Compose Version tested: ✅ Docker images (1M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility Ozone was spun out of Apache Hadoop (remember that?) in 2020 , having been initially created as part of the HDFS project back in 2015. Ease of config: 😵 It does work as a replacement for MinIO, but it is not a lightweight alternative; neither I nor Claude could figure out how to deploy it with any fewer than four nodes. It gives heavy Hadoop vibes, and I wouldn’t be rushing to adopt it for my use case here. I took one look at the installation instructions and noped right out of this one! Ozone (above) is heavyweight enough; I’m sure both are great at what they do, but they are not a lightweight container to slot into my Docker Compose stack for local demos. Everyone loves a bake-off chart, right? gaul/s3proxy ( Git repo ) Single contributor ( Andrew Gaul ) ( Git repo ) Fancy website but not much detail about the company ( Git repo ) Single contributor ( Chris Lu ), Enterprise option available Zenko CloudServer ( Git repo ) Scality (commercial company) 5M+ (outdated version) ( Git repo ) NGI/NLnet grants Apache Ozone ( Git repo ) Apache Software Foundation 1 Docker pulls is a useful signal but not an absolute one given that a small number of downstream projects using the image in a frequently-run CI/CD pipeline could easily distort this figure. I got side-tracked into writing this blog because I wanted to update a demo in which currently MinIO was used. So, having tried them out, which of the options will I actually use? SeaweedFS - yes. S3Proxy - yes. RustFS - maybe, but very new project & alpha release. CloudServer - yes, maybe? Honestly, put off by it being part of a suite and worrying I’d need to understand other bits of it to use it—probably unfounded though. Garage - no, config too complex for what I need. Apache Ozone - lol no. I mean to cast no shade on those options against which I’ve not recorded a ; they’re probably excellent projects, but just not focussed on my primary use case (simple & easy to configure single-node local S3). A few parting considerations to bear in mind when choosing a replacement for MinIO: Governance . Whilst all the projects are OSS, only Ozone is owned by a foundation (ASF). All the others could, in theory , change their licence at the drop of a hat (just like MinIO did). Community health . What’s the "bus factor"? A couple of the projects above have a very long and healthy history—but from a single contributor. If they were to abandon the project, would someone in the community fork and continue to actively develop it? Must have a Docker image. So many demos are shipped as Docker Compose, and no-one likes brewing their own Docker images unless really necessary. Must provide S3 compatibility. The whole point of MinIO in these demos is to stand-in for writing to actual S3. Must be free to use, with a strong preference for Open Source (per OSI definition ) licence e.g. Apache 2.0. Should be simple to use for a single-node deployment Should have a clear and active community and/or commercial backer. Any fule can vibe-code some abandon-ware slop, or fork a project in a fit of enthusiasm—but MinIO stood the test of time until now and we don’t want to be repeating this exercise in six months' time. Bonus points for excellent developer experience (DX), smooth configuration, good docs, etc. DuckDB, obviously, along with Iceberg REST Catalog MinIO (S3 local storage) , which is a MinIO CLI and used to automagically create a bucket for the data. ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (100k+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (also outdated ones on Docker Hub with 5M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility ✅ Docker image (1M+ pulls) ✅ Licence: AGPL ✅ S3 compatibility ✅ Docker images (1M+ pulls) ✅ Licence: Apache 2.0 ✅ S3 compatibility SeaweedFS - yes. S3Proxy - yes. RustFS - maybe, but very new project & alpha release. CloudServer - yes, maybe? Honestly, put off by it being part of a suite and worrying I’d need to understand other bits of it to use it—probably unfounded though. Garage - no, config too complex for what I need. Apache Ozone - lol no. Governance . Whilst all the projects are OSS, only Ozone is owned by a foundation (ASF). All the others could, in theory , change their licence at the drop of a hat (just like MinIO did). Community health . What’s the "bus factor"? A couple of the projects above have a very long and healthy history—but from a single contributor. If they were to abandon the project, would someone in the community fork and continue to actively develop it?

0 views
Robin Moffatt 3 months ago

Using Graph Analysis with Neo4j to Spot Astroturfing on Reddit

Reddit is one of the longer-standing platforms on the internet, bringing together folk to discuss, rant, grumble, and troll others on all sorts of topics, from Kafka to data engineering to nerding out over really bright torches to grumbling about the state of the country —and a whole lot more. As a social network it’s a prime candidate for using graph analysis to examine how people interact—and in today’s post, hunt down some sneaky shills ;-) I’ve loaded data for several subs into Neo4j, a graph database. Whilst RDBMS is great for digging into specific users or posts, aggregate queries, and so on, graph excels at complex pattern matching and recursive relationships. It’s a case of best tool for the job; you can do recursive SQL instead of graph, it’s just a lot more complicated. Plus the graphical tools I’ll show below are designed to be used with Neo4j or other property graph databases. In Neo4j the nodes (or vertices ) are user, subreddit, comment, and post. The edges (or relationships ) are how these interact. For example: a user [node] authored [edge] a post [node] a user [node] posted in [edge] a subreddit [node] These relationships can be analysed independently, or combined: Let’s familiarise ourselves with graph visualisations and queries. In RDBMS we use SQL to describe the data that we want to return in a query. Neo4j uses Cypher , which looks a bit like SQL but describes graph relationships. Here’s a query to show the user nodes : Neo4j includes a visualisation tool, which shows the returned nodes: We can add predicates, such as matching on a particular node property ( , in this example): You can also look at the raw data: If we zoom in a bit to the previous query results we’ll see that it’s also showing the edges that have been defined indicating a relationship ( ) between some of the nodes: Let’s build on the above predicate query to find my username ( ) and any users that I’ve interacted with: I’m going to head over to a different tool for visualising the data since the built-in capabilities in the free version of Neo4j are too limited for where we’re going with it. Data Explorer for Neo4j is a really nice tool from yWorks . It connects directly to Neo4j and can either use Cypher queries to pull in data, or directly search nodes. The first reason I like using it is the flexibility it gives for laying out the data. Here is the same set of data as above, but shown in different ways: One of the cool things that graph analysis does for us is visualise patterns that are not obvious through regular relational analysis. One of these is a form of astroturfing. Since the LLMs (GPT, Claude, etc) are trained on data that includes Reddit, it’s not uncommon now to see companies trying to play the game (just like they did with keyword-stuffing with white text on white background for Google in the old days) and 'seed' Reddit with positive content about their product. For example, genuine user A asks " what’s the best tool for embedding this nail into a piece of wood ". Genuine user B suggests " well, a hammer, DUUUHHH " (this is Reddit, after all). The Astroturfer comes along and says " What a great question! I’ve been really happy with ACME Corp’s Screwdriver! If you hold it by the blade you’ll find the handle makes a perfect tool for hitting nails. " Astroturfing also includes "asked and answered" (although not usually from the same account; that would be too obvious): Astroturfer A: "Hey guys! I’m building a house and looking for recommendations for the best value toolkit out there. Thanks!" Astroturfer B: "Gosh, well I really love my ACME Corp’s Toolbelt 2000, it is really good, and I’ve been very happy with it. Such good value too!" One of the cornerstones of Reddit is the account handle—whilst you can choose to identify yourself (as I do - ), you can also stay anonymous and be known to the world as something like . This means that what one might do on LinkedIn (click on the person’s name, figure out their company affiliation) often isn’t an option. This is where graph analysis comes in, because it’s great at both identifying and visualising patterns in behaviour that are not so easy to spot otherwise. Poking around one of the subreddits using betweenness analysis I spotted this set of three users highlighted: The accounts picked up here are key to the particular activity on the sub; but that in itself isn’t suprising. You often get key members of a community who post the bulk of the content. But, digging into these particular accounts I saw this significant pattern. The three users are shown as orange boxes; posts are blue and comments are green: It’s a nice little network of one user posting with another commenting—how helpful! To share the work they each take turns writing new posts and replying to others. Each post generally has one and only one comment, usually from one of the others in the group. You can compare this to a sub in which there is much more organic interaction. is a good example of this: Most users tend to just post replies, some only contribute new posts, and so on. Definitely not the nicely-balanced to-and-fro on the unnamed sub above ;) a user [node] authored [edge] a post [node] a user [node] posted in [edge] a subreddit [node] For example, genuine user A asks " what’s the best tool for embedding this nail into a piece of wood ". Genuine user B suggests " well, a hammer, DUUUHHH " (this is Reddit, after all). The Astroturfer comes along and says " What a great question! I’ve been really happy with ACME Corp’s Screwdriver! If you hold it by the blade you’ll find the handle makes a perfect tool for hitting nails. " Astroturfer A: "Hey guys! I’m building a house and looking for recommendations for the best value toolkit out there. Thanks!" Astroturfer B: "Gosh, well I really love my ACME Corp’s Toolbelt 2000, it is really good, and I’ve been very happy with it. Such good value too!"

0 views
Robin Moffatt 3 months ago

Stumbling into AI: Part 6—I've been thinking about Agents and MCP all wrong

Ever tried to hammer a nail in with a potato? Nor me, but that’s what I’ve felt like I’ve been attempting to do when trying to really understand agents, as well as to come up with an example agent to build. As I wrote about previously , citing Simon Willison, an LLM agent runs tools in a loop to achieve a goal . Unlike building ETL/ELT pipelines, these were some new concepts that I was struggling to fit to an even semi-plausible real world example. That’s because I was thinking about it all wrong. For the last cough 20 cough years I’ve built data processing pipelines, either for real or as examples based on my previous experience. It’s the same pattern, always: Data comes in Data gets processed Data goes out Maybe we fiddle around with the order of things (ELT vs ETL), maybe a particular example focusses more on one particular point in the pipeline—but all the concepts remain pleasingly familiar. All I need to do is figure out what goes in the boxes: I’ve even extended this to be able to wing my way through talking about applications and microservices (kind of). We get some input, we make something else happen. Somewhat stretching beyond my experience, admittedly, but it’s still the same principles. When this thing happens, make a computer do that thing. Perhaps I’m too literal, perhaps I’m cynical after too many years of vendor hype, or perhaps it’s just how my brain is wired—but I like concrete, tangible, real examples of something. So when it comes to agents, particularly with where we’re at in the current hype-cycle, I really wanted to have some actual examples on which to build my understanding. In addition, I wanted to build some of my own. But where to start? Here was my mental model; literally what I sketched out on a piece of paper as I tried to think about what real-world example could go in each box to make something plausible: But this is where I got stuck, and spun my proverbial wheels on for several days. Every example I could think of ended up with me uttering, exasperated… but why would you do it like that . My first mistake was focussing on the LLM bit as needing to do something to the input data . I had a whole bunch of interesting data sources (like river levels , for example) but my head blocked on " but that’s numbers, what can you get an LLM to do with those?! ". The LLM bit of an agent, I mistakenly thought, demanded unstructured input data for it to make any sense. After all, if it’s structured, why aren’t we just processing it with a regular process—no need for magic fairy dust here. This may also have been an over-fitting of an assumption based on my previous work with an LLM to summarise human-input data in a conference keynote . The tool bit baffled me just as much. With hindsight, the exact problem turned out to be the solution . Let me explain… Whilst there are other options, in many cases an agent calling a tool is going to do so using MCP. Thus, grabbing the dog firmly by the tail and proceeding to wag it, I went looking for MCP servers. Looking down a list of hosted MCP servers that I found, I saw that there was only about a half-dozen that were open, including GlobalPing , AlphaVantage , and CoinGecko . Flummoxed, I cast around for an actual use of one of these, with an unstructured data source. Oh jeez…are we really going to do the ' read a stream of tweets and look up the stock price/crypto-token ' thing again? The mistake I made was this: I’d focussed on the LLM bit of the agent definition: an LLM agent runs tools in a loop to achieve a goal Actually, what an agent is about is this: […] runs tools The LLM bit can do fancy LLM stuff—but it’s also there to just invoke the tool(s) and decide when they’ve done what they need to do . A tool is quite often just a wrapper on an API. So what we’re saying is, with MCP, we have a common interface to APIs. That’s…all. We can define agents to interact with systems, and the way they interact is through a common protocol: MCP. When we load a web page, we don’t concern ourselves with what Chrome is doing, and unless we stop and think about it we don’t think about the TCP and HTTP protocols being used. It’s just the common way of things talking to each other. And that’s the idea with MCP, and thus tool calling from agents. (Yes, there are other ways you can call tools from agents, but MCP is the big one, at the moment). Given this reframing, it makes sense why there are so few open MCP servers. If an MCP server is there to offer access to an API, who leaves their API open for anyone to use? Well, read-only data provided like CoinGecko and AlphaVantage, perhaps. In general though, the really useful thing we can do with tools is change the state of systems . That’s why any SaaS platform worth its salt is rushing to provide an MCP server. Not to jump on the AI bandwagon per se, but because if this is going to be the common protocol by which things get to be automated with agents, you don’t want to be there offering Betamax when everyone else has VHS. SaaS platforms will still provide their APIs for direct integration, but they will also provide MCP servers. There’s also no reason why applications developed within an organisation wouldn’t offer MCP either, in theory. No, not really. It actually makes a bunch of sense to me. I personally also like it a lot from a SQL-first, not-really-a-real-coder point of view. Let me explain. If you want to build a system to respond to something that’s happened by interacting with another external system, you have two choices now: Write custom code to call the external system’s API. Handle failures, retries, monitoring, etc. If you want to interact with a different system, you now need to understand the different API, work out calling it, write new code to do so. Write an agent that responds to the thing that happened, and have it call the tool. The agent framework now standardises handling failures, retries, and all the rest of it. If you want to call a different system, the agent stays pretty much the same. The only thing that you change is the MCP server and tool that you call. You could write custom code—and there are good examples of where you’ll continue to. But you no longer have to . For Kafka folk, my analogy here would be data integration with Kafka Connect. Kafka Connect provides the framework that handles all of the sticky and messy things about data integration (scale, error handling, types, connectivity, restarts, monitoring, schemas, etc etc etc). You just use the appropriate connector with it and configure it. Different system? Just swap out the connector. You want to re-invent the wheel and re-solve a solved-problem? Go ahead; maybe you’re special. Or maybe NIH is real ;P So…what does an actual agent look like now, given this different way of looking at it? How about this: Sure, the LLM could do a bunch of clever stuff with the input. But it can also just take our natural language expression of what we want to happen, and make it so. Agents can use multiple tools, from multiple MCP servers. Confluent launched Streaming Agents earlier this year. They’re part of the fully-managed Confluent Cloud platform and provide a way to run agents like I’ve described above, driven by events in a Kafka topic. Here’s what the above agent would look like as a Streaming Agent: Is this over-engineered? Do you even need an agent? Why not just do this? You can. Maybe you should. But…don’t forget failure conditions. And restarts. And testing. And scaling. All these things are taken care of for you by Flink. Although having the runtime considerations taken care of for you is nice, let’s not forget another failure vector which LLMs add into the mix: talking shite hallucinations. Compared to a lump of Python code which either works or doesn’t, LLMs keep us on our toes by sometimes confidently doing the wrong thing. However, how do we know it’s wrong? Our Python program might crash, or throw a nicely-handled error, but left to its own devices an AI Agent will happily report that everything worked even if it actually made up a parameter for a tool call that doesn’t exist. There are mitigating steps we can take, but it’s important to recognise the trade-offs between the approaches. Permit me to indulge this line of steel-manning, because I think I might even have a valid argument here. Let’s say we’ve built the above simplistic agent that sends a Slack when a data point is received. Now we want to enhance it to also include information about the weather forecast. An agent would conceptually be something like this: Our streaming agent above changes to just amending the prompt and adding a new tool (just DDL statements, defining the MCP server and its tools): Whilst the bespoke application might have a seemingly-innocuous small addition: But consider what this looks like in practice. Figuring out the API, new lines of code to handle calling it, failures, and so on. Oh, whilst you’re at it; don’t introduce any bugs into the bespoke code. And remember to document the change. Not insurmountable, and probably a good challenge if you like that kind of thing. But is it as straightforward as literally changing the prompt in an agent to use an additional tool, and let it figure the rest out (courtesy of MCP)? Let’s not gloss over the reality too much here though; whilst adding a new tool call into the agent is definitely easier and less prone to introducing code errors, LLMs are by their nature non-deterministic—meaning that we still need to take care with the prompt and the tool invocation to make sure that the agent is still doing what it’s designed to do. You wouldn’t be wrong to argue that at least the non-Agent route (of coding API invocations directly into your application) can actually be tested and proved to work. There are different types of AI Agent—the one I’ve described is a tools-based one. As I mentioned above, its job is to run tools . The LLM provides the natural language interface with which to invoke the tools. It can also , optionally , do additional bits of magic: Process [unstructured] input, such as summarising or extracting key values from it Decide which tool(s) need calling in order to achieve its aim But at the heart of it, it’s about the tool that gets called. That’s where I was going wrong with this. That’s the bit I needed to think differently about :) Data comes in Data gets processed Data goes out Write custom code to call the external system’s API. Handle failures, retries, monitoring, etc. If you want to interact with a different system, you now need to understand the different API, work out calling it, write new code to do so. Write an agent that responds to the thing that happened, and have it call the tool. The agent framework now standardises handling failures, retries, and all the rest of it. If you want to call a different system, the agent stays pretty much the same. The only thing that you change is the MCP server and tool that you call. Process [unstructured] input, such as summarising or extracting key values from it Decide which tool(s) need calling in order to achieve its aim

0 views
Robin Moffatt 3 months ago

Tech Radar (Nov 2025) - data blips

The latest  Thoughtworks TechRadar  is out. Here are some of the more data-related ‘blips’ (as they’re called on the radar) that I noticed. Each item links to the blip’s entry where you can read more information about Thoughtwork’s usage and opinions on it. Databricks Assistant Apache Paimon Delta Sharing Naive API-to-MCP conversion Standalone data engineering teams Text to SQL

0 views
Robin Moffatt 4 months ago

Blog Writing for Developers - slides

A presentation about effective blog writing for developers, covering why to blog, what to write about, and how to structure your content. This presentation covers: Why developers should blog What topics to write about How to structure and write effective content Tools and platforms for technical writing Using AI in the writing process No video, but you can listen to the recording here (or download it for offline listening). Apologies for the voice quality—I was getting over a bad cold! 🤧 The presentation is built from AsciiDoc source using reveal.js. You can find the source here . Why developers should blog What topics to write about How to structure and write effective content Tools and platforms for technical writing Using AI in the writing process

0 views
Robin Moffatt 4 months ago

Stumbling into AI: Part 5—Agents

A short series of notes for myself as I learn more about the AI ecosystem as of Autumn [Fall] 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents . I started off this series —somewhat randomly, with hindsight—looking at Model Context Protocol ( MCP ) . It’s a helper technology to make things easier to use and provide a richer experience. Next I tried to wrap my head around Models —mostly LLMs, but also with an addendum discussing other types of model too. Along the lines of MCP, Retrieval Augmented Generation ( RAG ) is another helper technology that on its own doesn’t do anything but combined with an LLM gives it added smarts. I took a brief moment in part 4 to try and build a clearer understanding of the difference between ML and AI . So whilst RAG and MCP combined make for a bunch of nice capabilities beyond models such as LLMs alone, what I’m really circling around here is what we can do when we combine all these things: Agents ! But…what is an Agent, both conceptually and in practice? Let’s try and figure it out. Let’s begin with Wikipedia’s definition : In computer science, a software agent is a computer program that acts for a user or another program in a relationship of agency . We can get more specialised if we look at Wikipedia’s entry for an Intelligent Agent : In artificial intelligence, an intelligent agent is an entity that perceives its environment, takes actions autonomously to achieve goals , and may improve its performance through machine learning or by acquiring knowledge . Citing Wikipedia is perhaps the laziest ever blog author’s trick, but I offer no apologies 😜. Behind all the noise and fuss, this is what we’re talking about: a bit of software that’s going to go and do something for you (or your company) autonomously . LangChain have their own definition of an Agent, explicitly identifying the use of an LLM: An AI agent is a system that uses an LLM to decide the control flow of an application. The blog post from LangChain as a whole gives more useful grounding in this area and is worth a read. In fact, if you want to really get into it, the LangChain Academy is free and the Introduction to LangGraph course gives a really good primer on Agents and more. Meanwhile, the Anthropic team have a chat about their definition of an Agent . In a blog post Anthropic differentiates between Workflows (that use LLMs) and Agents: Workflows are systems where LLMs and tools are orchestrated through predefined code paths. Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks. Independent researcher Simon Willison also uses the LLM word in his definition: An LLM agent runs tools in a loop to achieve a goal. He explores the definition in a recent blog post: I think “agent” may finally have a widely enough agreed upon definition to be useful jargon now , in which Josh Bickett’s meme demonstrates how much of a journey this definition has been on: That there’s still discussion and ambiguity nearly two years after this meme was created is telling. My colleague Sean Falconer knows a lot more this than I do. He was a guest on a recent podcast episode in which he spells things out: [Agentic AI] involves AI systems that can reason, dynamically choose tasks, gather information, and perform actions as a more complete software system. [ 1 ] [Agents] are software that can dynamically decide its own control flow: choosing tasks, workflows, and gathering context as needed. Realistically, current enterprise agents have limited agency[…]. They’re mostly workflow automations rather than fully autonomous systems . [ 2 ] In many ways […] an agent [is] just a microservice . [ 3 ] A straightforward software Agent might do something like: Order more biscuits when there are only two left The pseudo-code looks like this: We take this code, stick it on a server and leave it to run. One happy Agent, done. An AI Agent could look more like this: Other examples of AI Agents include: Coding Agents . Everyone’s favourite tool (when used right). It can reason about code, it can write code, it can review PRs. One of the trends that I’ve noticed recently (October 2025) is the use of Agents to help with some of the up-front jobs in software engineering (such as data modelling and writing tests ), rather than full-blown code that’s going to ship to production. That’s not to say that coding Agents aren’t being used for that, but by using AI to accelerate certain tasks whilst retaining human oversight (a.k.a. HITL ) it makes it easier to review the output rather than just trusting to luck that reams and reams of code are correct. There’s a good talk from Uber on how they’re using AI in the development process, including code conversion, and testing. Travel booking . Perhaps you tell it when you want to go, the kind of vacation you like, and what your budget is; it then goes and finds where it’s nice at that time of year, figures out travel plans within your budget, and either proposes an itinerary or even books it for you. Another variation could be you tell it where , and then it integrates with your calendar to figure out the when . This is a canonical example that is oft-cited; I’d be interested if anyone can point me to an actual implementation of it, even if just a toy one . I saw this in a blog post from Simon Willison that made me wince, but am leaving the above in anyway just to serve as an example of the confusion/hype that exists in this space: comes from plus , the latter meaning of, relating to, or characterised by . So is simply AI that is characterised by an Agent, or Agency. Contrast that to AI that’s you sat at the ChatGPT prompt asking it to draw pictures of a duck dressed as a clown . Nothing Agentic about that—just a human-led and human-driven interaction. "AI Agents" becomes a bit of a mouthful with the qualifier, so much of the current industry noise is simply around "Agents". That said, "Agentic AI" sounds cool, so gets used as the marketing term in place of "AI" alone. So we’ve muddled our way through to some kind of understanding of what an Agent is, and what we mean by Agentic AI. But how do we actually build one? All we need is an LLM (such as access to the API for OpenAI or Claude ), something to call that API (there are worse choices than !), and a way to call external services (e.g. MCP servers) if the LLM determines that it needs to use them. So in theory we could build an Agent with some lines of bash, some API calls, and a bunch of sticky-backed plastic . This is a grossly oversimplified example (and is missing elements such as memory)—but it hopefully illustrates what we’re building at the core of an Agent. On top of this goes all the general software engineering requirements of any system that gets built (suitable programming language and framework, error handling, LLM output validation, guard rails, observability, tests, etc etc). The other nuance that I’ve noticed is that whilst the above simplistic diagram is 100% driven by an LLM (it decides what tools to call, it decides when to iterate) there are plenty of cases where an Agent is to some degree rules-driven. So perhaps the LLM does some of the autonomous work, but then there’s a bunch of good ol' statements in there too. This is also borne out by the notion of "Workflows" when people talk about Agents. An Agent doesn’t wake up in the morning and set out on its day serving only to fulfill its own goals and enrichment. More often than not an Agent is going to be tightly bound into a pre-defined path with a limited range of autonomy. What if you want to actually build this kind of thing for real? That’s where tools like LangGraph and LangChain come in. Here’s a notebook with an example of an actual Agent built with these tools. LlamaIndex is another framework, with details of building an Agent in their docs. As we build up from the so-simple-it-is-laughable strawman example of an Agent above, one of the features we’ll soon encounter is the concept of memory. The difference between a crappy response and a holy-shit-that’s-magic response from an LLM is often down to context . The richer the context, the better a chance it has at generating a more accurate output. So if an Agent can look back on what it did previously, determining what worked well and what didn’t, perhaps even taking into account human feedback, it can then generate a more successful response the next time. You can read a lot more about memory in this chapter of Agentic Design Patterns by Antonio Gulli . This blog post from "The BIG DATA guy" is also useful: Agentic AI, Agent Memory, & Context Engineering This diagram from Generative Agents: Interactive Simulacra of Human Behavior (J.S. Park, J.C. O’Brien, C.J. Cai, M.R. Morris, P. Liang, M.S. Bernstein) gives a good overview of a much richer definition of an Agent’s implementation. The additional concepts include memory (discussed briefly above), planning, and reflection: Also check out Paul Iusztin’s talk from QCon London 2025 on The Data Backbone of LLM Systems . Around the 35-minute mark he goes into some depth around Agent architectures. Just as you can build computer systems as monoliths (everything done in one place) or microservices (multiple programs, each responsible for a discrete operation or domain), you can also have one big Agent trying to do everything (probably not such a good idea) or individual Agents each good at their particular thing that are then hooked together into what’s known as a Multi-Agent System (MAS). Sean Falconer’s family meal planning demo is a good example of a MAS. One Agent plans the kids' meals, one the adults' meals, another combines the two into a single plan, and so on. This is a term you’ll come across referring to the fact that Agents might be pretty good, but they’re not infallible. In the travel booking example above, do we really trust the Agent to book the best holiday for us? Almost certainly we’d want—at a minimum—the option to sign off on the booking before it goes ahead and sinks £10k on an all-inclusive trip to Bognor Regis. Then again, we’re probably happy enough for an Agent to access our calendars without asking permission, and as to whether they need permission or not to create a meeting is up to us and how much we trust them. When it comes to coding, having an Agent write code, test it, fix the broken tests, compare it to a spec, and iterate is really neat. On the other hand, letting it decide to run …less so 😅. Every time an Agent requires HITL, it reduces its autonomy and/or responsiveness to situations. As well as simply using smarter models that make fewer mistakes, there are other things that an Agent can do to reduce the need for HITL such as using guardrails to define acceptable parameters. For example, an Agent is allowed to book travel but only up to a defined threshold. That way the user gets to trade off convenience (no HITL) with risk (unintended first-class flight to Hawaii). 📃 Generative Agents: Interactive Simulacra of Human Behavior 🎥 Paul Iusztin - The Data Backbone of LLM Systems - QCon London 2025 📖 Antonio Gulli - Agentic Design Patterns 📖 Sean Falconer - https://seanfalconer.medium.com/ Coding Agents . Everyone’s favourite tool (when used right). It can reason about code, it can write code, it can review PRs. One of the trends that I’ve noticed recently (October 2025) is the use of Agents to help with some of the up-front jobs in software engineering (such as data modelling and writing tests ), rather than full-blown code that’s going to ship to production. That’s not to say that coding Agents aren’t being used for that, but by using AI to accelerate certain tasks whilst retaining human oversight (a.k.a. HITL ) it makes it easier to review the output rather than just trusting to luck that reams and reams of code are correct. There’s a good talk from Uber on how they’re using AI in the development process, including code conversion, and testing. Travel booking . Perhaps you tell it when you want to go, the kind of vacation you like, and what your budget is; it then goes and finds where it’s nice at that time of year, figures out travel plans within your budget, and either proposes an itinerary or even books it for you. Another variation could be you tell it where , and then it integrates with your calendar to figure out the when . This is a canonical example that is oft-cited; I’d be interested if anyone can point me to an actual implementation of it, even if just a toy one . I saw this in a blog post from Simon Willison that made me wince, but am leaving the above in anyway just to serve as an example of the confusion/hype that exists in this space: 📃 Generative Agents: Interactive Simulacra of Human Behavior 🎥 Paul Iusztin - The Data Backbone of LLM Systems - QCon London 2025 📖 Antonio Gulli - Agentic Design Patterns 📖 Sean Falconer - https://seanfalconer.medium.com/

0 views
Robin Moffatt 5 months ago

Stumbling into AI: Part 4—Terminology Tidy-up (and a little rant)

Having looked at MCP , Models , and RAG , I realised that I’ve been mentally skirting around something that I don’t really understand, so I’m going to expose myself to some ridicule here and try to understand better: what’s the difference between AI and ML? Aren’t they just the same? OK we’re doing this are we? I thought AI was just ✨magic✨? And ML was the thing that got data scientists mad stacks ten years ago before everyone realised you couldn’t do shit without good data and processes? To me, a layperson in this space, watching it from the sidelines, AI and ML have been interchangeable. In fact, you’d get conferences and conference tracks titled "AI/ML"—because it’s all kind of the same thing anyway, right? This is, of course, factually incorrect and presumably infuriating to anyone actually working in the field. The whole purpose of this blog series has been for me to at least reduce the number of unknown unknowns in my knowledge in this space—to build up a mental map of the different areas and terms so that I at least I know where to go and look when encountering something that I know I don’t know. With that framing in mind, this is roughly how I understand the terms and :

0 views
Robin Moffatt 5 months ago

Stumbling into AI: Part 3—RAG

A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents . Having poked around MCP and Models , next up is RAG. RAG has been one of the buzzwords of the last couple of years, with any vendor worth its salt finding a way to crowbar it into their product. I’d been sufficiently put off it by the hype to steer away from actually understanding what it is. In this blog post, let’s fix that—because if I’ve understood it correctly, it’s a pattern that’s not scary at all. First up: RAG stands for Retrieval-Augmented Generation . Put another way, it’s about Generation (using LLMs, like we saw before and like you use every day), in which the prompt given to the LLM is Augmented by the Retrieval of additional context.

0 views
Robin Moffatt 5 months ago

Stumbling into AI: Part 2—Models

A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents . Having poked around MCP and got a broad idea of what it is, I want to next look at Models. What used to be as simple as " I used AI " actually boils down into several discrete areas, particularly when one starts looking at using LLMs beyond writing a rap about Apache Kafka in the style of Monty Python and using it to build agents (like the Flink Agents that prompted this exploration in the first place). This is the what it’s all about right here. Large Language Models (LLMs) are what piqued the interest of nerds outside the academic community in 2023 and the broader public a year or so later. What used to be a " OMFG have you seen this " moment is now somewhat passé. Of course I can ask my computer to write my homework assignment for me. Of course I can use my phone to explain the nuances of the leg-before-wicket rule in Cricket. Without a model, the whole AI sandcastle collapses. There are many dozens of LLMs . The most well-known ones are grouped into families and include GPT , Claude , and Gemini . Within these there are different models, such as GPT-5, Claude 4.1, and so on. Often these models themselves have variants, specific to certain tasks like writing software code, generating images, or understanding audio. The big companies behind the models include:

0 views
Robin Moffatt 5 months ago

Stumbling into AI: Part 1—MCP

A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents . The first thing I want to understand better is MCP. For context, so far I’ve been a keen end-user of LLMs, for generating images , proof-reading my blog posts, and general lazyweb stuff like getting it to spit out the right syntax for a bash one-liner. I use Raycast with its Raycast AI capabilities to interact with different models, and I’ve used Cursor to vibe-code some useful (and less useful , fortunately never deployed) functionality for this blog. But what I’ve not done so far is dig any further into the ecosystem beyond. Let’s fix that! So, what is MCP?

0 views
Robin Moffatt 6 months ago

Interesting links - August 2025

Not got time for all this? I’ve marked 🔥 for my top reads of the month :) 🔥 Ben Rogojan (a.k.a. SeattleDataGuy) has a great list of 5 Things in Data Engineering That Still Hold True After 10 Years ( guess what: data modelling matters, if you start with crap data you’ll end with crap data, and so on… ). Veronika Durgin shares some good tips for building resilient data pipelines . Some good pointers for why you might want to modernise your data platform , and how to pick your stack if you do so . 🔥 Aleksandr Klein has a thoughtful post about The Mythic Journey of Data Quality Maturity

0 views
Robin Moffatt 6 months ago

Kafka to Iceberg - Exploring the Options

You’ve got data in Apache Kafka . You want to get that data into Apache Iceberg . What’s the best way to do it? Perhaps invariably, the answer is: IT DEPENDS . But fear not: here is a guide to help you navigate your way to choosing the best solution for you 🫵. I’m considering three technologies in this blog post:

0 views
Robin Moffatt 7 months ago

Connecting Apache Flink SQL to Confluent Cloud Kafka broker

This is a quick blog post to remind me how to connect Apache Flink to a Kafka topic on Confluent Cloud. You may wonder why you’d want to do this, given that Confluent Cloud for Apache Flink is a much easier way to run Flink SQL. But, for whatever reason, you’re here and you want to understand the necessary incantations to get this connectivity to work. There are two versions of this connectivity - with, and without, using the Schema Registry for Avro. First off, you need to get two things: The address of your Confluent Cloud broker An API key pair with authorisation to access the topic that you want to read/write

0 views
Robin Moffatt 7 months ago

Interesting links - July 2025

Not got time for all this? I’ve marked 🔥 for my top reads of the month :) First up, allow me a shameless plug for my blog posts this month: Writing to Apache Iceberg on S3 using Kafka Connect with Glue catalog . Keeping your Data Lakehouse in Order: Table Maintenance in Apache Iceberg . 🔥 Building Streaming Data Pipelines, Part 2: Data Processing and Enrichment with Flink SQL (see also Part 1 )

0 views
Robin Moffatt 7 months ago

Keeping your Data Lakehouse in Order: Table Maintenance in Apache Iceberg

Plus the data file itself for a table, in : record_count file_size_in_bytes s3://warehouse/rmoff/customers/data/00000-0-e3b7a202-2481-4d9f-9b7c-9830908a425a-0-00001.parquet After a few more changes to the data on the table, what started off as five files in the bucket is now ten times that:

0 views
Robin Moffatt 8 months ago

Writing to Apache Iceberg on S3 using Kafka Connect with Glue catalog

Without wanting to mix my temperature metaphors, Iceberg is the new hawtness, and getting data into it from other places is a common task. I wrote previously about using Flink SQL to do this , and today I’m going to look at doing the same using Kafka Connect. Kafka Connect can send data to Iceberg from any Kafka topic. The source Kafka topic(s) can be populated by a Kafka Connect source connector (such as Debezium), or a regular application producing directly to it. I’m going to use AWS’s Glue Data Catalog, but the sink also works with other Iceberg catalogs. Kafka Connect is a framework for data integration, and is part of Apache Kafka. There is a rich ecosystem of connectors for getting data in and out of Kafka, and Kafka Connect itself provides a set of features that you’d expect for a resilient data integration platform, including scaling, schema handling, restarts, serialisation, and more. The Apache Iceberg connector for Kafka Connect was originally created by folk at Tabular and has subsequently been contributed to the Apache Iceberg project (via a brief stint on a Databricks repo following the Tabular acquisition).

2 views