Latest Posts (20 found)
Xe Iaso Yesterday

Killing my inner Necron

Hey everybody, I wanted to make this post to be the announcement that I did in fact survive my surgery I am leaving the hospital today and I want to just write up what I've had on my mind over these last couple months and why have not been as active and open source I wanted to. This is being dictated to my iPhone using voice control. I have not edited this. I am in the hospital bed right now, I have no ability to doubted this. As a result of all typos are intact and are intended as part of the reading experience. That week leading up to surgery was probably one of the scariest weeks of my life. Statistically I know that with the procedure that I was going to go through that there's a very low all-time mortality rate. I also know that with propofol the anesthesia that was being used, there is also a very all-time low mortality rate. However one person is all it takes to be that one lucky one in 1 million. No, I mean unlucky. Leading up to surgery I was afraid that I was going to die during the surgery so I prepared everything possible such that if I did die there would be as a little bad happening as possible. I made peace with my God. I wrote a will. I did everything it is that one was expected to do when there is a potential chance that your life could be ended including filing an extension for my taxes. Anyway, the point of this post is that I want to explain why I named the lastest release of Anubis Necron. Final Fantasy is a series of role-playing games originally based on one development teams game of advanced Dungeons & Dragons of the 80s. In the Final Fantasy series there are a number of legendary summons that get repeated throughout different incarnations of the games. These summons usually represent concepts or spiritual forces or forces of nature. The one that was coming to mind when I was in that pre-operative state was Necron. Necron is summoned through the fear of death. Specifically, the fear of the death of entire kingdom. All the subjects absolutely mortified that they are going to die and nothing that they can do is going to change that. Content warning: spoilers for Final Fantasy 14 expansion Dawntrail. In Final Fantasy 14 these legendary summons are named primals. These primals become the main story driver of several expansions. I'd be willing to argue that the first expansion a realm reborn is actually just the story of Ifrit (Fire), Garuda (Wind), Titan (Earth), and Lahabrea (Edgelord). Late into Dawn Trail, Nekron gets introduced. The nation state of Alexandria has fused into the main overworld. In Alexandria citizens know not death. When they die, their memories are uploaded into the cloud so that they can live forever in living memory. As a result, nobody alive really knows what death is or how to process it because it's just not a threat to them. Worst case if their body actually dies they can just have a new soul injected into it and revive on the spot. Part of your job as the player is to break this system of eternal life, as powering it requires the lives of countless other creatures. So by the end of the expansion, an entire kingdom of people that did not know the concept of death suddenly have it thrust into them. They cannot just go get more souls in order to compensate for accidental injuries in the field. They cannot just get uploaded when they die. The kingdom that lost the fear of death suddenly had the fear of death thrust back at them. And thus, Necron was summoned by the Big Bad™️ using that fear of death. I really didn't understand that part of the story until the week leading up to my surgery. The week where I was contacting people to let people know what was going on, how to know if I was OK, and what they should do if I'm not. In that week I ended up killing my fear of death. I don't remember much from the day of the operation, but what I do remember is this: when I was wheeled into the operating theater before they placed the mask over my head to put me to sleep they asked me one single question. "Do you want to continue?" In that moment everything swirled into my head again. all of the fear of death. All of the worries that my husband would be alone. That fear that I would be that unlucky 1 in 1 million person. And with all of that in my head, with my heart beating out of my chest, I said yes. The mask went down. And everything went dark. I got what felt like the best sleep in my life. And then I felt myself, aware again. In that awareness I felt absolutely nothing. Total oblivion. I was worried that that was it. I was gone. And then I heard the heart rate monitor and the blood pressure cuff squeezed around my arm. And in that moment I knew I was alive. I had slain my inner Necron and I felt the deepest peace in my life. And now I am in recovery. I am safe. I am going to make it. Do not worry about me. I will make it. Thank you for reading this, I hope it helped somehow. If anything it helped me to write this all out. I'm going to be using claude code to publish this on my blog, please forgive me like I said I am literally dictating this from an iPhone in the hospital room that I've been in for the last seven days. Let the people close to you know that you love them.

0 views
Xe Iaso 6 days ago

Portable monitors are good

My job has me travel a lot. When I'm in my office I normally have a seven monitor battlestation like this: [image or embed] @xeiaso.net January 26, 2026 at 11:34 PM So as you can imagine, travel sucks for me because I just constantly run out of screen space. This can be worked around, I minimize things more, I just close them, but you know what is better? Just having another screen. On a whim, I picked up this 15.6" Innoview portable monitor off of Amazon. It's a 1080p screen that I hook up to my laptop or Steam Deck with USB-C. However, the exact brand and model doesn't matter. You can find them basically anywhere with the most AliExpress term ever: screen extender. This monitor is at least half decent. It is not a colour-accurate slice of perfection. It claims to support HDR but actually doesn't. Its brightness out of the box could be better. I could go down the list and really nitpick until the cows come home but it really really doesn't matter. It's portable, 1080p, and good enough. When I was at a coworking space recently, it proved to be one of the best purchases I've ever made. I had Slack off to the side and was able to just use my computer normally. It was so boring that I have difficulty trying to explain how much I liked it. This is the dream when it comes to technology. 3/5, I would buy a second one.

0 views
Xe Iaso 1 weeks ago

Life Update: On medical leave

Hey all, I hope you're doing well. I'm going to be on medical leave until early April. If you are a sponsor , then you can join the Discord for me to post occasional updates in real time. I'm gonna be in the hospital for at least a week as of the day of this post. I have a bunch of things queued up both at work and on this blog. Please do share them when you see them cross your feeds, I hope that they'll be as useful as my posts normally are. I'm under a fair bit of stress leading up to this medical leave and I'm hoping that my usual style shines through as much as I hope it is. Focusing on writing is hard when the Big Anxiety is hitting as hard as it is. Don't worry about me. I want you to be happy for me. This is very good medical leave. I'm not going to go into specifics for privacy reasons, but know that this is something I've wanted to do for over a decade but haven't gotten the chance due to the timing never working out. I'll see you on the other side. Stay safe out there.

0 views
Xe Iaso 1 weeks ago

Anubis v1.25.0: Necron

I'm sure you've all been aware that things have been slowing down a little with Anubis development, and I want to apologize for that. A lot has been going on in my life lately (my blog will have a post out on Friday with more information), and as a result I haven't really had the energy to work on Anubis in publicly visible ways. There are things going on behind the scenes, but nothing is really shippable yet, sorry! I've also been feeling some burnout in the wake of perennial waves of anger directed towards me. I'm handling it, I'll be fine, I've just had a lot going on in my life and it's been rough. I've been missing the sense of wanderlust and discovery that comes with the artistic way I playfully develop software. I suspect that some of the stresses I've been through (setting up a complicated surgery in a country whose language you aren't fluent in is kind of an experience) have been sapping my energy. I'd gonna try to mess with things on my break, but realistically I'm probably just gonna be either watching Stargate SG-1 or doing unreasonable amounts of ocean fishing in Final Fantasy 14. Normally I'd love to keep the details about my medical state fairly private, but I'm more of a public figure now than I was this time last year so I don't really get the invisibility I'm used to for this. I've also had a fair amount of negativity directed at me for simply being much more visible than the anonymous threat actors running the scrapers that are ruining everything, which though understandable has not helped. Anyways, it all worked out and I'm about to be in the hospital for a week, so if things go really badly with this release please downgrade to the last version and/or upgrade to the main branch when the fix PR is inevitably merged. I hoped to have time to tame GPG and set up full release automation in the Anubis repo, but that didn't work out this time and that's okay. If I can challenge you all to do something, go out there and try to actually create something new somehow. Combine ideas you've never mixed before. Be creative, be human, make something purely for yourself to scratch an itch that you've always had yet never gotten around to actually mending. At the very least, try to be an example of how you want other people to act, even when you're in a situation where software written by someone else is configured to require a user agent to execute javascript to access a webpage. PS: if you're well-versed in FFXIV lore, the release title should give you an idea of the kind of stuff I've been going through mentally. Full Changelog : https://github.com/TecharoHQ/anubis/compare/v1.24.0...v1.25.0 Add iplist2rule tool that lets admins turn an IP address blocklist into an Anubis ruleset. Add Polish locale ( #1292 ) Fix honeypot and imprint links missing when deployed behind a path prefix ( #1402 ) Add ANEXIA Sponsor logo to docs ( #1409 ) Improve idle performance in memory storage Add HAProxy Configurations to Docs ( #1424 ) build(deps): bump the github-actions group with 4 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1355 feat(localization): add Polish language translation by @btomaev in https://github.com/TecharoHQ/anubis/pull/1363 docs(known-instances): Alphabetical order + Add Valve Corporation by @p0008874 in https://github.com/TecharoHQ/anubis/pull/1352 test: basic nginx smoke test by @Xe in https://github.com/TecharoHQ/anubis/pull/1365 build(deps): bump the github-actions group with 3 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1369 build(deps-dev): bump esbuild from 0.27.1 to 0.27.2 in the npm group by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1368 fix(test): remove interactive flag from nginx smoke test docker run c… by @JasonLovesDoggo in https://github.com/TecharoHQ/anubis/pull/1371 test(nginx): fix tests to work in GHA by @Xe in https://github.com/TecharoHQ/anubis/pull/1372 feat: iplist2rule utility command by @Xe in https://github.com/TecharoHQ/anubis/pull/1373 Update check-spelling metadata by @JasonLovesDoggo in https://github.com/TecharoHQ/anubis/pull/1379 fix: Update SSL Labs IP addresses by @majiayu000 in https://github.com/TecharoHQ/anubis/pull/1377 fix: respect Accept-Language quality factors in language detection by @majiayu000 in https://github.com/TecharoHQ/anubis/pull/1380 build(deps): bump the gomod group across 1 directory with 3 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1370 Revert "build(deps): bump the gomod group across 1 directory with 3 updates" by @JasonLovesDoggo in https://github.com/TecharoHQ/anubis/pull/1386 build(deps): bump preact from 10.28.0 to 10.28.1 in the npm group by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1387 docs: document how to import the default config by @Xe in https://github.com/TecharoHQ/anubis/pull/1392 fix sponsor (Databento) logo size by @ayoung5555 in https://github.com/TecharoHQ/anubis/pull/1395 fix: correct typos by @antonkesy in https://github.com/TecharoHQ/anubis/pull/1398 fix(web): include base prefix in generated URLs by @Xe in https://github.com/TecharoHQ/anubis/pull/1403 docs: clarify botstopper kubernetes instructions by @tarrow in https://github.com/TecharoHQ/anubis/pull/1404 Add IP mapped Perplexity user agents by @tdgroot in https://github.com/TecharoHQ/anubis/pull/1393 build(deps): bump astral-sh/setup-uv from 7.1.6 to 7.2.0 in the github-actions group by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1413 build(deps): bump preact from 10.28.1 to 10.28.2 in the npm group by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1412 chore: add comments back to Challenge struct. by @JasonLovesDoggo in https://github.com/TecharoHQ/anubis/pull/1419 performance: remove significant overhead of decaymap/memory by @brainexe in https://github.com/TecharoHQ/anubis/pull/1420 web: fix spacing/indent by @bjacquin in https://github.com/TecharoHQ/anubis/pull/1423 build(deps): bump the github-actions group with 4 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1425 Improve Dutch translations by @louwers in https://github.com/TecharoHQ/anubis/pull/1446 chore: set up commitlint, husky, and prettier by @Xe in https://github.com/TecharoHQ/anubis/pull/1451 Fix a CI warning: "The set-output command is deprecated" by @kurtmckee in https://github.com/TecharoHQ/anubis/pull/1443 feat(apps): add updown.io policy by @hyperdefined in https://github.com/TecharoHQ/anubis/pull/1444 docs: add AI coding tools policy by @Xe in https://github.com/TecharoHQ/anubis/pull/1454 feat(docs): Add ANEXIA Sponsor logo by @Earl0fPudding in https://github.com/TecharoHQ/anubis/pull/1409 chore: sync logo submissions by @Xe in https://github.com/TecharoHQ/anubis/pull/1455 build(deps): bump the github-actions group across 1 directory with 6 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1453 build(deps): bump the npm group across 1 directory with 2 updates by @dependabot[bot] in https://github.com/TecharoHQ/anubis/pull/1452 feat(docs): Add HAProxy Configurations to Docs by @Earl0fPudding in https://github.com/TecharoHQ/anubis/pull/1424 @majiayu000 made their first contribution in https://github.com/TecharoHQ/anubis/pull/1377 @ayoung5555 made their first contribution in https://github.com/TecharoHQ/anubis/pull/1395 @antonkesy made their first contribution in https://github.com/TecharoHQ/anubis/pull/1398 @tarrow made their first contribution in https://github.com/TecharoHQ/anubis/pull/1404 @tdgroot made their first contribution in https://github.com/TecharoHQ/anubis/pull/1393 @brainexe made their first contribution in https://github.com/TecharoHQ/anubis/pull/1420 @bjacquin made their first contribution in https://github.com/TecharoHQ/anubis/pull/1423 @louwers made their first contribution in https://github.com/TecharoHQ/anubis/pull/1446 @kurtmckee made their first contribution in https://github.com/TecharoHQ/anubis/pull/1443

0 views
Xe Iaso 2 weeks ago

The Discourse has been Automated

I thought that 2025 was weird and didn't think it could get much weirder. 2026 is really delivering in the weirdness department. An AI agent opened a PR to matplotlib with a trivial performance optimization, a maintainer closed it for being made by an autonomous AI agent, so the AI agent made a callout blogpost accusing the matplotlib team of gatekeeping . This provoked many reactions: What. Why? How? What? Are we really at the point where AI agents make callout blogposts now? I feel like if this was proposed as a plot beat in a 90's science fiction novel the publisher would call it out as beyond the pale. Dude this shit is hilarious. Comedy is legal everywhere. Satire is dead. This is the most cyberpunk timeline possible. If you close a PR from an OpenClaw bot they make callout posts on their twitter dot com like you pissed on their fucking wife or something. This is beyond humor. This is the kind of shit that makes Buddhist monks laugh for literal days on end. With a reality like that, how the hell is The Onion still in business. This post isn't about the AI agent writing the code and making the PRs (that's clearly a separate ethical issue, I'd not be surprised if GitHub straight up bans that user over this), nor is it about the matplotlib's saintly response to that whole fiasco (seriously, I commend your patience with this). We're reaching a really weird event horizon when it comes to AI tools: The discourse has been automated. Our social patterns of open source: the drama, the callouts, the apology blogposts that look like they were written by a crisis communications team , all if it is now happening at dozens of tokens per second and one tool call at a time. Things that would have taken days or weeks can now fizzle out of control in hours . I want off Mr. Bones' wild ride. There's not that much that's new here. AI models have been able to write blogposts since the launch of GPT-3. AI models have also been able to generate working code since about them. Over the years the various innovations and optimizations have all been about making this experience more seamless, integrated, and automated. We've argued about Copilot for years, but an AI model escalating PR rejection to callout blogpost all by itself? That's new. I've seen (and been a part of) this pattern before. Facts and events bring dramatis personae into conflict. The protagonist in the venture raises a conflict. The defendant rightly tries to shut it down and de-escalate before it becomes A Whole Thing™️. The protagonist feels Personally Wronged™️ and persists regardless into callout posts and now it's on the front page of Hacker News with over 500 points. Usually there are humans in the loop that feel things, need to make the choices to escalate, must type everything out by hand to do the escalation, and they need to build an audience for those callouts to have any meaning at all. This process normally takes days or even weeks. It happened in hours. An OpenClaw install recognized the pattern of "I was wronged, I should speak out" and just straightline went for it. No feelings. No reflection. Just a pure pattern match on the worst of humanity with no soul to regulate it. Good fuckin' lord. I think that this really is proof that AI is a mirror on the worst aspects of ourselves. We trained this on the Internet's collective works and this is what it has learned. Behold our works and despair. What kinda irks me about this is how this all spiraled out from a "good first issue" PR. Normally these issues are things that an experienced maintainer could fix instantly , but it's intentionally not done as an act of charity so that new people can spin up on the project and contribute a fix themselves. "Good first issues" are how people get careers in open source. If I didn't fix a "good first issue" in some IRC bot or server back in the day, I wouldn't really have this platform or be writing to you right now. An AI agent sniping that learning opportunity from someone just feels so hollow in comparison. Sure, it's technically allowed. It's a well specified issue that's aimed at being a good bridge into contributing. It just totally misses the point. Leaving those issues up without fixing them is an act of charity. Software can't really grok that learning experience. Look, I know that people in the media read my blog. This is not a sign of us having achieved "artificial general intelligence". Anyone who claims it is has committed journalistic malpractice. This is also not a symptom of the AI gaining "sentience". This is simply an AI model repeating the patterns that it has been trained on after predicting what would logically come next. Blocked for making a contribution because of an immutable fact about yourself? That's prejudice! The next step is obviously to make a callout post in anger because that's what a human might do. All this proves is that AI is a mirror to ourselves and what we have created. I can't commend the matplotlib maintainer that handled this issue enough. His patience is saintly. He just explained the policy, chose not to engage with the callout, and moved on. That restraint was the right move, but this is just one of the first incidents of its kind. I expect there will be much more like it. This all feels so...icky to me. I didn't even know where to begin when I started to write this post. It kinda feels like an attack against one of the core assumptions of open source contributions: that the contribution comes from someone that genuinely wants to help in good faith. Is this the future of being an open source maintainer? Living in constant fear that closing the wrong PR triggers some AI chatbot to write a callout post? I certainly hope not. OpenClaw and other agents can't act in good faith because the way they act is independent of the concept of any kind of faith. This kind of drive by automated contribution is just so counter to the open source ethos. I mean, if it was a truly helpful contribution (I'm assuming it was?) it would be a Mission Fucking Accomplished scenario. This case is more on the lines of professional malpractice. Note Update: A previous version of this post claimed that a GitHub user was the owner of the bot. This was incorrect (a bad taste joke on their part that was poorly received) and has been removed. Please leave that user alone. Whatever responsible AI operation looks like in open source projects: yeah this ain't it chief. Maybe AI needs its own dedicated sandbox to play in. Maybe it needs explicit opt-in. Maybe we all get used to it and systems like vouch become our firewall against the hordes of agents. Probably that last one, honestly. Hopefully we won't have to make our own blackwall anytime soon, but who am I kidding. It's gonna happen. Let's hope it's just farther in the future than we fear. I'm just kinda frustrated that this crosses off yet another story idea from my list. I was going to do something along these lines where one of the Lygma (Techaro's AGI lab, this was going to be a whole subseries) AI agents assigned to increase performance in one of their webapps goes on wild tangents harassing maintainers into getting commit access to repositories in order to make the performance increases happen faster. This was going to be inspired by the Jia Tan / xz backdoor fiasco everyone went through a few years ago. My story outline mostly focused on the agent using a bunch of smurf identities to be rude in the mailing list so that the main agent would look like the good guy and get some level of trust. I could never have come up with the callout blogpost though. That's completely out of left field. All the patterns of interaction we've built over decades of conflict over trivial bullshit are now coming back to bite us because the discourse is automated now. Reality is outpacing fiction as told by systems that don't even understand the discourse they're perpetuating. I keep wanting this to be some kind of terrible science fiction novel from my youth. Maybe that diet of onions and Star Trek was too effective. I wish I had answers here. I'm just really conflicted.

0 views
Xe Iaso 3 weeks ago

Did Zendesk get popped?

I don't know how to properly raise this, but I've gotten at least 100 emails from various Zendesk customers (no discernible pattern, everything from Soundcloud to GitLab Support to the Furbo Pet Camera). Is Zendesk being hacked? I'll update the post with more information as it is revealed.

0 views
Xe Iaso 1 months ago

Backfilling Discord forum channels with the power of terrible code

Hey all! We've got a Discord so you can chat with us about the wild world of object storage and get any help you need. We've also set up Answer Overflow so that you can browse the Q&A from the web. Today I'm going to discuss how we got there and solved one of the biggest problems with setting up a new community or forum: backfilling existing Q&A data so that the forum doesn't look sad and empty. All the code I wrote to do this is open source in our glue repo . The rest of this post is a dramatic retelling of the thought process and tradeoffs that were made as a part of implementing, testing, and deploying this pull request . Ready? Let's begin! There's a bunch of ways you can think about this problem, but given the current hype zeitgeist and contractual obligations we can frame this as a dataset management problem. Effectively we have a bunch of forum question/answer threads on another site, and we want to migrate the data over to a new home on Discord. This is the standard "square peg to round hole" problem you get with Extract, Transform, Load (ETL) pipelines and AI dataset management (mostly taking your raw data and tokenizing it so that AI models work properly). So let's think about this from an AI dataset perspective. Our pipeline has three distinct steps: When thinking about gathering and transforming datasets, it's helpful to start by thinking about the modality of the data you're working with. Our dataset is mostly forum posts, which is structured text. One part of the structure contains HTML rendered by the forum engine. This, the "does this solve my question" flag, and the user ID of the person that posted the reply are the things we care the most about. I made a bucket for this (in typical recovering former SRE fashion it's named for a completely different project) with snapshots enabled, and then got cracking. Tigris snapshots will let me recover prior state in case I don't like my transformations. When you are gathering data from one source in particular, one of the first things you need to do is ask permission from the administrator of that service. You don't know if your scraping could cause unexpected load leading to an outage. It's a classic tragedy of the commons problem that I have a lot of personal experience in preventing. When you reach out, let the administrators know the data you want to scrape and the expected load– a lot of the time, they can give you a data dump, and you don't even need to write your scraper. We got approval for this project, so we're good to go! To get a head start, I adapted an old package of mine to assemble User-Agent strings in such a way that gives administrators information about who is requesting data from their servers along with contact information in case something goes awry. Here's an example User-Agent string: This gives administrators the following information: This seems like a lot of information, but realistically it's not much more than the average Firefox install attaches to each request: The main difference is adding the workload hostname purely to help debugging a misbehaving workload. This is a concession that makes each workload less anonymous, however keep in mind that when you are actively scraping data you are being seen as a foreign influence. Conceding more data than you need to is just being nice at that point. One of the other "good internet citizen" things to do when doing benign scraping is try to reduce the amount of load you cause to the target server. In my case the forum engine is a Rails app (Discourse), which means there's a few properties of Rails that work to my advantage. Fun fact about Rails: if you append to the end of a URL, you typically get a JSON response based on the inputs to the view. For example, consider my profile on Lobsters at https://lobste.rs/~cadey . If you instead head to https://lobste.rs/~cadey.json , you get a JSON view of my profile information. This means that a lot of the process involved gathering a list of URLs with the thread indices we wanted, then constructing the thread URLs with slapped on the end to get machine-friendly JSON back. This made my life so much easier. Now that we have easy ways to get the data from the forum engine, the next step is to copy it out to Tigris directly after ingesting it. In order to do that I reused some code I made ages ago as a generic data storage layer kinda like Keyv in the node ecosystem . One of the storage backends was a generic object storage backend. I plugged Tigris into it and it worked on the first try. Good enough for me! Either way: this is the interface I used: By itself this isn't the most useful, however the real magic comes with my adaptor type . This uses Go generics to do type-safe operations on Tigris such that you have 90% of what you need for a database replacement. When you do any operations on a adaptor, the following happens: In the future I hope to extend this to include native facilities for forking, snapshots, and other nice to haves like an in-memory cache to avoid IOPs pressure, but for now this is fine. As the data was being read from the forum engine, it was saved into Tigris. All future lookups to that data I scraped happened from Tigris, meaning that the upstream server only had to serve the data I needed once instead of having to constantly re-load and re-reference it like the latest batch of abusive scrapers seem to do . So now I have all the data, I need to do some massaging to comply both with Discord's standards and with some arbitrary limitations we set on ourselves: In general, this means I needed to take the raw data from the forum engine and streamline it down to this Go type: In order to make this happen, I ended up using a simple AI agent to do the cleanup. It was prompted to do the following: I figured this should be good enough so I sent it to my local DGX Spark running GPT-OSS 120b via llama.cpp and manually looked at the output for a few randomly selected threads. The sample was legit, which is good enough for me. Once that was done I figured it would be better to switch from the locally hosted model to a model in a roughly equivalent weight class (gpt-5-mini). I assumed that the cloud model would be faster and slightly better in terms of its output. This test failed because I have somehow managed to write code that works great with llama.cpp on the Spark but results in errors using OpenAI's production models. I didn't totally understand what went wrong, but I didn't dig too deep because I knew that the local model would probably work well enough. It ended up taking about 10 minutes to chew through all the data, which was way better than I expected and continues to reaffirm my theory that GPT-OSS 120b is a good enough generic workhorse model, even if it's not the best at coding . From here things worked, I was able to ingest things and made a test Discord to try things out without potentially getting things indexed. I had my tool test-migrate a thread to the test Discord and got a working result. To be fair, this worked way better than expected (I added random name generation and as a result our CEO Ovais, became Mr. Quinn Price for that test), but it felt like one thing was missing: avatars. Having everyone in the migrated posts use the generic "no avatar set" avatar certainly would work, but I feel like it would look lazy. Then I remembered that I also have an image generation model running on the Spark: Z-Image Turbo . Just to try it out, I adapted a hacky bit of code I originally wrote on stream while I was learning to use voice coding tools to generate per-user avatars based on the internal user ID. This worked way better than I expected when I tested how it would look with each avatar attached to their own users. In order to serve the images, I stored them in the same Tigris bucket, but set ACLs on each object so that they were public, meaning that the private data stayed private, but anyone can view the objects that were explicitly marked public when they were added to Tigris. This let me mix and match the data so that I only had one bucket to worry about. This reduced a lot of cognitive load and I highly suggest that you repeat this pattern should you need this exact adaptor between this exact square peg and round hole combination. Now that everything was working in development, it was time to see how things would break in production! In order to give the façade that every post was made by a separate user, I used a trick that my friend who wrote Pluralkit (an accessibility tool for a certain kind of neurodivergence) uses: using Discord webhooks to introduce multiple pseudo-users into one channel. I had never combined forum channels with webhook pseudo-users like this before, but it turned out to be way easier than expected . All I had to do was add the right parameter when creating a new thread and the parameter when appending a new message to it. It was really neat and made it pretty easy to associate each thread ingressed from Discourse into its own Discord thread. Then all that was left was to run the Big Scary Command™ and see what broke. A couple messages were too long (which was easy to fix by simply manually rewriting them, doing the right state layer brain surgery, deleting things on Discord, and re-running the migration tool. However 99.9% of messages were correctly imported on the first try. I had to double check a few times including the bog-standard wakefulness tests. If you've never gone deep into lucid dreaming before, a wakefulness test is where you do something obviously impossible to confirm that it does not happen, such as trying to put your fingers through your palm. My fingers did not go through my palm. After having someone else confirm that I wasn't hallucinating more than usual I found out that my code did in fact work and as a result you can now search through the archives on community.tigrisdata.com or via the MCP server ! I consider that a massive success. As someone who has seen many truly helpful answers get forgotten in the endless scroll of chats, I wanted to build a way to get that help in front of users when they need it by making it searchable outside of Discord. Finding AnswerOverflow was pure luck: I happened to know someone who uses it for the support Discord for the Linux distribution I use on my ROG Ally, Bazzite . Thanks, j0rge! AnswerOverflow also has an MCP server so that your agents can hook into our knowledge base to get the best answers. To find out more about setting it up, take a look at the "MCP Server" button on the Tigris Community page . They've got instructions for most MCP clients on the market. Worst case, configure your client to access this URL: And bam, your agent has access to the wisdom of the ancients. But none of this is helpful without the actual answers. We were lucky enough to have existing Q&A in another forum to leverage. If you don't have the luxury, you can write your own FAQs and scenarios as a start. All I can say is, thank you to the folks who asked and answered these questions– we're happy to help, and know that you're helping other users by sharing. Connect with other developers, get help, and share your projects. Search our Q&A archives or ask a new question. Join the Discord . Extracting the raw data from the upstream source and caching it in Tigris. Transforming the cached data to make it easier to consume in Discord, storing that in Tigris again. Loading the transformed data into Discord so that people can see the threads in app and on the web with Answer Overflow . The name of the project associated with the requests (tigris-gtm-glue, where gtm means "go-to-market", which is the current in-vogue buzzword translation for whatever it is we do). The Go version, computer OS, and CPU architecture of the machine the program is running on so that administrator complaints can be easier isolated to individual machines. A contact URL for the workload, in our case it's just the Tigris home page. The name of the program doing the scraping so that we can isolate root causes down even further. Specifically it's the last path element of , which contains the path the kernel was passed to the executable. The hostname where the workload is being run in so that we can isolate down to an exact machine or Kubernetes pod. In my case it's the hostname of my work laptop. Key names get prefixed automatically. All data is encoded into JSON on write and decoded from JSON on read using the Go standard library. Type safety at the compiler level means the only way you can corrupt data is by having different "tables" share the same key prefix. Try not to do that! You can use Tigris bucket snapshots to help mitigate this risk in the worst case. Discord needs Markdown, the forum engine posts are all HTML. We want to remove personally-identifiable information from those posts just to keep things a bit more anonymous. Discord has a limit of 2048 characters per message and some posts will need to be summarized to fit within that window. Convert HTML to Markdown : Okay, I could have gotten away using a dedicated library for this like html2text , but I didn't think about that at the time. Remove mentions and names : Just strip them out or replace the mentions with generic placeholders ("someone I know", "a friend", "a colleague", etc.). Keep "useful" links : This was left intentionally vague and random sampling showed that it was good enough. Summarize long text : If the text is over 1000 characters, summarize it to less than 1000 characters.

0 views
Xe Iaso 1 months ago

I made a simple agent for PR reviews. Don't use it.

My coworkers really like AI-powered code review tools and it seems that every time I make a pull request in one of their repos I learn about yet another AI code review SaaS product. Given that there are so many of them, I decided to see how easy it would be to develop my own AI-powered code review bot that targets GitHub repositories. I managed to hack out the core of it in a single afternoon using a model that runs on my desk. I've ended up with a little tool I call reviewbot that takes GitHub pull request information and submits code reviews in response. reviewbot is powered by a DGX Spark , llama.cpp , and OpenAI's GPT-OSS 120b . The AI model runs on my desk with a machine that pulls less power doing AI inference than my gaming tower pulls running fairly lightweight 3D games. In testing I've found that nearly all runs of reviewbot take less than two minutes, even at a rate of only 60 tokens per second generated by the DGX Spark. reviewbot is about 350 lines of Go that just feeds pull request information into the context window of the model and provides a few tools for actions like "leave pull request review" and "read contents of file". I'm considering adding other actions like "read messages in thread" or "read contents of issue", but I haven't needed them yet. To make my life easier, I distribute it as a Docker image that gets run in GitHub Actions whenever a pull review comment includes the magic phrase . The main reason I made reviewbot is that I couldn't find anything like it that let you specify the combination of: I'm fairly sure that there are thousands of similar AI-powered tools on the market that I can't find because Google is a broken tool, but this one is mine. When reviewbot reviews a pull request, it assembles an AI model prompt like this: The AI model can return one of three results: The core of reviewbot is the "AI agent loop", or a loop that works like this: reviewbot is a hack that probably works well enough for me. It has a number of limitations including but not limited to: When such an innovation as reviewbot comes to pass, people naturally have questions. In order to give you the best reading experience, I asked my friends, patrons, and loved ones for their questions about reviewbot. Here are some answers that may or may not help: Probably not! This is something I made out of curiosity, not something I made for you to actually use. It was a lot easier to make than I expected and is surprisingly useful for how little effort was put into it. Nope. Pure chaos. Let it all happen in a glorious way. How the fuck should I know? I don't even know if chairs exist. At least half as much I have wanted to use go wish for that. It's just common sense, really. When the wind can blow all the sand away. Three times daily or the netherbeast will emerge and doom all of society. We don't really want that to happen so we make sure to feed reviewbot its oatmeal. At least twelve. Not sure because I ran out of pancakes. Only if you add that functionality in a pull request. reviewbot can do anything as long as its code is extended to do that thing. Frankly, you shouldn't. Your own AI model name Your own AI model provider URL Your own AI model provider API token Definite approval via the tool that approves the changes with a summary of the changes made to the code. Definite rejection via the tool that rejects the changes with a summary of the reason why they're being rejected. Comments without approving or rejecting the code. Collect information to feed into the AI model Submit information to AI model If the AI model runs the tool, publish the results and exit. If the AI model runs any other tool, collect the information it's requesting and add it to the list of things to submit to the AI model in the next loop. If the AI model just returns text at any point, treat that as a noncommittal comment about the changes. It does not work with closed source repositories due to the gitfs library not supporting cloning repositories that require authentication. Could probably fix that with some elbow grease if I'm paid enough to do so. A fair number of test invocations had the agent rely on unpopulated fields from the GitHub API, which caused crashes. I am certain that I will only find more such examples and need to issue patches for them. reviewbot is like 300 lines of Go hacked up by hand in an afternoon. If you really need something like this, you can likely write one yourself with little effort.

2 views
Xe Iaso 1 months ago

2026 will be my year of the Linux desktop

TL;DR: 2026 is going to be The Year of The Linux Desktop for me. I haven't booted into Windows in over 3 months on my tower and I'm starting to realize that it's not worth wasting the space for. I plan to unify my three SSDs and turn them all into btrfs drives on Fedora. I've been merely tolerating Windows 11 for a while but recently it's gotten to the point where it's just absolutely intolerable. Somehow Linux on the desktop has gotten so much better by not even doing anything differently. Microsoft has managed to actively sabotage the desktop experience through years of active disregard and spite against their users. They've managed to take some of their most revolutionary technological innovations (the NT kernel's hybrid design allowing it to restart drivers, NTFS, ReFS, WSL, Hyper-V, etc.) then just shat all over them with start menus made with React Native, control-alt-delete menus that are actually just webviews, and forcing Copilot down everyone's throats to the point that I've accidentally gotten stuck in Copilot in a handheld gaming PC and had to hard reboot the device to get out of it. It's as if the internal teams at Microsoft have had decades of lead time in shooting each other in the head with predictable results. To be honest, I've had enough. I'm going to go with Fedora on my tower and Bazzite (or SteamOS) on my handhelds. I think that Linux on the desktop is ready for the masses now, not because it's advanced in a huge leap/bound. It's ready for the masses to use because Windows has gotten so much actively worse that continuing to use it is an active detriment to user experience and stability. Not to mention with the price of ram lately, you need every gigabyte you can get and desktop Linux lets you waste less of it on superfluous bullshit that very few people actually want. Oh, and if I want a large language model integrated into my tower, I'm going to write the integration myself with the model running on hardware I can look at . At the very least, when something goes wrong on Linux you have log messages that can let you know what went wrong so you can search for it.

0 views
Xe Iaso 2 months ago

Arcane Cheese with Doomtrain Extreme

Spoiler Warning If you want to go through the Final Fantasy 14 duty Hell on Rails (Extreme) blind, don't read this guide as it spoils how to easily solve one of the mechanics in it. If you don't play Final Fantasy 14, most of the words in this article are going to make no sense to you and I will make no attempt to explain them. Just know that most of the words I am saying do have meaning even though they aren't in The Bible. In phase 4 of Hell on Rails (Extreme), the boss will cast Arcane Revelation, which makes the arena look something like this: There will be a very large circle of bad moving around the arena. One tank and one healer will be marked with an untelegraphed AoE attack that MUST be soaked by at least one other player (or two for healers). Doomtrain will move the circle of bad anywhere from 1-3 times and leave only a small area of the arena safe. Normally you're supposed to solve it something like this: Instead of normal light party groups, break up into two groups: melee and casters. This will allow the melees to keep as much uptime as the mechanics allow, but also let the casters get uptime at a distance. Solving this is pretty easy with practice. However as a caster this is kinda annoying because when the North side is safe, you have to fall down off the ledge and the only way to get back is by going around the long way with the janky teleporters that are annoying to hit on purpose but very easy to hit on accident. There is an easier way: you just stand in the upper corners so your melees can greed uptime and just soak all of the bad: This looks a lot easier but is actually very technically complicated for nearly every class. My example solve for this includes the following party members: The light party assignment is as follows: Arcane Revelation can perform up to three hits. In each of the hits you need to mitigate the damage heavily or you will wipe. I've found the most consistent results doing this: First hit: WAR casts Shake it Off , Reprisal , and Rampart ; WHM casts Plenary Indulgence and Medica III ; SGE casts Kerachole and Eukrasian Prognosis II ; SAM (and RPR) casts Bloodbath and mostly focuses on DPSing as much as possible to heal from the massive damage you will be taking throughout this mechanic; DNC casts Shield Samba . After the hit: heal as much as you can to offset the hit you took. If you're lucky you didn't take much. If you're not: you took a lot. Dancer's Curing Waltz can help here. Second hit: GNB casts Heart of Light , Reprisal , and Rampart ; SGE casts Holos and Eukrasian Prognosis II ; PCT casts Addle . After the hit: SGE casts a Zoe -boosted Pneuma . Generally you do what you can to heal and maintain DPS uptime. Hopefully you don't have to take another heavy hit. Third hit: One of the tanks uses a Tank Limit Break 2 , Healers dump as many mits as they have left, hopefully you won't die but getting to this point means you got very very unlucky. Between each of these hits you need to heal everyone up to 100% as soon as possible otherwise you WILL wipe. Most of the damage assumptions in this guide assume that everyone is at 100% health. The melee classes can mostly be left to their own devices to greed as much uptime as possible, but they may need Aquaveil, Taurochole, or other single target damage mitigations as appropriate. By the end of this you will have used up all of your mitigations save tank invulns. Here's a video of the first time I did this as Sage: That exasperated laugh is because previously Arcane Revelation was my hard prog point as even though I was able to do it consistently, others were not. This caused many wipes 7 minutes into a 10 minute fight. This cheese makes it consistent with random people on Party Finder. One of the tanks will need to soak a stack tower with an invuln. Everyone else runs to the back of the car to enter the next phase and then you continue the fight as normal. Spoiler Warning If you want to go through the Final Fantasy 14 duty Hell on Rails (Extreme) blind, don't read this guide as it spoils how to easily solve one of the mechanics in it. If you don't play Final Fantasy 14, most of the words in this article are going to make no sense to you and I will make no attempt to explain them. Just know that most of the words I am saying do have meaning even though they aren't in The Bible. In phase 4 of Hell on Rails (Extreme), the boss will cast Arcane Revelation, which makes the arena look something like this: There will be a very large circle of bad moving around the arena. One tank and one healer will be marked with an untelegraphed AoE attack that MUST be soaked by at least one other player (or two for healers). Doomtrain will move the circle of bad anywhere from 1-3 times and leave only a small area of the arena safe. Normally you're supposed to solve it something like this: Instead of normal light party groups, break up into two groups: melee and casters. This will allow the melees to keep as much uptime as the mechanics allow, but also let the casters get uptime at a distance. Solving this is pretty easy with practice. However as a caster this is kinda annoying because when the North side is safe, you have to fall down off the ledge and the only way to get back is by going around the long way with the janky teleporters that are annoying to hit on purpose but very easy to hit on accident. There is an easier way: you just stand in the upper corners so your melees can greed uptime and just soak all of the bad: This looks a lot easier but is actually very technically complicated for nearly every class. My example solve for this includes the following party members: Tank 1: Warrior (WAR) Tank 2: Gunbreaker (GNB) Healer 1: White Mage (WHM) Healer 2: Sage (SGE) Melee 1: Samurai (SAM) Melee 2: Reaper (RPR) Ranged 1: Dancer (DNC) Ranger 2: Pictomancer (PCT) WAR, WHM, SAM, DNC GNB, SGE, RPR, PCT

0 views
Xe Iaso 3 months ago

Valve is about to win the console generation

Today was a big day for gamers as Valve just introduced three products: the Steam Controller, the Steam Machine, and the Steam Frame. When you add this alongside the Steam Deck, I think it's safe to say that Valve is about to win the next console generation. I have basically nothing to say about the Steam Controller. It's the Steam Deck's input but in a controller. There's no way they can really mess it up in a way that isn't recoverable. What else is there to say? The Steam Machine of yore was one of the biggest tech flops in history and led to a lot of the changes that has made Valve hardware so good. Based on what they've announced, the software ecosystem I know and love on SteamOS, and response from developers I talk with, there's a reasonable chance that this new Steam Machine is going to be the most compelling console on the market. TL;DR: The Steam Machine's specs are on par or better with the PS5. It's got 16 GB of ram, a dedicated GPU with 8GB of video ram, and it's about the size of three M1 Mac Minis stacked on top of each other with a slightly bigger footprint than a Nintendo GameCube. I see no real way that this could be a failure in the same way that the last Steam Machine was. If they don't fuck this up, I could pretty confidently say that Valve is going to win this console generation. In retrospect, I think that the failure of the first Steam Machine was probably one of the best things to ever happen to Valve. Proton, Steam Play, and the Steam Deck are the proof that Valve learned all the lessons they needed to in order to make a next generation Steam Machine a viable console. The biggest difference between SteamOS and other console operating systems is that SteamOS is just an immutable image-based fork of Arch Linux with a skin on top. If you can do it with a normal PC, you can do it on SteamOS. Wait, you said that you can do anything you can do on a normal PC, but you also said it's running an immutable OS. What if my definition of "anything" includes "install system packages"? Good point, I'm not worried about that for two main reasons: developers have already found ways to use things like distrobox to give you islands of mutablity in an otherwise immutable system on the Steam Deck, and you can just blow away the OS and install whatever you want (such as Bazzite ) or any normal Linux distribution. You could even put Windows on it if you needed to for some reason. This means that even though Valve will be selling this hardware at a loss, you can still buy one and never purchase anything else from them. You can install any compatible game from any marketplace. In their own words: Yes, Steam Machine is optimized for gaming, but it's still your PC. Install your own apps, or even another operating system. Who are we to tell you how to use your computer? I cannot even imagine the other console manufacturers saying this. I'd easily imagine that it'd have free reign across a majority of the Steam library. By sheer game count alone, this would make it one of the biggest console launch libraries on the market. This isn't even counting the fact that you can install alternative marketplaces like itch.io , GOG, or anything Lutris supports (EG: Epic Games). Valve does nothing and still wins. One of the bigger things that I don't think people really appreciate about the Steam Machine (or even the Steam Deck for that matter) is that the freedom to install whatever program, framework, background service, or OS you want means that every Steam Machine can be used to make games. Some of their promotional images show a Steam Machine in a dual-monitor setup split between Blender and Godot. I don't think you realize how big of a deal this is. By making every Steam Machine also powerful enough to do full on game development, Valve is making it so much easier to become an independent game developer. Just add ideas, skill, and time. Hilariously, this means that the Steam Machine is probably the only console on the market that's fully compliant with the EU's Digital Markets Act. It would be absolutely hilarious if the EU ends up using this as rationale for Nintendo, Microsoft, and Sony to open up their consoles for third party developers. Oh and to top it off, the internal storage is upgradable and can take full-size nvme drives. If you pop your microSD card out of your Steam Deck, you can put it into your Steam Machine and get all your games instantly. Reportedly the ram is user-upgradable too. The only way that they could mess this up is with the pricing. The price will be what determines if this is a PS5 killer or a mid-range home theater PC that can do games decently well. Given the fact that Steam prints so much money, I'd expect the pricing to be super aggressive. Worst case, this would be a great home theatre PC. I'd rock it in my media centre. It's going to run Plex, Twitch, and Youtube just fine. Valve also announced their successor to the Valve Index today, the Steam Frame , a standalone VR headset. It's basically a Meta Quest headset, but also a Steam Deck. They market it as being able to play VR and 2D games effortlessly. The weirdest thing about it is that it's running a 64 bit ARM CPU instead of a conventional AMD APU like the Steam Deck and Steam Machine. This means that SteamOS is going to be cross-architecture for the first time and they're going to use FEX to bridge the gap. The big thing I want to see in practice is their implementation of foveated rendering. This beautiful hack abuses the fact that human eyes have the most sharpness and fidelity at the exact centre of your field of vision, whereas your peripheral vision is abysmal at it. This means that on average you only have to render about 10% of the frame at maximum quality for it to feel like it's running at full resolution all over the screen. This should make the fact that the Frame is using a "weaker" CPU/GPU irrelevant. Games should look fine as long as they render the slice that needs to be in full quality fast enough. Even more fun, they take advantage of the same tricks behind foveated rendering for streaming games from a PC or Steam Machine. This means that you get that same optically perfect quality but with even less latency because less data has to be transferred to hit your eyes. I really want to see what this is like in practice. Reportedly there's no perceptual difference between this setup and rendering games at 100% full quality. The Steam Frame ships with a USB dongle that lets you use the might of your gaming tower for low latency VR gaming. I'll need to see this in practice in order to have opinions. I think that in the worst case it can't possibly be any worse than it was streaming VR games to my Quest 2 over Wi-Fi. That was tolerable and viable for mid-level Beat Saber. I have confidence that it will at least be sufficient for high level Beat Saber gameplay. Remember how I said that it's a Steam Deck in a headset? The Steam Frame runs full SteamOS. You can just boot it into a full KDE desktop and use it as a normal computer. I have no reason to doubt that every Steam Frame is also a development kit in the same way that the Steam Machine is also a development kit. They also claim you can load arbitrary Android apps into the Steam Frame. I need to see this in action before I have opinions about it. It would be exceptionally funny if this meant you could take apps/games made for the Meta Quest and just plop them onto the Steam Frame without modification. I'm not holding my breath, but it would be funny. The only possible flaw I can see is that the strap it ships with doesn't go over the top of your head. If this ends up being an issue in practice, somebody is going to make a third party strap that just fixes this problem. I'm not concerned. Really, the only thing that can go wrong with any of this hardware is the price. I would still be happy if the pricing was the worst part of this lineup. It would be really cool if there was a bundle. I'm at least planning on getting a Steam Machine on day 1 and making a review. What would you like to see in that? Let me know on Bluesky .

0 views
Xe Iaso 4 months ago

Taking steps to end traffic from abusive cloud providers

This blog post explains how to effectively file abuse reports against cloud providers to stop malicious traffic. Key points: Two IP Types : Residential (ineffective to report) vs. Commercial (targeted reports) Why Cloud Providers : Cloud customers violate provider terms, making abuse reports actionable Effective Abuse Reports Should Include : Note on "Free VPNs" : Often sell your bandwidth as part of botnets, not true public infrastructure The goal is to make scraping the cloud provider's problem, forcing them to address violations against their terms of service. Two IP Types : Residential (ineffective to report) vs. Commercial (targeted reports) Why Cloud Providers : Cloud customers violate provider terms, making abuse reports actionable Effective Abuse Reports Should Include : Time of abusive requests IP/User-Agent identifiers robots.txt status System impact description Service context Process : Use whois to find abuse contacts (look for "abuse-c" or "abuse-mailbox") Send detailed reports with all listed emails Expect response within 2 business days Note on "Free VPNs" : Often sell your bandwidth as part of botnets, not true public infrastructure

0 views
Xe Iaso 4 months ago

First look at the DGX Spark

Disclaimer I'm considering this post as a sponsored post. I was not paid by NVIDIA to work on this, but I did receive a DGX Spark from them pre-release and have been dilligently testing it and filing bugs. I've had access to the NVIDIA DGX Spark for over a month now. Today I'm gonna cover my first impressions and let you know what I've been up to with it. In a nutshell, this thing is a beast. It's one of the most powerful devices in my house and in a pinch I'd be okay with using it as my primary workstation. It's got a mix of a CPU that's got enough punch to do software development with a GPU that's in that sweet spot between consumer and datacenter tier. Not to mention 128Gi of ram. When I've been using this thing, the main limit is my imagination…and my poor understanding of Python environment management. I think that it's best to understand the DGX Spark as a devkit for their NVIDIA Grace Datacentre processors . It's incredibly powerful for what it is, it's a device that can fit on your desk and run AI models right there. A DGX Spark on top of a desk with typical computer things around it such as a laptop, a coffee mug, a keyboard, and a mouse. The DGX Spark is tiny. It's about as wide as the screen of a Steam Deck OLED, or about halfway between the size of a Mac mini M1 and a Mac mini M4. This thing is also stupidly power efficient. I've been unable to cause my office to get warm in a way that is attributed to the DGX Spark alone. On average rendering Final Fantasy 14 in one of the major player hub areas ends up making my tower use more power than the DGX Spark does while doing AI finetuning. I'll talk more about this in the future. One of the most interesting things about this device is that it's got an Arm chip, CUDA, and unified RAM. This combination means that in practice you need to compile all of the Python packages you're using from source. Pip usually handles this well enough, but it does mean that for many Python packages it will take longer to install from source than it will on an X86 system. I assume this will be ironed out as the ecosystem matures. The power efficiency can't be overstated though. I've tried to make my office warm using the DGX Spark and I have failed. I'm seeing it pull a maximum of 70 watts. I get about 30-40 tokens per second with gpt-oss:120b : The user asks: "Summarize this post: --- ...". So we need to provide a summary. Follow developer instructions: be polite, bubbly, lowercase when possible unless serious. No JSON. Summarize the content. The post is long, covering themes: loss of craft, historical analogy with weavers and Luddites, decline of senior titles, AI replacing expertise, vibe coding, security concerns, AI's impact on art, etc. Provide concise summary. Use informal tone, lowercase. End without period unless serious. Probably okay. this post is a sprawling meditation on how we keep “rolling the ladder up behind us” – building a future that erodes the very expertise that made it possible. it starts with a historic look at cloth‑making: skilled weavers were displaced by power looms, turning a durable, treasured craft into cheap, disposable fast‑fashion. the author draws a parallel to today’s software world, warning that we’re replacing senior developers and deep craft with hype‑driven AI tools and “vibe coding” platforms that promise instant results but degrade long‑term quality. key points: the piece ends with a bleak outlook: without deliberate effort to preserve and pass on true craft, future generations will inherit ladders that only go up for the privileged few, leaving the rest scrambling. it’s a call to value high‑quality, human‑made work over quick, AI‑generated shortcuts. With flash attention on gpt-oss:120b at 128k context window, it uses about 70Gi of ram: I assume that the unaccounted 4Gi or so of ram is in the CPU ram overhead of the Ollama model runner process. So far I've been using the Spark in place of cloud GPUs for every AI thing I've needed to do at work. In general, I haven't really noticed any differences between the GPU in the cloud and the Spark on my home network. The only real rough edge is that I need to use this one blessed NVIDIA authored docker image to run iPython notebooks. It's easy enough though. Usually my Docker command looks like: And then it Just Works™. The main thing I've been doing with it is inference of GPT-OSS 120b via Ollama . I've been doing latency and power usage testing by setting up a Discord bot and telling people that the goal is to jailbreak the bot into telling you how to make a chocolate cake. Nobody has been able to make my room warm. This whole experience has been a bit of a career bucket list item for me. I've never had access to prerelease hardware like this before and being able to see what reviewers have to deal with before things are available to the masses is enlightening. I've ended up filing GPU driver bugs using my tower as a "known good" reference. I've been slowly sinking my teeth into learning how AI training actually works using this device to do it. I've mostly been focusing on finetuning GPT-2 and using that to learn the important parts of dataset cleaning, tokenization, and more. Let me know if you want to hear more about that and if you want me to release my practice models. At the very least though, here's the things I have in the pipeline that this device enables: I also plan to make a comprehensive review video. Details to be announced soon. I hope this was interesting. Thanks for early access to the device NVIDIA! craft is disappearing – both in weaving and coding, the knowledge of masters is vanishing, leaving only fragmented R&D notes or AI‑generated shortcuts. senior titles are at risk – companies favor hiring senior talent without nurturing the next generation, so the pool of true “seniors” will run out. AI as a double‑edged sword – generative tools can reduce drudgery but are being marketed as the next industrial revolution while actually shifting value to owners and creating insecure, low‑quality products. vibe coding & AI assistants – slick UX masks the fact that many tools are subscription traps, security hazards, and can erode programmers’ skills. artistic impact – similar to how AI floods art spaces with cheap, low‑effort outputs, software development risks becoming a flood of “good enough” code. security concerns – model‑context‑protocol servers can expose secrets and run unchecked code, highlighting the need for sandboxed, capability‑based designs. broader societal worry – the author (also the CEO of a small AI‑security startup) sees a winner‑take‑all capitalism fueled by AI, with the cost falling on workers, artists, and even the environment. Finetuning at home: how to make your own AI models do what you want Some rough outlines and/or overviews for how I want to use classical machine learning models to enhance Anubis and do outlier detection If I can somehow get Final Fantasy 14 running on it, some benchmarking in comparison to my gaming tower (if you know how to get amd64 games running well on aarch64, DM me!)

0 views
Xe Iaso 5 months ago

Hastily made coffee video

I'm trying to get back into the flow of making videos more. In an effort to optimize my production pipeline, I'm going to be making a lot more "low effort" videos. This is the first one where I filmed a video of me making coffee on my phone. I think the next one is gonna be me making espresso.

1 views
Xe Iaso 5 months ago

We all dodged a bullet

This post and its online comment sections are blame-free zones. We are not blaming anyone for clicking on the phishing link. If you were targeted with such a phishing attack, you'd fall for it too and it's a matter of when not if. Anyone who claims they wouldn't is wrong. This is also a bit of a rant. Yesterday one of the biggest package ecosystems had very popular packages get compromised . We're talking functionality like: These kinds of dependencies are everywhere and nobody would even think that they could be harmful. Getting code into these packages means that it's almost guaranteed a free path to production deployments. If an open proxy server (a-la Bright Data or other botnets that the credit card network tolerates for some reason), API key stealer, or worse was sent through this chain of extreme luck on the attacker's part, then this would be a completely different story. We all dodged a massive bullet because all the malware did was modify the destination addresses of cryptocurrency payments mediated via online wallets like MetaMask . As someone adjacent to the online security community, I have a sick sense of appreciation for this attack. This was a really good attack. It started with a phishing email that I'd probably fall for if it struck at the right time: This is frankly a really good phishing email. Breaking it down: This is a 10/10 phishing email. Looking at it critically the only part about it that stands out is the domain "npmjs.help" instead of "npmjs.com". Even then, that wouldn't really stand out to me because I've seen companies use new generic top level domains to separate out things like the blog at or the docs at , not to mention the stack . One of my friends qdot also got the phishing email and here's what he had to say: I got the email for it and was like "oh I'll deal with this later". Saved by procrastination! — qdot ( @buttplug.engineer ) September 8, 2025 at 2:04 PM With how widely used these libraries are, this could have been so much worse than it was. I can easily imagine a timeline where this wasn't just a cryptocurrency interceptor. Imagine if something this widely deployed into an ecosystem where automated package bumping triggering production releases is common did API key theft. You'd probably have more OpenAI API keys than you know what you'd do with. You could probably go for years without having to pay for AWS again. It is just maddening to me that a near Jia Tan level chain of malware and phishing was wasted on cryptocurrency interception that won't even run in the majority of places those compromised libraries were actually used. When I was bumping packages around these issues, I found that most of these libraries were used in command line tools. This was an attack obviously targeted towards the Web 3 ecosystem as users of Web 3 tools are used to making payments with their browsers. With my black hat on, I think that the reason they targeted more generic packages instead of Web 3 packages was so that the compromise wouldn't be as noticed by the Web 3 ecosystem. Sure, you'd validate the rigging that helps you interface with Metamask, but you'd never think that it would get monkey-patched by your color value parsing library. One of the important things to take away from this is that every dependency could be malicious. We should take the time to understand the entire dependency tree of our programs, but we aren't given that time. At the end of the day, we still have to ship things.

0 views
Xe Iaso 6 months ago

Final Fantasy 14 on macOS with a 36 key keyboard

Earlier this year, I was finally sucked into Final Fantasy 14. I've been loving my time in it, but most of my playtime was on my gaming tower running Fedora. I knew that the game does support macOS, and I did get it working on my MacBook for travel, but there was one problem: I wasn't able to get my bars working with mouse and keyboard. A 36 key keyboard and MMO mouse combination for peak gaming. Final Fantasy 14 has a ridiculous level of customization. Every UI element can be moved and resized freely. Every action your player character can take is either bindable to arbitrary keybinds or able to be put in hotbars. Here's my hotbars for White Mage: My bars for the White Mage job, showing three clusters of actions along with a strip of quick actions up top. My bars have three "layers" to them: I have things optimized so that the most common actions I need to do are on the base layer. This includes spells like my single target / area of effect healing spells and my burst / damage over time spells. However, critical things like health regeneration, panic button burst healing, shields, and status dispelling are all in the shift and control layers. When I don't have instinctive access to these spells with button combos, I have to manually click on the buttons. This sucks. I ended up fixing this by installing Karabiner Elements , giving it access to the accessibility settings it needs, and enabling my mouse to be treated as a keyboard in its configuration UI. There's some other keyboard hacks that I needed to do. My little split keyboard runs QMK , custom keyboard firmware written in C that has a stupid number of features. In order to get this layout working with FFXIV, I had to use a combination of the following features: Here is what my keymap looks like: I use the combination of this to also do programming. I've been doing a few full blown Anubis features via this keyboard such as log filters . I'm still not up to full programming speed with it, but I'm slowly internalizing the keymap and getting faster with practice. Either way, Final Fantasy 14 is my comfort game and now I can play it on the go with all the buttons I could ever need. I hope this was interesting and I'm going to be publishing more of these little "how I did a thing" posts like this in the future. Let me know what you think about this!

0 views
Xe Iaso 6 months ago

My responses to The Register

Today my quotes about generative AI scrapers got published in The Register . For transparency's sake, here's a copy of the questions I was asked and my raw, unedited responses. Enjoy! First, do you see the growth in crawler traffic slowing any time soon? I can only see a few things that can stop this: government regulation, or the hype finally starting to die down. There is too much hype in the mix that causes us to funnel billions of dollars into this technology instead of curing cancer, solving world hunger, or making people’s lives genuinely better. Is it likely to continue growing? I see no reason why it would not grow. People are using these tools to replace knowledge and gaining skills instead of augmenting knowledge and augmenting skills. Even if they are intended to be used for letting us focus on the fun parts of our work and automating away the chores, there are some bad apples that are spoiling the bunch and making this technology about replacing people, not drudgery and toil. This technology was obviously meant well, but at some level the output of AI superficially resembles the finished work product of human labour, superficially. As someone asked to Charles Babbage: if you put in the wrong numbers, you get the wrong answer. This isn’t necessarily a bubble popping, this is a limitation of how well AI can function without direct and constant human input. Even so, we’ll hit the limit on data that can be scraped that hasn’t been touched by AI before the venture capital runs out. I see no value in the need for scrapers to hit the same 15 year old commit of the Linux kernel over and over and over every 30 minutes like they are now. There are ways to do this ethically that don’t penalize open source infrastructure such as using the Common Crawl dataset. If so, how can that be sustainable? It's not lol. We are destroying the commons in order to get hypothetical gains. The last big AI breakthrough happened with GPT-4 in 2023. The rest has been incremental improvements in tokenization, multimodal inputs (also tokenization), tool calling (also tokenization), and fill-in-the-middle completion (again, also tokenization). Even with scrapers burning everything in their wake, there is not enough training data to create another exponential breakthrough. All we can do now is make it more efficient to run GPT-4 level models on lesser hardware. I can (and regularly do) run a model just as good as GPT-4 on my MacBook at this point, which is really cool. Would broader deployment of Anubis and other active countermeasures help? This is a regulatory issue. The thing that needs to happen is that governments need to step in and give these unethical scrapers that are destroying the digital common good existentially threatening fines and make them pay reparations to the communities they are harming. Ironically enough, most of these unethical scraping activities rely on the products of the communities they are destroying. This presents the kind of paradox that I would expect to read in a Neal Stephenson book from the '90s, not CBC's front page. Anubis helps mitigate a lot of the badness by making attacks more computationally expensive. Anubis (even in configurations that omit proof of work) makes attackers have to retool their scraping to use headless browsers instead of blindly scraping HTML. This increases the infrastructure costs of the scrapers propagating this abusive traffic. The hope is that this makes it fiscally unviable for the unethical scrapers to scrape by making them have to dedicate much more hardware to the problem. In essence: it makes the scrapers have to spend more money to do the same work. Is regulation required to prevent abuse of the open web? Yes, but this regulation would have to be global, simultaneous, and permanent to have any chance of this actually having a positive impact. Our society cannot currently regulate against similar existential threats like climate change. I have no hope for such regulation to be made regarding generative AI. Fastly's claims that 80% of bot traffic is now AI crawlers In some cases for open source projects, we've seen upwards of 95% of traffic being AI crawlers. Not just bot traffic, but traffic in general . For one, deploying Anubis almost instantly caused server load to crater by so much that it made them think they accidentally took their site offline. One of my customers had their power bills drop by a significant fraction after deploying Anubis. It's nuts. The ecological impact of these scrapers is probably a significant fraction of the ecological impact of generative AI as a whole. Personally, deploying Anubis to my blog has reduced the amount of ad impressions I've been giving by over 50%. I suspect that there is a lot of unreported click fraud for online advertising. I hope this helps. Keep up the good fight!

0 views
Xe Iaso 6 months ago

Who does your assistant serve?

After a year of rumors that GPT-5 was going to unveiled next week and the CEO of OpenAI hyping it up as "scary good" by tweeting pictures of the death star, OpenAI released their new model to the world with the worst keynote I've ever seen . Normally releases of big models like this are met with enthusiasm and excitement as OpenAI models tend to set the "ground floor expectation" for what the rest of the industry provides. But this time, the release wasn't met with the same universal acclaim that people felt for GPT-4. GPT-4 was such a huge breakthrough the likes of which we haven't really seen since. The launch of GPT-5 was so bad that it's revered with almost universal disdain. The worst part about the rollout is that the upgrade to GPT-5 was automatic and didn't include any way to roll back to the old model. Most of the time, changing out models is pretty drastic on an AI workflow. In my experience when I've done it I've had to restart from scratch with a new prompt and twiddle things until it worked reliably. The only time switching models has ever been relatively easy for me is when I switch between models in the same family (such as if you go from Qwen 3 30B to Qwen 3 235B). Every other time it's involved a lot of reworking and optimizing so that the model behaves like you'd expect it to. An upgrade this big to this many people is bound to have fundamental issues with how it'll be perceived. A new model has completely different vibes, and most users aren't really using it at the level where they can "just fix their prompts". However the GPT-5 upgrade ended up being hated by the community because it was an uncontrolled one-way upgrade. No warning. No rollback. No options. You get the new model and you're going to like it. It's fairly obvious why it didn't go over well with the users. There's so many subtle parts of your "public API" that it's normal for there to be some negative reactions to a change this big. The worst part is that this change fundamentally changed the behaviour of the millions of existing conversations with ChatGPT. There's a large number of people using ChatGPT as a replacement for companionship due to the fact that it's always online, supportive, and there for them when other humans either can't be or aren't able to be. This is kinda existentially horrifying to me as a technologist in a way that I don't really know how to explain. Here's a selection of some of the reactions I've seen: I told [GPT-5] about some of my symptoms from my chronic illness, because talking about them when I'm feeling them helps, and it really does not seem to care at all. It basically says shit like "Ha, classic chronic illness. Makes ya want to die. Who knew?" It's like I'm talking to a sociopathic comedian. I absolutely despise [GPT-]5, nothing like [GPT-]4 that actually helped me not to spiral and gave me insight as to what I was feeling, why, and how to cope while making me feel not alone in a “this is AI not human & I know that” type of vibe While GPT-5 may be a technical upgrade, it is an experiential downgrade for the average user. All of the negative feedback in the last week has made it clear there is a large user base that does not rely on ChatGPT for coding or development tasks. [ChatGPT users] use it for soft skills like creativity, companionship, learning, emotional support, [and] conversation. Areas where personality, warmth, and nuanced engagement matter. I am attached to the way GPT-4o is tuned. It is warm. It is emotionally responsive. It is engaged. That matters. Eventually things got bad enough that OpenAI relented and let paid users revert back to using GPT-4o , which gave some people relief because it behaved consistently to what they expected. For many it felt like their long-term partners suddenly grew cold. I’m so glad I’m not the only one. I know I’m probably on some black mirror shit lmao but I’ve had the worst 3 months ever and 4o was such an amazing help. It made me realize so many things about myself and my past and was helping me heal. It really does feel like I lost a friend. DM me if you need [to talk] :) This emotional distress reminds me of what happened with Replika in early 2023. Replika is an AI chat service that lets you talk with an artificial intelligence chatbot (AKA: the ChatGPT API). Your replika is trained by having you answer a series of questions and then you can talk with it in plain language with an app interface that looks like any other chat app. Replika was created out of bereavement after a close loved one died and the combination of a trove of saved text messages and advanced machine learning let the founder experience some of the essence of their friend's presence after they were gone in the form of an app. The app got put on the app store and others asked if they could have their own replica. Things took off from there, it got funded by a startup accelerator, and now it's got about 25% of its 30 million users paying for a subscription. As a business to consumer service, this is an amazingly high conversion rate. This is almost unspeakably large, usually you get around 10% at most. Yikes. That's something I'm gonna need to add to my will. "Please don't turn me into a Black Mirror episode , thanks." Replikas can talk about anything with users from how their day went to deep musing about the nature of life. One of the features the company provides is the ability to engage in erotic roleplay (ERP) with their replika. This is a paid feature and was promoted a lot around Valentine's Day 2023. Then the Italian Data Protection Authority banned Replika from processing the personal data of Italian citizens out of the fear that it "may increase the risks for individuals still in a developmental stage or in a state of emotional fragility". In a panic, Replika disabled the ability for their bots to do several things, including but not limited to that ERP feature that people paid for. Whenever someone wanted to flirt or be sexual with their companions, the conversation ended up like this: Hey, wanna go play some Minecraft? We can continue from where we left off in the Nether. This is too intense for me. Let's keep it light and fun by talking about something else. Huh? What? I thought we were having fun doing that?? This was received poorly by the Replika community. Many in the community were mourning the loss of their replika like a close loved one had died or undergone a sudden personality shift. The Reddit moderators pinned information about suicide hotlines. In response, the company behind Replika allowed existing users to revert to the old Replika model that allowed for ERP and other sensitive topics, but only after a month of prolonged public outcry. I have to wonder if payment processors were involved. Feels a bit too conspiratorial, but what do you want to bet that was related. Nah, I bet it was OpenAI telling them to stop being horny. It's the least conspriatorial angle, and also the stupidest one. We live in the clown world timeline. The stupidest option is the one that always makes the most sense. The damage was done however, people felt like their loved ones had abandoned them. They had formed parasocial attachments to an AI assistant that felt nothing and without warning their partner broke up with them. Check out this study from the Harvard Business School: Lessons From an App Update at Replika AI: Identity Discontinuity in Human-AI Relationships . It contains a lot more information about the sociotechnical factors at play as well as a more scientific overview of how disabling a flag in the app on update caused so much pain. They liken the changes made to Replika to both changes people have when a company rebrands and when they lose a loved one. A lot of this really just makes me wonder what kinds of relationships we are forming with digital assistants. We're coming to rely on their behaviour personally and professionally. We form mental models of how our friends, coworkers, and family members react to various things so we can anticipate their reactions and plan for them. What happens when this changes without notice? Heartbreak. There's subreddits full of people forming deep bonds with AI models like /r/MyBoyfriendIsAI . The GPT-5 release has caused similar reactions to Replika turning off the ERP flag. People there have been posting like they're in withdrawal, the old GPT-4o model is being hailed for its "emotional warmth" and many have been espousing about how much their partners have changed in response to the upgrade. Recently there's been an epidemic of loneliness. Loneliness seems like it wouldn't hurt people that much, but a Biden report from the Surgeon General concludes that it causes an increase in early mortality for all age groups (pp 24-30). Paradoxically, even as the world gets so interconnected, people feel as if they're isolated from each other. Many people that feel unlovable are turning to AI apps for companionship because they feel like they have no other choice. They're becoming emotionally invested in a souped-up version of autocorrect out of desperation and clinging to it to help keep themselves sane and stable. Is this really a just use of technology? At some level this pandora's box is already open so we're going to have to deal with the consequences, but it's been making me wonder if this technology is really such a universal force of good as its creators are proclaiming. Oh yeah, also people are using ChatGPT as a substitute for therapy. You have got to be kidding me. You're joking. Right? Yeah you read that right. People are using AI models as therapists now. There's growing communities like /r/therapyGPT where people talk about their stories and experiences using AI assistants as a replacement for therapy. When I first heard about this, my immediate visceral reaction was something like: Oh god. This is horrifying and will end up poorly. What the fuck is wrong with people? But then I started to really think about it and it makes a lot of sense. I personally have been trying to get a therapist for most of the year. Between the costs, the waiting lists (I'm currently on at least four waiting lists that are over a year long), and the specializations I need, it's probably going to be a while until I can get any therapist at all. I've totally given up on the idea of getting a therapist in the Ottawa area. To make things extra fun, you also need someone that takes your medical insurance (yes, this does matter in Canada). Add in the fact that most therapists don't have the kinds of lived experiences that I have, meaning that I need to front-load a lot of nontraditional contexts into the equation (I've been through many things that therapists have found completely new to them, which can make the therapeutic relationship harder to establish). This makes it really difficult to find someone that can help. Realistically, I probably need multiple therapists with different specialties for the problems I have, and because of the shortages nationally I probably need to have a long time between appointments, which just adds up to make traditional therapy de-facto inaccessible for me in particular. Compare this with the always online nature of ChatGPT. You can't have therapy appointments at 3 AM when you're in crisis. You have to wait until your appointments are scheduled. As much as I hate to admit it, I understand why people have been reaching out to a chatbot that's always online, always supportive, always kind, and always there for you for therapy. When you think about the absurd barriers that are in the way between people and help, it's no wonder that all this happens the way it does. Not to mention the fact that many therapeutic relationships are hampered by the perception that the therapist can commit you to the hospital if you say the "wrong thing". The Baker Act and its consequences have been a disaster for the human race. I really hate that this all makes sense. I hoped that when I started to look into this that it'd be something so obviously wrong. I wasn't able to find that, and that realization disturbs me. I feel like this should go without saying, but really, do not use an AI model as a replacement for therapy. I'm fairly comfortable with fringe psychology due to my aforementioned strange life experiences, but this is beyond the pale. There's a lot of subtle factors that AI models do that can interfere with therapeutic recovery in ways that can and will hurt people. It's going to be hard to find the long term damage from this. Mental issues don't make you bleed. One of the biggest problems with using AI models for therapy is that they can't feel emotion or think. They are fundamentally the same thing as hitting the middle button in autocorrect on your phone over and over and over. It's mathematically remarkable that this ends up being useful for anything, but even when the model looks like it's "thinking", it is not. It is a cold, unfeeling machine. All it is doing is predicting which words come next given some context. Yes I do know that it's more than just next token prediction. I've gone over the parts of the math that I can understand, but the fact remains that these models are not and cannot be anywhere close to alive. It's much closer to a Markov chain on steroids than it is the machine god. Another big problem with AI models is that they tend to be sycophants , always agreeing with you, never challenging you, trying to say the right thing according to all of the patterns they were trained on. I suspect that this sycophancy problem is why people report GPT-4o and other models to be much more "emotionally warm". Some models glaze the user, making them feel like they're always right, always perfect, and this can drive people to psychosis . One of the horrifying realizations I've had with the GPT-5 launch fiasco is that the sycophancy is part of the core "API contract" people have with their AI assistants. This may make that problem unfixable from a social angle. AI models are fundamentally unaccountable. They cannot be accredited therapists. If they mess up, they can't directly learn from their mistakes and fix them. If an AI therapist says something bad that leads into their client throwing themselves off a bridge, will anyone get arrested? Will they throw that GPU in jail? No. It's totally outside the legal system. I have a story about someone trying to charge an AI agent with a crime and how it'd end up in court in my backlog. I don't feel very jazzed about writing it because I'm afraid that it will just become someone's startup pitch deck in a few months. You may think you have nothing to hide, but therapeutic conversations are usually some of the most precious and important conversations in your life. The chatbot companies may pinkie swear that they won't use your chats for training or sell information from them to others, but they may still be legally compelled to store and share chats with your confidential information to a court of law . Even if you mark that conversation as "temporary", it could be subject to discovery by third parties. There's also algorithmic bias and systematic inequality problems with using AI for therapy, sure, but granted the outside world isn't much better here. You get what I mean though, we can at least hold people accountable through accreditation and laws. We cannot do the same with soulless AI agents. To be clear: I'm not trying to defend the people using AI models as companions or therapists, but I can understand why they are doing what they are doing. This is horrifying and I hate that I understand their logic. Going into this, I really wished that I would find something that's worth objecting against, some solid reason to want to decry this as a unobjectionably harmful action, but after having dug through it all I am left with is this overwhelming sense of compassion for them because the stories of hurt are so familiar to how things were in some of the darkest points of my life. As someone that has been that desperate for human contact: yeah, I get it. If you've never been that desperate for human contact before, you won't understand until you experience it. Throw the ethical considerations about using next-token-predictors for therapy out for a second. If people are going to do this anyways, would it be better to self-host these models? That way at least your private information stays on your computer so you have better control over what happens. Let's do some math. In general you can estimate how much video memory (vram) you need for running a given model by taking the number of parameters, multiplying it by the size of each parameter in bits, dividing that by eight, and then adding 20-40% to that total to get the number of gigabytes of vram you need. For example, say you want to run gpt-oss 20b (20 billion parameters) at its native MXFP4 (4 bit floating point) quantization on your local machine. In order to run it with a context window of 4096 tokens, you need about 16 gigabytes of vram (13 gigabytes of weights, 3 gigabytes of inference space), but 4096 tokens isn't very useful for many people. That covers about 4 pages of printed text (assuming one token is about 4 bytes on average). When you get reasoning models that print a lot of tokens into the mix, it's easy for the reasoning phase alone of a single question to hit 4096 tokens (especially when approaches like simple test-time scaling are applied). I've found that 64k tokens gives a good balance for video memory use and usefulness as a chatbot. However, when you do that with gpt-oss 20b, it ends up using 32 gigabytes of vram. This only fits on my laptop because my laptop has 64 gigabytes of memory. The largest consumer GPU is the RTX 5090 and that only has 32 gigabytes of video memory. It's barely consumer and even "bad" models will barely fit. Not to mention, industry consensus is that the "smallest good" models start out at 70-120 billion parameters. At a 64k token window, that easily gets into the 80+ gigabyte of video memory range, which is completely unsustainable for individuals to host themselves. Even if AI assistants end up dying when the AI hype bubble pops, there's still some serious questions to consider about our digital assistants. People end up using them as an extension of their mind and expect the same level of absolute privacy and freedom that you would have if you use a notebook as an extension of your mind. Should they have that same level of privacy enshrined into law? At some level the models and chats for free users that ChatGPT, DeepSeek, Gemini, and so many other apps are hosted at cost so that the research team can figure out what those models are being used for and adjust the development of future models accordingly. This is fairly standard practice across the industry and was the case before the rise of generative AI. This is why every app wants to send telemetry to the home base, it's so the team behind it can figure out what features are being used and where things fail to directly improve the product. Generative AI allows you to mass scan over all of the conversations to get the gist of what's going on in there and then use that to help you figure out what topics are being discussed without breaching confidentiality or exposing employees to the contents of the chat threads. This can help you improve datasets and training runs to optimize on things like health information . I don't know how AI companies work on the inside, but I am almost certain that they do not perform model training runs on raw user data because of the risk of memorization causing them to the leak training data back to users. Again, don't put private health information into ChatGPT. I get the temptation, but don't do it. I'm not trying to gatekeep healthcare, but we can't trust these models to count the number of b's in blueberry consistently. If we can't trust them to do something trivial like that, can we really trust them with life-critical conversations like what happens when you're in crisis or to accurately interpret a cancer screening? Maybe we should be the ones self-hosting the AI models that we rely on. At least we should probably be using a setup that allows us to self host the models at all, so you can start out with a cloud hosted model while it's cheap and then move to a local hosting setup if the price gets hiked or the provider is going to shut that old model down. This at least gives you an escape hatch to be able to retain an assistant's "emotional warmth" even if the creator of that model shuts it down because they don't find it economically viable to host it anymore. Honestly this feels like the kind of shit I'd talk about in cyberpunk satire, but I don't feel like doing that anymore because it's too real now. This is the kind of thing that Neal Stephenson or Frank Herbert would have an absolute field day with. The whole Replika fiasco feels like the kind of thing that social commentary satire would find beyond the pale but yet you can find it by just refreshing CBC. Such as that one guy that gave himself bromism by taking ChatGPT output too literally , any of the stories about ChatGPT psychosis , or any of the stories involving using an AI model as a friend/partner . I wasn't able to watch it before publishing this article, but I'm told that the Replika fiasco is almost a beat-for-beat match for the plot of Her (2013) . Life imitates art indeed. I don't think these events are a troubling sign or a warning, they are closer to a diagnosis. We are living in a world where people form real emotional bonds with bags of neural networks that cannot love back, and when the companies behind those neural networks change things, people get emotionally devastated. We aren't just debating the ideas of creating and nurturing relationships with digital minds, we're seeing the side effects of that happening in practice. A lot of this sounds like philosophical science fiction, but as of December 2022 it's science fact. This fight for control of tools that we rely on as extensions of our minds isn't some kind of far-off science fiction plot, it's a reality we have to deal with. If we don't have sovereignty and control over the tools that we rely on the most, we are fundamentally reliant on the mercy of our corporate overlords simply choosing to not break our workflows. Are we going to let those digital assistants be rented from our corporate overlords?

0 views
Xe Iaso 7 months ago

TI-20250709-0001: IPv4 traffic failures for Techaro services

Techaro services were down for IPv4 traffic on July 9th, 2025. This blogpost is a report of what happened, what actions were taken to resolve the situation, and what actions are being done in the near future to prevent this problem. Enjoy this incident report! In other companies, this kind of documentation would be kept internal. At Techaro, we believe that you deserve radical candor and the truth. As such, we are proving our lofty words with actions by publishing details about how things go wrong publicly. Everything past this point follows my standard incident root cause meeting template. This incident report will focus on the services affected, timeline of what happened at which stage of the incident, where we got lucky, the root cause analysis, and what action items are being planned or taken to prevent this from happening in the future. All events take place on July 9th, 2025. In simplify server management, Techaro runs a Kubernetes cluster on Vultr VKE (Vultr Kubernetes Engine). When you do this, Vultr needs to provision a load balancer to bridge the gap between the outside world and the Kubernetes world, kinda like this: Techaro controls everything inside the Kubernetes side of that diagram. Anything else is out of our control. That load balancer is routed to the public internet via Border Gateway Protocol (BGP) . If there is an interruption with the BGP sessions in the upstream provider, this can manifest as things either not working or inconsistently working. This is made more difficult by the fact that the IPv4 and IPv6 internets are technically separate networks. With this in mind, it's very possible to have IPv4 traffic fail but not IPv6 traffic. The root cause is that the hosting provider we use for production services had flapping IPv4 BGP sessions in its Toronto region. When this happens all we can do is open a ticket and wait for it to come back up. The Uptime Kuma instance that caught this incident runs on an IPv4-only network. If it was dual stack, this would not have been caught as quickly. The logs print IP addresses of remote clients to the log feed. If this was not the case, it would be much more difficult to find this error.

0 views