Latest Posts (20 found)

The Laziest Generation

by Ibrahim Diallo Ibrahim talks about house prices in the US, how it's only getting worse, and the perception from previous generations that kids today are somehow lazy because they can't afford a house before the age of 40. Read post ➡ Being the father of 2 young people, this worries me too. Despite this post being US-centric, the script is the same here in the UK. Unless my kids generation come out of school on a 6 figure salary, they don't have a hope in hell of buying a decent house. To put that in context, here in the UK a £100,000 salary puts you in the top 3% of earners . In the late 90's a house would cost around 4x a person's salary on average. Today it's 8x . So most can forget about saving for a deposit. Instead younger generations will have to rely on inheritance, which will only exacerbate the late stages of life in which people are buying houses. Something has to give. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
iDiallo Today

The Laziest Generation

I don't understand why this generation can't afford a home. When my grandfather was 18, he had already saved enough money from his paper route and various odd jobs to buy his first home. By the time my father turned 26, he was already married, had his first child, and was moving into his first home. We lived frugally, and our parents taught us the value of spending wisely. Today's man-children, at the ripe old age of 40, still cannot afford a home. Yet they have no problem eating out every day, going to the movies, and buying popcorn and avocado toast. Add in subscriptions to ten different services they barely use, and that's money thrown out the window. They don't see the correlation between their spending habits and their inability to buy a home or save money in the first place. I understand that my grandfather's house only cost $12,000, and my father bought his for $50,000. Mine was much more expensive, I paid $150,000, and that house is now worth a million. I understand that by the time my children are 26, it will probably be worth $10 million. If they start saving now, they'll have a shot. But I can tell they will choose reckless spending over saving, and I simply do not understand this generation. Last week I found a flyer wedged into my front door from a real estate agent in the neighborhood. On it was a list of homes she had sold, each entry showing a picture of the house and its sale price. The cheapest was $970k. For her, this was a record of her work. "Hire me and I'll sell your house," a calling card of bragging rights. For me, it was a nightmare. I don't live in an affluent neighborhood, yet somehow all the homes are worth a million dollars. Thirteen years ago, a colleague of mine bought hers in this same neighborhood for around $200k. It was a savvy investment. If she sells now, she'll get at least five times what she paid. While that price was reasonable at the time, meaning you could dedicate a third of your salary to your mortgage, at a million dollars, you're paying far more. That's between $7,000 and $10,000 per month. Good luck finding a job that pays three times that. To satisfy that requirement, you'd need to earn $250k to $360k a year. Cutting back on avocado toast or prepping your own meals won't save you nearly enough. If you squint and stretch your imagination, maybe it's possible to afford these homes, not by cutting back, but by finding new sources of income. But what about the next generation? My kids. When they're in their 20s and 30s, how much will houses cost? If we continue at this pace, the wooden houses in this neighborhood are going to cost at least $10 million each. And we'll call the next generation even lazier. Maybe we'll tell them they're splurging on water bottles. "Back in my day, we drank tap water." Or maybe they're not using Grok enough to come up with a smarter financial strategy. I don't think this is sustainable. The only way forward may be for everything to collapse first. See you at the homeless camp where we'll all end up.

0 views
Unsung Today

The curious case of the disappearing Polish S

Speaking of remastering (and diacritics ), I grabbed my older Medium deep dive called The curious case of the disappearing Polish S , and put it on the new site. It looks so much better than on Medium and while I was at it, I’ve redone all the visuals, and updated it a little bit. = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-curious-case-of-the-disappearing-polish-s/1.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/the-curious-case-of-the-disappearing-polish-s/1.1600w.avif" type="image/avif"> It’s still probably one of my favourite bugs I’ve encountered. I hope you enjoy! #bug deep dives #keyboard #localization #marcin wichary #text editing

0 views
ava's blog Yesterday

enduring the heat wave in germany

I live in an apartment that first gets heated up on one side before noon, then later from the other side. My kitchen is especially hot each year because it has a huge bay window with no shutters installed. My strategies for keeping cool have been to air out everything at night, and if possible draw in and circulate air via a fan during some of it. Then as soon as the sun is coming up, closing windows, lowering the existing outside shutters so the sun can’t heat up the glass or insides, and always keeping the kitchen door closed so the heat is contained within. I avoid opening the windows during the day to not let heat in, except if I really need fresh air or the humidity is too high. Humidity is the thing that is wrecking us the most in this, which is why it is often futile to ask people elsewhere how they deal with these high temperatures when those people live in very dry climates. The humidity messes with your body’s ability to exude heat, and in worst case, results in the wet bulb effect . That is also why even people from hotter countries can suddenly struggle elsewhere (like Europe), together with the angle at which sunlight hits Earth at that area being different (a lower sun angle spreads the same amount of energy over a larger area, making it feel cooler, while a higher angle concentrates energy on a smaller area, increasing warmth). This is why fans with water cooling and tips like hanging a wet T-shirt in front of a fan, constantly misting yourself or wearing wet clothes etc. can sort of backfire and make your home a bit more unbearable, depending on the circumstances. I also have a fan with water cooling with optional cooling bricks/batteries, and it’s currently on because we hang out in front of it, but I’m mindful of when I turn that mode on and for how long. In the next few weeks, we are planning to add sun protection foil to some windows, and when the extreme demand is over in fall, I’ll buy a Midea Porta Split and install it in the living room. Good tips in general, some summarized from above: Hydrate a lot, even before you are actually thirsty. Stay inside if possible. Keep the added humidity to a minimum. Know what you are trying to do with drinks and showers. Cool drinks and showers offer relief, but can make you heat up after. Hot beverages and showers can make everything feel cooler after and help you sweat. I like both, depending on the situation. Wrap ice packs or similar stuff in a towel and put them under your feet or in your armpits. If possible, lower shutters so the sun cannot heat up the interior and the glass. Maybe install sun protection foil on windows (most are plant-friendly). I’ve also seen others provisionally use those reflecting covers for cars on their windows, or aluminium foil. Make sure that if it’s behind the glass, the heat won’t be trapped and make the glass crack, so preferably attach it on the outside. Sunscreen, wide breathable and covering clothing, sun umbrellas and hats. During fall/winter, maybe during Black Friday sales, get a portable split cooling system. Portables do not need structural changes to the building, which is why they tend to be allowed in rental units as they can be removed without a trace and aren’t in use all year. Shitty landlords might get mad to see it in your window, but in many countries, there already is positive case law about them and the usual AC dismissals don’t apply to them. Set out flat bowls of water in the shadow for wild animals and refill. Consider different ones for different sizes (a flat one with stone pebbles for insects, a relatively flat but water-only one for hedgehogs etc., one bird bath…). Use cool tiles and cooling mats for pets. Keep an eye out for baby birds who flee their overheated nests too early; maybe you can save some of them. Especially bitdd who live in attics and roofs are dying right now (swifts etc.) If possible and you can plan the shipment, avoid deliveries. Keep water around for delivery personnel. Eat smaller snacks and portions spread out throughout the day instead of big meals so your body doesn’t heat up as much during digestion. Leave the windows open all day. Let the sun heat up your interior, if possible; try at least covering windows with blankets if there are no shutters. Set out water for animals where it heats up drastically, or in a beverage where they might become trapped and drown. Walk your dog when the ground is heated up - asphalt burns happen quickly past 25 degrees Celsius. Fall for scalpers, scammers and increased prices for ACs and fans who are using the current demand and availability issues for profit. The Porta Split I mean to get can be bought for 550-750 Euro under normal circumstances, now during the heat wave, prices have exploded to over 1.4k. Only buy that if it is an emergency. Think fans or ACs can make you sick. This is a widely held belief especially in older generations in Germany at least, together with the myth that any wind can cause a cold and stiff neck. It is bullshit. It’s a big reason why this country is not prepared for this heat and there’s a 20% adoption rate for ACs here. Think you need to keep the fan off or not buy one at all because of the electricity bill. The increase is lower for newer models and for the few days you need to use it (more) (for now). You are also not meaningfully contributing to climate change with this increased energy use. Like, come on, they wanna build entire data centers eating away gigawatts, your heat protection is not the issue here. Still, all of these tend to be hyperindividualistic solutions, just like when Covid happened, and we need more widespread, structural solutions. Not everyone can stay home; many people still have to work and commute. You might tell people to hydrate as much as possible, but their work doesn’t offer free (or extra) water to them, and many places like restaurants and cafés still don’t. We tell people to invest in ACs and fans, but landlords and workplaces don’t want to install any, forbid the use, or don’t cover the price of these things. It’s like heat management is still an incredibly personal thing where everyone has to feel like they are fending for themselves, investing their own money into stockpiling resources and tech, and utilizing the privilege to avoid a lot of the heat by working from home/working inside, taking time off, calling in sick and so on. More collective heat management can look like: Free water in establishments everywhere, and drinking fountains spread throughout cities, with signs pointing to the next one. Designating libraries, community centers, schools, transit hubs and big shops like huge supermarkets as cooling centers during heat waves. Keeping trees, bushes, grass etc. intact and adding more. They help keep cities cooler, together with reflective roofs and lighter pavements. Legally mandating landlords to install ACs in rental units, especially ones directly below the roof (attic/loft/penthouse apartments), and cover specific windows in protective foil or external shutters. Requiring new(er) buildings to have specific insulation that helps in summer as well as winter, ventilation strategies, ACs, etc. and updating building codes so new housing remains habitable during prolonged heat waves, even without continuous air conditioning. More shaded areas in crowded places, waiting spots (public transportation), shaded pathways between major destinations. Rollout of functioning and resilient AC in all public transportation, hospitals, schools, universities, elderly homes etc. Extending opening hours into the early morning and late evening during extreme heat, with closure inbetween (or at the bare minimum, siestas). Temperature thresholds that trigger additional protections or suspension of certain work or studies. Preparing railroads, normal roads and other parts of the public from the intense heat effects or making them more heat resistant; otherwise you risk bent rails, melting bitumen etc. Distributing fans or subsidizing cooling equipment where appropriate. Strengthening electrical grids to cope with increased cooling demand, subsidizing electricity costs during declared heat emergencies, expanding renewable generation to reduce the emissions associated with increased cooling needs. And likely more I forgot. Yes, people will cry that this costs soooo much money. But remember that we have no problem investing that money into wars, AI, data centers, expensive proprietary software licenses, politicians’ money schemes and making billionaires richer. Landlords need to invest the rent into the property instead of enriching themselves and getting other people to pay off their mortgage. These aren’t one-time events, it will continue to get worse. Earlier in the year, longer, higher. Many people and animals will die. Everyone has to start preparing and learning from it now, and stop buying into the bullshit that “it was hot when I was a child too, we are just complaining more!!1!”. Your government is failing you if they are not acting now, and it is intentional, as the heat affects vulnerable and powerless groups the most. Make sure you check on old, sick, disabled people and people you know who take medication that makes them more vulnerable to the sun and/or heat. For example, diuretics, beta blockers, anticholinergics, and some antidepressants and stimulants. Reply via email Published 27 Jun, 2026 Hydrate a lot, even before you are actually thirsty. Stay inside if possible. Keep the added humidity to a minimum. Know what you are trying to do with drinks and showers. Cool drinks and showers offer relief, but can make you heat up after. Hot beverages and showers can make everything feel cooler after and help you sweat. I like both, depending on the situation. Wrap ice packs or similar stuff in a towel and put them under your feet or in your armpits. If possible, lower shutters so the sun cannot heat up the interior and the glass. Maybe install sun protection foil on windows (most are plant-friendly). I’ve also seen others provisionally use those reflecting covers for cars on their windows, or aluminium foil. Make sure that if it’s behind the glass, the heat won’t be trapped and make the glass crack, so preferably attach it on the outside. Sunscreen, wide breathable and covering clothing, sun umbrellas and hats. During fall/winter, maybe during Black Friday sales, get a portable split cooling system. Portables do not need structural changes to the building, which is why they tend to be allowed in rental units as they can be removed without a trace and aren’t in use all year. Shitty landlords might get mad to see it in your window, but in many countries, there already is positive case law about them and the usual AC dismissals don’t apply to them. Set out flat bowls of water in the shadow for wild animals and refill. Consider different ones for different sizes (a flat one with stone pebbles for insects, a relatively flat but water-only one for hedgehogs etc., one bird bath…). Use cool tiles and cooling mats for pets. Keep an eye out for baby birds who flee their overheated nests too early; maybe you can save some of them. Especially bitdd who live in attics and roofs are dying right now (swifts etc.) If possible and you can plan the shipment, avoid deliveries. Keep water around for delivery personnel. Eat smaller snacks and portions spread out throughout the day instead of big meals so your body doesn’t heat up as much during digestion. Leave the windows open all day. Let the sun heat up your interior, if possible; try at least covering windows with blankets if there are no shutters. Set out water for animals where it heats up drastically, or in a beverage where they might become trapped and drown. Walk your dog when the ground is heated up - asphalt burns happen quickly past 25 degrees Celsius. Fall for scalpers, scammers and increased prices for ACs and fans who are using the current demand and availability issues for profit. The Porta Split I mean to get can be bought for 550-750 Euro under normal circumstances, now during the heat wave, prices have exploded to over 1.4k. Only buy that if it is an emergency. Think fans or ACs can make you sick. This is a widely held belief especially in older generations in Germany at least, together with the myth that any wind can cause a cold and stiff neck. It is bullshit. It’s a big reason why this country is not prepared for this heat and there’s a 20% adoption rate for ACs here. Think you need to keep the fan off or not buy one at all because of the electricity bill. The increase is lower for newer models and for the few days you need to use it (more) (for now). You are also not meaningfully contributing to climate change with this increased energy use. Like, come on, they wanna build entire data centers eating away gigawatts, your heat protection is not the issue here. Free water in establishments everywhere, and drinking fountains spread throughout cities, with signs pointing to the next one. Designating libraries, community centers, schools, transit hubs and big shops like huge supermarkets as cooling centers during heat waves. Keeping trees, bushes, grass etc. intact and adding more. They help keep cities cooler, together with reflective roofs and lighter pavements. Legally mandating landlords to install ACs in rental units, especially ones directly below the roof (attic/loft/penthouse apartments), and cover specific windows in protective foil or external shutters. Requiring new(er) buildings to have specific insulation that helps in summer as well as winter, ventilation strategies, ACs, etc. and updating building codes so new housing remains habitable during prolonged heat waves, even without continuous air conditioning. More shaded areas in crowded places, waiting spots (public transportation), shaded pathways between major destinations. Rollout of functioning and resilient AC in all public transportation, hospitals, schools, universities, elderly homes etc. Extending opening hours into the early morning and late evening during extreme heat, with closure inbetween (or at the bare minimum, siestas). Temperature thresholds that trigger additional protections or suspension of certain work or studies. Preparing railroads, normal roads and other parts of the public from the intense heat effects or making them more heat resistant; otherwise you risk bent rails, melting bitumen etc. Distributing fans or subsidizing cooling equipment where appropriate. Strengthening electrical grids to cope with increased cooling demand, subsidizing electricity costs during declared heat emergencies, expanding renewable generation to reduce the emissions associated with increased cooling needs.

0 views
Kev Quirk Yesterday

3D Printers are actually very useful

I recently started getting into 3D printing , but so far I've spent most of my time getting setup and learning the ropes. I've now completed my first little project with the 3D printers and I'm really happy with the result. As a biker of many years I have a number of helmets lying around. This is because you're supposed to replace a helmet every 5 years, because the protective foam inside degrades over time. So I have a small collection of lids and nothing to really do with them. So, I decided to print myself some helmet stands and mount them in my office. There were 3 helmets I wanted to display. A reasonably well rated helmet stand on Amazon costs around £11 . So for the 3 lids, I'd be looking at £33 (~$45). Instead of handing over 33 of my finest pounds to Jeff Bezos, I decided to have a nose on Maker World and found this helmet stand that was very well rated. So I downloaded the files and set my printers to work, and a day or so later I had these little beauties: They feel really solid and have no problem holding a helmet on the wall. Better yet, they only cost me around £2.50 ($3.30) each in filament (~750g of filament in total), so way cheaper than the Amazon option. Today I finally had time to mount the lids to the wall, and I think they look great! Sure, I could have pissed about making toy dragons or whatever, but I think these are a far better use of my 3D printers, and really why I bought them. I'm so glad that the printed results are good enough to be useable. I already have some ideas of things I want to create next, but I'm going to have to start familiarising myself with FreeCAD for that project. We'll see how that goes... Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Ahead of AI Yesterday

Using Local Coding Agents

Many people reached out to me in the past asking about my local agent stack as well as how I set up my local agent stack. So, I thought it might be useful to put together a little tutorial on how to set up a local (coding) agent using open-source tools and open-weight LLMs. Figure 1: Overview of the local stack, that is, a coding agent harness that uses a local model hosted through an inference engine / runtime server. This article is a tutorial on setting up a production-ready coding agent with a fully local stack. We will use a locally served LLM together with a local coding harness that can read files, make edits, run commands, and verify changes as shown in the figure above. Here, we can think of the LLM as the engine that provides the reasoning and code generation. And the surrounding harness provides the operating environment that allows the LLM to do meaningful coding work in our local projects. Why local? For many coding workflows, a local setup is an interesting alternative to proprietary services such as GPT in Codex or Opus in Claude Code. The local setup is transparent, inspectable, and free to run apart from hardware and electricity costs. It also stays fully under your control, and you can modify the coding harness in any way you like. Plus, it’s a lot of fun! By the way, in case you want a bit more background information on coding agent harnesses, I covered the core components of coding agents (and building a coding agent from scratch for learning purposes) here: I have to admit that I still primarily alternate between Codex and Claude Code as my daily drivers, for now (and just to keep up with the new tooling and functions that are constantly being added). Also, the plan limits (especially for Codex) are still so generous that I haven’t had to worry about costs so far. However, I’ve been using local solutions for a while, too, to test things and because it somehow gives me joy to have and use a fully local setup (versus proprietary services). Either way, local solutions become more and more attractive each day. One aspect is the costs. If you have the hardware, they are practically free to run. And then there’s, of course, the privacy angle. For example, for organizing and processing my receipts, I’d be more comfortable with a local model ingesting them rather than sending the data over to OpenAI or Anthropic. (Then, if we keep in mind that Anthropic was recently throttling their flagship model’s performance for LLM research , proprietary services may become more restrictive over time, and it’s maybe a good idea to be comfortable with open-weight alternatives as a backup.) And there are many, many additional reasons and use cases like that. Your motivations for using local LLMs and coding harnesses may include: Predictable, fixed costs if you reach your subscription plan limits, and immunity to API price changes. Reproducibility; sometimes it’s nice if a model is upgraded (e.g., GPT 5.4 -> GPT 5.5 -> GPT 5.6) and it solves all your queries more reliably. However, this can also break existing workflows. Offline use in the classic airplane flight scenario with slow or no internet, or when going on a coding/writing retreat in the cabin in the woods w/o a Starlink subscription. And there are probably several others. So, in this article, we will set up and use popular harnesses like Codex and Claude Code with open-weight models and investigate whether using a model-specific harness (like Qwen-Code for Qwen3.6) brings any additional benefits. (Of course, there are many more harnesses like OpenCode, Cline, Pi, and Noumena Code, but I thought that most people already have muscle memory with either Codex or Claude Code, which makes switching to open-weight models a bit smoother). Most coding agent harnesses follow similar principles and have more or less the same features and functionality. However, the implementation details may differ, and certain LLMs have usually been primarily optimized for a specific harness. Of course, many open-weight LLMs like GLM 5.2, for example, would run Claude Code, etc. However, if an LLM developer also develops a coding harness, it is somewhat safe to assume that their model is optimized for their own harness first (while also supporting others). Here, I am primarily going to use Qwen3.6 with the Qwen-Coder coding client. However, I will also go over other options for using a local LLM with other agent harnesses, for example, Claude Code, Codex, and the increasingly popular Cline, but more on that later. The reason why I am primarily using Qwen-Code when working with Qwen models is that: it is open-source, like Codex ( https://github.com/openai/codex ) but unlike Claude Code; Qwen models have been specifically optimized for the Qwen-Code harness (more information below); I can run both Codex (with the latest GPT model) and Qwen-Code with a local Qwen model side by side on the same machine without having to switch manually back and forth between models. Regarding the second point in the list above, that Qwen models work better in Qwen-Code, Nvidia’s Polar: Agentic RL on Any Harness at Scale paper (May 2026) has a benchmark showing that the Qwen3.5-4B base model has the best coding performance in said Qwen-Code harness (both before and after their Polar-RL training), which I included below. Figure 2: Qwen model performance in different coding harnesses via Polar: Agentic RL on Any Harness at Scale ( https://arxiv.org/abs/2605.24220 ) The benchmark in the table above is for an older Qwen3.5 model, and I am assuming that the latest Qwen3.6 models are even further optimized to do well in Qwen-Code specifically. However, Pi ( https://github.com/earendil-works/pi ) also seems to be a very interesting candidate that I need to play around with in the future. By the way, Qwen3.6 35B-A3B is about 22 GB to download, requires roughly 30-40 GB of RAM, and runs pretty swiftly on both a Mac Mini with M4 and a DGX Spark. Based on the recent benchmarks shared by Cohere earlier in June, it is currently the best local model in its size class. Figure 3: Cohere benchmark from North Mini Code report published in June ( https://huggingface.co/blog/CohereLabs/introducing-north-mini-code ) As seen above, Qwen3.6 35B-A3B dominates all but one benchmark in this size class. However, that being said, Qwen Code is a general harness and also supports other types of models. For instance, we could also connect North Mini Code or Gemma 4 in Qwen Code. Figure 4: Yes, Qwen3.6 35B-A3B is a really good model! (Via x.com/pupposandro/status/2064707907489272147/) Architecture-wise, the Qwen3.6 35B-A3B model has hybrid attention similar to Qwen3-Coder and Qwen3.5. I wrote more about it in Beyond Standard LLMs . Figure 5: Qwen3.6 architecture and fact sheet from my LLM gallery . Alternatively, if you don’t want to use Qwen3.6, Cohere’s North Mini Code is probably the most interesting, capable alternative at this size class right now. I will go over this model in the next local LLM setup section as well. Figure 6: North Mini Code architecture and fact sheet from my LLM gallery . No matter what agent harness we use (Qwen-Code, Codex, or Claude Code), we have to set up a local LLM, such as Qwen3.6 35B-A3B, first. There are several options like Ollama, LM Studio, vLLM, SGLang, MLX, etc to serve models locally. You know from my Build A Large Language Model (From Scratch) and Build A Reasoning Model (From Scratch) projects that I like to code these myself. Implementing a model from scratch has the benefits that we understand the whole stack, plus we can modify and further train and fine-tune it. However, here, we just look for a model serving framework that has been super optimized for inference speed and resource needs since we don’t plan to do any training or fine-tuning at this point. (We could, as an extra step, convert and import our own from-scratch fine-tuned model into these efficient serving stacks, but this is out of the scope for this article.) For this tutorial, we will use Ollama as our efficient model serving engine because it’s relatively easy to install and use from the command line across different operating systems (although LM Studio also added a non-GUI client, but I am less familiar with it). By the way, I am not affiliated with any of the tools mentioned in this article, but one nice thing about Ollama is that they also optionally support open-weight models hosted in the cloud, including the currently strongest open-weight model, GLM 5.2, which is too large to run locally on consumer hardware. (The cloud models are not free, of course, but have similar subscription plans as ChatGPT and Claude; it’s still nice though that this option exists to conveniently test the latest state-of-the-art open-weight models “locally.”) Anyways, setting up Ollama is pretty straightforward, and you can find the official macOS/Linux/Windows download instructions on their download page. After installing, I recommend downloading a model for a quick test run. For instance, on macOS, we can use the ollama app to download models directly via the GUI: Figure 7: Using the Ollama app to find and download models Otherwise, this can be done on the command line as well via By the way, the above-mentioned qwen3.6:35b-mlx is a model using Apple’s Metal performance shaders, i.e., optimized for Macs with Apple silicon chips. I highly recommend using *-mlx versions of models working on Macs (if available). Figure 8: Prefer the MLX version when using a Mac (with an Apple Silicon chip). On a Linux machine, use the non-MLX version: Then, to make sure that it works, you can either use the GUI again or launch Ollama from the command line. Figure 9: Running Ollama in the terminal. You can exit this session via the command. As mentioned before, the currently best alternative to this Qwen3.6 35B-A3B model is North Mini Code 1.0 of similar size. Figure 10: North Mini Code 1.0 as an alternative to Qwen3.6 35B A3B. Before deciding on whether to use an LLM as a local coding agent, it’s usually not a bad idea to run a quick speed and quality assessment. Here, for the speed assessment, I would look for tokens/sec performance. Additionally, I’d also make sure this stays stable for (very) long contexts, which is what we are usually dealing with during agentic coding workflows (as opposed to simpler chatbots). Of course, we also don’t want the memory cost to explode either. You could run my ollama_speed_memory_bench.py script to do a quick check. In a nutshell, it sends different prompts (ranging from 1k to 50k words) to an Ollama model and asks it to generate up to 8k tokens by default. It reports simple statistics like prefill speed from Ollama’s prompt evaluation metrics, generation speed from output-token timing, and memory use from the Ollama process plus NVIDIA GPU memory when available. For example, to evaluate the on macOS, if you downloaded or cloned the scripts from https://github.com/rasbt/local-coding-agent-evals , we can run the following, which takes about 5 minutes: On Linux, we can run: Note that this assumes that you already downloaded the respective model as explained in the previous section. Also, depending on your system, if you have less than 30 GB RAM, you may have to use a smaller model like gemma4:e2b, which uses up to about 8 GB RAM on long contexts. Of course, there are also many smaller models, but in my experience, they make pretty bad local coding agents.) Note that for models, the RSS RAM report is not super accurate on macOS (especially for mlx model variants that utilize the Metal backend), and I suggest keeping an eye on the activity monitor’s RAM usage for Ollama during the run as well. In this case, the RAM usage fluctuated between 20 - 29 GB. Anyways, the bottom line is that for 50k contexts, the Qwen3.6 and North Mini Code models use up to 30 GB RAM and generate output with about 40 tok/sec on a recent Mac Mini and 30 tok/sec on a DGX. Below is a visual summary of the different runs. Figure 11: Quick speed comparison of the different models on different systems. Note that the macOS RAM consumption is not super accurate there. Also, note that the Qwen 35B-A3B model is faster on Mac than on the DGX Spark (which is the other way around for the Gemma 4 E2B model) thanks to the optimized MLX version. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals Another interesting question is how Qwen 35B-A3B compares to the similarly-sized Cohere North Mini model? If we take similarly quantized models into account (above, I was using the Qwen3.6 default), they are pretty similar, although North Mini is perhaps slightly ahead overall, as shown below. Figure 12: Q4-quantized Qwen3.6 35B vs North Mini Code. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals Anyway, the bottom line is that, in my opinion, anything faster than 20-30 tok/sec is pretty reasonable for local agent work. This is about the same speed as GPT 5.5 with “high” reasoning . In this case, both models clear the bar easily. By the way, personally, I run my agents almost exclusively on my DGX Spark because I don’t want my Mac Mini to get too hot and I want to have the RAM available for other tasks. Of course, there are always ways to optimize this more with different frameworks (other than Ollama), quantizations, MTP, and so on. However, Ollama is a good plug & play allrounder with minimal setup time that connects easily to various coding agent frameworks and where it’s super simple to swap and try out different models. After checking that the model is fast enough for convenient local work, I recommend doing a quick modeling performance assessment. Sure, there are many standardized benchmarks out there we could take a look at and even run ourselves. Usually, you can find the numbers for relevant benchmarks in the model’s technical report or model hub page. Usually, I also find it useful to look at a relative comparison with other models on https://artificialanalysis.ai/models/ . Figure 13: Benchmark from https://artificialanalysis.ai/models/ . Average performance (top), coding performance (center), agentic performance (bottom). Based on the figure above, we can see that Qwen3 35B-A3B is much more capable than the Gemma 4 E4B and E2B models, for example. Note that the Artificial Intelligence Index numbers keep changing over time as they swap benchmarks and update the weighting, so there are no “absolute” numbers we could use as a reference point for deciding which model is “good enough”. Rather, I would compare a new, interesting model to a model you used before as an anchor or reference point. Beyond standard benchmarks, I would also curate a personal set of tasks that are relevant to you to do a quick check whether this model is even suitable for any type of work that you might want it to perform. Below are the outputs of a reasoning- and code-related set of questions that also test the tool calling capabilities of the models. Here, the model returns the tool call but doesn’t execute the code itself. For instance, we can say that gets the conceptual debugging and security-review tasks right, but still struggles with agentic judgment around “what file/action first” tasks. is usable but not fully reliable for autonomous tool use. But a harness that constrains actions, adds retries, and maybe gives stronger project context could make it pretty usable. On the other hand, failing is a strong signal that it is less suitable for this kind of tool-use reasoning, even if it is fast. Note that the failures are not just formatting issues. It looks like it chooses the wrong tool, asks for clarification when enough context is present, etc. I would probably not use it as a coding-agent model beyond very narrow or heavily constrained tasks. Now, after this lengthy preamble setting up a local LLM, let’s get back to the main topic, the coding agent harness. As mentioned at the beginning of this article, we will use the qwen-code ( https://github.com/QwenLM/qwen-code ) harness, as Qwen models have been optimized for it. Figure 14: Next, we are trying to connect the locally served model to the coding agent harness. If you are familiar with Claude Code, it’s basically the same thing but fully open-source. However, I will also go over how to connect the local Qwen3.6 model to Codex and Claude Code in the next sections. Note that coding harnesses are much more capable than LLMs by themselves. This is where I recommend being more careful about what you are running and where. For instance, when trying new (coding) agents, I like to Do an audit of the (open-source) agent code base first. Run it on separate hardware (e.g., my DGX Spark) or a separate user account and/or virtual environment on my machine at the very least. Regarding the audit, I recommend looking for data sharing/egress and the default blast radius when it comes to file permissions, as well as some baseline robustness to prompt injection. The figure below attempts to summarize the main points. Figure 15: Practical audit checklist before running an installed coding agent harness. Similar concerns apply to the local model serving engine (e.g., Ollama) as well. However, coding agents require even more attention as they can directly read data from your machine and manipulate files. To do a basic audit, I recommend the following: Clone the repo: Ask a trusted agent you used before (like GPT 5.5 in Codex or Opus 4.8 in Claude Code) to review it with a focused prompt. Something like the following: You are auditing ./qwen-code before I install or run the agent on my machine. Focus only on practical local-machine risk from the installed agent and the code paths that create it: install scripts and package lifecycle hooks shell command execution by the agent file read/write boundaries at runtime secret handling and environment-variable inheritance how repo files, project instructions, and tool output can influence the agent MCP, plugin, extension, or tool integrations network calls and telemetry update mechanisms after installation terminal escape/output handling data egress and data residency Ignoring internet downloads that are strictly required for installation, check whether the installed agent can send prompts, files, telemetry, logs, identifiers, or metadata to remote servers when I use a local model through Ollama. Ignore cloud-model configurations. Do not infer risk from the project owner alone. Identify concrete endpoints, SDKs, default providers, environment variables, config defaults, and docs that control network behavior, including any endpoints operated in foreign countries or by third-party companies. Do not do broad style review. Do not refactor. Produce: high-risk findings with file/line references medium-risk concerns network/data-egress findings, including any foreign, third-party, or China-linked endpoints or defaults commands I should avoid running until reviewed settings or environment variables that reduce local-machine risk a short recommendation: safe to test in sandbox, safe to use, or do not run For each item, say whether it is expected behavior for a coding agent or inherently riskier than Codex or Claude Code. Below is a summary of the main findings (because the full report may be a bit boring and too long for this article): Local execution Qwen Code can run shell commands on our machine through its shell tool but there are strict approval controls unless permissive modes such as are enabled. This is expected for a coding agent, and it’s actually what makes it useful in practice. But of course it becomes risky if run unsandboxed or with a full environment containing secrets. Data egress Even with local Ollama, Qwen Code can send usage telemetry and metadata to Alibaba/Aliyun endpoints unless usage statistics and telemetry are disabled (more on that below). This is riskier than a local-only setup because model prompts may stay local, but session IDs, tool metadata, model info, and local base URL metadata can still leave the machine. But again, this is also common among all kinds of tools (yes, Codex and Claude do that as well). File and secret boundaries Workspace files are readable by default, while writes generally require approval and include some overwrite protections. This is good and standard agent practice. Prompt injection surfaces Repo instructions, tool output, MCP tools, extensions, and project config can influence the agent’s behavior. Prompt injection attacks can be reduced via the approval gates mentioned above. This is normal for coding agents, but untrusted repos should be treated as hostile by default because they can steer the agent toward reading files, running commands, or sending data through approved tools. Regarding the main privacy concerns in point 2, most of it is fixable via a custom with the following contents: The setting is a tradeoff. Security fixes will not be installed automatically, but I prefer having explicit control over when updates happen instead of letting the tool pull and apply new code in the background. By the way, cline ( https://github.com/Cline/Cline ), Codex ( https://github.com/openai/codex ), and Claude Code have similar telemetry data sharing defaults that would need to be disabled explicitly. (Note that Claude Code doesn’t have an official open-source version of their codebase, which makes trusting it even trickier, and it does seem to send data to both Anthropic and Datadog.) Either way, overall, it seems Qwen-Code follows standard practices, and as of this writing, there is no particular concern that is non-standard for coding agents. If we accept the reported findings and risks (personally, I didn’t see any red flags), we can now proceed with the installation and hook up our local Qwen3.6-35B-A3B model to Qwen Code (and Codex and Claude Code in the next sections). As mentioned before, I preferably experiment with and run coding agents, which can read and edit local files, on a separate machine (in my case a DGX Spark, but it could also be a separate Mac or Linux workstation). Alternatively, I would run it in a VM or set up a separate macOS or Linux user account as a practical middle ground. (I heard from some friends that they also rent servers for that, like Linode or Heroku, for tinkering purposes. However, instead of the monthly hosting costs for a somewhat capable machine, I would probably rather get a relatively cheap $200-500 hardware box, or even an old retired laptop, and run a local harness and then use a stronger open-weight model hosted in the cloud via Ollama cloud models, OpenRouter, etc if you are looking for alternatives to GPT or Claude.) Anyways, let’s install Qwen-Code. The listed options include, e.g., However, running the commands above assumes that the published artifacts match the code we just reviewed in the GitHub repo. If we are extra careful/paranoid, we can also build it ourselves from the GitHub repo. Be warned, this is more manual/messier though (I recommend executing them one at a time instead of copy & pasting the whole block into the terminal): After completing the installation, we can now launch the Qwen-Code client via the qwen command from the terminal to complete the setup and connect to the locally served LLM. For this, after running the qwen command, we select “Custom Provider”, as shown below. Figure 16: Choose “Custom Provider,” which lets us connect the Ollama LLM. Ollama uses the OpenAI API standard. So, next, we follow the on-screen setup guide and choose the “OpenAI-compatible” option. Figure 17: Since Ollama follows the OpenAI API standard, we choose “OpenAI-compatible” here. Next, we need to provide the API endpoint of the running Ollama application that serves our local LLM. Usually that’s the local address by default. We enter (including the /v1) since that’s the OpenAI-compatible base URL. Figure 18: Configure Qwen Code to use Ollama’s local OpenAI-compatible endpoint, . Next, we enter as our custom provider. Figure 19: Enter as the API key placeholder for the local custom provider. Next, we can select the available models. These are the ones that we downloaded via . You can enter only a single model or multiple ones separated by commas. You can double-check the list of downloaded models via . By the way, you can always add more models easily later (I’ll explain after completing the setup). Figure 20: Select the local Ollama models that Qwen Code should make available through the custom provider. We are almost done! In step 5/6, we of course select “Enable thinking” mode, which will result in higher token usage but the better resulting problem-solving capabilities are worth it. Figure 21: Enable thinking mode for the local model provider. And that’s basically it. Step 6 is basically a review step that we can confirm by pressing “Enter”. Congratulations, you should now have a working fully-local LLM workflow set up. The usage is pretty much similar to Claude Code, where you can use / commands for various functionality. E.g., you can switch models via the command, as shown below. Figure 22: Use to switch models. By the way, as I mentioned before, it’s relatively easy to add new models from ollama. Once you pull a new model via , you can add it as a new entry in . Here, just copy & paste an existing entry into the file and change the “id” and “name” to that of the Ollama model name. Figure 23: We can add new ollama models by editing the config file. Here, is the name of the ollama model name, e.g., . By the way, to update the qwen-code tool once in a while, if we used the git clone & local build route, we can pull a recent GitHub snapshot and update it as follows: Now that we have a fully working, local coding agent, the question is: how well does it perform, and is it actually good enough for my tasks? Of course, there are benchmarks for this, but in my opinion, nothing beats trying it for yourself on some of your workflow. In other words, this basically means using it for a day or two to decide whether it meets your bar. I also recommend compiling a small set of tasks that reflect your common coding agent usage. And if you come upon a particularly challenging one when working on a given project, it may not be a bad idea to add it to this set to evaluate future models. As an example of what I mean, I shared a relatively small, simple, and general set of tasks we can use to test the agents here on GitHub: https://github.com/rasbt/local-coding-agent-evals/tree/main/agent-problem-pack . This is basically an extension of the tasks from the Local LLM Setup section. The details on how to run these are in the GitHub README: https://github.com/rasbt/local-coding-agent-evals/tree/main/agent-problem-pack#quick-start-running-benchmarks-manually . Below is the outcome for the different LLMs tested in Qwen-Code. Figure 24: Small local agent capability benchmark using Qwen-Code. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals As we can see, both the Qwen3.6 and North Mini Code 35B-A3B models solve 4 out of 5 of these problems. Gemma 4 E2B fails a lot. Out of curiosity, I also added the a bit older Nemotron 3 Nano model. It has a similar size and compute performance as the aforementioned Qwen and North models, and it performs similarly well. Figure 25: Nemotron 3 Nano architecture overview from my LLM Gallery After setting up the local coding agent (and the article exceeding 5000 words), this would probably be a reasonable place to stop. However, as a bonus, I also thought it might be interesting to add brief Codex and Claude Code notes for completeness. Unfortunately, as far as I know, the Codex UI does not support non-OpenAI models, but we can use the Codex CLI to run our Ollama models. If you haven’t installed the OpenAI Codex CLI yet, you can get and install it analogously to qwen-code from their open-source GitHub directory: https://github.com/openai/codex (Yes, the Codex CLI is open source!) I will spare you the lengthy listing of the commands and recommend checking the repo’s README instead for the official instructions. (Cloning the repo and running an audit similar to qwen-code is not a bad idea here, as well.) Then, once installed, there are multiple ways to enable local model use. In my opinion, the most convenient way is to set up a separate config (inside the existing folder) with some default options: Figure 26: Set up a separate Ollama profile for Codex for convenience. Then, we can still use to launch the regular “Codex with GPT 5.5” mode and use our Ollama model via . Figure 27: Launch Codex using a local Ollama model. When rerunning the test cases from the Agent Capability Assessment section, to my surprise, Qwen3.6 does actually perform better via Codex compared to its “native” Qwen-Code coding harness, as shown below. Figure 28: Small local agent capability benchmark in Codex. Even though this is just a small set of benchmarks, it suggests that using Codex as the universal coding agent harness may not be such a bad idea after all. Of course, there is also the popular Claude Code agent harness that we could use as a harness around our local LLMs. While very popular and capable, this is probably my least favorite option for local setups because the codebase is proprietary. That also means we cannot readily inspect and/or disable Anthropic’s data logging practices. To set it up, if you don’t have Claude Code already installed on your machine, I suggest checking the official docs for recommended installation commands: https://code.claude.com/docs/en/quickstart . Claude Code itself does not expose the same local-provider configuration path as Codex. However, Ollama provides an integration via : https://docs.ollama.com/integrations/claude-code I.e., we can execute to run the Claude Code harness with an Ollama model. By the way, this also works for codex via , but I personally prefer the route we discussed earlier, as it gives me a bit more insight and control about how things works etc. Figure 29: Claude Code with a local Qwen3.6 model through Ollama. However, as a user, it feels like Claude Code takes much longer to come up with a solution. It probably has a much higher token usage. So, below, I additionally looked at the token usage of all three harnesses. As we can see, Claude Code uses by far the most tokens on average, Codex the least. Figure 30: Average token usage of the three harnesses for different LLMs. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals When it comes to the little agent capability assessment benchmark, the Qwen and North Mini Code models also get 5/5, and even the small Gemma 4 model does ok! Interestingly, we can also see that the token usage is largely driven by the harness, not the LLM itself. I.e., among all three LLMs that are capable of solving (almost) all 5 tasks, they all use the same number of tokens (e.g., Qwen3.6 uses roughly the same number of tokens as North Mini Code and Nemotron 3 Nano when used inside Claude Code). Only Gemma 4 uses fewer tokens, but it also fails almost all tasks, likely because of insufficient tool-calling capabilities where the tasks interrupt early. For reference, below is again the summarized task-success rate. Figure 31: Summarized task success rates. Anyway, the takeaway here is that if more tokens help the model-harness combination to solve more (and more complex) problems, great! But if we have two harnesses that both have an equal task success rate, a harness that uses 50% fewer tokens (e.g., Codex over Claude Code), then this is a huge win, because it will make tasks run twice as fast. However, the big caveat here is that task correctness is a necessary criterion, but it doesn’t measure code quality and readability, which are hard to assess automatically. PS: I tried to analyze why Claude Code uses more tokens, and it seems that the difference mainly comes from input tokens rather than output tokens. In other words, Claude is not writing twice as much. The logs suggest that Claude is repeatedly feeding more context back into the model across turns, including previous messages, tool calls, command outputs, and file contents. For example, one Claude run used about 578k input tokens but only about 4.5k output tokens across 25 turns. So the likely explanation is that Claude’s harness accumulates or accounts for a larger prompt-side history during multi-step agent runs. So far, all the setups we discussed assumed that we were running the local LLM on the same machine as the coding harness. However, what if we developed some trust in the coding agent harness and want to use it on our main Mac while the model itself is hosted on a different machine, e.g., a DGX Spark? In my opinion, the best (or most convenient) setup is an SSH tunnel from the Mac to the DGX. First, I suggest quitting Ollama on the Mac or changing the to something else below. Assuming we quit the Ollama app on the Mac, check that the following returns an empty output to indicate that Ollama is not available: Then run the following command on that Mac in a terminal window on the Mac side: That command means that we open an SSH connection to as user , which you need to adjust to whatever your username and machine name are. Then, the command forwards the Mac’s local port to on the DGX because of . Note that this is the Ollama address. The terminal running will look like it is hanging. That is normal. Keep it open while you use Qwen Code, Codex, or Claude Code. Press to stop the tunnel. So after it is running, use this on your Mac to see if the Mac can indeed access the ollama models from the DGX: If that returns the DGX models, your Mac tools can use the DGX Ollama server as if it were local. Then, just use Qwen Code and Codex just like above. For Claude via , the key is that the Mac-side command must see the tunneled endpoint. If needed: We focused on Qwen Code, Codex, and Claude Code because they are the most direct fit for coding-agent workflows. OpenClaw and Hermes are also capable, but they are broader agent harnesses. They are better suited when you want one agent to coordinate across tools, apps, browsers, terminals, and longer-running workflows. For coding work, I recommend starting with Qwen Code, Codex, or Claude Code first (and there are also many other interesting coding harnesses like OpenCode, Cline, Pi, and Noumena Code). And I would treat OpenClaw and Hermes as interesting follow-up options for things beyond coding rather than the first baseline for this local coding-agent setup. This was a long article with lots of information and configuration. If there are a few main takeaways, I’d say that it’s not the mechanistic setup pipeline but rather the considerations when running coding agents locally. That is, the most important part is not getting one specific tool installed, but understanding the model-serving layer, the agent harness, the permission model, and how to evaluate whether the setup actually solves coding tasks reliably. Of course, GPT 5.5 and Opus 4.8 are currently better than smaller open-weight models that run on a Mac or DGX Spark. But the newer Mixture-of-Experts models in the 30-35B range (such as Qwen3.6, North Mini Code, and Nemotron 3 Nano) are all very, very capable and really sufficient for a lot of tasks. And yes, they run with the same token speed as GPT 5.5 through a Pro subscription, so it should not necessarily slow down your workflows. The main consideration when setting up local agents, besides the model itself, is also which harness we want to use. The common perception is that models are usually optimized more for a specific harness than others (e.g., Qwen3.6 may work better in Qwen Code than Claude Code, for example). Based on the small agent assessment, this may not necessarily be true, though (this is only a very small benchmark, so take it with a big grain of salt). So, if you are more comfortable with a different harness that you have a lot of muscle memory with, like Codex and Claude Code, maybe it’s not a bad idea to just stick the model into that one and give it a try! Anyways, I hope the article was useful, and it got you interested in doing some tinkering with open-weight models. They are becoming more capable by the day, and it’s for some inexplicable reason just fun to run models locally. If you want to try the benchmarks yourself, the code and small evaluation tasks used in this article are available here: https://github.com/rasbt/local-coding-agent-evals Also, my Build a Reasoning Model (From Scratch) book has now gone to print and started shipping. I wanted to post a picture, but it will be 3 more days until it arrives. Build a Reasoning Model (From Scratch) If you liked my previous Build a Large Language Model (From Scratch) book, this is essentially a sequel implementing inference-time scaling techniques and reinforcement learning algorithms from scratch. And if you want to support future long-form articles like this one, consider becoming a paid subscriber . It helps me keep writing these independent deep dives and sharing the accompanying code, figures, and experiments. Figure 1: Overview of the local stack, that is, a coding agent harness that uses a local model hosted through an inference engine / runtime server. This article is a tutorial on setting up a production-ready coding agent with a fully local stack. We will use a locally served LLM together with a local coding harness that can read files, make edits, run commands, and verify changes as shown in the figure above. Here, we can think of the LLM as the engine that provides the reasoning and code generation. And the surrounding harness provides the operating environment that allows the LLM to do meaningful coding work in our local projects. Why local? For many coding workflows, a local setup is an interesting alternative to proprietary services such as GPT in Codex or Opus in Claude Code. The local setup is transparent, inspectable, and free to run apart from hardware and electricity costs. It also stays fully under your control, and you can modify the coding harness in any way you like. Plus, it’s a lot of fun! By the way, in case you want a bit more background information on coding agent harnesses, I covered the core components of coding agents (and building a coding agent from scratch for learning purposes) here: 1. Intro I have to admit that I still primarily alternate between Codex and Claude Code as my daily drivers, for now (and just to keep up with the new tooling and functions that are constantly being added). Also, the plan limits (especially for Codex) are still so generous that I haven’t had to worry about costs so far. However, I’ve been using local solutions for a while, too, to test things and because it somehow gives me joy to have and use a fully local setup (versus proprietary services). Either way, local solutions become more and more attractive each day. One aspect is the costs. If you have the hardware, they are practically free to run. And then there’s, of course, the privacy angle. For example, for organizing and processing my receipts, I’d be more comfortable with a local model ingesting them rather than sending the data over to OpenAI or Anthropic. (Then, if we keep in mind that Anthropic was recently throttling their flagship model’s performance for LLM research , proprietary services may become more restrictive over time, and it’s maybe a good idea to be comfortable with open-weight alternatives as a backup.) And there are many, many additional reasons and use cases like that. Your motivations for using local LLMs and coding harnesses may include: Predictable, fixed costs if you reach your subscription plan limits, and immunity to API price changes. Reproducibility; sometimes it’s nice if a model is upgraded (e.g., GPT 5.4 -> GPT 5.5 -> GPT 5.6) and it solves all your queries more reliably. However, this can also break existing workflows. Offline use in the classic airplane flight scenario with slow or no internet, or when going on a coding/writing retreat in the cabin in the woods w/o a Starlink subscription. it is open-source, like Codex ( https://github.com/openai/codex ) but unlike Claude Code; Qwen models have been specifically optimized for the Qwen-Code harness (more information below); I can run both Codex (with the latest GPT model) and Qwen-Code with a local Qwen model side by side on the same machine without having to switch manually back and forth between models. Figure 3: Cohere benchmark from North Mini Code report published in June ( https://huggingface.co/blog/CohereLabs/introducing-north-mini-code ) As seen above, Qwen3.6 35B-A3B dominates all but one benchmark in this size class. However, that being said, Qwen Code is a general harness and also supports other types of models. For instance, we could also connect North Mini Code or Gemma 4 in Qwen Code. Figure 4: Yes, Qwen3.6 35B-A3B is a really good model! (Via x.com/pupposandro/status/2064707907489272147/) Architecture-wise, the Qwen3.6 35B-A3B model has hybrid attention similar to Qwen3-Coder and Qwen3.5. I wrote more about it in Beyond Standard LLMs . Figure 5: Qwen3.6 architecture and fact sheet from my LLM gallery . Alternatively, if you don’t want to use Qwen3.6, Cohere’s North Mini Code is probably the most interesting, capable alternative at this size class right now. I will go over this model in the next local LLM setup section as well. Figure 6: North Mini Code architecture and fact sheet from my LLM gallery . 3. Local LLM Setup No matter what agent harness we use (Qwen-Code, Codex, or Claude Code), we have to set up a local LLM, such as Qwen3.6 35B-A3B, first. There are several options like Ollama, LM Studio, vLLM, SGLang, MLX, etc to serve models locally. You know from my Build A Large Language Model (From Scratch) and Build A Reasoning Model (From Scratch) projects that I like to code these myself. Implementing a model from scratch has the benefits that we understand the whole stack, plus we can modify and further train and fine-tune it. However, here, we just look for a model serving framework that has been super optimized for inference speed and resource needs since we don’t plan to do any training or fine-tuning at this point. (We could, as an extra step, convert and import our own from-scratch fine-tuned model into these efficient serving stacks, but this is out of the scope for this article.) For this tutorial, we will use Ollama as our efficient model serving engine because it’s relatively easy to install and use from the command line across different operating systems (although LM Studio also added a non-GUI client, but I am less familiar with it). By the way, I am not affiliated with any of the tools mentioned in this article, but one nice thing about Ollama is that they also optionally support open-weight models hosted in the cloud, including the currently strongest open-weight model, GLM 5.2, which is too large to run locally on consumer hardware. (The cloud models are not free, of course, but have similar subscription plans as ChatGPT and Claude; it’s still nice though that this option exists to conveniently test the latest state-of-the-art open-weight models “locally.”) Anyways, setting up Ollama is pretty straightforward, and you can find the official macOS/Linux/Windows download instructions on their download page. After installing, I recommend downloading a model for a quick test run. For instance, on macOS, we can use the ollama app to download models directly via the GUI: Figure 7: Using the Ollama app to find and download models Otherwise, this can be done on the command line as well via By the way, the above-mentioned qwen3.6:35b-mlx is a model using Apple’s Metal performance shaders, i.e., optimized for Macs with Apple silicon chips. I highly recommend using *-mlx versions of models working on Macs (if available). Figure 8: Prefer the MLX version when using a Mac (with an Apple Silicon chip). On a Linux machine, use the non-MLX version: Then, to make sure that it works, you can either use the GUI again or launch Ollama from the command line. Figure 9: Running Ollama in the terminal. You can exit this session via the command. As mentioned before, the currently best alternative to this Qwen3.6 35B-A3B model is North Mini Code 1.0 of similar size. Figure 10: North Mini Code 1.0 as an alternative to Qwen3.6 35B A3B. 4. Simple Speed Performance Assessment Before deciding on whether to use an LLM as a local coding agent, it’s usually not a bad idea to run a quick speed and quality assessment. Here, for the speed assessment, I would look for tokens/sec performance. Additionally, I’d also make sure this stays stable for (very) long contexts, which is what we are usually dealing with during agentic coding workflows (as opposed to simpler chatbots). Of course, we also don’t want the memory cost to explode either. You could run my ollama_speed_memory_bench.py script to do a quick check. In a nutshell, it sends different prompts (ranging from 1k to 50k words) to an Ollama model and asks it to generate up to 8k tokens by default. It reports simple statistics like prefill speed from Ollama’s prompt evaluation metrics, generation speed from output-token timing, and memory use from the Ollama process plus NVIDIA GPU memory when available. For example, to evaluate the on macOS, if you downloaded or cloned the scripts from https://github.com/rasbt/local-coding-agent-evals , we can run the following, which takes about 5 minutes: On Linux, we can run: Note that this assumes that you already downloaded the respective model as explained in the previous section. Also, depending on your system, if you have less than 30 GB RAM, you may have to use a smaller model like gemma4:e2b, which uses up to about 8 GB RAM on long contexts. Of course, there are also many smaller models, but in my experience, they make pretty bad local coding agents.) Note that for models, the RSS RAM report is not super accurate on macOS (especially for mlx model variants that utilize the Metal backend), and I suggest keeping an eye on the activity monitor’s RAM usage for Ollama during the run as well. In this case, the RAM usage fluctuated between 20 - 29 GB. Anyways, the bottom line is that for 50k contexts, the Qwen3.6 and North Mini Code models use up to 30 GB RAM and generate output with about 40 tok/sec on a recent Mac Mini and 30 tok/sec on a DGX. Below is a visual summary of the different runs. Figure 11: Quick speed comparison of the different models on different systems. Note that the macOS RAM consumption is not super accurate there. Also, note that the Qwen 35B-A3B model is faster on Mac than on the DGX Spark (which is the other way around for the Gemma 4 E2B model) thanks to the optimized MLX version. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals Another interesting question is how Qwen 35B-A3B compares to the similarly-sized Cohere North Mini model? If we take similarly quantized models into account (above, I was using the Qwen3.6 default), they are pretty similar, although North Mini is perhaps slightly ahead overall, as shown below. Figure 12: Q4-quantized Qwen3.6 35B vs North Mini Code. Code to reproduce: https://github.com/rasbt/local-coding-agent-evals Anyway, the bottom line is that, in my opinion, anything faster than 20-30 tok/sec is pretty reasonable for local agent work. This is about the same speed as GPT 5.5 with “high” reasoning . In this case, both models clear the bar easily. By the way, personally, I run my agents almost exclusively on my DGX Spark because I don’t want my Mac Mini to get too hot and I want to have the RAM available for other tasks. Of course, there are always ways to optimize this more with different frameworks (other than Ollama), quantizations, MTP, and so on. However, Ollama is a good plug & play allrounder with minimal setup time that connects easily to various coding agent frameworks and where it’s super simple to swap and try out different models. 5. Simple Benchmark Performance Assessment After checking that the model is fast enough for convenient local work, I recommend doing a quick modeling performance assessment. Sure, there are many standardized benchmarks out there we could take a look at and even run ourselves. Usually, you can find the numbers for relevant benchmarks in the model’s technical report or model hub page. Usually, I also find it useful to look at a relative comparison with other models on https://artificialanalysis.ai/models/ . Figure 13: Benchmark from https://artificialanalysis.ai/models/ . Average performance (top), coding performance (center), agentic performance (bottom). Based on the figure above, we can see that Qwen3 35B-A3B is much more capable than the Gemma 4 E4B and E2B models, for example. Note that the Artificial Intelligence Index numbers keep changing over time as they swap benchmarks and update the weighting, so there are no “absolute” numbers we could use as a reference point for deciding which model is “good enough”. Rather, I would compare a new, interesting model to a model you used before as an anchor or reference point. Beyond standard benchmarks, I would also curate a personal set of tasks that are relevant to you to do a quick check whether this model is even suitable for any type of work that you might want it to perform. Below are the outputs of a reasoning- and code-related set of questions that also test the tool calling capabilities of the models. Here, the model returns the tool call but doesn’t execute the code itself. For instance, we can say that gets the conceptual debugging and security-review tasks right, but still struggles with agentic judgment around “what file/action first” tasks. is usable but not fully reliable for autonomous tool use. But a harness that constrains actions, adds retries, and maybe gives stronger project context could make it pretty usable. On the other hand, failing is a strong signal that it is less suitable for this kind of tool-use reasoning, even if it is fast. Note that the failures are not just formatting issues. It looks like it chooses the wrong tool, asks for clarification when enough context is present, etc. I would probably not use it as a coding-agent model beyond very narrow or heavily constrained tasks. 6. Agent Code Base Audit Now, after this lengthy preamble setting up a local LLM, let’s get back to the main topic, the coding agent harness. As mentioned at the beginning of this article, we will use the qwen-code ( https://github.com/QwenLM/qwen-code ) harness, as Qwen models have been optimized for it. Figure 14: Next, we are trying to connect the locally served model to the coding agent harness. If you are familiar with Claude Code, it’s basically the same thing but fully open-source. However, I will also go over how to connect the local Qwen3.6 model to Codex and Claude Code in the next sections. Note that coding harnesses are much more capable than LLMs by themselves. This is where I recommend being more careful about what you are running and where. For instance, when trying new (coding) agents, I like to Do an audit of the (open-source) agent code base first. Run it on separate hardware (e.g., my DGX Spark) or a separate user account and/or virtual environment on my machine at the very least. Figure 15: Practical audit checklist before running an installed coding agent harness. Similar concerns apply to the local model serving engine (e.g., Ollama) as well. However, coding agents require even more attention as they can directly read data from your machine and manipulate files. To do a basic audit, I recommend the following: Clone the repo: Ask a trusted agent you used before (like GPT 5.5 in Codex or Opus 4.8 in Claude Code) to review it with a focused prompt. Something like the following: install scripts and package lifecycle hooks shell command execution by the agent file read/write boundaries at runtime secret handling and environment-variable inheritance how repo files, project instructions, and tool output can influence the agent MCP, plugin, extension, or tool integrations network calls and telemetry update mechanisms after installation terminal escape/output handling data egress and data residency high-risk findings with file/line references medium-risk concerns network/data-egress findings, including any foreign, third-party, or China-linked endpoints or defaults commands I should avoid running until reviewed settings or environment variables that reduce local-machine risk a short recommendation: safe to test in sandbox, safe to use, or do not run Local execution Qwen Code can run shell commands on our machine through its shell tool but there are strict approval controls unless permissive modes such as are enabled. This is expected for a coding agent, and it’s actually what makes it useful in practice. But of course it becomes risky if run unsandboxed or with a full environment containing secrets. Data egress Even with local Ollama, Qwen Code can send usage telemetry and metadata to Alibaba/Aliyun endpoints unless usage statistics and telemetry are disabled (more on that below). This is riskier than a local-only setup because model prompts may stay local, but session IDs, tool metadata, model info, and local base URL metadata can still leave the machine. But again, this is also common among all kinds of tools (yes, Codex and Claude do that as well). File and secret boundaries Workspace files are readable by default, while writes generally require approval and include some overwrite protections. This is good and standard agent practice. Prompt injection surfaces Repo instructions, tool output, MCP tools, extensions, and project config can influence the agent’s behavior. Prompt injection attacks can be reduced via the approval gates mentioned above. This is normal for coding agents, but untrusted repos should be treated as hostile by default because they can steer the agent toward reading files, running commands, or sending data through approved tools.

0 views
iDiallo Yesterday

All Chinese Models Will Be Illegal in 3... 2... 1...

The Washington Post reported that the US government will decide who can use state-of-the-art LLMs . After the ban of Fable and the limitations coming to ChatGPT 5.6, what's next? My bet is Chinese models. For all of Anthropic's doomsaying and propping up of their secret model Mythos, several open-weight models have proven capable of similar feats, and at a fraction of the cost. DeepSeek rocked the AI world in December 2024 with their initial release, nearly sending shockwaves through American stock markets. Last year, I looked into getting a BYD electric car. At the price they were selling for, I figured that even with a 100% tariff slapped on top, it would still be a bargain. Then I discovered that not only is there a steep import tariff, you simply cannot register the car in the United States. The car itself is illegal. According to reviews from people who actually own one, it's a fantastic vehicle that would outcompete most cars on the US market. Because of that, the US simply banned it. So what does this mean for large language models? If we're now told that state-of-the-art LLMs are too dangerous for the general public, what happens to Chinese models that are equally powerful? People will start flocking to DeepSeek and zAI. The quality matches OpenAI and Anthropic, the models are open-weight, and the cost is dramatically lower. The logical next step, if you're a DC lobbyist on retainer for a San Francisco AI lab, is to ban them. We don't live in rational times. The only path to an IPO for Anthropic and OpenAI is to kick the ladder out from under everyone else and get Washington to call it "safety policy." Download the models while you still can, because once the regulation drops, owning a local copy of DeepSeek might just make you a dissident.

0 views
Unsung Yesterday

Noise as information and information as noise

In 1982, the videogame Yars’ Revenge for the Atari 2600 needed to show a “neutral zone” in the middle of the screen. The console was so primitive – an entire great book was written about this – that it didn’t have any video memory. Any cheap effect would do, even random noise… but something as simple as generating noise was also too much for the underpowered system. So the creator of the game decided to do something that in any other situation would mean at the very least trouble, if not a downright security disaster. He crossed the wires and output on screen… the game’s own source code: The source code looked noisy enough, and the problem was solved. (Somewhat recently, Retro Game Mechanics Explained analyzed it carefully in this YouTube video , to make sure it’s not just a myth.) = 2x) and (width >= 700px)" srcset="https://unsung.aresluna.org/_media/noise-as-information-and-information-as-noise/yt1-play.2096w.avif" type="image/avif"> = 3x) or (width >= 700px)" srcset="https://unsung.aresluna.org/_media/noise-as-information-and-information-as-noise/yt1-play.1600w.avif" type="image/avif"> A similar approach was used in a Nintendo GameCube game Metroid Prime , at a moment when the protagonist’s visor needed to appear disrupted. It was two decades later, but the team still bounced off of hardware limitations, this time around memory : The GameCube only has 24MB of RAM, so every texture has to be carefully considered. If we used a low resolution texture (64x64) to save memory the “static” would be blurry and not crisp. One engineer on the team came up with a great idea: what if we just use the memory holding the Metroid Prime code itself! We quickly tried it out and it looked amazing. When you see Samus’s visor affected by electrical “noise” in game, you’re actually seeing the bits and bytes of the Metroid Prime software code itself being rendered on the screen. Turns out machine code is sufficiently random to work great as a static noise texture! This is how it looked: A few years later, in 2008, people working on Xbox 360 were testing a new interface for their entire console. It was called NXE – New Xbox Experience – and in the bottom-right corner it showed delightful ripples: …or, not just delightful. While NXE was tested internally, the ripples actually encoded the serial number of the console, to prevent leaks . Apparently, it was built specifically so that Microsoft only needed just two images to find out the entire serial number. A less surreptitious version of this idea exists today – for example, setting up a new Apple Watch shows a pretty pattern… …that also happens to encode enough information to identify the specific one watch. It really appears to be nothing more than an obfuscated QR Code, and “boy, have they patented it .” I know concealing a message inside another message is called steganography . I don’t think all of these fall under that umbrella, and I don’t even know all the above can be called “hacks.” I just thought they were interesting examples of information masquerading as noise, and noise pretending to be information. #games #graphics #hacks #security #youtube

0 views

Premium: Notes From The Bubble, Volume 1

It’s been an incredibly long few weeks, and as a result my previously-planned Hater’s Guide just isn’t possible within what little time I have left in this week, which is why I’m starting an ongoing series — Notes From The Bubble — where I’m going to dig into the various stories that have stood out to me in the last few weeks and what they mean for the greater tech ecosystem. It’ll be my weapon of choice going forward for the (few) weeks where a greater narrative is taking longer to pull together than usual. I also think it’s time for something a little more light-hearted after a few hundred thousand words of deeply-researched financial nightmare fuel. As serious as the tech industry’s descent into cargo cultism has become, it’s really important to laugh at how disordered and goofy everybody has become as they realize that we’re flat out of hypergrowth ideas . Every time you see something stupid, desperate, ridiculous or disconnected from reality, know it’s a symptom of the greater fear that AI isn’t the next big thing, and that everything is an attempt to put off accepting that truth or, alternatively, create another hype cycle so we can avoid talking about it. I know this all sounds a little reductive, but look at the current state of the tech industry. Meta is creating a Polymarket competitor . Snap is launching its third generation of AR glasses that nobody wants , I assume to compete with Meta’s AI glasses that are exclusively owned by influencers and people that should be banned from public restrooms. Microsoft has gone from loving OpenAI to loving Anthropic to loving open source LLMs and decrying the idea that any one company could control the entire AI ecosystem, somehow missing that Microsoft is the largest AI infrastructure provider in the world and is the reason that this industry exists. Google invested $75 million in movie studio A24 as part of some sort of nebulous AI partnership that will likely result in very little actually happening.  Oh, and you can now watch Instagram on your TV . This is the modern tech industry: a series of cobbled-together ideas pushed out by also-rans with massive monopolies and talent suffocated by executives that haven’t had a human experience in decades. Can you imagine Satya Nadella or Mark Zuckerberg buying something from a hardware store? Do you think they know how to use a vending machine? When did any of these people last pay a bill, or worry about anything other than shareholder value and stock-based compensation? How often do you think Sundar Pichai actually uses Google, Google Docs, or any other products blighted with a Gemini pop-up?  Today’s newsletter will be a longer-form column, a series of thoughts on the current state of the tech industry. Welcome…to Notes From The Bubble.

0 views
Stratechery 2 days ago

2026.26: Summer Vibes

Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Anthropic’s Safety Superpower . A Vibe Coding Adventure. It is thrilling to be an analyst in the age of AI, particularly because the questions seem so weighty. Are software companies doomed? Will white collars work exist in a decade? Might chip policy lead to war in the Taiwan Strait? All valid! And, at the same time, fretting about the future can foreclose an appreciation at how incredibly awesome this technology is, and that the possibilities really are endless. You can do anything — even organize your garage. That might sound silly, but technology, for all of its importance, is also fun, and I’m having a blast . — Ben Thompson Apple in Europe (but not Siri AI).  It was a footnote to Apple’s announcements at WWDC two weeks ago, but as expected, the now-fully-function Apple Intelligence products — aka Siri AI — will not be released in Europe because of the company’s ongoing battle with European regulators over the Digital Markets Act. On Dithering Tuesday, Ben and Gruber had a great 15-minute discussion about how maddening the situation continues to be, but I also appreciated the end of Ben’s Daily Update on Tuesday , which covered the same topic and explained why Apple’s own policies may well be what creates the long-term competitive changes the EU hopes to see.  — Andrew Sharp A Midsummer Mailbag on Sharp Tech. Every time a major holiday approaches, we try to celebrate on Sharp Tech with an extended mailbag that, thanks to the listeners, tends to be a lot of fun. Ben and I did that again for this week’s episode , and in addition to thoughts on the future of the memory chip market and more of Ben’s experience with vibe coding, we hit questions on our daily caffeine intake, Sam Altman’s PR strategy, data centers in the ocean, and how to improve international soccer. Come for both substance and pre-vacation goofiness, and whether you’re traveling next week or not, happy 4th of July!  — AS Apple Price Increases, Apple Intelligence and the E.U. — Apple is (finally) raising prices, but they’re not shipping Siri AI to the E.U. Memory Chips and China, Microsoft and Chinese Models — The big three memory makers may come to regret opening up the door to Chinese memory makers; Microsoft, meanwhile, is very incentivized to use Chinese models. My Vibe Coding Adventure, The App and the Experience, Ten Takeaways — My experience and reflections on vibe coding an app that I plan on actually using regularly. An Interview with Figma CEO Dylan Field About Design and AI — An interview with Figma CEO Dylan Field about building Figma, and why he believes AI gives the company a tailwind. Hopes, Fears, and the Wizards — A window into Washington Wizards fandom during a very big week, after a very long decade. No Siri for EU Price Hikes Embedded Memories: The Next Generation Party Building and Xi’s Dominance; Memory Chips and ASML Accusations; Germany’s Puzzling Push for Plaza Accords Draft Week Winners and Losers, Miami Gets Giannis and Boston Gets Awkward, Micah Nori and a Blazers Experiment A Summer Break Mailbag: Memory Mania, Vibe Coding, Mafia PR, Caffeine Intake, Garages, and How to Fix Soccer

0 views

New iPad

After my last post , I pulled the trigger and went with an iPad Pro 11” with Apple Pencil Pro and Magic Keyboard case. Thankfully I got it the night before the massive Apple price hikes (although it still cost way too much). I gotta say, I love this thing! Obviously it’s a huge upgrade, I jumped forward 6 years in tech from my last iPad. The form factor is much nicer as well, the 12.9” was simply too big. 11” is perfect for getting work done, sketching, gaming and using it as an e-reader. I’m planning to sell off my Kindle Oasis and Supernote Nomad, the new iPad has easily replaced both. In addition to the iPad, I super splurged and grabbed a new lens for my Sony camera. Both purchases are in preparation for our trip to China in August. My goal is to pack light, since we’ll be traveling with two kids. The iPad replaces the need for a computer + e-reader + game console (hey, it’s a long trip)! The camera lens is significantly smaller and less bulky than my other lenses, increasing the likelihood I’ll carry the camera and snap more photos. I’ve already tested out photo editing on the iPad with Pixelmator Pro and the RAW files from my Sony. The experience is excellent, especially with the Apple Pencil in the mix. The M5 processor rips through any task I throw at it (it’s funny my iPad is now significantly more powerful than my MacBook Pro). Outside of our trip, I expect my traditional computers (desktop + laptops) will see a lot less usage. At this stage in my life, the iPad does 90% of what I need. For example, my entire blog publishing flow is now possible on this tablet. I can connect my SD card, edit photos with Pixelmator Pro, write the post and upload a draft with iA Writer, then attach the photos and publish via the Micro Blog website (yes, I changed again in preparation for the trip). I’m excited to use this setup in the “field”. I’ll have to find a nice cafe in Baotou to write and edit photos from 😜.

0 views
Ankur Sethi 2 days ago

Your analytics are lying to you

Alistair Davidson writes about migrating a form-heavy web application from a React SPA to a traditional server-rendered HTML-first website . The entire article is worth reading, but I want to draw attention to this bit about analytics (emphasis mine): The results? When we launched, the number of people completing the form doubled. The analytics people didn’t even know where these users were coming from. Of course, your javascript-based analytics package doesn’t see the users you are bouncing because of javascript failures. It was a flood! We also saw my “keep a backend session, never lose user data” approach pay off. In one case, someone completed a form a month after starting it. Web analytics are fragile. They fail in so many ways that making product decisions based wholly on your Google Analytics or Plausible data is folly of the highest degree. Here's a subset of all the reasons your analytics package undercounts or miscounts visitors: Web analytics can only give you an approximation of what your web traffic looks like. Even when they work correctly, they paint an incomplete picture. As I said in my post about share buttons , the number one referrer for pages on this website is "Direct/none". It's impossible for Plausible to figure out where those users are coming from. Further, my server logs report three times as much traffic as my Plausible dashboard over a seven day window. Some of this might be bot traffic and thus irrelevant, but I know for a fact that a large chunk of this traffic comes from RSS readers. Plausible will never have insight into these users. My point is, if you rely on your analytics dashboard to make product decisions, you're excluding a large chunk of potential users who simply don't show up in your graphs. You might be missing out on serving thousands of potential users because you can't see them in your data. These are users who want to sign up for your newsletter, buy your app, subscribe to your service. These are human beings you could help, whose lives you could improve. I'm not saying that analytics are completely useless. They can and should have a place in your decision-making process. Just don't treat analytics data as gospel, because there will always be massive blind spots in what it tells you. To get a real understanding of how users experience your products, test them on real devices under real conditions as much as possible. And as always, get out there and talk to your users. Network errors prevent your analytics script from loading. Ad-blockers and tracking prevention block your script from loading (enabled by default on many browsers today). A JavaScript error in an unrelated part of the page prevents the analytics script from working correctly. The user loses network connectivity before the analytics script can send data to the server. The user gets impatient and bounces off your website before the page can load fully and start collecting data. Too much JavaScript on the page causes the browser tab to crash (a common issue on low-end devices). The analytics script is blocked by a DNS rule, corporate proxy, firewall, or VPN. The user has disabled JavaScript. The user's browser has limited or no support for JavaScript (Opera Mini still has more than half a million downloads on Android, and it's still widely-used in Africa ). The user is accessing your content using a service that strips JavaScript (e.g. an RSS reader, a web archiving tool, Telegram Instant View, AMP, a read-later service, or a bookmarking service). You only test your app in Chrome, so you don't realize that your website is entirely broken in Firefox and Safari.

0 views
David Bushell 2 days ago

ARIA, anti-patterns, and you

Please take a minute to understand what ARIA is and is not. ARIA and especially the ARIA Authoring Practices Guide (APG) are commonly misunderstood. I read an article the other day that had this facepalm moment: And with modern LLM agents, turning a spec into working code is surprisingly fast. Point the agent at the APG pattern, describe your component’s markup, and get a solid first draft you can refine and test. This is worrying, and the use of “LLM agents” isn’t the worst part! The APG is not a how-to guide of ‘best practices’ for building accessible websites. It exists to demonstrate how the ARIA specification should work in theory — regardless of support and regardless of whether more accessible, non-ARIA patterns exist (they do). As Eric Bailey notes — The guide was originally authored to help demonstrate ARIA’s capabilities. As a result, its code examples near-exclusively, overwhelmingly, and disproportionately favor ARIA. What I Wish Someone Told Me When I Was Getting Into ARIA - Eric Bailey — which makes sense, because: Browser and assistive technology developers can thus utilize code in this guide to help assess the quality of their support for ARIA 1.2. Read Me First - ARIA Authoring Practices Guide (APG) Even if ARIA was fully supported ( it’s not ) the APG still wouldn’t be a ‘best practice’ guide. ‘Best practice’ is not using ARIA at all. If you can use a native HTML element or attribute with the semantics and behavior you require already built in , instead of re-purposing an element and adding an ARIA role, state or property to make it accessible, then do so . 2.1 First Rule of ARIA Use - Using ARIA, W3C APG exists in a vacuum to show off the ARIA spec. The button example includes this code, for crying out loud! I’m unaware of any circumstance where should ever be used over a . Before you tell me you can’t edit your React component library, do the web a favour and delete your codebase. In fairness, the button example has a “Read This First” disclosure — and guess what: they use a element and not the disclosure pattern because the APG isn’t best practice. It’s hard to blame developers for misusing ARIA and the APG. I’ve been confused myself. As W3C documentation goes, APG is rather sexy. It’s a useful resource if you understand why it exists. Misuse of ARIA has made the web less accessible. Increased ARIA usage on pages was associated with higher detected errors. The more ARIA attributes that were present, the more detected accessibility errors could be expected. The WebAIM Million - WebAIM Avoid ARIA where ever possible. Don’t point a freaking LLM at the APG! I can’t believe I’m saying this but use Google’s slop if you absolutely refuse to learn/code yourself. Apparently OpenAI is throwing ARIA at the web and seeing what sticks. Ahhh! I don’t know anymore, take some pride in your expertise? P.S. name an assistive technology that isn’t a screen reader. Ain’t easy, is it? So don’t be casually punctuating with the word “test” like it’s some get-out-of-jail-free card for your dubious practice and advice. “Overview of Digital Accessibility Technologies” by Declan Chidlow is a great help if you want to win this game at parties. Thanks for reading! Follow me on Mastodon and Bluesky . Subscribe to my Blog and Notes or Combined feeds.

0 views
Kev Quirk 2 days ago

📝 2026-06-26 13:59: My first rather large #3DPrinting project. Can anyone work out what they are? (No they're...

My first rather large #3DPrinting project. Can anyone work out what they are? (No they're not abstract Starship Enterprises) Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
Anton Zhiyanov 2 days ago

Solod v0.2: Networking, new targets, friendlier interop

Solod ( So ) is a system-level language with Go syntax, zero runtime, and a familiar standard library. It's designed for two main audiences: The previous version (v0.1) focused on porting core Go stdlib packages and providing convenient C interop. At the end of that post, I said the next release would focus on networking, concurrency, or both. Now, networking is here — the v0.2 release I'm sharing today includes support for TCP, UDP, and Unix domain sockets. Concurrency is still planned for the future, so for now, servers handle one connection at a time. This release also lets you compile So to more targets, like 32-bit platforms, WebAssembly, and bare metal. And C interop even smoother! Networking • TCP server • TCP client • Deadlines • IP addresses • Targets • Interop • Stdlib • Wrapping up The main feature in v0.2 is the package. It's a simplified version of Go's package which supports the three most commonly used transports: The API mirrors Go closely, so most of it will feel familiar. The big difference is that So has no goroutines, so there's no concurrent server support — you accept and serve connections sequentially. More on that in a moment. Let's build a classic: an echo server that accepts a connection, reads a message, and sends it back. If you've written a TCP server in Go, this should look familiar — , an loop, and / on the connection. The only thing missing is a : without goroutines, each connection is handled to completion before moving on to the next . The client starts the connection using , then uses to send a request and to get the reply: UDP and Unix domain sockets work in a similar way. For UDP, an unconnected socket uses to get data and the sender's address, and to send a reply. For Unix sockets, there are (stream) and (datagram). By default, , , and are blocking. In Go, you'd typically use goroutines and contexts to prevent getting stuck forever. Since that's not available in So (yet), every connection and listener supports deadlines instead: , , and are available on , , , and listener types. When the deadline passes, any pending call fails with . If you don't set a deadline, a blocked call will wait forever. This isn't concurrency, but it's enough to keep a single-threaded server responsive. Along with , v0.2 ports Go's package, which provides small, allocation-free value types for IP addresses. represents an IP address, combines an IP address with a port, and is an IP with a prefix length (a CIDR block): These are simple value types that don't use any heap allocation, which fits well with So's explicit-memory approach. The package also provides and functions to help you work with strings. Solod compiles to plain C, which (in theory) means it can target anything a C compiler can. Because of this, v0.2 adds new targets: Here's the complete toolchain you need to build a freestanding binary using : A large part of the standard library ( , , , , , , , and more) works just fine in freestanding mode. For more details, check out the freestanding guide . A bunch of smaller changes make Solod nicer to write. Three new directives for low-level work, all documented in the interop guide : works with variables, constants, types, and functions. You can use it on multiple lines, and the attributes will stack. For example, will combine with . Type aliases . So now supports Go-style type aliases: Numeric C types . The package now includes named types for C's numeric types — , , , , , , and others. When you declare an extern function, you can use the actual C types in its signature instead of trying to guess the correct fixed-width Go type for your platform. Third-party packages . You can now add external So packages using or by vendoring, and you can organize your own code into multiple modules. So doesn't have a real package ecosystem yet, but it's a good start. Better diagnostics . By default, panic messages report the C file and line. Pass to report the original So source location instead: There's also an optional flag that adds nil-pointer checks when accessing struct fields and calling interface methods. This way, if there's a bad dereference, the program will panic cleanly instead of causing a segmentation fault. Both options are off by default to keep the generated code more readable. Beyond and , v0.2 adds a few more packages: And a small but handy update to memory management: now reclaims the last allocation if you give it the matching pointer. It's a minor optimization, but it means a quick alloc/free pair on an arena no longer wastes space. Stdlib documentation With v0.2, Solod has evolved from just "command-line tools and C glue" into something you can actually use on a network — like a TCP or UDP server, a small protocol client, or a Unix-socket daemon. The new targets (32-bit, WASM, freestanding) mean the same code can now run in more places, even down to bare metal. The big thing that's still missing is concurrency. A server that handles requests one at a time works for some tasks, but a real network service needs to manage many connections at once. That's the obvious goal for v0.3 — adding some kind of concurrency, along with the stdlib packages that support it. If you're interested, take a look at So's readme — it has everything you need to get started. Or try So online without installing anything. Go developers who want low-level control and zero-cost C interop without having to learn Zig or Odin. C developers who like Go's style. TCP (networks , , ) via , , and , with the and types. UDP (networks , , ) via , (a connected socket), and (an unconnected socket with / ). Unix domain sockets ( for streams, for datagrams) via , , , and . 32-bit platforms . The compiler and stdlib now work correctly on 32-bit platforms, where and pointers are narrower. WebAssembly (WASI) . You can compile a So program to and run it under any WASI runtime. Freestanding mode . So programs can run on bare-metal systems without any C standard library. No libc means no malloc, but you can use instead. — hex encoding and decoding, including for hexdump-style output. — generating and parsing UUIDs (v4 and v7), with random components from a cryptographically secure source.

0 views
Stone Tools 2 days ago

Visual Basic on the PC w/Windows 3.1

If I dig deep into my own heart, really self-reflect, I find I simply don't possess whatever people like Bill Gates and Elon Musk do. I think most of us are content to know we've touched a life or two, helped make someone's existence a bit more pleasant, and can feel gratitude toward the universe for those small miracles. Others seem to know no limit in their acquisition of influence, power, and wealth. For them, it isn't simply enough to guide an industry, they must be the industry. In this zero-sum game, there is no upper limit to their cravings Before Musk became the first (I'm choking on the word) trillionaire , Gates was the world's richest person for a couple of decades. Like Musk, he crossed a specific monetary milestone back in 1999 as the "first person with a net worth exceeding $100 billion," about $200B in 2026 money. How he earned it and what he did with it has been the subject of any number of documentaries , books , movies , interviews , depositions , and damning rumors . I think the media can agree on at least one point relevant to our discussion today: Bill Gates was hellbent on owning the entire personal computing landscape. He said as much, out loud, on stage, to industry professionals, in front of the press. Jacqui Morby recounted the story on The Computer Chronicles . "Gary (Kildall) got up (at the Rosen Forum panel discussion) and talked about what his plans were for CP/M and where the company was going, and then made a comment, 'Well, this is a very large market, and there's room for lots of companies.' Bill Gates interrupted and said, 'No, there'll only be one company.'" He didn't seem particularly interested in creating innovative things, so much as he wanted to make sure that the innovations of others had a Microsoft response. While working with Apple to develop software for the original Macintosh, Andy Hertzfeld recalled a story of Gates digging in for system details that didn't really have anything to do with the business applications being built by Microsoft. Shortly thereafter, Windows 1.0 released, much to Steve Jobs's frustration . Jobs wouldn't be the last to feel screwed over by Microsoft "taking" ideas . Another tactic employed by Gates was absorption, the tried and true fast-track to acquiring toys one lacks. Consider the story of Alan Cooper . Coincidentally the idea for a visual application builder "popped into his head" just as HyperCard debuted, in 1987, triggered by Microsoft's announced adoption of DLLs, dynamic link libraries, which provided easy access to core operating system functions to whomever wanted to tap into them. Cooper saw this as a unique foundation upon which to build a kind of "construction set" for the DOS visual shell of your corporate dreams. Don't like the default Windows shell? Build your own! Microsoft engineer Gabe Newell was super impressed with Cooper's demo of the construction set, then called Tripod, and arranged for a demonstration for Gates. From the excellent article, "Something Pretty Right" by Ryan Lucas. "Why can't we do stuff like this?" is very revealing phrasing, IMHO as an armchair psychologist. Give that line to 1,000 actors and you'll get 1,000 unique performances balancing the tension between frustration and longing. As a Very Rich Guy™, there was nothing Gates wanted that he couldn't have. Like someone who pays others to level up their RPG character , US$1M and a contract later, Tripod (renamed Ruby) was his. While Cooper insists that HyperCard had no influence on the creation of Tripod , Gates most certainly was thinking about it. In his article "The 25th Birthday of BASIC" for BYTE Magazine , October 1989 ( Visual Basic would debut in 1991). Ruby was reformulated into something with but a passing resemblance to Tripod . Its bespoke scripting language was replaced with a variant of BASIC, and the goal of the program was no longer to build shells on top of the Microsoft DLLs, but to build applications for Microsoft's own shell, Windows 3.0. Visual Basic was born, arguably a more profound product than Cooper's original vision. Credit where it's due, Gates saw potential that Cooper himself couldn't see. A while back, I dug into Apple's HyperCard . Visual Basic gives us an interesting opportunity to look at a similar first-party, visual programming solution from Microsoft's perspective. Like HyperCard , Visual Basic had its own dedicated magazine , and inspired legions of developers long after Microsoft ceased support in 2008. As recently as 2023 , Microsoft has had to issue official statements on their support plans for "classic" Visual Basic, which keeps a huge number of bespoke, legacy applications alive, something HyperCard cannot claim. The Microsoft vs. Apple wars of the day almost necessitated taking sides, but in truth each has something it could learn from the other. Visual Basic 3.0 was the last pure 16-bit application in the line, and was the first version to include robust database capabilities. The true potential of the product was unlocked. This particular OS/application combination is much more in keeping with the spirit of this blog, I feel. There's a lot to learn. When I studied HyperCard , I noted the 1,000 page book that awaited me. Visual Basic ships with 3,000 pages, to say nothing of the wealth of 3rd party publications; an industry unto itself. As a man who recently took another annual step toward that great Blue Screen in the sky, every tick of the second hand gently rattles my bones. For large projects like this I have to consider how quickly I can get up to speed. Well, given the temperament of training books of the day, I suppose the proper first consideration is, "How dumb am I?" I refer to myself as a "big dummy" in blog posts, and I stand by that assertation, but I don't like it when others call me dumb. I can handle more complex material, but like I said, I don't have a lot of time. How quickly can I learn Visual Basic ? That seems unabsorbably fast . Maybe if I didn't sleep? I think I'd forget everything by Monday. Also by Tuesday. "Proglaming" sounds like fun, but a week is still too fast for my pace. Getting closer. Perfect. Slow enough for an old man to follow; fast enough to finish with time to spare before involuntary admission into a retirement home. If I weren't 40 years too late, I'd throw my own hat into the publishing ring and combine "I'm a big dummy" with "I want to learn this quickly." It's been a long time since I last touched Windows 3.1. It's funny, my memory of it doesn't match my hands-on experience today. I recall it being far uglier, though it still suffers from absurdly large title bars which don't provide much in the way of information or utility. I dig that (VGA mode) powder blue , though. It's handsome if perhaps uninspired, the result of a collaboration between Microsoft and IBM for OS/2's Presentation Manager (which predates Windows 2.0). Their "Joint Development Agreement" gave pretty broad latitude to both companies to use, without licensing fees, code shared between the two companies. I'm not even tangentially familiar with law, but it does read, in part: That gave Windows 2 and 3 a nice glow-up after the flop of Windows 1.0. Initially, even Microsoft had trouble getting their own developers to build Windows applications. I imagine it must have been a huge relief for Gates to have a tool that not only made it easy to build Windows applications, but that could even be an enjoyable experience. Jumping into Visual Basic , the first impression is, "I can do this." It looks approachable. I can't explain what every button in the toolbar does, but some of the basic stuff is as easy to identify as in HyperCard . Adding a control, like a text field, is a double-click away. The "Properties" panel makes intuitive sense, for tweaking the characteristics of a selected control, something HyperCard lacks. Appending code to a control is as simple as double-clicking its instance in the window. "Properties" is context aware, only showing what can be tweaked on the selected object. For the large part, the industry abandoned this contextual approach. I wonder why? PageMaker was leaning that way with its control panel, and InDesign promptly threw that away in favor of persistent controls for things that aren't even in the current document context. Why do we need text kerning tools on screen when there's not even a text box in the current document, in Affinity for example ? Tools like Figma , Apple's Pages seem to have kept the contextual flame alive, which is nice to see. "Pros want every tool on-screen at all times," a UX consultant once said with a straight face, I guess. The toolbar could stand to be better organized and starts gesturing in the direction of that meme image about Microsoft's love of buttons . They certainly did lean heavily on this UI metaphor crutch, as a catch-all way of cramming in as many features as possible. It's confusing at times (why a "picture box" and also "images?"), but with this version of the program, on this operating system, things haven't gotten completely out of hand yet. We're getting up to speed on the controls and how to interface with them today. Let's consider some nice things about Visual Basic's approach. I am rapidly growing to appreciate the keyboard shortcuts for UI elements, like buttons and sliders. Visual Basic makes it super simple to add a keyboard hook to an on-screen control. Simply label a button with in the confusingly named "caption" property and the following character will become the keyboard shortcut, via . So, an "Exit" button with the "caption" will read and will function identically to a mouse click on that button. When I say "identically" I do mean identically. The button's built-in method will be triggered, the same as if a mouse had done it. We don't have to worry about bifurcating control logic between keyboard and mouse for such interactions. We're then treated to an amuse bouche of off-kilter things to come. Checkboxes and radio buttons both have an on/off state, where any number of checkboxes can be on/off, but only one radio button in a set can be on. When programming with these controls, checkboxes return a value of or to represent unchecked or checked. Radio buttons return a or boolean on each of the options. For now, we'll file this under "Things That Make Me Give a Skeptical Sideways Glance." After spending a couple of days with it, the built-in text editor is driving me crazy, a "feature" Visual Basic shares with HyperCard ; neither is good. I can excuse a lack of autocomplete, a tool that would debut with Visual Basic 5 , as "Something Yet to be Invented." I cannot excuse the lack of indentation assistance and word-wraps, both already common features in word processors of the day. Microsoft has given us a smidge more than the absolute bare-minimum for a text editor. Keeping code tidy and readable requires significant, diligent effort on my part; it's not coming easily to me. I appreciate the auto-capitalization (though Basic is case-insensitive) and coloring on language keywords, but syntax checking and formatting a line of text the instant I've repositioned the cursor is annoying. Unfinished lines throw up modal dialogs warning me of interpreter troubles, triggered as easily as moving the cursor up or down for a moment. It's unwieldy to sketch out a code block to fill in the details later with those constant interruptions. It would be nice to be able to trigger the parser on-demand. We're learning about the mouse and how to handle mouse events. From a programmatic standpoint, this is pretty basic stuff. One of the nice things about the code editor is the pulldown in the top toolbar surfaces all possible functions for a selected UI element. We don't have to try to remember the exact name and spelling of a function; just pick the one you want to edit and get started. A setting that is theoretically interesting is the default unit of measurement for elements. Until now, I'd never heard of "twips": a "twentieth of a point". Where a point is 72/inch, there are 1,440 twips/inch. Windows used this as a device-independent standardized unit of measure. For on-screen, a conversion to pixels was used, and for print a conversion to printer resolution was used. Any form you design in Visual Basic can be trivially sent to the printer with a simple Basic call, and it will print at the resolution of the printer, not your screen. The coolest trick, though, is "edit and continue." Because the program is being constantly interpreted, not compiled, we can run the program, pause it, modify the code, and continue live execution. This is super handy for iterating solutions to annoying bugs. The Microsoft-faithful have really never known a world without this. The Apple-faithful have had this tantalizing fruit dangled before them a couple of times now, never quite delivering on the promise. I like it. In building out WIMP applications , we need to fill out the "M" part of that acronym. Today we learn how to build menus using the "Menu Design Window." The tool is competent, if a bit inelegant. Initially, it is easy to bang out a rough outline of an application's menu structure without taking one's hands off the keyboard; mouse-free is always a welcome option. Type a menu item, hit , type the next, hit , and the next, etc. Then, apply structure to the menu with the on-screen arrow tools for indentation/reordering elements. Alas, we cannot indent at the time of menu item entry, that hierarchy must be set in a separate step later. One disappointing absence is any kind of relationship between menu elements. Moving a menu item with "submenu" items will not move those submenu elements with it. No "outliner" style editing, ala ThinkTank , here. We also cannot multi-select items to edit them as a group, something we can do with form controls. Slow, patient, one-at-a-time editing of menu items is all we get. To be fair, menus can be programmatically generated, which may honestly be a better option in many ways. That pulls us away from the "Visual" in Visual Basic , though, don't it? The design window also forces its vertical editing into a horizontal view, another "Things That Make Me Give a Skeptical Sideways Glance." The example in the screenshot shows a 3-level menu, and I'm nowhere close to filling that horizontal space. It's wasted screen real estate, made more aggravating by the fact that the menu design window cannot be resized . As I think many in the industry have internalized by now, an editor view should place its primary content front and center, with refining elements playing a supporting role. The menu item properties would be much better served filling the right-hand side of the window, giving the menu itself vertical breathing room on the left. It's one of those things that probably gets better over the years, but is conspicuously half-baked for version 3 of the product. "It's OK, but I expected better by version 3," will be a running theme going forward. Now that I've been at this for a week, the angle of approach to visual programming HyperCard and Visual Basic each take has come into sharper focus. Initially, their superficial similarities led me to expect more direct parity between the two. Both provide a visual toolkit for designing interfaces. Both use a more simplistic language than the core language for each platform. Neither is truly "object oriented" (if that's important to you). Both were killed despite amassing a large, passionate following. Even a simple inspection of their toolbars highlights the philosophical difference between the two approaches. Most of the HyperCard toolbox is devoted to drawing pictures, with the controls reduced to buttons and text fields. It is constantly surprising to me how much mileage is squeezed out of such a restricted set of UI controls. Microsoft, on the other hand, offers a toolbar button for each and every thing you might want to add to an application. They take inverted approaches. Where I might add a generic button in HyperCard , then attach a script which invokes the system file browser, Visual Basic gives me a pre-built file browser control to drag into my app. I prefer Visual Basic's approach of "drag out a rectangle to define a control," especially for buttons and text fields; it feels more modern in its UX. HyperCard makes us add controls strictly by pulldown menu, then we have to drag the corners of the button, with no visual indicators, into the new size. Surprisingly awkward. Visual Basic also earns points in offering a grid to snap elements to position, making it much easier than HyperCard to align and scale elements precisely with one another. Gotta do a lot of eyeballin' on the HyperCard side of things; its grid only works in paint mode. Consequently, laying out something like a calculator is much faster and easier in Visual Basic , at the expense (?) of looking exactly like any other Windows program ever made. (Although the demo calculator doesn't look anything like the actual Windows calculator?) Don't get me wrong, conformance to corporate homogeneity may be exactly what you need at times and Visual Basic can generate something "professional looking" in a jiffy. It is, perhaps, devoid of character, but it also creates something a Windows user can look at and trust. Breaking free of those somewhat rigid constraints requires considered effort in Visual Basic , whereas HyperCard practically begs us to go hog wild. We're firmly in "learning Basic" land here; the application itself doesn't have a whole lot else to it. The panel for exporting our .exe files is about as barebones as one could imagine. There's a color palette, but I'm not entirely clear why; colors for controls can be set in the Properties palette via its own popup color palette. I should also give a shout out to the built-in Help system. Though I wish it were context aware, there's an absurd amount of information available right there in Windows without having to crack open the 10 pound manual. HyperCard has Balloon Help, which is nice and cute, but also anemic; we only get as much explanation as fits in a couple of sentences. Visual Basic's help system gives lengthy, detailed explanations of topics with code samples, is searchable, is bookmarkable (!), has tutorials for understanding the principles of the program, and more. It's quite good! The last week of my training book gets intense with discussions on make files, database connectivity, MDI (multiple document interface), DDE (dynamic data exchange), interfacing with DLLs, and so on. We've only been building throw-away toy applications so far, and I honestly don't feel the book has mentally equipped me for these hairier discussions. It's a pretty significant cognitive leap from the simplicity I feel the product promised. The long and the short of it is, I'm learning enough Basic to squeak by and get a sense of its tempo and grammar, but as a first-time user I find it more overwhelming than HyperTalk. HyperCard and Visual Basic each come with a 600+ page language reference guide. Microsoft also throws in three more manuals, another 2,400+ pages, for good measure. Its language guide would expand to 1,000+ pages in Visual Basic 4. Brevity is the very soul of cowards, I guess was their stance. Though their language reference guides are similar length, Microsoft's is a far more dense, dry tome. Apple spends the first 150 pages talking about "What even is programming?" and the last 150 pages getting into topics outside the scope of HyperTalk; a slim 300 pages to describe the language. Let's examine some concrete examples. Here's how to make the system thrice on the click of a button in HyperCard : Here's how to (ostensibly) do that in Visual Basic 3: Full disclosure: this didn't work, even though it is the example given in the "Programmer's Guide." Something is coalescing the three beeps into one. DOSBox-X issue? Because scripts are kind of "embedded" into their respective HyperCard objects, we don't have to disambiguate subroutines with prefixes; any given script is scoped precisely to its associated GUI object. It's the La Croix of object orientation; just a whiff of a hint of that flavor. HyperCard's approach lends itself better to casual tinkering around, but Visual Basic has an edge in surfacing all functions of our application in the code editor. In HyperCard we have to remember which object contains which code block, or hunt through all objects individually, searching for the code we want. Visual Basic's approach requires unique names for all subroutines. This makes it fairly trivial to trigger events across objects. If we want a button to click another button by proxy, we would have to do something like this in HyperTalk: Sometimes I wish HyperTalk would allow dot-syntax for object specifier chains. In Visual Basic, we simply call the uniquely-named function directly: Where HyperTalk takes a gentle, English-like approach to its language, Visual Basic isn't afraid to be far more "programmery." HyperTalk developers can certainly get into their own weeds trying to figure out the precise incantation to sidestep the interpreter and achieve specific goals. Conversely, Visual Basic developers could quickly find themselves in a world of memory management, DLLs, batch files, and make files. Both developers feel some pain, but one is kind of orthogonal to the other. Your preference may depend on which breed of demon you enjoy slaying. As clearly evidenced by the Voyager series of software and MYST , highly professional software was possible with HyperCard . That said, the upper boundary for Visual Basic feels much higher. As a simple example, with the keyword we can reach in and directly call the Windows Kernel (or any existing) DLL; this of course being the killer feature that triggered Alan Cooper to develop the program in the first place. That's impossible to do out-of-the-box with HyperCard ; it cannot access the Macintosh Toolbox so deftly. Likewise with database data, Visual Basic gives us flexibility in what kind of data to bring in, like dBASE or FoxPro . There may be specialized stacks or XCMDs (plugins) to HyperCard that can assist with these tasks, but nothing native to the program. However, HyperCard provides its own built-in database free of charge, requiring no special effort on the developer's part to leverage it. Building something like an address book is simply a matter of adding some text fields to a card. Those will function like fields in a database by default, and actions like saving/loading user data will happen transparently. Adding search, or something similar, takes a few extra steps, but is conceptually simple through a HyperTalk command like Visual Basic provides a "Data Manager" module, which allows us to create simple Access databases for use as the backbone of the application. This is all explained in detail in the supplemental 300+ page "Visual Basic 3.0 Professional Features, Book 2." Once the database is built, interfacing with its records is straightforward using the "Data Control" tool. When the database is linked in properly, controls like images and text fields can be wired up directly to their corresponding fields in the database schema, called "bound controls." The database widget itself provides buttons to step through records and corresponding data will auto-populate the bound layout elements. If "browsing" is the extent of your database needs, you're in good shape. I imagine most will want to do more than that, perhaps adding fields, or doing search queries. You'll want to steel yourself, because it gets gnarly real quick. I'll just say that the book is 300+ pages for a reason, with talk about complex subjects like Dynasets, Snapshots, Tables, the JET engine, SQL queries, and more. It's far more capable than HyperCard , as we can work with multiple databases in our VB application, access remote databases, and more. That power is paired with an equivalent learning curve, one which is thrust upon any developer who needs even a tiny bit more than the drag-and-drop controls provide. Overall, it would be fair to call the IDE "competent." It contains the tools we need, arranged by palette, and makes certain actions (like adding a button) as easy as a double-click. What's not to like? I think what frustrates me about these tools is how they feel like somewhat careless design solutions to their respective problems. Look at the "Properties" palette, for example. This looks, to my eyes, like a developer was told, "The properties for a selected object should be available for editing." The developer iterated them as a literal list, adding some basic editing niceties, like making a color chooser available when a color property is edited. What I find in practice is that the vast majority of the properties go untouched, especially for something like a Form object, and the ones I actually need require scrolling through a long list to find and edit. Later properties in the list, even those which are common to all controls, shift around in position depending on how many properties a given control has. I'm constantly having to read through that list, scanning for the "Name" property, which is where we set the programmatic name for the control. It's arguably the most important property , and it's playing peek-a-boo. When I make a new form (a "form" is a window; I don't know why they call it a "form") I have a few things I need to set right off the bat: the size, the title, and the programming reference name. After that, sometimes I want to set the background color. We'll ignore the fact that property names don't make sense; naming conventions had perhaps not yet been firmly established in an era when the terms UI and UX had not yet become common vernacular. From a pure, "What is the user most likely to need?" point of view, this simple alphabetical list is the laziest solution to the design challenge. Fair point, HyperCard's lack of any properties palette was more lazy, but this is version 3 of this product. I frankly (perhaps unfairly) expect more considered effort from a first-party solution. My frustration extends to the main toolbox as well. It's just a bunch of buttons with no organizational structure applied. Tooltips, similar to what we understand today, were introduced with Macintosh System 7 as "Balloon Help" the same year VB3 released, so I can't fault Microsoft for "failing to implement" them in this release. Still, icon-only is a lazy way to handle it, when the goal is to shove as many icons into the toolbar as possible. Asymetrix Toolbook 3 , a similar visual IDE for Windows development, provides more robust, logically arranged tools for the job. Here's the text editor and object properties panels. Note in particular a few things: Visual Basic itself contains a similar contextual help in other parts of the application, like its "Crystal Reports" tool, making its absence in the main app even more frustrating. This kind of haphazard application of tools and controls feels sloppy, which reminds me of something I wanted to talk about. While going through the official manuals for Visual Basic , something kept bothering me. I couldn't put my finger on it at first, but once I saw it, my eyes were forever cursed . This is a small grievance, "petty" some would say, "a colossal waste of mental resources" others may scoff. But what's a tech blog without a certain level of pedantry? I'm not above pedantry. Here we see the Visual Basic 3 manual is laid out in Helvetica and Times. Man, I'm already bored. Anyway, beyond the utterly pedestrian font choices (in fairness, they did have to lay out 3,000+ pages of this stuff), something seems "off" about it. In particular, that Helvetica looks malformed, with sloppy kerning and unbalanced strokes. Let's take a closer look. Helvetica Neue doesn't match, and Arial (my original suspect) is ruled out by the end caps on the capital "C". Helvetica Condensed is also not right. Wait, I see what's happening. It's the same issue I have with the user interface, manifested in the manual. This isn't Helvetica Condensed, it's Helvetica physically squashed into a fake condensed version. The richest man in the world couldn't afford to buy a proper condensed font for his company? "Or is this indicative of a deeper issue?" he asked, slipping back into his pop-psychology armchair. It smacks of "good enough," never striving for "great." That kind of sums up my feelings toward Windows and Windows applications of this period. The stuff worked, and had obvious success, but never seemed to be borne of thoughtful consideration. Did that inattention to detail come from cost-cutting measures, or perhaps some kind of cultural blindness? Were the deficiencies seen and ignored, or simply not seen at all? And that reminds me of something else I wanted to talk about. In the PBS documentary series, Triumph of the Nerds , Steve Jobs famously said of Microsoft, "They have no taste." I genuinely think Bill Gates could not understand the meaning of Jobs's accusation. Or rather, he couldn't fathom why "taste" should enter into his calculus whatsoever. Having no taste didn't stop him from becoming the richest man in the world. What does "taste" have to do with stockholder value? When Apple teased with a new release of OS X, "Redmond, start your photocopiers," I think Gates was thinking, "Of course we will. Thanks for the free R&D." He bristled at being publicly chastised for copying , but my read on that is he really wanted to say, "So what if we copy Apple? Why shouldn't we? Look at our success and tell me it hasn't been a good strategy." What Jobs saw as creative bankruptcy, Gates saw as business efficiency. Being asked to frame his success on Jobs's terms ruffled Gates's feathers. Jobs said, and I agree, that innovation means saying "no" to 1000 things before saying "yes." "Process" is that very action. "Process" is the pruning of the possibility space. It's the self-awareness to distinguish "good enough" from "great." It's when you step away from your work, give it the critical stink eye, and apply taste . That's an impossible task if one has no taste to begin with. So what's a tasteless corporation to do? While Microsoft may have not cared too much about process, they had manufacturing down cold. Put in PenPoint OS, out pops Windows for Pen Computing. Put in OS X 10.3, out pops Windows Vista. Put in Java, out pops J++. Put in a Dreamcast, out pops an Xbox. Even today, similar "factory production" charges are levied against them. I'm not suggesting they "stole" ideas so much as they simply seemed content to let others do the hard work of saying "no" 1,000 times. While they may have shortcut the creative process, they still had to learn how to manufacture products. In so doing, they accidentally picked up a little taste along the way, which would lead to pretty good stuff from time to time. It's been part of the fabric of the industry for decades, and now the torch of manufacturing tasteless product from the creative work of others has been passed on to generative AI. To scale , no less. The ramifications weigh heavily on my mind, especially when someone publicly calls for the absorption of my work into the generative AI apparatus. I'm both flattered and appalled. On average, how many times do you think I rewrite the introductions to these posts? How many thousands of words have I thrown away to reach something approaching what I wanted to actually say? I tend to rewrite intros 3 or 4 times, and I mean that truly; each rewrite is radically different from the others. In this post alone, I have thrown away some 5,000 words. Some might think those 5,000 words are the cost of the process, but that's not right. They are the process. The unpublished words are the important ones. Those are the words that got me to these words. Knowing that, throw any creative work into the generative wood chipper and it should be obvious why what comes out cannot live up to the original. It's lacking the 1,000 nos. I'm disappointed in the ending of this book. Day 21 comes and goes without even a hint of acknowledgement that we've made it through the gauntlet. At the end of it all, we also haven't built anything of value. Every chapter created little baby programs to illustrate specific concepts; no project built upon a previous project except for a few shallow, superficial glow-ups. Contrast that with HyperCard , where we had a full-fledged address book, with database, search, custom art, and save/load. With Visual Basic , I never felt that same spark I did with HyperCard . Visual Basic seems great for when you have a strong idea of what you want to build. However, its lack of drawing tools and "don't worry about it, I've got you covered" database curtail creative exploration far more than I would have predicted at the beginning of my studies. Not having to worry about those details opens up a wider world of "lemme try something real quick" experimentation and iteration. In an ideal product, I'd combine the prototyping strengths of HyperCard with the professional-strength of Visual Basic . Then, later we could swap out the default database with Access, or export the placeholder drawings as image assets for a professional artist to clean up in another revision. I cannot personally find a place for Visual Basic in my heart, but I can absolutely understand why it took off. It filled a major gap in the programming landscape, helping amateurs and pro-ams build tools for themselves, and even throwing a lifeline to a generation of COBOL engineers needing to transition ASAP. Like Apple with HyperCard , that gap was re-opened by the discontinuation of the product, abandoning a whole fleet of developers and, perhaps just as importantly, potential developers. I suppose nothing lasts forever, but these companies are worth multi (choking on the word again) TRILLIONS of US dollars. At valuations like that, with the fealty they demand from us, I consider it a moral imperative for them to provide excellent tools which empower the widest possible breadth of users' skill levels. Not providing such tools is a choice . Considered from another angle, I'll leave you with this open question. What software do Apple and Microsoft provide today that will be spoken of, with the same reverence as HyperCard and Visual Basic, 25 years from now? Ways to improve the experience, notable deficiencies, workarounds, and notes about incorporating the software into modern workflows (if possible). With Visual Basic 3, 2, 1, and DOS 1.0, the applications you build are 16-bit only and are therefore relegated to running only in virtual environments on 64-bit Windows. If this fits your modus operandi, you're in good shape. If you're hoping to keep it old-school, but still want the option of running your creation on modern hardware, then you'll want to get Visual Basic 6 up and running in Windows 2000? XP? I tried it in Windows 98SE and it wouldn't launch. VB6 builds 32-bit applications as standalone, compiled executables, can connect to the Internet, and produces builds which run on Windows 10/11. Note that Windows 11 promises to run applications built with VB6 , but does not promise to run VB6 itself. However, I gave it a shot and though there were issues with the install, and the IDE acts a little weird, and it complains on launch about missing OLE files, it did run and I was able to build an executable on Windows 11. For funsies, here's Gates and Jobs demonstrating their respective visual programming environments. Gates giving a subdued demo of the just-announced Visual Basic 1.0 . His voice cracking at 0:33 is adorable . Jobs had just returned to Apple after they bought NeXT, and here he's showing the technology Apple has bet its future on. We know it today as Xcode , but it started life as Interface Builder . The line he drew between components in the demo was called a "binding," something that has conceptually resurfaced in SwiftUI. DOSBox-X 2026.01.02, Windows x64 build. CPU set to Pentium DOS reports as v6.22 Host system folder mounted as drive C:\ holds Windows Windows 3.1, basic installation 1024 x 768, 32K colors under DOS reports total RAM, but Free only reports . Good enough for today, but 16-bit Windows should be able to register 4MB, not just 2MB. A few extra applications for comparative/convenience reasons: Toolbook, Actor, ObjectVision, Acrobat Distiller Visual Basic 3.0 Reports 386 Enhanced Mode enabled Reports free RAM In lieu of tooltips, at the bottom of the current window we have a contextual description of the current tool, much like Bank Street Writer and Lotus 1-2-3 . The text editor includes indent/outdent tools, can set our editing font of choice, waits to check syntax until we ask it to, and even includes a simple "build a function" utility to wire up common tasks to common UI events. The properties panel is laid out hierarchically, keeping the most-needed stuff front and center, while demoting less-used options to secondary emphasis. DOSBox-X ran everything smoothly and without issue. I did not install Windows on top of real DOS, though. I relied on DOSBox-X's implementation. This may account for a couple of strange issues, outlined below. I experienced one crash in Visual Basic 3 , when accessing the Help system. Issuing a looped command resulted in only a single system beep. My guess is that something in the emulated environment is suppressing this. I could never get databases to connect, even the ones that ship with Visual Basic , let alone any personal data carried over from previous database explorations. It may be the result of DOSBox-X using an emulated version of . Strangely, I saw it work once and then it stopped working as suddenly as it started and never worked again. An installation of Windows on a proper installation of MS-DOS might fix this problem.

0 views
daniel.haxx.se 2 days ago

A curl mountain movie

One of my favorite visuals for known vulnerabilities in curl is the mountain . It shows how many currently known vulnerabilities were present in the code through-out curl’s history. In the end of June 2026 it looks like this: Over time we get more vulnerabilities reported. Since every flaw has a version range during which the problem existed and with more issues that have overlapping version ranges, the mountain grows. It changes shape every time we do a release or we publish a new vulnerability. At this moment in time, curl version 7.34.0 is the release that contains the most number of known vulnerabilities: 101 . The worst one ever if you will. Out of a total of 206. The mountain uses different colors for different severity levels of the published vulnerabilities, as the legend in the top-left of the image explains. To illustrate the ever-changing nature of the shape and size, I wrote a script that renders the mountain the way it looked at specific dates in the past up until today. More specifically, the script renders one image for every month since curl started (March 1998). I then turned these 340 individual images into a little movie that shows how it grew into today’s shape. At four months/second. The data for this come from vuln.pm and the curl git repository . The graph rendering is based on the dashboard scripts . All images put into a movie with ffmpeg of course. Several people have asked what happened in 2016 that caused the notable drop. A slope if you will. If we zoom in on that, we can spot that curl 7.51.0 has eleven fewer vulnerabilities than the version before that. This release was the first one after the 2016 Cure53 code audit , but other than that there is no clear distinct process or obvious code changes that explain this trend shift. Lots of other graphs show just the ordinary pace and growth in various project areas. It was still fairly early days CI-wise but had been running at least a few CI jobs per commit for a few years already by then. curl was adopted into the OSS-Fuzz project in July 2017, which since then makes us find some issues better, but the drop looks like it happened before then. We had already been analyzing the code regularly on Coverity since a few years. Better tooling? New compiler options? We simply don’t know. As we keep announcing more vulnerabilities going forward, things will continue to change. Maybe I will come back and make another movie in five years?

0 views
Manuel Moreale 2 days ago

Anne Lee Steele

This week on the People and Blogs series we have an interview with Anne Lee Steele, whose blog can be found at aleesteele.com . Tired of RSS? Read this in your browser or sign up for the newsletter . People and Blogs is supported by the "One a Month" club members. If you enjoy P&B, consider becoming one for as little as 1 dollar a month. I’m Anne. I’ve spent almost a decade in what I call the ‘open ecosystem', the first five years as a lurker and participant, the second five as a researcher and facilitator. I’ve done ethnographic studies of OpenStreetMap, was the Community Manager of The Turing Way, and have held a variety of fellowships with organisations ranging from the Internet Society to the Software Sustainability Institute. Outside of all these things, I would call myself an artist-of-sorts, maybe to say that I make art more than I embody the spirit of an artist, per say. But I sometimes throw the title around anyway. I guess I’ll say it: I’m a researcher, facilitator, and artist. I was a big ‘micro’ blogger in my teens, using platforms like Xanga, Livejournal, and Flickr to document my teenage life. Then I inevitably moved to Tumblr alongside many angsty and artsy teenager girls right as Facebook started to take off in parallel, before moving to Instagram (and using it as a kind of ‘blog’ for years). I’ve gone through the inevitable cycles of use then rejection, of deleting and reactivating all my social media accounts. My original Facebook account is gone now. When I started grad school in 2019, I started my blog as a method of sharing more about my life and research when I moved to Geneva. I think it came out of the joint desires for self-expression and a desire to get out of social media. We now collectively call this platform decay “enshittification”, but I really felt like what I was putting online was performative more than anything else. The blog felt like shouting into a voice, yes, but it was my shout, on my own website, in a void of my own creation. There’s no like button for that. I often just create a new page on Obsidian or VSCode and just start to write. Sometimes it all comes out in one go - sometimes the draft will take years to fruition (and yes - I’ll often backdate that post to when it was created, not when it was published). There’s actually a secret draft folder on Github that hosts all my drafts in progress. Out of all the creative processes, I find writing the most difficult, but also the most transformative. I rarely enjoy the process, but always feel better, or have more clarity, or understand something or myself better, afterwards. In parallel to writing, I’m very much a power user of are.na which feels like a more instinctual, affectual, and social form of thinking out loud. I think a lot of the nascent themes contained there eventually end up on the blog in some form or another. I’m a big listener of NTS radio (specifically the Breakfast Show with Flo) and use earth.fm a lot. Sounds really create a space for me - and are a way I stay grounded and aware, no matter where I might be working. If I have control over that however, I tend to need a big desk for books and papers of all kinds - maybe I need a messy desk in order to have a cleaner mind. I absolutely believe that physical spaces influence creativity. When I’m writing something long form for example, I’ll usually surround myself physically with books and visual artifacts (photos, sketches, and other detritus) related to the topic, almost like it’s a living alter to the work, or an externalised process of thinking that starts with the visual, then becomes injested and cognitive. I’ve been on the move for the past year or so, and I’ve really felt its impact on my creativity: in some ways I’m more spontaneous, but less deep and situated with my thinking. I have no doubt that this is because of my perpetual motion. My website (and blog attached to it) is very simple. I used to host my website on Github Pages, but now it’s built on Netlify (very open to alternatives - please reach out!). I used to use Heroku before it was shut down. The whole website and codebase is on Github. You can see more about the ethos of the website, and specifically the practices I am aiming to adopt better practices for accessibility and open practices on this easter egg of a page (which you can find by clicking the sticky note in the footer): aleesteele.com/design Field Notes from my Desktop is the name of both my blog and newsletter these days. I feel all sorts of ways about having a newsletter now – I’ve only sent out one so far. It’s a very different feeling to have a captive audience, some of which have been subscribed automatically after an event I’ve facilitated, or joined from the web. I don’t know if I like it, but it feels like in the age of information glut, there’s something about the inbox that remains sacred for many folks. I want to respect that, and maybe want to think of it as a seeding process... For my blog, I remain dedicated to maintaining it as is, without any real changes. Looking back, I guess I would have held myself accountable to finishing more blogs in the moment: when that whiff of an idea, or a concept, or an event or reflection has completely capitvated me and I feel the need to write about it. Unfortunately I have so many half-finished blogs not because I didn’t like the topic, but rather because it was such a struggle and a slog to finish, that I didn’t bring it fully over that crest into fruition. Maybe I should have ritualised it. At the same time, I don’t want to be too hard on myself. I did the best I could at the time. Maybe the newsletter is meant to be the rhythm, and the blog is the burst of free jazz. I have no monetization plan, and currently don’t monetize anything I do for the blog. In fact, I pay to use Buttondown at the moment, and I’m debating whether to do that (currently $9 USD a month since they changed their membership plan). I pay for my domain, which is £12 GBP annually. Now I’m mixing blogs and newsletters! I’ve been such a periodic reader of a bunch of different things that I tend to save on are.na, that it’s hard to pick out a few. I’ve been a passionate reader of the Marginalian and the Creative Independent for many years. At this point, both are less blog, and more of a wiki-like resource about life, creativity, ecology, and all sorts of topics that make life meaningful and mysterious. I’ve also used read and used Open Culture for many years – which is also a blog-of-sorts. I recently learned about The Examined Mind . One person whose writing practice I look up to is Shannon Mattern , an academic anthropologist turned New York Librarian-of-sorts and my friend Jonathan Gray who has a fantastic (and consistent!) public writing process. Both of them are academics with public practices - and while I’m not an academic, I do have a research-informed process. I also love Julian Stodd’s blog on leadership and organisational practices . He’s done a lot of deep and open thinking that I’ve appreciated about the topic, and has stayed loyal to old school Wordpress. This question inspires me to bring together a more curated list of blogs I’ve read and followed over the years, and also to recognise the shortfalls of my own ‘blogging’ practice. I have saved blog instances (meaning individual blogs), but I haven’t ‘followed’ a single blogger in a really long time actually. Is that because of the newsletter because I don’t use RSS feeds? I’m developing a series of mapping meditations , based on Pauline Oliveros’ Sonic Meditations . If you have a map or a meditation to share, please reach out. I also run internet infrastructure walking tours in London, in an effort to make invisible infrastructures more embodied and playful. This year, I’ve been experimenting with a monthly walk, and you're welcome to join one using Luma . Now that you're done reading the interview, go check the blog . If you're looking for more content, go read one of the previous 147 interviews . People and Blogs is possible because kind people support it.

0 views
Kev Quirk 2 days ago

📝 2026-06-26 09:57: You know what isn't fun? Living in a 200 year old stone house, with no...

You know what isn't fun? Living in a 200 year old stone house, with no insulation, during a heatwave. There's collapsed animals (not literally) and fans all over the place. 🥵️ Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views
neilzone 2 days ago

Restoring missing Address Book in Thunderbird 140 menu bar

For some reason, the Address Book tab/pane on Thunderbird’s menu bar had gone missing, and I struggled to find out how to get it back. So, for future me, what resolved it was:

0 views