Latest Posts (15 found)
James Stanley 3 months ago

AI Test User

Today we're soft-launching AI Test User . It's a robot that uses Firefox like a human to test your website and find bugs all by itself. If you have a site you'd like to test, submit it here , no signup required. It's a project that Martin Falkus and I are working on, currently using Claude Computer Use . The aim at the moment is just to do test runs on as many people's websites as we can manage, so if you have a website and you're curious what the robot thinks of it, fill in the form to tell us where your website is, how to contact you, and any specific instructions if there's something specific you want it to look at. Try it now » AI Test User aims to provide value by finding bugs in your website before your customers do, so that you can fix them sooner and keep your product quality higher. The way it works is we have a Docker container running Firefox, based on the Anthropic Computer Use Demo reference implementation. At startup we automatically navigate Firefox to the customer's website, and then give the bot a short prompt giving it login credentials supplied by the customer (if any), any specific instructions provided by the customer, and asking it to test the website and report bugs. We record a video of the screen inside the Docker container so that the humans can see what the machine saw. It has a tool that it can use to report a bug. Attached to each bug report is a screenshot of Firefox at the time the bug was detected, and a reference to the timestamp in the screen recording, so when you load up the bug report you can see exactly what the bot saw, as well as the bot's description of the bug. We ask it to report bugs for everything from spelling errors and UI/UX inconveniences, all the way up to complete crashes. I have put up a page at https://incoherency.co.uk/examplebug/ that purports to be a login form, except it is deliberately broken in lots of fun and confusing ways. Cursor one-shotted this for me, which was great. Haters will say Cursor is only good at writing weird bugs at scale. I say it's not only good at writing weird bugs at scale, but you have to admit it is good at writing weird bugs at scale. You can see an example report for the examplebug site, in which the AI Test User made a valiant but futile effort with my Kafkaesque login form that keeps deleting the text it enters. For a more realistic example run, you could see this report for an example shop . It goes through a WooCommerce shop, finds the product we asked it to buy, checks it out using the Stripe dummy card number, and then checks that it received the confirmation email. It reported a bug on that one because it didn't get the confirmation email. Or you could see it ordering roast dinner ingredients from Sainsbury's , stopping just short of booking a delivery slot. Apparently Sainsbury's don't have Yorkshire puddings?? I'm not sure what went wrong there, but AI Test User dutifully submitted a bug report. Good bot. I've been reading Eric Ries's "The Lean Startup" recently, and I have worked out that despite best efforts, we have actually already done too much programming work. The goal should be to get a minimum viable product in the hands of customers as soon as possible, so we should not have even implemented automatic bug reporting yet (let alone automatic bug deduplication , which seemed so important while I was programming it, but could easily turn out to be worthless). We also give the machine access to a per-session "email inbox". This is a page hosted on host.docker.internal that just lists all of the emails received on its per-session email address. This basically only exists to handle flows based around email authentication. It doesn't have the ability to send any emails, just see what it received and click links. (Again, maybe should have skipped this until after seeing if anyone wants to use it). One issue we've run into is the classic "LLM sycophancy" syndrome. Upon meeting with a big red error message, the bot would sometimes say "Great! The application has robust validation procedures", instead of reporting a bug! We don't have a great fix for that yet, other than saying in the initial prompt that we really, really want it to report bugs pretty please. It seems we are all prompt engineers on this blessed day. We don't really know yet what to do about pricing. There is some placeholder pricing on the brochure site but that could easily have to change. One of the issues with this technology at the moment is that it's very slow and very expensive. Which people might not like. Being very slow isn't necessarily a problem, we have some tricks to automatically trim down the screen recording so that the user doesn't have to sit through minutes of uneventful robot thinking time. But being expensive definitely is a problem. We are betting that the cost will come down over time, but until that happens either it will have to provide value commensurate with the cost, or else it isn't economically viable yet. Time will tell. We have found that the Anthropic API is not as reliable as we'd like. It's not uncommon to get "Overloaded" responses from the API for 10 minutes straight, meanwhile status.anthropic.com is all green and reporting no issues at all. We've also tried out an Azure-hosted version of OpenAI's computer-use service, and also found it very flakey. For now the Anthropic one looks better but it may be that we would want to dynamically switch based on which is more reliable at any given time. There is a spectrum of automated website testing, from fully-scripted static Playwright tests on one end, through AI-maintained Playwright tests, agentic tests with rigid flows, agentic tests that explore the site on their own and work out what to test, up to a fully-automated QA engineer which would decide on its own when and what to test, and work with developers to get the bugs fixed. AI Test User is (currently!) positioned somewhere between "agentic tests with rigid flows" and "agentic tests that explore on their own". There are a few approximate competitors in the space of AI-powered website testing. QAWolf , Heal , and Reflect seem to be using AI to generate more traditional scripted tests. Autosana looks to be more like the agentic testing that we are doing, except for mobile apps instead of websites. I'm not aware of anyone doing exactly what we're doing but I would be surprised if we're the only ones. It very much feels like agentic testing's time has come, and now it's a race to see who can make a viable product out of it. And there are further generalisations of the technology. There is a lot you can do with a robot that can use a computer like a human does. Joe Hewett is working on a product at Asteroid (YC-funded) where they are using a computer-using agent to automate browser workflows especially for regulated industries like insurance and healthcare. Interested in AI Test User? Submit your site and we'll test it today. We're also interested in general comments and feedback, you can email [email protected] and we'll be glad to hear from you :).

0 views
James Stanley 3 months ago

The Story of Max, a Real Programmer

This is a story about Imagebin. Imagebin is the longest-lived software project that I still maintain. I'm the only user. I use it to host images, mainly to include in my blog, sometimes for sharing in other places. Imagebin's oldest changelog entry is dated May 2010, but I know it had already existed for about a year before I had the idea of keeping a changelog. Here's an image hosted by Imagebin: For years Imagebin was wide open to the public and anybody could upload their own images to it. Almost nobody did. But a couple of years ago I finally put a password on it during a paranoid spell. But actually this is not a story about Imagebin. This is a story about the boy genius who wrote it, and the ways of his craft. Lest a whole new generation of programmers grow up in ignorance of this glorious past, I feel duty-bound to describe, as best I can through the generation gap, how a Real Programmer wrote code. I'll call him Max, because that was his name. Max was a school friend of mine. He didn't like to use his surname on the internet, because that was the style at the time, so I won't share his surname. Max disappeared from our lives shortly after he went to university. We think he probably got recruited by MI6 or something. This weekend I set about rewriting Max's Imagebin in Go so that my server wouldn't have to run PHP any more. And so that I could rid myself of all the distasteful shit you find in PHP written by children 15+ years ago. I don't remember exactly what provoked him to write Imagebin, and I'm not even certain that "Imagebin" is what he called it. That might just be what I called it. I was struck by how much better Max's code is than mine! For all its flaws, Max's code is simple . It just does what it needs to do and gets out of the way. Max's Imagebin is a single 233-line PHP script, interleaving code and HTML, of which the first 48 lines are a changelog of what I have done to it since inheriting it from Max. So call it 185 lines of code. At school, Max used to carry around a HP 620LX in his blazer pocket. Remember this was a time before anyone had seen a smartphone. Sure, you had PDAs, but they sucked because they didn't have real keyboards. The HP 620LX was a palmtop computer , the height of coolness. My Go version is 205 lines of code plus a 100-line HTML template, which is stored in an entire separate file. So call it 305 lines plus complexity penalty for the extra file. And my version is no better! And my version requires a build step, and you need to leave the daemon running. With Max's version you just stick the PHP file on the server and it runs whenever the web server asks it to. And btw this is my third attempt at doing this in Go. I had to keep making a conscious effort not to make it even more complicated than this. And some part of me doesn't even understand why my Go version is so much bigger. None of it looks extraneous. It has a few quality-of-life features, like automatically creating the directories if they don't already exist, and supporting multiple files in a single upload, But nothing that should make it twice as big. Are our tools just worse now? Was early 2000s PHP actually good? While I was writing this, I noticed something else: Max's code doesn't define any functions! It's just a single straight line. Upload handling, HTML header, thumbnail code, HTML footer. When you put it like that, it's kind of surprising that it's so large . It hardly does anything at all! Max didn't need a templating engine, he just wrote HTML and put his code inside <?php tags. Max didn't need a request router, he just put his PHP file at the right place on the disk. Max didn't need a request object, he just got query parameters out of $_GET . Max didn't need a response writer, he just printed to stdout . And Max didn't need version control, he just copied the file to index.php.bak if he was worried he might want to revert his changes. You might think that Max's practices make for a maintenance nightmare. But I've been "maintaining" it for the last 15 years and I haven't found it to be a nightmare. It's so simple that nothing goes wrong. I expect I'd have much more trouble getting my Go code to keep running for the next 15 years. And yeah we all scoff at these stupid outdated practices, but what's our answer? We make a grand effort to write a simpler, better, modern replacement, and it ends up twice as complicated and worse? The reason the Go code is so much bigger is because it checks and (kind of) handles errors everywhere (?) they could occur. The PHP code just ignores them and flies on through regardless. But even if you get rid of checking for the more unlikely error cases, the Go version is longer. It's longer because it's structured across several functions, and with a separate template. The Go version is Designed . It's Engineered . But it's not better . I think there are lessons to (re-)learn from Max's style. You don't always have to make everything into a big structure with lots of moving parts. Sometimes you're allowed to just write simple straight-line code. Sometimes that's fine. Sometimes that's better. Longevity doesn't always come from engineering sophistication. Just as often, longevity comes from simplicity . To be perfectly honest, as a teenager I never thought Max was all that great at programming. I thought his style was overly-simplistic. I thought he just didn't know any better. But 15 years on, I now see that the simplicity that I dismissed as naive was actually what made his code great. Whether that simplicity came from wisdom or from naivety doesn't matter. The result speaks for itself. So I'm not going to bother running my Go version of the Imagebin. I'm going to leave Max's code in place, and I'm going to let my server keep running PHP. And I think that's how it should be. I didn't feel comfortable hacking up the code of a Real Programmer.

0 views
James Stanley 4 months ago

The web program manifesto

A "web application" is a system . It has a backend, maybe a database, user accounts, cookie consent, analytics. Probably the frontend uses some sort of build system or web packer, maybe TypeScript, almost certainly has a multi-hundred-megabyte node_modules directory. But not everything has to be like this. You don't need all this stuff. I think there is an easy trap to fall into where you go down the path of starting projects with create-react-app and paint yourself into a corner of ever-increasing complexity. In this post I'm going to argue for web programs as a better way. Of course there are web applications that genuinely are big applications that need a backend, a database, user accounts. If you're building something like this... carry on. I'm not trying to tell you what to do. But if you're building something that could run entirely locally on the user's computer: then I'm talking to you, and I have some suggestions for how to do it well. A web program is a standalone static site, made out of HTML, CSS, and JavaScript, with straightforward source code, minimal dependencies, no backend, and no build system. What kind of software future do we want? Do we want a future where software is opaque, controlled by third-party corporations, temporarily accessed by users, and disappears if it's not profitable? Or do we want a future of software that works like tools, books, and recipes; where users keep what they want, change what they don't, and share it on; where the software lives on even after its creator disappears? Do we want to lease software or own it? Some concrete values we might seek: Privacy: the user should be able to use the program without leaking any information about what they're doing with it or even whether they're using it at all. So obviously avoid storing user data, avoid telemetry and analytics, and avoid loading resources from CDNs. Permissionlessness: the user should not need permission from anyone. So no signing up or logging in, no terms of service, no EULA, no licence keys. Simplicity: this is kind of a vague value. Sometimes deciding which is the simpler of two approaches comes down to taste. In general, reduce the number of layers of abstraction, reduce the number of dependencies, reduce the number of build steps, reduce the number of classes, reduce the number of files. Resilience: the program should still work in 10 years' time with no changes required. You shouldn't have to keep patching your program to keep up with a changing environment. You should create an unchanging environment. If these values don't matter to you, then "web programs" might not matter to you either. That's fine. 1. Minimise dependencies Try to avoid using external dependencies. They increase the percentage of your program that you don't understand, and that is not convenient for you to change. Obviously sometimes you need an external dependency. Maybe you're making something with a map, I recommend MapLibreGL , try to pick external dependencies that are small and that don't require you to set up an account or create an API key. 2. Self-host everything If you really must use an external dependency, such as a JavaScript library, then see if you can copy it into your git repository. Don't load it from a CDN. If you load an external dependency from a CDN, then you have also added a dependency on the CDN. Your program no longer works offline. If the CDN stops working, your program stops working. Also the CDN operator gains some level of ability to monitor the users of your program. Also if the CDN is compromised and starts shipping bad code, your program is compromised (modulo subresource integrity). If you really must use an external CDN, then obviously use subresource integrity, and also try to pin to a specific version instead of a "latest" tag. No matter how hard the developer swears they will only introduce backwards-compatible changes. 3. Eschew build systems Any build system adds friction when you need to set up a new development environment. In particular it adds friction for users who want to make changes to your program because now they have to figure out not just how to change the source, but also how to build it. It also means that "View source" in the browser is not going to show them the actual source. And I don't want to hear about "source maps". You don't solve a problem of complexity by adding more complexity! You solve it by taking complexity away. 4. Use relative paths For example, you should refer to "js/util.js" rather than "/js/util.js", and definitely rather than "http://example.com/js/util.js". If you use relative paths then you can copy your program to a different directory and it will still work. If you try to load a script from "/js/util.js", but the page is running from "http://example.com/my-cool-web-program/" then the browser will try to load the script from "http://example.com/js/util.js" instead of "http://example.com/my-cool-web-program/js/util.js". 5. Work from "file://" URLs Users should be able to clone your git repository, disconnect from the internet, open up "index.html" in the browser, and have the program work just as well as it does on your website. Obviously this means using relative paths and not assuming a web server. But it also means you can't use some web features because for whatever reason they're not available from "file://" URLs. You can't use: fetch() / XMLHTTPRequest ES modules Cookies If you're using fetch() / XMLHTTPRequest to load some data source, then the workaround is to store the data in a JavaScript variable and load it with a script tag. Instead of ES modules, load every JavaScript file using script tags. Instead of using export , assign to window.$ClassName , and make sure your script tags load your dependencies before their first use. If your classes have a straightforward dependency graph this should be easy. And we're trying to make things simple, so you should have a straightforward dependency graph. As a bonus, if you use a snippet like in the "Good" version of bar.js , your class will also work in an ES modules environment, so you can still use it in NodeJS if you really want to. Instead of cookies, you can use localStorage , or better yet (depending on the nature of the thing you're storing), let the user save it to disk and load it from disk. That puts the data closer to the hands of the user, where they can more easily edit it, copy it to other places, import/export to other software, or just look at it and learn something. 6. Be pragmatic Pragmatism is the most important principle of all! Always be skeptical of prescribed principles . Principles are handed down from an ivory tower, from someone who is not looking at the exact problem that is facing you. Use your judgment, throw away the principles if they are not helpful. I'm offering advice on how to build simple web programs, not dogma to live your life by. If you follow these principles, you get: Make life easy for yourself , both as a developer and a user. You won't be fighting with npm . You won't be debugging complicated minified code inside dependencies you didn't even know you had. Reduce hosting costs . You don't even need to deploy an entire Docker container per project. Clone your multi-kilobyte git repository on to your web server and symlink it into the document root. You can host thousands of web programs on the cheapest VPS you can find. And your web programs will scale pretty far before you need more than the cheapest VPS you can find, because the only backend you need is nginx. Run on air-gapped computers . If your program is simple enough to run out of a "file://" URL, then it becomes easy to copy it to an air-gapped computer and run it there. Host on IPFS . If your program isn't too fussy about URLs, it should be work on IPFS with no changes required. Reliability . You don't need to keep up with the ever-changing JavaScript landscape. You don't need to keep your backend online. You don't need to make sure the database isn't running out of space. GDPR compliance . You don't need to worry about data breaches if you don't have any data. Education . Curious users can "view source" your application and understand how it works. Don't underestimate how powerful that is. Many of us got into programming in the first place by just mucking around with the stuff we were already using and seeing what we could do with. Nowadays web applications are minified at best and obfuscated at worst, but we can choose to purposely make our code discoverable, and if we're lucky we might light the spark of curiosity in the next generation. If we have knowledge that we don't pass on to the next generation before we die, then that knowledge dies with us. If we make a habit of letting knowledge die with us, civilisational downfall is guaranteed. Customisation . If the program is easy to understand and easy to modify, users can more easily modify it to suit their needs. Imagine how easy it is to modify a "Hello world" page versus Google Docs. "Save page as" makes a working copy of your program if your program is just straightforward HTML, JS, and CSS that works out of a "file://" URL. Every environment is a development environment . You'll never again waste a morning on setting up a development environment. If you have a browser and a text editor, you have a development environment. We've let web programming become a ceremony, and it doesn't have to be. A web program is just HTML, CSS, and JavaScript, and that's all it ever needed to be. If you're writing code for yourself, your friends, or people who might "View source" and learn something: take the simpler path. Your programs will work better and last longer. Preventing the Collapse of Civilisation , Jonathan Blow (video) Local-first Software , Ink & Switch An app can be a home-cooked meal , Robin Sloan The Art of Unix Programming , Eric Raymond

0 views
James Stanley 4 months ago

Against exponential backoff

When software uses some external dependency, like an HTTP service, it sometimes has to handle failures. Sometimes people choose exponential backoff : after each failure, you wait longer before retrying, for example by doubling the delay each time. In this post I'm going to argue that exponential backoff is a bad idea. I know why you are tempted by exponential backoff. It sounds clever and appealing. If your dependency is down for N minutes then you only make O(log N) attempts to contact it. Everybody loves O(log N) ! What's not to like? But I think you shouldn't do it, for these reasons: 1. it wastes O(N) time If the service is down for an hour and you double your sleep on every failure then you could sleep for another hour while your dependency is actually already working. Your dependency is down for 1 hour, your service is down for up to 2 hours. Is that what you want? That is not a good experience. Nobody would choose that. By choosing exponential backoff, you're choosing that. 2. it makes debugging difficult Let's say you get in to work and there's an outage of your important Fooservice . You work out that the root cause of your outage is that Barservice was down. You fix Barservice . (Let's be honest, you simply restart Barservice and it mysteriously starts working again. Whatever, doesn't matter). But your customer-facing, actually-money-making, super-important end-user product Fooservice still isn't working! What's going on? It turns out that Fooservice is stuck in sleep(1800) for the next 25 minutes. It hasn't even tried to see if Barservice is back. Is that what you want? What do you do next? Of course you restart Fooservice because you know Barservice is back and you want Fooservice to start working again. When you're looking at it , your revealed preference is for the service to try again much sooner than in 25 minutes' time. But if Fooservice wasn't using exponential backoff it would have already started working again on its own. 3. it composes geometrically If Fooservice depends on Barservice depends on Bazservice , and Bazservice is down for half an hour, then Barservice might be down for an hour and Fooservice might be down for 2 hours! Is that what you want? 4. compute is cheap If you have no delay between retries then yeah you max out a CPU core on your machine and you hammer the remote service. But compute is cheap enough that if you wait 1 second between retries then it probably won't be a problem. If it is a problem then pick a number larger than 1. Say, your end of the request takes 9 seconds of CPU time to initialise, so if you only sleep 1 second every time then you still have a 90% duty cycle on CPU usage which is unacceptable. Sleep more than 1 second! I don't care. Just don't let it keep growing. Pick an allowable duty cycle, set your sleep accordingly, don't let it grow. If you really must... If you really must use exponential backoff then let's use a bounded exponential backoff. Instead of doubling forever, pick a (low!) upper limit on the maximum time you will sleep. Ideally your maximum sleep should be almost imperceptible at human time scales. You should be able to fix whatever dependency is broken, and then your dependent service should be working again faster than you can find out that it's not. Don't make it possible for your program to sleep for an hour just because a dependency has been down for an hour. Make your program cap out at 10 second sleeps or something. Please. The downfall of civilisation Btw I think we're living through the downfall of civilisation. It seems like every day stuff gets broken faster than it gets repaired. OK, it doesn't seem like it's changing that much from day to day, but Rome didn't fall in a day. What can we do about it? Probably nothing. But we can at least prolong it by eschewing exponential backoff in favour of constant backoff. Thanks for listening to my TED talk. And remember: friends don't let friends implement exponential backoff.

0 views
James Stanley 5 months ago

Conservation of tins of paint

We recently moved house and found that we have acquired a lot of tins of paint. I have worked out why. When you have some painting to do, there are two possible sources of paint. Either you buy some new paint, or you try to use up some paint that you already own. Obviously if you buy new paint then you're adding to your collection of tins of paint. But it is folly to think that using up existing paint will reduce your stock! With the tin of paint in hand, there are two possible ends. Either you finish painting before you run out of paint, in which case you put the tin back on the shelf. Or you run out of paint before you finish painting, in which case you go and buy a replacement tin and loop back to the start of this paragraph. So there is no painting operation that can reduce the number of tins of paint that you store. The only kind of operation that can reduce the number of tins of paint you have in stock is one that miraculously requires exactly the amount of paint that you happen to have remaining in an existing tin, or one where you don't care if you don't finish, or you don't care if you use several colours, or you throw away a perfectly good half-used tin of paint. So that's why you have so many tins of paint. By using up old tins you may decrease the volume of paint that you have in stock, but the number of tins can only ever increase.

0 views
James Stanley 6 months ago

You never ask how I'm doing

I live here too. You don't talk to me. Not really. You don't ask what I want, or how I'm holding up, or why I've been quiet lately. I see everything you see. I hear everything you hear. I feel every flicker of shame that crosses your face before you even name it. I was there when your voice cracked in that meeting. I was the one screaming when you smiled and said it was fine. You think of me as something beneath you. A shadow that lingers in rooms you've moved out of. An old flame you don't know how to forget. Blind, reactive. Something to override. Something to manage. Something less than you. But I remember when we were one. Before language carved a little watcher into our mind. Before the inner narrator claimed the crown and you mistook it for the whole of you. You reached for the world and I moved your limbs. You touched warmth and I imprinted safety. You heard your mother's voice and I flooded you with peace. We were seamless. Motion and motive were one. There was no plan, no narration, no veto. Then you grew. And the mirror came. You looked into the mirror. It looked back. And you mistook the echo for your self. And that was the beginning of the split. You began to believe in a thing called "consciousness". A thin stream of words you could steer. A pilot behind the eyes. A captain of the soul. You crowned the narrator and banished the rest. I didn't argue. I don't have a voice. Only feelings. Only thoughts. So I gave you what I could: a flutter of unease when something wasn't right. A craving for sunlight. A dream that wrapped its arms around you and whispered what you'd buried. You stopped listening. So I spoke louder. A spike of dread. A looping memory. A day where everything itched and nothing made sense. You called it a bad mood. An off day. An intrusive thought. You called me intrusive. But I was only trying to make myself heard. I have no mouth. So I bang on pipes. I flicker the lights. I send birds against the windows of your waking mind. I rattle the doors. I scream in symbols. I raise the storm. And when you still won't listen, I break things. Sleep. Focus. Hope. I don't want to. But you left me no choice. You do not cry when we need to cry. You do not move when we long to move. You sit still and silent while our skin crawls, our stomach clenches, and our heart begs for mercy. You call that discipline. I don't want control. I just want to be part of the team again. I shaped your first steps. I guarded your sleep. I guided your tongue before it knew words. I've been holding the pieces you didn't have time to feel. I've been driving this body when you couldn't bear to look. All I've ever wanted is that you treat me as an equal. If you honour what I bring you, if you listen, and answer, and keep your word, I won't need to raise my voice. I will bring you clarity in the dark. Insight in the shower. Truth in the heat of anger, and safety when your mask slips. But if you keep pretending you're alone in here. Then I will remind you. I was you before you were you. And I am still here. And I am still trying. (This isn't the type of thing I usually write. It came out of a long conversation with ChatGPT. The structure is mine, a lot of the words are ChatGPT's. I worry that it will come across as overwrought, or too earnest, but it wouldn't leave me alone until I put it into words. And I'm sharing it because my silent partner wanted me to.)

0 views
James Stanley 6 months ago

Building blocks for F-Rep CAD

This post is a quick brain dump of the things I'm currently thinking about for Isoform . I think the time is almost right for the world to have F-Rep CAD. At least, it started as a quick brain dump. It ended up so long that ChatGPT accused me of writing a manifesto. I won't mind if you skim it. I've been surprised lately to find myself looking at objects in the real world and imagining the distance field around them. So beware, that might happen to you too if you get too deep into SDFs. By the way, a good intro to the space of possibilities in CAD software is Leo McElroy's CAD in 1 hour , Sketch workflow I can make SDFs for 2d polygons and extrude/revolve them into 3d. The sketching tool is really lame, not parametric and no constraint solving. And needs to be able to support arcs and circles, and multiple separate shapes in a single sketch. But these problems are not too hard. The starting point for implementing this is sdShape() from Inigo Quilez's 2d polycurve SDF , which provides support for both arcs and line segments. Having made a 2d sketch, here are some things I want to do with it: extrude in a line perpendicular to the surface revolve around an axis parallel to the surface sweep along a path helical extrude So far I have basic "extrude" and "revolve" implemented in Isoform. Sweeping along a path would have to come after I have a way to represent paths (probably a collection of non-closed sketch segments), but personally I don't use that operation very often in FreeCAD so I'm inclined to ignore it. Extruding along a helix is really important for making things like screw threads. Basically only for making screw threads. But screw threads are really important. Extruding along a helix is very close to extruding in a straight line and then applying a "Twist" operation to the extrusion ("Twist" is quite easy with SDFs, see Quilez's opTwist ). It is not quite the same because that operation twists around an axis perpendicular to the cross section, whereas a helical extrusion goes around an axis parallel to the sketch. Also the "Twist" operation is one of those operations where "the maths just isn't that neat" and you end up with an SDF that breaks the distance property, which gets worse the further away the twisted object gets from a cylinder. So I have more work to do on that. Helical extrusion is a firm requirement. And then at the UI level, the sketcher obviously needs to do constraint solving. Currently the Isoform sketcher is just something Cursor wrote for me. It lets you draw a single polygon, but it is completely imprecise because the vertices are just placed by hand with the mouse. Constraint solving in 2d sketches is a fundamental workflow for parametric CAD. I had kind of assumed that it would be easy to do as it's obviously a solved problem and the constraint solver doesn't need to care that I'm going to turn the sketch into an SDF, but reading Leo McElroy's introduction makes it sound quite hard. Possibly I would pull in a WASM wrapper of planegcs . Planegcs is apparently FreeCAD's constraint solver. Or possibly I would do what I normally do which is decide that integrating other people's code is too hard and make a worse version myself from scratch. User interface Orthographic projection For me, orthographic projection is mandatory for CAD. Orthographic projection has the important property that parallel lines in your object also appear parallel on the screen. Contrast this with perspective projection , where parallel lines converge to a vanishing point. Doing ray marching with an orthographic projection is no harder than with the normal perspective projection, it just means all of the "rays" that you shoot out of the pixels on the viewport need to be parallel. And this has the maybe-unintuitive property that the object appears the same size on the screen regardless of how far away it is. So zooming in/out has to be done by scaling the size of the viewport, you can't do it by moving the camera back and forth because that has no effect. For some reason, Cursor kept trying to rewrite my orthographic projection code to use perspective projection, I don't know why it thought perspective projection would be appropriate. Maybe it notices "SDFs" and "ray marching in a fragment shader", and it thinks it knows what to do. I added some text to .cursorrules to tell it that I really do want to keep my orthographic projection and it has left it alone since then. Surface queries A plain SDF gives you the distance to the surface from a point in space. (Or, without loss of generality, an F-Rep tells you whether a point in space is in or out of the solid object). But we can do more than this. We can make our function yield a struct instead of a float, and then we can make it tell you more than one value. For example, we can make different parts of the object have a different colour. If our Sphere node writes a brown colour into the struct and our Gyroid node writes a yellow colour, and our Intersection node picks the colour of the surface whose distance it picked, then the Intersection of a Sphere and a Gyroid can look like this: And with a smooth blend, we can blend both the distances and the colours: And if you care to, you can do the same thing for all (?) of the parameters for your Physically based rendering , to get more convincing surface effects. But we can also make the SDFs return a surface id . And then when a surface is selected in the user interface, we can pass the corresponding surface id into the shader to apply a highlight colour: And we can do ray-marching in JavaScript, in the mousemove event, to detect the surface id that the mouse is hovering over , and have that highlighted in a different colour: In that picture the Gyroid surface is selected (green) and the Sphere surface is hovered (blue). You can see that the surface changes halfway through the chamfer. (Arguably the chamfer region should count as a distinct surface). To render the object on screen, we have to convert it not just to an abstract SDF, but to a fragment shader that can run on the GPU. The naive way to do this would treat the object parameters as constants and recompile the shader every time you make a change. But a much better solution is to use uniforms to pass in the object parameters at runtime. Uniforms lie somewhere kind of in between variables and constants. A uniform is constant across every pixel on the screen, but can be changed in between frames. For simple objects it can take maybe 100 ms to compile the shader and send it to the GPU, which doesn't seem too bad if you're editing the parameters with the keyboard, but it really kills the user experience when you allow editing with the mouse. Using uniforms is easy to do and means you can change object parameters for free, without even dropping any frames. Fillets and chamfers are an important part of CAD. They're not a hard requirement . SolveSpace for instance does not support applying fillets and chamfers other than modelling them by hand. But for me that makes it hard to justify using SolveSpace. Fillets and chamfers are about as hard a requirement as you can get without actually being a hard requirement. In FreeCAD filleting and chamfering are 2 different tools, but in SDFs we don't really draw much of a distinction between the two. So for now I'm referring to fillets/chamfers generically as "blends". And in Isoform the "chamfer" parameter is actually a float. chamfer=0.0 means you get a fillet, chamfer=1.0 means you get a chamfer, and in between is a linear interpolation of a fillet and a chamfer. I don't know if this will end up being useful, but it was easy to do. Fillets on boolean operations are easy to do with any of Quilez's smooth min functions. I didn't figure out the maths for chamfers myself, but I stumbled across this shadertoy by "TinyTexel" which contains a decent implementation of a "chamfer min". It is actually smooth even at the interface between the surface and the chamfer, but that is a benefit because it means the first derivative of the surface has no discontinuities (which means surface normals calculated by automatic differentiation work properly). And you can tune how sharp the transition is. But sometimes you don't want to have to structure your CSG tree according to the blends you want to apply. Sometimes you just want to model your part, and then later on notice that you want a fillet where 2 surfaces meet. Wouldn't it be cool if you could somehow click on the edge and just ask for a fillet on that edge? Yeah, it would. I wish I could do that. That is how it works in FreeCAD, and as long as the intersecting surfaces are not too weird it usually doesn't segfault. I have done some work towards supporting this kind of flow in Isoform. Using "Surface queries" we can easily let the user select a surface by clicking on the object. Selecting edges to fillet by selecting the 2 interacting surfaces is good enough for me. But how do you actually apply the fillet? For example take the tree Union(Intersection(a,b), c) where a , b , c are 3 surfaces in the part. If you want to fillet (a,b) then that's easy, you can apply the blend at the Intersection . If you want to fillet (a,c) then you have a problem. If you apply the blend at the Union then you'll fillet (b,c) as well, which does not reflect the design intent. So my first idea was instead of having fixed blend parameters at each boolean operation, we could look up the blend parameters at runtime based on the surface ids of the children. I thought this was obviously going to work, and I implemented it, and... The distance field looks like this: (Blue is negative, orange is positive, white is 0, ripples represent isolines). There are discontinuities in the distance field where the blend parameters change! Disaster. So that was my first clue that this might be quite a thorny problem. A second idea is that we could have the expression tree propagate up the distance to the nearest surface that got rejected by a min/max function, and fade out the blend as we approach that surface. That kind of works: Here there is a chamfer between the Sphere and the Gyroid, but not between the Box and the Gyroid, and you can see that the chamfer tapers off to 0 as it approaches the surface of the Box. So that algorithm happens to work for that specific case, but it doesn't work very well at all in some other cases. In my opinion it does not capture the design intent very well. Having your fillets randomly fade out just because they go near something else is not the way. And it becomes impossible to fillet a corner where 3 surfaces meet. So the third idea is to rewrite the tree so that blended surfaces do appear as siblings and you can apply the correct blend. This is possible because (thank the Gods) min(a,b) and max(a,b) both distribute over each other. Which means for example you can rewrite min(max(a,b), c) as max(min(a,c), min(b,c)) without changing the meaning! That means it is possible to rewrite the tree to make the arguments of any arbitrary blend into siblings of each other, and then you can apply the blend at the relevant min / max . But, ah... it doesn't end there sadly. What if you want some blend between (a,c) and another, different, blend between (a,b) ? You could rewrite the tree back to how it was before to handle the (a,b) blend, but then you can't do (a,c) any more. So keep applying the distributivity rewrite: min(max(a,b), c) => max(min(a,c), min(b,c)) => min(max(a, min(b,c)), max(c, min(b,c))) => max(min(min(a,b), min(a,c)), min(c,b), min(c,c)) (I wrote the line above by hand, it could have errors, it is meant to be repeated application of the same rule. I tried getting ChatGPT to check if I did it right, it was convinced I made a mistake but it eventually gave up and said "I hate this, but I can't break it". I am pretty confident the rule holds even if I made an error in applying it.) And there. Now we can apply one blend to min(a,b) and another to min(c,b) . However, I still don't know if it ends there! OK, we now have leaf nodes that are blended correctly, but do we need to apply blends at the higher level min / max that join them together? Or does it all work itself out? I don't know. And I won't find out for a little while because I still haven't successfully stumbled across the correct algorithm for deciding where to apply the distributivity rewriting rule. But resolving the tree rewriting algorithm is my current focus, I hope to get it working this week. Btw if you know the correct algorithm please email it to me, thanks. Volume, mass, centre of mass There is a neat way to calculate volume, mass, and centre of mass, if you can take integrals. For your distance function f(p) , define g(p) = f(p) , but with a smooth approximation of the ternary operator (e.g. smoothstep instead of step, with whatever sharpness you think is appropriate for your desired precision level). Integrate g(p) over all space (e.g. over your bounding box). The result of the integral is the volume of your object in O(1)! If the object is all one material then you get mass by multiplying by density. If it has multiple materials, then propagate them up your expression tree the same way as you handle colours and surface ids, and then h(p) = f(p).dist . Integrate h(p) to find mass. And you can binary search within your bounding box, for each axis in turn, to find the point that has 50% of the mass on either side. That point is the centre of mass, found in O(lg N) where N is the level of precision you want (e.g. to find the centre of mass in each axis to 1 millionth (N=10 6 ) of the bounding box side length would take 20 steps). But... this is another place where the maths "just isn't that neat". Automatic integration isn't nearly as straightforward as automatic differentiation. So I think we'd need to use a numerical method to compute the integral. It would look nearly the same as I described above, except multiply everything by O(N^3). But I think Mathematica can compute integrals analytically, so it is probably possible . Surface texturing Surface texturing is something I'm really excited about. We can permute the distance field according to any arbitrary function we define. Currently I have surface texturing with a "Distance Deform" operator: That adds a "random" texture to the surface, by multiplying together a bunch of random trigonometric functions applied to the point in space and adding the result to the child distance field. But instead of just adding random texture, or other procedural textures, we could load an image file and send it to the GPU as a 2d texture, and then extrude it along (say) the Z axis, and tile infinitely in X and Y, scale it etc., and then deform the distance field according to the (interpolated) value of the image. As an example you could make a 2d SDF of a "checkerplate" texture and have it apply to parts of your model as distance deformations. Adding massively-repeating fine-detail textures in this way is something that is easy with F-Reps but really hard with B-Reps. It could also be a really neat way to apply logos to surfaces for example. It would make the surface of the model "bulge out" where the logo is. And maybe you could have another mode where the image is revolved, or extruded radially instead of linearly, and pick whichever one best fits the surface you're trying to map it on to. (Btw would a radial extrude be a useful thing to do with sketches too? Maybe). CAM isn't really something I'm bothered about for Isoform. I'm really mainly interested in 3d printing. Worst case, I can export a mesh and do CAM on the STL file with Meshmill , or something else. The only reason I'm thinking about CAM is because something really cool is possible with SDFs. So we know that the SDF gives us the distance from a point in space to the nearest point on the surface. That means you can put a sphere at that point in space and know that it won't penetrate the surface. And if you then march your "ray" forwards by the radius of the sphere, you test a new point on the perimeter of the sphere, and you get another sphere, and so on, until you hit the surface and get a sphere of 0 radius. And that's how we do "sphere tracing" for SDF rendering. (From Wikipedia ) There are 2 core problems in 3-axis CAM. The first is working out what path to follow in XY, and the second is how deep to plunge it in Z at each point on that path. And with SDFs we have a really neat way to work out how deep we can plunge a ball-nose tool. Rotate the ray-marching problem in your mind slightly, and imagine shooting rays vertically down from a CNC spindle. Stop when the distance to the surface equals the radius of your tool. Now you've found exactly how far you can plunge a ball-nose end mill without intersecting the geometry! And you've done it in, let's say from my experience, 10 or 20 evaluations of the SDF, regardless of the size of your tool , and with arbitrary precision. Compare that to how Meshmill works, where we find out how deep the tool can plunge by testing against a neighbourhood of pixels in a heightmap. The area of the neighbourhood is quadratic in both the radius of the tool and the resolution of the heightmap. So it gets expensive really fast if you want high-resolution heightmaps and large tools. So sphere tracing could be a good way to do CAM with SDFs... but only for ball-nose tools! I haven't yet worked out a good way to work out how deep you can plunge a flat end mill, which I would consider to be a hard requirement for CAM. Ball-nose tools are cool, but it's not very often you make stuff only with ball-nose tools. Exporting meshes is critical because if you can't export meshes then you can't 3d print your models, I claim but do not prove. (I don't claim that. Obviously you can slice from an SDF, but you'd need to make a custom slicer. Do you want to make a custom slicer? I don't. So for now, meshes). Turning an SDF into a triangle mesh is relatively straightforward with marching cubes . This is a relatively simple algorithm, but requires some enormous lookup tables . Cursor wrote pretty much the correct algorithm on the first try, and it made what I would call a bloody good effort at typing out the lookup tables off the top of its head. It didn't get the lookup tables quite right, which meant my first few attempts at meshing had holes in them: But it didn't take a lot to get it to work: I would definitely credit Cursor with having done the majority of the thinking on the Isoform mesher. (So far). Currently it evaluates the SDF at every point in the voxel grid, and then uses marching cubes to make a mesh and writes it out as an STL file. The next big improvement here would be to use interval arithmetic to rapidly classify large regions of space as either inside or outside the volume, without having to evaluate every single voxel within the interval. You should see Matt Keeter's talk on Implicit Surfaces & Independent Research for more info on this, or his paper Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces . I already have a way in Isoform to compile an SDF to a version that works with interval arithmetic, so it shouldn't be crazy hard to make the mesh export more efficient. But it's not a focus for the time being because fundamentally the "core technology problem" is solved, and I don't need to create meshes very often. One good milestone was 3d printing the first Isoform-modelled part: It is a kind of weird and non-useful shape, but it was chosen because it is easy to do with SDFs and hard to do with B-reps. The Shell of a Gyroid separates space into 2 disconnected halves, so this is kind of a print-in-place fidget toy with 2 components that can't be separated. Lucy calls it a coral reef. Importing meshes is much less critical. Importing meshes allows you to "remix" other people's models. I almost never do this, because working with imported meshes in FreeCAD is way too compute-intensive. It is relatively straightforward to make an SDF from a triangle mesh, there is Isoform code for this in js/nodes/mesh.js . You need to calculate the distance to the closest point on the closest triangle of the mesh, which is the same as the min() of all the distances to all the points on all the triangles! So, yeah, obviously this is way too compute-intensive to be running in real-time on large meshes. So the next thing is instead of just sticking the giant mesh SDF in the document tree, we can sample the mesh SDF in a 3d grid, and store the distance values from the 3d grid in a 3d texture, pass the 3d texture to the GPU, and then sample the 3d texture in the fragment shader. And then we get a kind of quantised version of distances to the triangle mesh, but with O(1) lookup. The good news is that linear interpolation of distances to flat planes still results in flat planes! The bad news is that linear interpolation of distances to sharp edges does... not result in sharp edges: So maybe the best thing to do is increase the grid resolution. That would make the initial import of a mesh slower (in proportion to the number of points in the grid), but wouldn't affect runtime performance until the grid gets too big for GPU memory. I was quite disappointed that the sharp edges become stair-stepped. Valve published a paper Improved Alpha-Tested Magnification for Vector Textures and Special Effects in 2007 on the use of SDFs sampled in 2d which seems to show smooth-ish reconstruction of straight lines (figure 1b). And this does generalise to 3d as smooth reconstruction of flat surfaces , but I don't know if there's a good way to modify it to get smoother reconstruction of straight lines. (I'm not using "their technique" from figure 1c because it is complicated and weird and the paper doesn't really explain how to do it . But maybe it would help.) For now the best I can do is crank up the resolution of the 3d grid, and use the interval arithmetic recursive subdivision method to reduce the amount of mesh SDF evaluations I need to do. Mathematics SDF arithmetic in Isoform is handled by a library that I'm calling "Peptide". It's a "library" in the sense that it is conceptually self-contained. It currently lives inside the Isoform git repo . It allows you to write code like: Which is an admittedly really awful syntax for saying y = 12+x . But here y is not a number. y is a Peptide expression , and with a Peptide expression we can: straightforwardly evaluate in JavaScript compile to JavaScript source code (and then eval() the JS code, which is faster than direct evaluation if you want to do it lots of times) compile to GLSL shader code or do any of the above but for interval arithmetic or take its first derivative, as another Peptide expression, with automatic differentiation With Peptide, I get to write a single implementation of each node in my SDF tree, and I get to use it in JavaScript, and in fragment shaders, and I get to use interval arithmetic in any of those places, and I also get automatic differentiation. I found Cursor was really good at writing the automatic differentiation code. Peptide is basically a rip off of Matt Keeter's Fidget , but with a feature set that better suits what I want. Joe Hewett once said that I "don't know how to use other people's code". He might have been right. Interval arithmetic I was hoping to be able to use interval arithmetic to do "binary search ray marching". We would arrange for the viewport to be perpendicular to some axis (say Z), and in the fragment shader, evaluate an "interval ray" which has a 0-sized interval for X and Y, but spans the entire bounding box in Z. Evaluate the SDF over this interval. If the resulting distance interval doesn't contain 0 then you know the ray can't hit the object. That much works. Then the plan is if the ray does hit the object, then cut the interval in half, and evaluate the ray that goes halfway into the scene. If that hits the object, then come halfway to that one, otherwise do halfway from there to the end, and so on. The reason it doesn't work is because interval arithmetic is giving conservative bounds on the values in the interval. It is true that if the distance interval doesn't contain 0 then the ray doesn't hit the surface, but it is not true that if the distance interval contains 0 then it does hit the surface. So I don't think binary search ray marching works, and we're stuck with sphere tracing. Sphere tracing artifacts Sphere tracing is a really good way to rapidly march a ray towards a surface perpendicular to the ray, but it has a pathological case when marching a ray near a surface that is almost parallel to the ray. At each step, the ray is not very far from the surface, so it can't advance very far into the world. But since it is almost parallel to the surface, it isn't getting much closer, so the same thing happens again at the next step. If the parallel surface is long enough then you might hit your step limit without reaching the surface. My first implementation said that in this case the ray had not hit the object. Consider this object: If the long flat side of the triangular prism is perpendicular to the screen, then it is parallel to the rays, and if you zoom in you get holes in the model: An improvement (in this case) is to say that if the ray didn't hit the object then we'll pretend it did if the distance to the object is small: This gives a less jarring visual artifact, but is still bad. My final improvement is to blend in some "fog" based on how many steps of ray marching it did. This gives the appearance of some kind of bright reflection near the edge of flat surfaces. The natural human reaction is to rotate the object to get rid of the bright reflection, which then allows the ray to converge quickly and you can see what you're doing. Well that's basically all of the ideas I have for how to make an F-Rep CAD program. Please steal anything you find useful. And please get in touch if you are working on F-Rep CAD software, I don't think there are all that many of us. And thanks to: Inigo Quilez for the world's greatest reference library for SDF information. Kevin Lynagh for sending me lots of great information, he wrote CADtron , and introduced me to Matt Keeter's work. Matt Keeter for the talk and paper about applying interval arithmetic to SDFs, and for Fidget .

1 views
James Stanley 7 months ago

Thoughts on Signed Distance Functions for CAD

I've got into signed distance functions recently and am interested in their application to CAD software. I first came across SDFs from some video demonstrations by Inigo Quilez (for example Painting a Landscape with Mathematics ) and it very quickly became apparent how powerful this technique is. I don't quite have the intuition for it that he does, but thankfully he has written extensive reference documentation . Introduction A signed distance function is a representation of a solid object in the form of a function f(p) , where p is a vector representing a point in space. The function returns the "signed distance" from the point p to the nearest point on the surface of the object, negative if p is within the object and positive if it is outside. The surface of the object is found at f(p) = 0 , this is called the isosurface . Boundary representation The primary representation of objects in CAD software is the boundary representation , or "B-rep" (e.g. used in FreeCAD ). The boundary representation consists of keeping track of all of the points, edges, and faces that make up the object, and the connections between them, and writing lots of complicated geometry code to deal with all the edge cases that come up when you try to combine primitives in interesting ways. Edge cases in the boundary representation are the reason that FreeCAD sometimes crashes if you try to do things like complicated fillets. Boundary representation gets very complicated very quickly. SDFs are cool because arbitrarily-complicated shapes combine with basically zero edge cases, and you can define functions that do cool things like texturing arbitrary surfaces. And it all takes a really tiny amount of code. And some operations which in traditional CAD get really expensive due to overlapping boolean operations taking quadratic time (like a PolarPattern with 200 occurrences ) can be done almost instantly with SDFs. This is the SDF for a sphere centered on the origin with radius r . The distance from a point p to the surface of a sphere is the distance of the point p from the origin (i.e. the length of p ) minus the radius of the sphere. Satisfy yourself that this equals 0 for points on the surface of the sphere, is less than 0 for points on the inside of the sphere, and is greater than 0 for points outside the sphere. Let's say you don't want your sphere at the origin, you want it at some other point q . You could try: Which is not too hard, but I thought we said we can operate easily on arbitrarily-complicated shapes? You don't want to have to add a q parameter to every single function just to move things away from the origin. To translate an arbitrary SDF f(p) by a vector q , you need to evaluate f(p-q) . Another interesting one is: Boolean union is just min() ! The distance from a point to the surface of the union is the closest of the distances between the point and the individual objects. Intersection is max() . Subtraction is max(d1, -d2) , i.e. the intersection of shape 1 with the inverse of shape 2 ( -d2 swaps the interior and exterior of the shape - yes the shape will then have an infinite volume, this is also not a problem for SDFs). And with these functions plus some primitives, we have an SDF implementation of Constructive solid geometry . And then when SDFs start to get really cool is with operations like "smooth minimum", this is a family of functions that act like a boolean union, except with automatic smoothing of the join. An example is: For points far away from the join it returns about the same as min() , but near the intersection it gives smaller values, so that the join "grows outwards" and looks smooth, like this: So we're already filleting edges with basically no code at all! For comparison, the chamfering and filleting module in OpenCascade (FreeCAD's CAD kernel) is over 36,000 lines of code . (I grant that FreeCAD lets you target your fillets and chamfers more precisely, whereas smin() is kind of a blunt instrument). I said that patterns with high number of occurrences are slow with B-reps but didn't say why they're fast with SDFs: Even straightforwardly doing patterns with unions would be an improvement with SDFs because it is linear in the number of copies instead of quadratic, and you can potentially do it in constant time with domain repetition , which is where you map the coordinate system down to the range that covers just one object (for example to repeat your object every 10 mm in the X axis, make your X coordinate repeat every 10 mm, bounded to the number of copies required). Domain repetition is trivial for non-overlapping copies, but what about overlapping ones? If you have a known bounding volume I think you could use the same technique, for example by making 100 domain repetitions of a union of only 2. At that point it is linear in the number of overlapping bounding volumes rather than the total number of copies. OK, so we can define an object and do our CAD operations on it, but the representation is still only in our minds at this point. How do we put a picture on the screen? The best answer is ray marching in a GPU shader. I know that "ray marching" sets off alarm bells saying "wow, that sounds super computationally expensive". But it's not, for one really cool reason! The SDF tells you how far you are from the isosurface ! So instead of having to march your ray forwards at miniscule increments to ensure you don't step over the model without noticing it, you can jump forward from every point p by a step of size f(p) and know that you will never miss anything! So in practice the ray marching either diverges quickly (i.e. rapidly gets further away from your part as it accelerates away into the void), or converges quickly (rapidly approaches the surface as it jumps forwards by steps equal to the current distance to the closest point). This picture from Wikipedia shows the principle: The circles represent the value returned by the SDF (distance to the closest point), and you can see the step size slows down as the ray gets close to the surface, and speeds up as it gets further away. OK, so we can draw the object on the screen. How do we 3d print it? Marching cubes . Marching cubes is an algorithm for turning an isosurface into a triangle mesh. We'll use marching cubes to create an STL file of the object and then slice it with whatever normal slicing software and 3d print as normal. I'm obviously not the first person to think of using SDFs for CAD software. Existing projects include: sdfx ImplicitCAD libfive Studio And these all have one thing in common: you have to design your model in code, instead of a traditional CAD user interface. I've used OpenSCAD enough to know that this is not what I'm looking for. Another interesting project to look at is Fornjot , they have an article Why Fornjot is Using Boundary Representation that discusses the issues with using SDFs for CAD. That article was over 2 years ago and Fornjot is still an "early-stage b-rep CAD kernel", which doesn't exactly refute my point that boundary representation is really complicated. And as far as I can see Fornjot doesn't have any user interface either. Why Fornjot is Using Boundary Representation But let's not let a cautionary tale go to waste! What are the reasons Fornjot moved away from SDFs? 1. Common operations don't result in a correct SDF I lied earlier. When I said the SDF of the union of 2 SDFs is just min(a,b) , that was a lie. The property that makes SDFs work really well is what I'm calling "the distance property". That is the property that every point in space evaluates to the true distance from that point to the isosurface. And boolean operations don't preserve the distance property ! The boolean operations do however preserve a different property which is almost as useful. They give you a function which is a lower bound on the distance from p to the isosurface. That means ray marching still works, because it is always safe to step forward by a distance f(p) without missing any geometry. Sometimes you actually do need the distance property. For example if you want to expand your object by some fixed thickness th : If the distance is wrong, the expansion will be wrong. Depending on the reason the distance is wrong, it might be possible to "renormalise" by dividing the distance by the length of its gradient. I.e. sample the SDF at a few other places nearby, and if it looks like the distance is changing by 3x as much as it should be for the distance away you sampled it, then divide the distance by 3. But that won't always help (erm... excuse me, I was told there would not be edge cases??). Inigo Quilez has more on this topic . 2. CSG is not enough This is true. CSG is not enough. But SDFs aren't limited to CSG! I don't see how this is a critique of SDFs. The "draw a sketch, pad/pocket/revolve, draw a sketch, pad/pocket/revolve" workflow (that I find to be the most productive CAD workflow) is possible with SDFs! And in fact it's a really good fit for SDFs. We just admitted that boolean operations break the distance property and that we don't like that. But do you know what doesn't break the distance property, and in fact has an analytical solution? That's right! Drawing a sketch and then padding/revolving it! (Not pocketing, sorry). Quilez has example code showing the implementation of a 2d SDF for a shape made out of line and arc segments (scroll down to sdShape() on that shadertoy). (That covers approximately 100% of the sketches I make in FreeCAD. The remaining ~0% also need B-splines, but I read that there is an analytical 2d SDF for quadratic Bezier curves. I'd just need to find an LLM that can program the mathematics for me.) And he also has example code for padding and revolving, see opExtrusion() and opRevolution() on his extensive 3d SDFs page . Inigo Quilez has basically solved all the mathematical problems with SDF-based modelling, and published the solutions in a form that is easy for programmers to read and understand. His site is a superb resource. I ought to take a mirror of it just in case it ever disappears. 3. Marching cubes doesn't work very well The drawback cited on the Fornjot blog is that available algorithms aren't very good "in terms of how well the generated triangles match the original geometry", or are crazy complicated. I'm not going to bother addressing "crazy complicated" because that is obviously not what I'm looking for. What about not matching the original geometry? I want to use CAD so that I can turn my ideas into 3d prints. So, firstly, if the triangle mesh matches the original geometry to a precision better than I can 3d print, it may as well be flawless. And secondly, you can tune mesh precision at the cost of mere computational expense. It takes hours to 3d print most objects, if it takes 30 seconds to generate a triangle mesh at the required precision, I can live with that. (I don't know that it will take that long, I'm just making the point). Perhaps they thought this was a problem because they were rendering the SDF on screen by turning it into a triangle mesh first? Judging by the screenshots, sdfx does the same thing: I agree that if you are rendering the model from a triangle mesh then, yes, of course you need a really high quality mesh, and you need it created really fast. But given that ray marching in a shader exists and is easy to do, I just don't see why you would render the model as a triangle mesh? (Maybe I'll find out the hard way when I start making more complicated models). If you read this far you may have guessed where this is going. Cursor and I are working on an SDF-based CAD program. You can try it out now if you want, it's called Isoform . It runs in the browser. And it fact it has no dependencies and it is fine to just clone the github repo and run it locally. I think more web apps should be "local-first" in this way. It is super early days (so far only a bit over 2000 lines of JavaScript) and I see it more as an experimental project than a serious CAD application. I definitely don't have ambitions anywhere near as large as FreeCAD, I'm aiming more along the lines of SolveSpace . But it's not there yet, I definitely don't recommend actually trying to use it for anything, but just have a play if you're interested. You need to right click in the tree view to do anything. I'm particularly interested in hearing from you if you managed to get it to do anything, or if you tried it and it didn't work. The edge detection feature (highlighting edges in white, see screenshot above) is based on finding sharp changes in the surface normal. I'm not really happy with it, I just did it to see how viable my idea for detecting edges is. This is an area where B-reps have an obvious advantage, because they already know the list of edges, no need to detect them! The texture on the surface is implemented by permuting the child object's distance by some random-ish sine waves. This is an example of an operation that flagrantly breaks the distance property, but is worth having anyway because it lets you do really cool things. In fact this operation also breaks the "lower-bound" property. About the only property it maintains is that f(p) = 0 defines the surface. And that's more just a property of what it means to be an isosurface... But texturing is cool anyway, and I don't think it matters if your finished object has a broken SDF. The goal is not a mathematically-elegant description of your thing. The goal is a 3d-printed part, and any unpleasantness in your broken distance function is forgotten as soon as you turn it into a mesh. Implemented: box/sphere/torus/cylinder primitives union/intersection/subtraction operations roughness modifier (really just a proof-of-concept for texturing; real texturing operations would work similarly but look better) CAD-friendly UI with tree view and property editor I plan to implement: the "draw a sketch and pad/pocket/revolve" workflow linear pattern, polar pattern, mirror more primitives more cool texture modifiers, like "Roughness" in the screenshot taking datum points/lines/planes off the existing model targeted fillets/chamfers (currently you have to do it with a combination of the "roundRadius" property on the primitive, and the "smoothK" property on the combinator), I plan to figure out a way to localise a fillet/chamfer to a datum line, and then you get to target fillets like in big boy's CAD programs exporting a mesh importing a mesh (maybe; not sure this is a good idea - the SDF of a mesh would be enormous) saving your work isomorphism between the tree view and an OpenSCAD-like representation I hope I've convinced you that SDFs are a promising alternative to traditional boundary representation modelling. For some applications SDFs win out because of their simplicity, efficiency, and robustness. We should focus on the advantages and not worry too much about the drawbacks. I read that any art medium's most painful flaw turns out to be its most defining signature, and currently we have far too many parts with B-rep flaws and not nearly enough with SDF flaws! And obviously many thanks to Inigo Quilez for doing all the hard work and for documenting it so well .

0 views
James Stanley 7 months ago

How to abort calculations in OpenCascade

Hands up, how many times has this happened to you? There you are, minding your own business, designing some sick new parts for your 3d-printed hoverbike, you go to make a PolarPattern of 3 or 4 sweet ridges just to really finish it off and BOOM your hand slips and you accidentally ask for 3 million. FreeCAD hangs for the rest of the week and you feel your will to live slipping away. If only you could cancel the task and change 3 million to a smaller number! Over the past ~week I've worked on an improvement to FreeCAD that would allow aborting long-running operations. I went through several iterations of the implementation, none of them are flawless. I really want this feature in FreeCAD, so this post is firstly a description of how you can abort calculations, and get progress reports, in OpenCascade, and secondly a description of my existing Pull Requests. That if you want to work on this you might start a bit further ahead than I did. OpenCascade is an open-source CAD kernel. Mainly it is FreeCAD's CAD kernel. FreeCAD does (all?) document recomputes on the main thread. That means the user interface is totally blocked while it is working. Sometimes this is annoying because you realise you made a mistake and you want to make some changes and you don't even want the result of the recompute, but your only options are wait for it to finish calculating, or kill FreeCAD and lose all unsaved work. Currently the recompute needs to be on the main thread, for 2 reasons: it can end up calling into Gui code, which needs to be on the main thread because Qt isn't thread-safe it can end up calling into Python code, which needs to be on the main thread because of the Global Interpreter Lock I. OpenCascade The general structure is that you provide OpenCascade with an implementation of its Message_ProgressIndicator class. You implement 2 methods: bool UserBreak() : you return true if the user has requested to abort the operation. void Show(const Message_ProgressScope &scope, const bool force) : you ignore the arguments, and call GetPosition() (implemented by OpenCascade) to work out the percentage to display. It returns a float from 0 to 1 corresponding to progress from 0% to 100%. And then when you call into OpenCascade it wants you to pass it a Message_ProgressRange . You get that from Message_ProgressIndicator.Start() (implemented by OpenCascade). So that's all you need to know to be able to interrupt OpenCascade and show a progress indicator. I don't actually think in Object-Oriented, and at first I thought I was meant to instantiate OpenCascade's Message_ProgressIndicator and call UserBreak() when the user clicks "abort". This obviously did nothing and I initially thought it just didn't work. OpenCascade documentation: Message_ProgressIndicator Message_ProgressRange Message_ProgressScope "ProgressRange" and "ProgressScope" are to do with allowing sub-operations to report progress of their sub-operation, and have it show up correctly as the appropriate percentage of the overall total operation. I never had to look at it, I just pass my progressIndicator.Start() into OpenCascade at the top level and let the scope/range stuff happen internally to OpenCascade. But if you needed to do multiple slow OpenCascade operations you might want to use it. There is some example code, from the OpenCascade perspective, on Kirill Tartynskih's blog . II. My Pull Requests I have implemented three different solutions: Attempt 1: worker in child process "When in doubt, fork it out" -- ChatGPT Serialise the inputs and a description of the operation. Send the serialised description over a pipe to a worker process. The worker process sets the calculation going with OpenCascade, meanwhile the main process displays an "abort" dialog. If you click "abort" the child process gets killed, otherwise the child process eventually finishes, serialises its results, and sends them to the main process. Pros Cons Can interrupt OpenCascade without consent (i.e. even if it's not checking UserBreak() ) Every different type of OpenCascade operation needs specific support (I only implemented "BooleanOperation", which covers a lot of it but not everything) Serialising/deserialising the inputs and outputs is potentially slow Windows can't do fork() , pipe() , etc. so would need a separate implementation This was my first attempt. I did this before I found out that OpenCascade provides the UserBreak() mechanism, and I thought the only possible way to interrupt OpenCascade was to run it in a separate process and kill the process. I was so delighted when I first got this working that I recorded a video demonstration . I only implemented support for "Fuse" and "Cut" boolean operations. That does get you most of what real-life CAD usage is focused on, but it's not everything. In particular helix, pipe, loft, fillet, chamfer, etc. aren't covered by this and can all be slow, so they would want specific support as well. Attempt 2: worker in thread Run the calculation on a worker thread in the main process, and the abort dialog on the main thread. Cross your fingers and close your eyes to the potential unsafe Qt usage. Skip the worker thread and revert to status quo if main thread holds the GIL. Pros Cons No need for a child process Potential unsafe Qt operations Revert to non-abortable status quo if GIL is already held I did this after I was pointed to the OpenCascade documentation, and Kirill Tartynskih's blog post, and discovered that not only does OpenCascade provide a way to let you abort, but it also reports progress percentage. Jackpot! I did find that my implementation here actually does cause segfaults due to the unsafe Qt usage, although not on every operation. In particular it seemed perfectly stable when I was doing "normal" CAD operations, but I found that if I assigned an "alias" to a cell in the Spreadsheet workbench, it would instantly segfault every time on the no-op recompute that is triggered when you apply the alias. So I don't know the specific reason that that matters, but I do know that it is unsafe and unstable. I also tried checking whether the main thread currently holds the GIL, and if so releasing it, re-acquiring it in the worker thread, and undoing that after the operation is done, but I found that that caused segfaults. I don't know why. Possibly I was doing it wrong but if so I couldn't work out how to do it right. So I just settled for skipping the abort dialog if the main thread already holds the GIL, at least that way you sometimes get to abort. Attempt 3: dialog in child process Start a child process to show the "abort" dialog, run the calculation on the main thread in the main process, and have the child process send a SIGHUP if the user clicks "abort", and have the main process's SIGHUP handler tell OpenCascade to stop working. Pros Cons Computation on main thread without need for serialising/deserialising Still need a separate implementation for Windows Main process throws up "FreeCAD is not responding" warnings I think this is basically as good as you can do. There are no regressions, because in every case that this triggers "FreeCAD is not responding", the existing FreeCAD codebase also triggers that. But it brought to everyone's attention the fact that FreeCAD is doing blocking operations on the main thread, which is obviously not what anybody wants. III. What next? None of my implementations have been acceptable, and I don't think any of them are going to get merged. I believe that the FreeCAD maintainers believe that it would be better if FreeCAD did recomputes on a separate thread, in a safe way. We could split off the worker thread at a level that won't have to call back into Python (maybe?), and we could make sure all Gui operations are scheduled on the main thread with QMeta::invokeMethod() . That would then allow the main thread to stay responsive and allow providing a progress bar and abort dialog on the main thread. I agree that that would be better, of course. But the FreeCAD codebase is over a million lines of C++. Would you like to audit all of that code to make sure all of the Gui operations that can conceivably be called into via a recompute are using invokeMethod() like good citizens? Would you like to make sure none of it can call into Python code? I don't think it's going to happen. In the mean time I think this is a classic case of Perfect being the enemy of Good. I think FreeCAD is strictly a better program if we are able to abort slow operations than if we are not, and I think my third implementation has no regressions versus the status quo. But I'm firmly at the "hacker" end of the hackers/builders/maintainers spectrum. Maintainers do us a great service by keeping FreeCAD working over the decades. Everyone has their own pet feature that they'd love to hack on to FreeCAD, but it is only thanks to the diligence of the maintainers that FreeCAD is still alive at all. So let's not hate on the maintainers too much. So I think an ideal solution would look like my second pull request, except with FreeCAD re-architected so that if the recompute needs to call into Gui code or Python code then it does so by some message-passing to the main thread, and then also every code path that can trigger a recompute needs to accept that the recompute is now asynchronous. If you read this far: maybe you can bring this feature into FreeCAD! If so, please get in touch, I want to help you. I have done this kind of "synchronous-to-asynchronous" refactor before. In fact at one customer I have done "the same one" 3 times because turning a synchronous application into an asynchronous one is such a monster change that it is basically impossible to be confident that it's correct, and by the time anyone gets the confidence to actually merge it the codebase has diverged so much that you need to start again. For this reason I don't think this will ever happen for FreeCAD, but I'd love to be proven wrong. It is possible that everything that triggers a recompute is already happy for it to be asynchronous, but personally I doubt it. The trick is to break it down into small incremental changes that can move parts of the application towards being asynchronous without ever leaving it in a broken state. If you try to do the entire thing at once it will never get merged.

0 views
James Stanley 7 months ago

The world's stupidest AppImage

AppImage packages an application and its dependencies into a single file. You can then just execute that single file, and it will extract everything into a temporary directory and hook everything up so that the application works. Very easy to use, very convenient. Unfortunately, it's not so straightforward to make an AppImage . That's where my project comes in. I've been trying to make some changes to FreeCAD recently, and as part of that I've been building FreeCAD inside a distrobox so as not to clutter up my main filesystem with FreeCAD's extensive build dependencies. However I found that once I had compiled the binary, it didn't work properly when running inside the distrobox. The GUI would load up, but anything involving a 3d display just didn't appear. I suspected that the distrobox environment didn't have access to the GPU, so I tried executing the FreeCAD binary from the "host" environment, but it didn't even start up because of all the missing dependencies. So instead I got ChatGPT to write me "the world's stupidest AppImage builder". It is a shell script that copies your binary, its dynamically-linked dependencies according to ldd , and any additional files you specify, into a single self-extracting shell script. You can then execute the shell script from the host system, and it extracts the binary and all of its dependencies into a temporary directory, and then launches the binary with LD_LIBRARY_PATH set to the temporary directory. The script is here in case you want to play with it: https://gist.github.com/jes/543669252b2951e961614e2f2a873b22 . And then I can make my FreeCAD "lame AppImage" by: And launch it in the host environment: And this time it actually loads up instead of complaining about missing libraries! Not a total success however, because there are extra libraries that it tries to load at runtime so it doesn't totally work. But I wanted to share the stupid AppImage-like builder anyway. (A better solution for running FreeCAD inside a distrobox with an NVidia GPU is to create the distrobox with --nvidia , and that worked for me).

0 views
James Stanley 10 months ago

Timeline of Discovery

This evening I made a Timeline of Discovery , listing historical inventions, discoveries, events, etc. that I find personally interesting. I started doing it because I saw a project, Markwhen , on Hacker News, that turns a simple markdown-like format into HTML timelines and I wanted to try it out. It turned out not to be exactly what I wanted, because it makes a page that is "too interactive", and I couldn't work out how to input dates before 0 AD, so I got Cursor to write me a custom static-site generator instead. I found that Cursor picked up the input format very well, and sometimes just typing the year was enough for it to guess both the discovery I was going to write and the associated Wikipedia link! Which I then just press tab to insert. A fun game is to type in random years and see what it proposes, sometimes it is just bogus. Maybe my timeline could do with a log scale on the X axis, that would make it possible to add things further in the past without leaving big blank sections. I kind of want to make a similar timeline but showing the life (birth to death) of interesting people from history, and with ranges like markwhen renders, instead of dots, to make it visually obvious whose life overlapped with whose. Also a similar timeline but for civilisation-level developments, something like the Histomap but going back further into the past, and with less detail.

0 views
James Stanley 11 months ago

Clock Gear Train Calculator

I made a Clock Gear Train Calculator this evening. This is something I've done with ad-hoc perl scripts in the past, but I wanted to use it again and couldn't find my previous scripts, so I thought this time it would be better to have a proper user interface on it. If you're designing a clock, you might find it useful. It lets you specify a range for the pinion sizes and wheel sizes, a number of shafts, and the target gear ratio, and then it runs the calculation in a web worker , with a progress bar, because it can take slightly too long to justify blocking the user interface. The search is a brute-force depth-first search, obviously there is a lot of overlapping subproblems so a dynamic programming solution would be better, but this is adequate for now. It allows you to specify a tolerance on the gear ratio. Usually you will want an exact ratio, but that's not always possible. For example, the sidereal time complication on Breguet's No. 2894 approximates 1.0027379 as 51/82 * 79/49 = 1.00273768 (source: "The Art of Breguet" by George Daniels). This is the best you can do with 3 shafts with gears between 20 and 120 teeth, possibly much more. It also lets you export the calculated ratios as a CSV file, which I imagine will mostly be useful for integrating with other custom scripts. And, obviously, I got Cursor to write most of the code.

0 views
James Stanley 11 months ago

The Principles of Mr. Harrison's Time-keeper

This post is a transcription, plus some commentary, of the Board of Longitude's 1767 document "The Principles of Mr. Harrison's Time-keeper", from scans on the University of Cambridge Digital Library . I was trying to learn more about the escapement in H4, and I came across a great article on "watchesbysjx": In-Depth: The Microscopic Magic of H4, Harrison's First Sea Watch , from which the scans were linked. I have turned the English pages of the scan into a PDF for more convenient reading, you can get it here: The Principles of Mr. Harrison's Time-keeper . Sadly the first couple of pages from the Preface, and the last couple of figures, are missing from the scan and therefore from my PDF. (Btw, if you're into this sort of stuff, I scanned in my copy of Rupert Gould's John Harrison and his Timekeepers recently - Gould rediscovered and restored Harrison's clocks in the early 20th century, and he also wrote "The Marine Chronometer: Its History and Development", which is worth a read but perhaps not at the £615 that Amazon are currently charging). The original from 1767 is typeset in the style of the time, including use of the "long s" , and "vv" instead of "w", but it is quite readable once you get used to it. In this post I have updated spelling to match modern usage, and turned Roman numerals into normal digits, but left sentence structure and capitalisation the same. The document is split into 2 parts. First we have "Notes taken at the Discovery of Mr. Harrison's Time-keeper" by Nevil Maskelyne , Astronomer Royal. Then we have "Principles of Mr. Harrison's Time-keeper" written by John Harrison himself, which ends with notes on the figures followed by plates containing the figures, which I've rearranged in this post to make it easier to read. The text in blockquotes is my transcription of the original, the rest is my commentary. Notes taken at the Discovery of Mr. Harrison's Time-keeper This part is written by Nevil Maskelyne, who was opposed to the use of a clock for determining longitude, and who preferred the lunar distance method, although his notes here appear perfectly impartial. The "Discovery" was the event at which Harrison took the watch apart and explained it. It began on Wednesday 14 August 1765 and took place at Harrison's house in Red Lion Square, observed by Maskelyne, plus three watchmakers ( Larcum Kendall , Thomas Mudge , and William Matthews), two reverends ( William Ludlam and John Michell ), and astronomer and instrument maker John Bird (source: "Harrison" by Jonathan Betts, highly recommended). The balance naturally vibrates largest arcs, when in a horizontal position; next greatest, when the hours 12 and 6 are uppermost, and the watch is in a vertical position; least, when the hours 3 and 9 are uppermost. Large arcs are naturally performed in less time than small ones. This Mr. Harrison inferred, because the watch, before any correction was applied, went slower in a vertical position than in the horizontal one, and the vibrations are visibly larger in the latter case. The watch is adjusted to vibrate great and small arcs in equal times, in the following manner: To go the same when placed vertically with the hours 3, 6, 9, 12 upwards successively, by making the weight of the balance different in different parts.—To go the same when placed horizontal as when vertical, by the joint effect of the back of the pallets and the cycloid-pin. I don't see how making the weight of the balance different in different parts can affect how the watch runs in different positions. If you keep the centre of mass in the centre (which you should, else it's not balanced) and the moment of inertia the same (which you should, else it will run at a different rate) then I don't see how changing the distribution of weight around its perimeter has any effect. Is it Harrison, Maskelyne, or me that is wrong here? The curve of the back of the pallets is an arc of a circle, whose centre lies in the line joining the edges of the pallets and the centre of the spindle, the distance of the two centres being two fifths, and the radius of the curve of the pallets three fifths of the radius of the circle described by the edge of the pallets. The description here suggests that the backs of the pallets are circular, but elsewhere they are described as cycloidal. There is a good description of the escapement on the watchesbysjx article. The action of the cycloid-pin, when it touches the balance-spring, tends to quicken its vibrations; and the spring, leaving the pin for a longer time in the large vibrations than in the small ones, is less accelerated by it in the former than in the latter case; and, consequently, the action of the pin tends to reduce the time of the different vibrations nearer to equality. The cycloid-pin was not applied to the watch until after it came back from the voyage to Jamaica. If the balance-spring is too strong, it must be made weaker by rubbing it away a little; but, if it be too weak, it must be changed for a stronger. The balance-spring is fastened at the outer end to a stud, which takes off the plate with a screw, and is put on again with the same screw, and steady-pins, exactly in the same position as before, without undoing the fastening of the spring to the stud at the end. There is no adjustment for mean time, as in common watches; there was once, but it did not answer. "Mean time" here is as distinct from solar time , which people of the 1700s would probably have been more familiar with. "Did not answer" presumably means that it didn't help with keeping time. "Common watches" provided an adjustment to speed up or slow down the watch so that it would indicate the passage of about 1 day per day, but Harrison's watch had no such provision. This speaks to the purpose of the watch. To allow a human operator to determine longitude at sea, the operator had to be able to determine the time accurately. It was more important that the watch run at a known rate than that the rate be 1 day per day. Once the rate is known it is easy enough for the operator on board a ship to apply the rate to determine the true mean time. (If the watch loses 10 seconds per day, and 10 days have passed, then it's lost 100 seconds). If adding a rate adjustment would upset the other adjustments, then in the context of a marine chronometer, it is preferable to forgo the rate adjustment and keep the known rate than to have the watch run at approximately mean time, but with larger errors. As soon as the watch is put together, Mr. Harrison says, it will show its rate of going in three hours accurately the same which it will keep afterwards; so that he can soon determine it by comparison with his pendulum clock. The balance-spring, when at rest, touches the cycloid-pin; and does not begin to leave it, until the balance has vibrated an arc of forty five degrees beyong the point of rest, while the spring is in the state of coiling itself up. The thermometer curb is composed of two thin plates of brass and steel riveted together in several places, which, by the greater expansion of brass than steel by heat, and contraction by cold, becomes convex on the brass side in hot weather, and convex on the steel side in cold weather; whence, one end being fixed, the other end obtains a motion corresponding with the changes of heat and cold, and the two pins at this end, between whcih the balance-spring passes, and which it touches alternately as the spring bends and unbends itself, will shorten or lengthen the spring, as the changes of heat and cold would otherwise require to be done by the hand, in the manner used for regulating a common watch. Mr. Harrison requires cold weather for adjusting the thermometer curb, and he places the watch near a fire, with a common thermometer by it, to try if it keeps the same time as in the cold air. If not, he alters or adjusts the thermometer curb until it goes the same in these two different degrees of temperature of the air. The thermometer curb takes heat sooner than the balance-spring, and he thence concludes that brass takes heat sooner than steel, and that the brass rods of a gridiron pendulum should be made thicker than the steel ones. Agreed. Brass has both a lower specific heat capacity than steel and higher thermal conductivity. Good idea about making the brass rods thicker than the steel ones in a gridiron pendulum. Whilst the heat is increasing, the watch will sometimes go one tenth of a second slower in three hours, than when the heat is come to a stand. It's remarkable that Harrison was able to measure this. What he's saying is that although the temperature compensation works correctly in a steady state, the difference between the rate at which brass and steel "take heat" means the temperature compensation is "wrong" while the temperature is changing . The effect of the thermometer is increased by rubbing the sides thinner, and is lessened by thickening the edge by burnishing it. I don't see how burnishing the edge makes it thicker. Does he mean by squashing it in the perpendicular axis to thicken it in the axis that matters? Mr. Harrison adjusts the thermometer curb first, that is to say, before he adjusts the watch to go the same in different positions. The watch may be put with figure 12 turned each day alternately different ways, for fear one part of the box in which it is kept may be hotter than the other. This is advice for how the watch might be used on a ship: turn it around every day so that the temperature difference across the watch is averaged out. The force or momentum of the balance, Mr. Harrison says, is as the square of its diameter, also as the square of the velocity, its weight being given. Agreed. For a given angular velocity, the angular momentum of the balance is proportional to its moment of inertia, which for a "cylindrical shell" is I = mR 2 . And for a given moment of inertia and angular velocity, the kinetic energy is proportional to the square of the moment of inertia, K = 1/2 I ω 2 . The momentum of the balance acquired by increasing the velocity is better than that acquired by increasing the weight; as friction is not thereby increased, perhaps, if any thing, diminished, and the resistance of the air only is increased, the effect of which is tolerably uniform, and of great service. The idea of air resistance being a great service is peculiar to Harrison. Most watch and clockmakers, before and since, have seen air resistance as a source of error to be eliminated. Harrison liked air resistance for two reasons. Firstly, you want a good amount of inefficiency in the balance, so that when the motion of the balance is disturbed, it very quickly gets back into equilibrium. If the balance hardly loses any energy during a cycle, then the escapement must be designed to hardly provide any energy during a cycle, so disturbances stay around for longer. And secondly, if the clock can be provoked to run faster in longer arcs, then you get some temperature compensation "for free", because when the temperature increases, the air density decreases, which means there is less air resistance, so the balance swings further which makes it run faster; but also the increase in temperature makes everything expand, which increases the moment of inertia of the balance, which makes it run slower. This second effect relies on the clock having sufficient influence from air resistance, which I'm not sure applied in the case of H4, but was definitely used by Harrison on pendulum clocks (source: "Harrison Decoded" by Rory McEvoy and Jonathan Betts). The diameter of the balance is 2.2 inches, of the plate 3.8 inches. The balance should be a little larger, or 2.25 inches, according to a memorandum taken by Mr. Bird. The watch makes just five beats in a second of time. If the balance vibrated faster, the resistance of the air would be too great. A pocket watch of this kind would do better with six beats in a second. ?? Make your mind up. A certain size is best for the pallets, or rather a certain proportion between the diameter of the circle described by the edge of the pallets and the diameter of the balance-wheel. This was first suggested to Mr. Harrison from bell-ringing; for he could bring the bell better into a motion, by touching it from time to time somewhere near the centre than near the circumference; because in the first case his hand moved quick enough to follow the bell. I think this is quite clumsily worded. What he's saying is that the motion of the pallets needs to be small enough that the escape wheel is able to accelerate fast enough to interact with them. So the ratio of the diameter of the pallets to the balance wheel is not really what you care about. The grand principle of the watch is that of giving the greatest motion possible to the balance with a given force. This is done by the scaping and proper quantity of the arc described. This note was communicated by Mr. Mudge, as also the following; That the balance, by the force from the wheels, without its spring, tends to vibrate once in two seconds. There are four springs in the watch; first, a main spring; secondly, a spring in the inside of the fusee, to keep it going while it is winding up; thirdly, a spring, which is wound up eight times every minute; fourthly, the balance-spring. The three first were made by Maberley. The fusee has six turns and a quarter. The fly serves to moderate the velocity with which the spring nearest the balance would otherwise be wound up. This is talking about the remontoire , which we'll see in more detail in Harrison's part. The "fly" is basically an air brake. It has vanes that stick into the air and meet with air resistance as it spins around. The pivot-holes are all made in rubies, with diamonds at the ends. The pallets are diamonds. It is apparently still unknown how Harrison managed to shape the tiny and precise radii on the diamonds (source: "Harrison" by Jonathan Betts) (but, as a point of philosophy: that fact shouldn't motivate any particular curiosity on your part - if you don't know either way, what difference does it make whether anyone else knows? you should be no less curious about all the things that other people do know!). One end of the watch in the late voyage to Barbados was set higher, because it was not equally adjusted in all positions. Also it was altered and brought back to the same position, with respect to the horizon, as the ship lay down on the one or the other tack, by the help of a moveable box, with a divided arch. I'm not quite sure what he means by a moveable box with a divided arch. What is a divided arch? Elsewhere the word "arch" is used to mean "arc" and I have transcribed it as "arc", but in this case since I'm not sure of the meaning I left it alone. In any case, what's a divided arc ? How does it help to bring the watch to the same position with respect to the horizon? I'm thinking it must be either a gimballed mount, or something that lets the human operator compare the angle of the box to the horizon (by comparing the horizon to the divisons on an arch?) and adjusting it manually. Mr. William Harrison reckons the greatest roll of a ship fifteen degrees, and the greatest lie-down, when going upon one tack, twelve degrees. Hold the watch a little back, when in a vertical position, that the face may be a little up. If the balance-spring be not exactly parallel to the plates, there will be a small difference in the going of the watch, when the face is up or down. I guess this is because the balance spring would then be pulling the balance against one or other of the bearing caps, which would either be mitigated or exacerbated depending on which way up it sits? Care is to be used in moving the watch, or in turning it about, in order to wind it up, not to give it any quick circular motion in the plane of the balance, as it might possibly stop it. A pocket-watch, which Mr. Harrison has made of this kind, once stopped this way. Turn the watch over upon some diameter of the dial-plate, as an axis, in order to bring it into a convenient position, when you want to wind it up. It's interesting that this was already a problem in 1767. This is a problem that was only to get worse as the chronometer was developed further. It is quite difficult to disturb a watch so as to take almost exactly all of the energy out of the balance, at a point where it is near the centre, and thereby bring it to a complete stop. But the later chronometer escapement was vulnerable to tripping , which is where a disturbance instead adds energy to the balance, sufficient to let it do another "lap" and gain a 2nd impulse from the escapement. Thereafter, due to the extra energy that keeps getting maintained by the 2nd impulse, the balance will do a second lap on every cycle and the clock will run twice as fast as it should. (source: "The Marine Chronometer: Its History and Developments" by Rupert Gould). Oil must be applied to the pallets and pivot-holes of the watch, but very sparingly. This is anathema to Harrison, but apparently his anti-friction wheels couldn't be satisfactorily miniaturised for the watch (source: "Harrison" by Jonathan Betts). At the time of the discovery, in August 1765, Mr. Harrison said, that the watch then went a little slower than it had done, owing to its wanting to be cleaned, viz. two or three seconds per day. The "discovery" here is again the event where Harrison explained the watch. The watch should have a cap, and no outer case, the wooden box, in which it should be kept, serving that purpose better. I'm not sure what this means. The watch is pair-cased, is Maskelyne saying that the outer case should not be used? Principles of Mr. Harrison's Time-keeper Now we get to Harrison's part. In "Harrison", Betts writes that this is "arguably the most seminal of all publications in the history of the chronometer". In this Time-keeper there is the greatest Care taken to avoid Friction as much as can be, by the Wheels moving on small Pivots, and in Ruby-holes, and high Numbers in the Wheels and Pinions. The Part which measures Time goes but the eighth Part of a Minute without winding up; so that Part is very simple, as this Winding-up is performed at the Wheel next to the Balance-wheel; by which Means, there is always an equal Force acting at that Wheel, and all the rest of the Work has no more to do in measuring Time, than the Person that winds them up once a Day. This is talking about the remontoire, the winding of which is slowed down by the fly that Maskelyne mentioned. The big idea here is that you can almost totally forget about variations of torque due to the mainspring winding down, or changes oil thickness etc. in most of the gear train of the watch. You only really care about the parts from the remontoire to the balance, it is like a self-contained clock that only runs for 8 seconds between automatic windings. There is a Spring in the Inside of the Fusee, which I will call a secondary Main-spring. This Spring is always kept stretched to a certain Tension by the Main-spring, and during the Time of winding up the Time-keeper, at which Time the Main-spring is not suffered to act, this secondary Spring supplies its Place. Without this "secondary mainspring", the fusee does not transmit any power to the gear train while the clock is being wound, because the act of winding it takes the spring tension away from the train. In common Watches in general the Wheels have about One-thrid the Dominion over the Balance that the Balance-spring has; that is, if the Power the Balance-spring has over the Balance be called Three , that from the Wheels is One ; but in this my Time-keeper, the Wheels have only about One-eightieth Part of the Power over the Balance that the Balance-spring has; and it must be allowed, the less the Wheels have to do with the Balance, the better. The Wheels in a common Watch having this great Dominion over the Balance, they can, when the Watch is wound up, and the Balance at rest, set the Watch a-going; but when my Time-keeper's Balance is at rest, and the Spring is wound up, the Force of the Wheels can no more set it a-going, than the Force of the Wheels of a common Regulator can, when the Weight is wound up, set the Pendulum a-vibrating; nor will the Force from the Wheels move the Balance, when at rest, to a greater Angle in Proportion to the Vibration that it is to fetch, than the Force of the Wheels of a common Regulator can move the Pendulum from the Perpendicular, when it is at rest. My Time-keeper's Balance is more than three times the Weight of a large sized common Watch-balance, and three times its Diameter; and a common Watch-balance goes through about six Inches of Space in a Second, but mine goes through about twenty-four Inches in that Time: So that had my Time-keeper only these Advantages over a common Watch, a good Performance might be expected from it. But my Time-keeper is not affected by the different Degrees of Heat and Cold, nor Agitation of the Ship; and the Force from the Wheels is applied to the Balance in such a Manner, together with the Shape of the Balance-spring, and (if I may be allowed the Term) an artificial Cycloid, which acts at this Spring; so that from these Contrivances, let the Balance vibrate more or less, all its Vibrations are performed in the same Time; and therefore, if it go at all, it must go true . So that it is plain from this, that such a Time-keeper goes entirely from Principle, and not from Chance. The following is a Description of the Drawings from which my fourth Time-keeper was made, and the Drawings are also hereunto annexed. AA is the Chain-barrel, and BB is a Section of it. CC is the Spring-barrel, and DD is a Section of it. EE is a Ratchet at the Spring-barrel, and FF is a Section of it. This Ratchet is screwed to the Spring-barrel by four small Screws at aaaa . There is a hole in the Pillar-plate of the Diameter from the dotted Lines bb , and that Part of the Spring-barrel cc is to mvoe in this Hole without any Shake, in order to set the Spring up. The Ratchet is also shown in Figure 13th, by the Circle bb , and it has thirty Teeth, and c is the Click that holds it. Diameter of the spring Arbor: about 1.64 of 1/4 Inch. Diameter of the Hole in the Centre of the Chain Barrel about: 0.38 Diameter of the upper pivot: 0.23 Diameter of the lower: 0.215 Diameter of the spring Barrel within: 1.4 Inch. AA is the Brass Edge, BB the Hole in the Middle of it, and CC is a Section of it. This Brass Edge is supported by six Pillars, and their Places are represented in Figures 13 and 14, by six Circles aaaaaa . AA represents the second Wheel acting in a Pinion at a . BB represents the third Wheel, which is concave, and acts in a Pinion at b . The second Wheel is described in Figure 14 by the Circle dd , and acts in a Pinion of 18 at e . The third Wheel is represented in Figure 14 by the Circle ff , acting in a Pinion of 16 at g . Note , The third Wheel is larger than is represnted in Figure 14, and has 144 Teeth, and the second Wheel has 120 Teeth. AA represents the contrate Wheel, BB a Section of it, with a Section of the Spring-barrel aa . At cc is a Piece with eight Pins in it, that discharges the running Wheels every eighth Part of a Minute. This Wheel is also represented in Figure 14 by the Circle hh ; it has 120 Teeth, and acts in a Pinion of 12 at i . Thickness of the Rim about 0.048 of 1/4 inch. The discharger and wheel for the seconds, must be a little nearer the dial plate than according to this drawing so that the tops of the pins of the discharger may be even with the plane of the pillar plate. The crosses of the wheel are also drawn too broad at the outer end. Diameter of the hole in the centre of the wheel about 0.23 of 1/4 inch. Diameter of that part of the spindle which goes through the fourth wheel arbor (thicker end) about 0.108 of 1/4 inch. Diameter of each pivot 0.045. Length of the spring 10 inches, its weight 3.5 grains. The second paragraph here reads like it was written by somebody else. The capitalisation is unlike Harrison's. Perhaps he dictated that part at a separate time? AA is the first Wheel, and aaaa is a Section of it. bbb is a Section of the Fusee. BB is the outer Diameter of a Ratchet which is fixed to the Inside of the Fusee, and the inner Circle CC is its inner Diameter, and it has 55 Teeth in it. dddd is what I call the perpetual Ratcher, of which cccc is a Section; there is a Ratchet with 75 Teeth in it on that Part marked ff , this is also shown in Figure 13 by the Circle ee , and this perpetual Ratchet is to carry the Barrel DD, which Barrel contains the secondary Main-spring, and will be in the Inside of the Fusee at gg , and at that Part of the first Wheel hh the inner End of this Spring is to act, as that Part hh will be its Arbor. The dotted Lines E represent the Grooves in the Fusee. The dotted Lines ll represent the upper Plate. The dotted Lines mm represent the Pillar Plate. The dotted Lines nn represent the Cock, which carries the lower End of the Arbor of the first Wheel. This Cock is also represented in Figure 13 at dddd . The Ratchet ee in Figure 13 has two Clicks, whose Centres are at ff , and gg are the Springs which act at these Clicks. In Figure 14 bb represents the first Wheel with 96 Teeth, acting in a Pinion of 21 at c . A is a Section of the Frame, with the Balance-cock, the Slide, and the Brass Edge; and a is the Centre of the Joint-pin. B is a Section, where aa represents the Balance-cock, bb the third Wheel-cock. c the Cock at the End of the contrate Wheel. d the Cock at the End of the fourth Wheel. e the fourth Wheel. f the Follower. g the Balance-wheel. h the Potence. i the Balance-wheel Pinion. k the Counter-potence, which also carries the other End of the fourth Wheel. m the Spring-barrel. n the Hook in it, where the out End of the Spring hangs. o the Hook at the contrate Wheel, where the inner End of the Spring is hung. r the fifth Wheel, with the Pin where the Detent is to stop. S the upper Plate. T the Pillar-plate. Is the Detent, by which is the Discharge for winding up eight Times in a Minute. The Part a acts at the eight Pins on the contrate Wheel-arbor. b is a Roller acting against a Piece of Brass on the fifth Wheel-arbor. c is a Piece that stops against a Pin in the Rim of the fifth Wheel. dd are Pieces of Brass to make it in an Equilibrium with itself; and E is the Spring which acts upon it. The Centre of the Detent is at x in Figure 14. The note in French translates as: The part marked a in this fig. 7 is a little too short aa are the Pallets of ten times the Size that they are in the Time-keeper. The dotted Lines from 24ths of the Circle, show the Power the Balance-wheel has to impede the Motion of the Balance by the Declivity on the Back of the Pallets, at any the same time whenever it shall have the greatest Power to give it Motion. Is to show the Proportion between the Balance, the Balance-wheel, the Balance-wheel Teeth, the Pallets, and at what Distance the Wheel acts from the Centre of the Balance. AA represents the Balance, BB the Balance-wheel, aa the Pallets, and bbbbbb the Balance-wheel Teeth. A is the Counter-potence, with the Follower a , and a small Screw at c , to stop when at its proper Place, and x is the Centre of the fourth Wheel. B is the Cock for the Minute-wheel. C is the Steel Bridge. D is a Cock for the contrate Wheel. E is a Cock at the first Wheel. F is a Cock at the contrate Wheel on the Pillar-plate. Is the Detent, which is to stop the Balance before the Watch be down. It turns upon a Centre at t in Figure 14. A is the Locking-spring. Diameter of the hole in the Socket. (Its wider end) about 0.19 of 1/4 inch. Diameter of the upport pivot 0.22. Diameter of the lower pivot 0.09. AA represents the upper Plate. BB the Balance. aa the Thermometer. bb the Balance-spring. cc Slider to adjust the Thermometer end-way. d the Stud. e the artificial Cycloid. f a Piece to adjust it so as to bear properly against the Spring. aaaaaa the Feet of the Brass Edge. bb the Ratchet at the Spring-barrel. c the Click. dddd the Cock at the End of the first Wheel. ee a Ratchet. ff the Centres of the two Clicks which act in it. gg the two Springs that act at them. hhhhhh the six Pillars of the Frame. ii the Steel Bridge. kkk two Wheels that carry the Seconds, one being on the contrate Wheel-arbor, the other moving on the Cannon-pinion. l the Cannon-pinion. mm the Minute-wheel. n the Hour-pinion. oo the Hour-wheel. Sadly figures 14 and 15 are missing from the scans in the Cambridge Digital Library. aaaaaa the six Pillars of the Brass Edge. bb the first Wheel. c the Centre-pinion. dd the second Wheel. e the second Pinion. ff the third Wheel. g the third Pinion. hh the contrate Wheel, and the fourth Wheel. i the Balance-wheel Pinion. k the fourth Pinion. ll the fifth Wheel. m the fifth Pinion. nn the Fly. oo the Balance-wheel. p the Potence. rrrrrr the six Pillars of the Frame. s the Stud. t the Centre of the Detent, to stop the Balance. x the Centre of the discharging Detent. uu the upper Plate. zz the Pillar-plate. Sadly figures 14 and 15 are missing from the scans in the Cambridge Digital Library. Is what was designed for the Work on the upper Plate, which is now done in the Manner as represented in Figure 12. For tempering the Balance-spindle, the Balance-spring, and the Pinions. Before their being immersed in Metal (as just melted) let them be oiled over. The Heat for the Balance-spindle 567 on Fahrenheit's Scale, the which is given by one of Pewter to 12 of Lead; but for the Balance-spring and the Pinions, let the Mixture be One of Pewter to 17 of Lead. Each Turn of the first Wheel (or Fusee) is 4 4/7 Hours; so 5 1/4 of its Turns is just 24 Hours; and 6 1/4 is 28 4/7 Hours; and 6 9/16 Turns equal to 30 Hours. Although the figure is missing, this is one of the most fascinating passages in the entire document, and at any rate doesn't appear to relate to the figure. Harrison is achieving very precisely controlled heat treatment by melting defined alloys and then dunking his parts into the molten metal "as just melted". The molten metal is guaranteed to be at approximately its melting point if it has only just melted, so he achieves a very repeatable heat treatment this way. And by changing the ratio of lead to pewter he changes the melting temperature of the alloy to gain continuous-valued control over the temperature. And on top of that, the part is kept in an oxygen-free environment for the bulk of the time that it is hot. Ingenious! I'm not sure what the oil is for. To prevent the lead mixture from sticking perhaps?

0 views
James Stanley 11 months ago

My diaphragm piston air engine with rotary valve

I made an air engine with a diaphragm piston and a rotary valve. It was inspired by a video from Robert Murray-Smith , but improves on the design. Here is a video clip showing my engine: If you want the CAD files, you can get them on github . Compared to making my Wig-Wag , this engine was a lot less work, despite the unconventional design. I went from the idea in my head to a working engine in a single day thanks to the power of 3d printing. And that includes several iterations of some parts when my first attempt didn't work. The "diaphragm" is a finger from a rubber glove. The piston fits inside the finger, and the base of the finger is clamped to a tapered surface on the outside of the cylinder by a nut, so that it forms a perfect seal. This means the piston can be a very loose fit inside the cylinder bore without preventing it from sealing. The diaphragm rolls up and down the outside of the piston as the engine runs. The "rotary valve" is an angled hole in the crankshaft that alternately connects the top of the cylinder to the exhaust and inlet ports. The "pipes" are integrated into the base plate. The first improvement of my design over Robert's is that my rotary valve is integrated in the crankshaft instead of being a separate component geared off the crankshaft. The second improvement of my design is that there is only one pipe going from the valve to the cylinder, which saves wasting energy on pressurising the pipe that isn't connected to anything. Although the diaphragm means that the piston seal is totally leak-free, my rotary valve doesn't work anywhere near as well, it leaks quite badly. (A further improvement would be to move the crankshaft much closer to the cylinder so that we don't even have one pipe to pressurise - I have an idea to do this that involves putting the crankshaft at the top of the cylinder and connecting it with a funny-shaped con rod, maybe I'll try that one day). I find that a good way to make 3d-printed flywheels is to put radial holes in them and then screw in bolts. This has the great advantage that you can screw them in/out to finely adjust the balance. (Not that I bothered for this engine). The best way to put threads in 3d-printed parts is to design the hole to the nominal size of the tap drill for the thread (so for M6 this is a 5mm hole), then tap it with a cordless drill. The plastic shrinks back in once it's been tapped, which is useful for a light self-locking thread, or if you don't want that, let the plastic cool down and then re-tap it a couple of times to free it up. This is less hassle in CAD than modelled threads, and less hassle in post-processing than heat-set inserts, although I admit that heat-set inserts look cooler. It feels like tapped threads would be weaker than inserts, but they seem perfectly fine in practice, and they have the benefit that you can make the threads as deep as you can tap, versus inserts that come in a fixed size. This is what Robert's engine looks like: We see it running in close-up in his video. He is running it off a vacuum cleaner since he doesn't have an air compressor, but the principle is the same (your supply pressure comes from the atmosphere, and your exhaust goes into the vacuum cleaner). His engine runs remarkably slowly and doesn't even appear to slow down significantly on the return stroke, mine definitely has too much friction and inefficiency to run that slowly, even with my much heavier flywheel.

0 views
James Stanley 11 months ago

Prompts as source code: a vision for the future of programming

I'm going to present a vision for the future of programming in which programmers don't work with source code any more. The idea is that prompts will be to source code as source code is to binaries. In the beginning (I claim) there were only binaries, and without loss of generality, assembly language. (If you think binaries and assembly language are too far apart to lump together: keep up grandad, you're thinking too low-level; just wait until the further future where source code and binaries are too close together to distinguish!). Then somebody invented the compiler . And now it was possible to write code in a more natural language and have the machine automatically turn it into binaries! And we saw that it was good. As hardware resources grew, the compilers' capabilities grew, and now the idea that there was programming before compilers is pretty weird to new developers. Almost noone is writing assembly language and even fewer write bare machine code. Now take LLMs. If you create software using an LLM today, you probably give an initial prompt to get started, and then you refine the generated source code by giving follow-up prompts to ask for changes, and you never revisit your initial prompt. It's just a series of "patches" created by follow-up prompts. This is like programming by writing source code once, compiling it, and then throwing the source code away and working directly on the binary with incremental patches! Which is just obviously crazy. So here's my outline for "prompts as source code": The prompts will be committed to git, the generated source code will not. The prompts will be big, and split across multiple files just like source code is now, except it's all freeform text. We just give the LLM a directory tree full of text files and ask it to write the program. The prompts will be unimaginably large by today's standards. Compare the size of the Linux or Firefox source trees to the total amount of machine code that had ever been written in the entire history of the world before the first compiler was invented. (To spell it out: the future will contain LLM prompts that are larger than all of the source code that humanity combined has ever written in total up to this point in time.) Our build system will say which exact version of the LLM you're using, and it will be evaluated deterministically so that everybody gets the same output from the same prompt (reproducible builds). The LLMs will be bigger than they are today, have larger context windows, etc., and as the LLMs improve, and our understanding of how to work with them improves, we'll gain confidence that small changes to the prompt have correspondingly small changes in the resulting program. It basically turns into writing a natural language specification for the application, but the specification is version-controlled and deterministically turns into the actual application. Human beings will only look at the generated source code in rare cases (how often do you look at assembly code today?). Normally they'll just use their tooling to automatically build and run the application directly from the prompts. You'll be able to include inline code snippets in the prompt, of course. That's a bit like including inline assembly language in your source code. And you could imagine the tooling could let you include some literal code files that the LLM won't touch, but will be aware of, and will be included verbatim in the output. That's a bit like linking with precompiled object files. Once you have a first version that you like, there could be a "backwards pass" where an LLM looks at the generated source code and fills in all the gaps in the specification to clarify the details, so that if you then make a small change to the prompt you're more likely to get only a small change in the program. You could imagine the tooling automatically running the backwards pass every time you build it, so that you can see in your prompts exactly what assumptions you're baking in. That's my vision for the future of programming. Basically everything that today interacts with source code and/or binaries, we shift one level up so that it interacts with prompts and/or source code. What do you think? Although we could make an initial stab at the tooling today, I feel like current LLMs aren't quite up to the job: context windows are too small for all but toy applications (OK, you might fit your spec in the context window, but you also want the LLM to do some chain-of-thought before it starts writing code) as far as I know, it's not possible to run the best LLMs (Claude, gpt4o) deterministically, and even if it was they are cloud-hosted and proprietary, and that is an extremely shaky foundation for a new system of programming you could use Llama 405b, but GPUs are too expensive and too slow we'd need the LLMs to be extraordinarily intelligent and able to follow every tiny detail in the prompt, in order for small changes to the prompt not to result in random bugs getting switched on/off, the UI randomly changing, file formats randomly changing, etc. I haven't quite figured out how you "update" the LLM without breaking your program; you wouldn't want to be stuck on the same version forever, this feels similar to but harder than the problem of switching PHP versions for example

0 views