Posts in Julia (20 found)

Who Benefited from the Aisuru and Kimwolf Botnets?

Our first story of 2026 revealed how a destructive new botnet called Kimwolf has infected more than two million devices by mass-compromising a vast number of unofficial Android TV streaming boxes . Today, we’ll dig through digital clues left behind by the hackers, network operators and services that appear to have benefitted from Kimwolf’s spread. On Dec. 17, 2025, the Chinese security firm XLab published a deep dive on Kimwolf , which forces infected devices to participate in distributed denial-of-service (DDoS) attacks and to relay abusive and malicious Internet traffic for so-called “residential proxy” services. The software that turns one’s device into a residential proxy is often quietly bundled with mobile apps and games. Kimwolf specifically targeted residential proxy software that is factory installed on more than a thousand different models of unsanctioned Android TV streaming devices. Very quickly, the residential proxy’s Internet address starts funneling traffic that is linked to ad fraud, account takeover attempts and mass content scraping. The XLab report explained its researchers found “definitive evidence” that the same cybercriminal actors and infrastructure were used to deploy both Kimwolf and the Aisuru botnet — an earlier version of Kimwolf that also enslaved devices for use in DDoS attacks and proxy services. XLab said it suspected since October that Kimwolf and Aisuru had the same author(s) and operators, based in part on shared code changes over time. But it said those suspicions were confirmed on December 8 when it witnessed both botnet strains being distributed by the same Internet address at 93.95.112[.]59 . Image: XLab. Public records show the Internet address range flagged by XLab is assigned to Lehi, Utah-based Resi Rack LLC . Resi Rack’s website bills the company as a “Premium Game Server Hosting Provider.” Meanwhile, Resi Rack’s ads on the Internet moneymaking forum BlackHatWorld  refer to it as a “Premium Residential Proxy Hosting and Proxy Software Solutions Company.” Resi Rack co-founder Cassidy Hales told KrebsOnSecurity his company received a notification on December 10 about Kimwolf using their network “that detailed what was being done by one of our customers leasing our servers.” “When we received this email we took care of this issue immediately,” Hales wrote in response to an email requesting comment. “This is something we are very disappointed is now associated with our name and this was not the intention of our company whatsoever.” The Resi Rack Internet address cited by XLab on December 8 came onto KrebsOnSecurity’s radar more than two weeks before that. Benjamin Brundage is founder of Synthient , a startup that tracks proxy services. In late October 2025, Brundage shared that the people selling various proxy services which benefitted from the Aisuru and Kimwolf botnets were doing so at a new Discord server called resi[.]to . On November 24, 2025, a member of the resi-dot-to Discord channel shares an IP address responsible for proxying traffic over Android TV streaming boxes infected by the Kimwolf botnet. When KrebsOnSecurity joined the resi[.]to Discord channel in late October as a silent lurker, the server had fewer than 150 members, including “ Shox ” — the nickname used by Resi Rack’s co-founder Mr. Hales — and his business partner “ Linus ,” who did not respond to requests for comment. Other members of the resi[.]to Discord channel would periodically post new IP addresses that were responsible for proxying traffic over the Kimwolf botnet. As the screenshot from resi[.]to above shows, that Resi Rack Internet address flagged by XLab was used by Kimwolf to direct proxy traffic as far back as November 24, if not earlier. All told, Synthient said it tracked at least seven static Resi Rack IP addresses connected to Kimwolf proxy infrastructure between October and December 2025. Neither of Resi Rack’s co-owners responded to follow-up questions. Both have been active in selling proxy services via Discord for nearly two years. According to a review of Discord messages indexed by the cyber intelligence firm Flashpoint , Shox and Linus spent much of 2024 selling static “ISP proxies” by routing various Internet address blocks at major U.S. Internet service providers. In February 2025, AT&T announced that effective July 31, 2025, it would no longer originate routes for network blocks that are not owned and managed by AT&T (other major ISPs have since made similar moves). Less than a month later, Shox and Linus told customers they would soon cease offering static ISP proxies as a result of these policy changes. Shox and Linux, talking about their decision to stop selling ISP proxies. The stated owner of the resi[.]to Discord server went by the abbreviated username “D.” That initial appears to be short for the hacker handle “ Dort ,” a name that was invoked frequently throughout these Discord chats. Dort’s profile on resi dot to. This “Dort” nickname came up in KrebsOnSecurity’s recent conversations with “ Forky ,” a Brazilian man who acknowledged being involved in the marketing of the Aisuru botnet at its inception in late 2024. But Forky vehemently denied having anything to do with a series of massive and record-smashing DDoS attacks in the latter half of 2025 that were blamed on Aisuru, saying the botnet by that point had been taken over by rivals. Forky asserts that Dort is a resident of Canada and one of at least two individuals currently in control of the Aisuru/Kimwolf botnet. The other individual Forky named as an Aisuru/Kimwolf botmaster goes by the nickname “ Snow .” On January 2 — just hours after our story on Kimwolf was published — the historical chat records on resi[.]to were erased without warning and replaced by a profanity-laced message for Synthient’s founder. Minutes after that, the entire server disappeared. Later that same day, several of the more active members of the now-defunct resi[.]to Discord server moved to a Telegram channel where they posted Brundage’s personal information, and generally complained about being unable to find reliable “bulletproof” hosting for their botnet. Hilariously, a user by the name “Richard Remington” briefly appeared in the group’s Telegram server to post a crude “Happy New Year” sketch that claims Dort and Snow are now in control of 3.5 million devices infected by Aisuru and/or Kimwolf. Richard Remington’s Telegram account has since been deleted, but it previously stated its owner operates a website that caters to DDoS-for-hire or “stresser” services seeking to test their firepower. Reports from both Synthient and XLab found that Kimwolf was used to deploy programs that turned infected systems into Internet traffic relays for multiple residential proxy services. Among those was a component that installed a software development kit (SDK) called ByteConnect, which is distributed by a provider known as Plainproxies . ByteConnect says it specializes in “monetizing apps ethically and free,” while Plainproxies advertises the ability to provide content scraping companies with “unlimited” proxy pools. However, Synthient said that upon connecting to ByteConnect’s SDK they instead observed a mass influx of credential-stuffing attacks targeting email servers and popular online websites. A search on LinkedIn finds the CEO of Plainproxies is Friedrich Kraft , whose resume says he is co-founder of ByteConnect Ltd. Public Internet routing records show Mr. Kraft also operates a hosting firm in Germany called 3XK Tech GmbH . Mr. Kraft did not respond to repeated requests for an interview. In July 2025, Cloudflare reported that 3XK Tech (a.k.a. Drei-K-Tech) had become the Internet’s largest source of application-layer DDoS attacks . In November 2025, the security firm GreyNoise Intelligence found that Internet addresses on 3XK Tech were responsible for roughly three-quarters of the Internet scanning being done at the time for a newly discovered and critical vulnerability in security products made by Palo Alto Networks. Source: Cloudflare’s Q2 2025 DDoS threat report. LinkedIn has a profile for another Plainproxies employee, Julia Levi , who is listed as co-founder of ByteConnect. Ms. Levi did not respond to requests for comment. Her resume says she previously worked for two major proxy providers: Netnut Proxy Network, and Bright Data. Synthient likewise said Plainproxies ignored their outreach, noting that the Byteconnect SDK continues to remain active on devices compromised by Kimwolf. A post from the LinkedIn page of Plainproxies Chief Revenue Officer Julia Levi, explaining how the residential proxy business works. Synthient’s January 2 report said another proxy provider heavily involved in the sale of Kimwolf proxies was Maskify , which currently advertises on multiple cybercrime forums that it has more than six million residential Internet addresses for rent. Maskify prices its service at a rate of 30 cents per gigabyte of data relayed through their proxies. According to Synthient, that price range is insanely low and is far cheaper than any other proxy provider in business today. “Synthient’s Research Team received screenshots from other proxy providers showing key Kimwolf actors attempting to offload proxy bandwidth in exchange for upfront cash,” the Synthient report noted. “This approach likely helped fuel early development, with associated members spending earnings on infrastructure and outsourced development tasks. Please note that resellers know precisely what they are selling; proxies at these prices are not ethically sourced.” Maskify did not respond to requests for comment. The Maskify website. Image: Synthient. Hours after our first Kimwolf story was published last week, the resi[.]to Discord server vanished, Synthient’s website was hit with a DDoS attack, and the Kimwolf botmasters took to doxing Brundage via their botnet. The harassing messages appeared as text records uploaded to the Ethereum Name Service (ENS), a distributed system for supporting smart contracts deployed on the Ethereum blockchain. As documented by XLab, in mid-December the Kimwolf operators upgraded their infrastructure and began using ENS to better withstand the near-constant takedown efforts targeting the botnet’s control servers. An ENS record used by the Kimwolf operators taunts security firms trying to take down the botnet’s control servers. Image: XLab. By telling infected systems to seek out the Kimwolf control servers via ENS, even if the servers that the botmasters use to control the botnet are taken down the attacker only needs to update the ENS text record to reflect the new Internet address of the control server, and the infected devices will immediately know where to look for further instructions. “This channel itself relies on the decentralized nature of blockchain, unregulated by Ethereum or other blockchain operators, and cannot be blocked,” XLab wrote. The text records included in Kimwolf’s ENS instructions can also feature short messages, such as those that carried Brundage’s personal information. Other ENS text records associated with Kimwolf offered some sage advice: “If flagged, we encourage the TV box to be destroyed.” An ENS record tied to the Kimwolf botnet advises, “If flagged, we encourage the TV box to be destroyed.” Both Synthient and XLabs say Kimwolf targets a vast number of Android TV streaming box models, all of which have zero security protections, and many of which ship with proxy malware built in. Generally speaking, if you can send a data packet to one of these devices you can also seize administrative control over it. If you own a TV box that matches one of these model names and/or numbers , please just rip it out of your network. If you encounter one of these devices on the network of a family member or friend, send them a link to this story (or to our January 2 story on Kimwolf ) and explain that it’s not worth the potential hassle and harm created by keeping them plugged in.

0 views
Chris Coyier 3 months ago

Media Diet

📺 Wondla — 10/10 kids show. I was way into it. Post-apoc situation with underground bunkers (apparently Apple loves that theme) where when the protagonist girl busts out of it, the world is quite different. The premise and payoff in Season 1 was better than the commentary vibe of Season 2, but I liked it all. Apparently there is one more season coming . 🎥 Downton Abbey: The Grand Finale — The darkest of the three movies? Weird. I love spending time in this world though so I was happy to be there. But honestly I was coming off a couple of day beers when I saw it in the theater and it put me in a weird mood and I should probably watch it again normally. How to proper movie critics review movies without their random current moods affecting the review?! 📕 Annie Bot —  Sierra Greer is like, what if we turned AI into sex bots? Which honestly feels about 7 minutes away at this point. I’m only like half through it and it’s kinda sexy in that 50-shades kinda way where there is obviously some dark shit coming. 📔 Impossible People — Binge-able graphic novel by Julia Wertz about a redemption arc out of addiction. I’m an absolute sucker for addiction stories. This is very vulnerable and endearing. Like I could imagine having a very complicated friendship with Julia. It doesn’t go down to the absolute bottom of the well like in books like A Million Little Pieces or The Book of Drugs , so I’d say it’s a bit safer for you if you find stuff like that too gut wrenching.

0 views

how i use my terminal

this is a whole blog post because it is "outside the overton window"; it usually takes at least a video before people even understand the thing i am trying to describe. so, here's the video: the steps here that tend to surprise people are 0:11 , 0:21 , and 0:41 . when i say "surprise" i don't just mean that people are surprised that i've set this up, but they are surprised this is possible at all. here's what happens in that video: i got annoyed at VSCode a while back for being laggy, especially when the vim plugin was running, and at having lots of keybind conflicts between the editor, vim plugin, terminal, and window management. i tried zed but at the time it was quite immature (and still had the problem of lots of keybind conflicts). i switched to using nvim in the terminal, but quickly got annoyed at how much time i spent copy-pasting filenames into the editor; in particular i would often copy-paste files with columns from ripgrep, get a syntax error, and then have to edit them before actually opening the file. this was quite annoying. what i wanted was an equivalent of ctrl-click in vscode, where i could take an arbitrary file path and have it open as smoothly as i could navigate to it. so, i started using tmux and built it myself. people sometimes ask me why i use tmux. this is why! this is the whole reason! (well, this and session persistence.) terminals are stupidly powerful and most of them expose almost none of it to you as the user. i like tmux, despite its age, bugs, and antiquated syntax, because it's very extensible in this way. this is done purely with tmux config: and this is the contents of : i will not go through the whole regex, but uh. there you go. i spent more time on this than i probably should have. this is actually a trick; there are many steps here. this part is not so bad. tmux again. i also have a version that always opens an editor in the current pane, instead of launching in the default application. for example i use by default to view json files, but to edit them. here is the trick. i have created a shell script (actually a perl script) that is the default application for all text files. setting up that many file associations by hand is a pain. i will write a separate blog post about the scripts that install my dotfiles onto a system. i don't use Nix partly because all my friends who use Nix have even weirder bugs than they already had, and partly because i don't like the philosophy of not being able to install things at runtime. i want to install things at runtime and track that i did so. that's a separate post too. the relevant part is this: this bounces back to tmux. in particular, this is being very dumb and assuming that tmux is running on the machine where the file is, which happens to be the case here. this is not too bad to ensure - i just use a separate terminal emulator tab for each instance of tmux i care about; for example i will often have open one Windows Terminal tab for WSL on my local laptop, one for my desktop, and one for a remote work machine via a VPN. there's actually even more going on here—for example i am translating the syntax to something vim understands, and overriding so that it doesn't error out on the —but for the most part it's straightforward and not that interesting. this is a perl script that scripts tmux to send keys to a running instance of nvim (actually the same perl script as before, so that both of these can be bound to the same keybind regardless of whether nvim is already open or not): well. well. now that you mention it. the last thing keeping me on tmux was session persistence and Ansuz has just released a standalone tool that does persistence and nothing else . so. i plan to switch to kitty in the near future, which lets me keep all these scripts and does not require shoving a whole second terminal emulator inside my terminal emulator, which hopefully will reduce the number of weird mysterious bugs i encounter on a regular basis. the reason i picked kitty over wezterm is that ssh integration works by integrating with the shell, not by launching a server process, so it doesn't need to be installed on the remote. this mattered less for tmux because tmux is everywhere, but hardly anywhere has wezterm installed by default. honestly, yeah. i spend quite a lot less time fighting my editor these days. that said, i cannot in good conscience recommend this to anyone else. all my scripts are fragile and will probably break if you look at them wrong, which is not ideal if you haven't written them yourself and don't know where to start debugging them. if you do want something similar without writing your own tools, i can recommend: hopefully this was interesting! i am always curious what tools people use and how - feel free to email me about your own setup :) 0:00 I start with Windows Terminal open on my laptop. 0:02 I hit ctrl + shift + 5 , which opens a new terminal tab which 's to my home desktop and immediately launches tmux. 0:03 tmux launches my default shell, . zsh shows a prompt, while loading the full config asynchronously 0:08 i use to fuzzy find a recent directory 0:09 i start typing a ripgrep command. zsh autofills the command since i've typed it before and i accept it with ctrl + f . 0:11 i hit ctrl + k f , which tells tmux to search all output in the scrollback for filenames. the filenames are highlighted in blue. 0:12 i hold n to navigate through the files. there are a lot of them, so it takes me a bit to find the one i'm looking for. 0:21 i press o to open the selected file in my default application ( ). tmux launches it in a new pane. note that this is still running on the remote server ; it is opening a remote file in a remote tmux pane. i do not need to have this codebase cloned locally on my laptop. 0:26 i try to navigate to several references using rust-analyzer, which fails because RA doesn't understand the macros in this file. at 0:32 i finally find one which works and navigate to it. 0:38 i hit ctrl + k h , which tells tmux to switch focus back to the left pane. 0:39 i hit n again. the pane is still in "copy-mode", so all the files from before are still the focus of the search. they are highlighted again and tmux selects the next file in search order. 0:41 i hit o , which opens a different file than before, but in the same instance of . 0:43 i hit b , which shows my open file buffers. in particular, this shows that the earlier file is still open. i switch back and forth between the two files a couple times before ending the stream. i don't need a fancy terminal locally; something with nice fonts is enough. all the fancy things are done through tmux, which is good because it means they work on Windows too without needing to install a separate terminal. the editor thing works even if the editor doesn't support remote scripting. nvim does support RPC, but this setup also worked back when i used and . i could have written this such that the fancy terminal emulator scripts were in my editor, not in tmux (e.g. in nvim). but again this locks me into the editor; and the built-in terminals in editors are usually not very good. it's much easier to debug when something goes wrong (vscode's debugging tools are mostly for plugin extension authors and running them is non-trivial). with vim plugins i can just add statements to the lua source and see what's happening. all my keybinds make sense to me! my editor is less laggy. my terminal is much easier to script through tmux than through writing a VSCode plugin, which usually involves setting up a whole typescript toolchain and context-switching into a new project fish + zoxide + fzf . that gets you steps 4, 5, and kinda sorta-ish 6. "builtin functionality in your editor" - fuzzy find, full text search, tabs and windows, and "open recent file" are all commonly supported. qf , which gets you the "select files in terminal output" part of 6, kinda. you have to remember to pipe your output to it though, so it doesn't work after the fact and it doesn't work if your tool is interactive. note that it hard-codes a vi-like CLI ( ), so you may need to fork it or still add a script that takes the place of $EDITOR. see julia evans' most recent post for more info. e , which gets you the "translate into something your editor recognizes" part of 8, kinda. i had never heard of this tool until i wrote my own with literally the exactly the same name that did literally exactly the same thing, forgot to put it in PATH, and got a suggestion from asking if i wanted to install it, lol. or or , all of which get you 12, kinda. the problem with this is that they don't all support , and it means you have to modify this whenever you switch editors. admittedly most people don't switch editors that often, lol. terminals are a lot more powerful than people think! by using terminals that let you script them, you can do quite a lot of things. you can kinda sorta replicate most of these features without scripting your terminal, as long as you don't mind tying yourself to an editor. doing this requires quite a lot of work, because no one who builds these tools thought of these features ahead of time.

0 views

theory building without a mentor

NOTE: if you are just here for the how-to guide, click here to skip the philosophizing. Peter Naur wrote a famous article in 1985 called Programming as Theory Building . it has some excellent ideas, such as: programming must be the programmers’ building up knowledge of a certain kind, knowledge taken to be basically the programmers’ immediate possession, any documentation being an auxiliary product. solutions suggested by group B [who did not possess a theory of the program] […] effectively destroyed its power and simplicity. The members of group A [who did possess a theory] were able to spot these cases instantly and could propose simple and effective solutions, framed entirely within the existing structure. the program text and its documentation proved insufficient as a carrier of the most important design ideas i think this article is excellent, and highly recommend reading it in full. however, i want to discuss one particular idea Naur mentions: For a new programmer to come to possess an existing theory of a program it is insufficient that he or she has the opportunity to become familiar with the program text and other documentation. What is required is that the new programmer has the opportunity to work in close contact with the programmers who already possess the theory [...] program revival, that is reestablishing the theory of a program merely from the documentation, is strictly impossible. i do not think it is true that it is impossible to recover a theory of the program merely from the code and docs. my day job, and indeed one of my most prized skills when i interview for jobs, is creating a theory of programs from their text and documentation alone. this blog post is about how i do that, and how you can too. Naur also says in the article: “in a certain sense there can be no question of theory modification, only program modification” i think this is wrong: theory modification is exactly what Ward Cunningham describes as "consolidation" in his 1992 article on Technical Debt . i highly recommend the original article, but the basic idea is that over time, your understanding of how the program should behave changes, and you modify and refactor your program to match that idea. this happens in all programs, but the modification is easier in programs with little technical risk . furthermore, this theory modification often happens unintentionally over time as people are added and removed from teams. as ceejbot puts it : This is Conway’s Law over time. Teams are immutable: adding or removing a person to a team produces a different team. After enough change, the team is different enough that it no longer recognizes itself in the software system it produces. The result is people being vaguely unhappy about software that might be working perfectly well. i bring this up to note that you will never recover the same theory as the original programmers (at least, not without talking to them directly). the most you can do is to recover one similar enough that it does not require large changes to the program. in other words, you are creating a new theory of the program, and may end up having to adapt the program to your new theory. this is useful both when fixing bugs and when adding new features; i will focus on new features because i want to emphasize that these skills are useful any time you modify a program. for a focus on debugging, see Julia Evans' Pocket Guide to Debugging . this post is about creating theories at the "micro" level, for small portions of the program. i hope to make a post about the "macro" level in the future, since that's what really lets you start making design decisions about a program. i recently made a PR to neovim , having never worked on neovim before; i'll use that as an example going forward. i highly recommend following along with a piece of code you want to learn more about. if you don't have one in mind, i have hidden all the examples behind a drop-down menu, so you can try to apply the ideas on your own before seeing how i use them. the investigation i did in this blog post was based off neovim commit 57d99a5 . Click here to open all notes. to start off, you need an idea of what change you want to make to the program. almost always, programs are too large for you to get an idea of the whole program at once. instead, you need to focus on theory-building for the parts you care about, and only understand the rest of the program to the extent that the parts you care about interact with it. in my neovim PR, i cared about the command, which opens a file if it isn't loaded, or switches to the relevant buffer if it is. specifically i wanted to extend the "switch to the relevant buffer" part to also respect , so that i could pass it a line number. there are several ways to get started here. the simplest is just finding the relevant part of the code or docs—if you can provoke an error that's related to the part of the code you're changing, you can search for that error directly. often, knowing how execution reaches that state is very helpful, which you can do by getting a backtrace. you can get backtraces for output from arbitrary programs with liberal use of rr , but if you're debugging rustc specifically, there's actually a built-in flag for this, so you can just use . for , this didn't work: it was documented on neovim's site , but i didn't know a -specific error to search for. if this doesn't print an error message, or if it's not possible to get a recording of the program, things are harder. you want to look for something you already know the name of; search for literal strings with that name, or substrings that might form part of a template. for i searched for the literal string , since something needs to parse commands and it's not super common for it to be on its own in a string. that pulled up the following hits: looked promising, so i read the code around there. sometimes triggering the condition is hard, so instead i read the source code to reverse-engineer the stack trace. seeing all possible call sites of a function is instructive in itself, and you can usually narrow it down to only a few callers by skimming what the callers are doing. i highly recommend using an LSP for this part since the advantage comes from seeing all possible callers, not just most, and regex is less reliable than proper name resolution. it turned out that none of the code i found in my search was for itself, but i did find it was in a function named . had only one caller, . that was called by . the doc-comment on mentions that it parses the string, but i am not used to having documentation so i went up one level too far to . at that point, looking at the call site of , i realized i had gone too far because it was passing in the whole string of the Ex command line. i found a more relevant part of the code by looking at the uses of in : i got lucky - this was not actually the code i cared about, but the bit i did care about had a similar name, so i found it by searching for : from there i went to the definition of (in ) and found in that file: and from there found that the function i cared about was called . if i had been a little more careful, i could have found sooner with (this time without filtering out hidden files or limiting to the source directory). but this way worked fine as well. do mini experiments: if you see an error emitted in nearby code, try to trigger it so that you verify you're looking in the right place. when debugging, i often use process of elimination to narrow down callers: if an error would have been emitted if a certain code path was taken, or if there would have been more or less logging, i can be sure that code i am looking at was not run. the simplest experiment is just ; it's easy to notice and doesn't change the state of the program, and it can't fail. other experiments could include "adding custom logging" or "change the behavior of the function", which let you perform multiple experiments at once and understand how the function impacts its callers. for more complicated code, i like to use a debugger, which lets you see much more of the state at once. if possible, in-editor debuggers are really nice—vscode, and since recently, zed , have one built-in; for nvim i use nvim-dap-ui . you can also just use a debugger in a terminal. some experiments i like to try: for , i was quite confident i had found the right code, so i didn't bother with any experiments. there are other cases where it's more useful; i made an earlier PR to tmux where there were many different places search happened, so verifying i was looking at the right one was very helpful. specifically i added to the function i thought was the right place, since debug logging in tmux is non-trivial to access. i rarely use a debugger for adding new code; mostly i use it for debugging existing code. programs complicated enough that i need a debugger just to understand control flow usually have a client/server model that also makes them harder to debug, so i don't bother and just read the source code. reading source code is also useful for finding examples of how to use an API. often it handles edge cases you wouldn't know about by skimming, and uses helper functions that make your life simpler. your goal is to make your change as similar to the existing codebase as possible, both to reduce the risk of bugs and to increase the chance the maintainer likes your change. when i write new code, i will usually copy a small snippet from elsewhere in the codebase and modify it to my needs. i try to copy at most 10-15 lines; more than that indicates that i should try to reuse or create a higher-level API. once in , i skimmed the code and found a snippet looked like it was handling existing files: the bug here is not any code that is present; instead it's code that's missing. i had to figure out where was stored and how to process it. so, i repeated a similar process for . this time i had something more to start with - i knew the command structure was named and had type . looking at the definition of showed me what i wanted: looking for , i found (with a helpful comment saying it was responsible for ) which called , and in turn . looking at the callers of i found , which handles . has exactly the behavior i wanted for , so i copied its behavior: out of caution, i also looked at the other places in the function that handled , and it's a good thing i did, because i found this wild snippet above: i refactored this into a helper function and then called it from both the original command and my new code in . this works in much the same way. try to find existing tests by using the same techniques as finding the code you care about . read them; write them using existing examples. tests are also code, after all. test suites usually have better documentation than the code itself, since adding new tests is much more common than modifying any particular section of code; see if you can find the docs. i look for files, and if i don't find them i fall back to skimming the readme. sometimes there are is also a in the folder where the tests are located, although these tend to be somewhat out of date. i care a lot about iteration times, so i try and find how to run individual tests. that info is usually in the README, or sometime you can figure it out from the test command's output. run your tests! ideally, create and run your tests before modifying the code so that you can see that they start to pass after your change. tests are extra important when you don't already understand the code, because they help you verify that your new theory is correct. run existing tests as well; run those before you make changes so you know which failures are spurious (a surprisingly high number of codebases have flaky or environment-dependent tests). i started by looking for existing tests for : fortunately this had results right away and i was able to start adding my new test. had a pointer to which documented and . neovim has very good internal tooling and when my call failed it gave me a very helpful pointer to . hopefully this was helpful! i am told by my friends that i am unusually good at this skill, so i am interested whether this post was effective at teaching it. if you have any questions, or if you just want to get in contact, feel free to reach out via email . breaking at a function to make sure it is executed printing local variables setting hardware watchpoints on memory to see where something is modified (this especially shines in combination with a time-travel debugger ) programming is theory building . recovering a theory from code and docs alone is hard, but possible. most programs are too large for you to understand them all at once. decide on your goal and learn just enough to accomplish it. reading source code is surprisingly rewarding. match the existing code as closely as you can until you are sure you have a working theory.

0 views
DYNOMIGHT 7 months ago

DumPy: NumPy except it’s OK if you’re dum

What I want from an array language is: I say NumPy misses on three of these. So I’d like to propose a “fix” that—I claim—eliminates 90% of unnecessary thinking, with no loss of power. It would also fix all the things based on NumPy, for example every machine learning library. I know that sounds grandiose. Quite possibly you’re thinking that good-old dynomight has finally lost it. So I warn you now: My solution is utterly non-clever. If anything is clever here, it’s my single-minded rejection of cleverness. To motivate the fix, let me give my story for how NumPy went wrong. It started as a nice little library for array operations and linear algebra. When everything has two or fewer dimensions, it’s great. But at some point, someone showed up with some higher-dimensional arrays. If loops were fast in Python, NumPy would have said, “Hello person with ≥3 dimensions, please call my ≤2 dimensional functions in a loop so I can stay nice and simple, xox, NumPy.” But since loops are slow, NumPy instead took all the complexity that would usually be addressed with loops and pushed it down into individual functions. I think this was a disaster, because every time you see some function call like , you have to think: Different functions have different rules. Sometimes they’re bewildering. This means constantly thinking and constantly moving dimensions around to appease the whims of particular functions. It’s the functions that should be appeasing your whims! Even simple-looking things like or do quite different things depending on the starting shapes. And those starting shapes are often themselves the output of previous functions, so the complexity spirals. Worst of all, if you write a new ≤2 dimensional function, then high-dimensional arrays are your problem. You need to decide what rules to obey, and then you need to re-write your function in a much more complex way to— Voice from the back : Python sucks! If you used a real language, loops would be fast! This problem is stupid! That was a strong argument, ten years ago. But now everything is GPU, and GPUs hate loops. Today, array packages are cheerful interfaces that look like Python (or whatever) but are actually embedded languages that secretly compile everything into special GPU instructions that run on whole arrays in parallel. With big arrays, you need GPUs. So I think the speed of the host language doesn’t matter so much anymore. Python’s slowness may have paradoxically turned out to be an advantage , since it forced everything to be designed to work without loops even before GPUs took over. Still, thinking is bad, and NumPy makes me think, so I don’t like NumPy . Here’s my extremely non-clever idea: Let’s just admit that loops were better. In high dimensions, no one has yet come up with a notation that beats loops and indices. So, let’s do this: That’s basically the whole idea. If you take those three bullet-points, you could probably re-derive everything I do below. I told you this wasn’t clever. Suppose that and are 2D arrays, and is a 4D array. And suppose you want to find a 2D array such that . If you could write loops, this would be easy: That’s not pretty. It’s not short or fast. But it is easy! Meanwhile, how do you do this efficiently in NumPy? Like this: If you’re not a NumPy otaku, that may look like outsider art. Rest assured, it looks like that to me too, and I just wrote it. Why is it so confusing? At a high level, it’s because and and multiplication ( ) have complicated rules and weren’t designed to work together to solve this particular problem nicely. That would be impossible, because there are an infinite number of problems. So you need to mash the arrays around a lot to make those functions happy. Without further ado, here’s how you solve this problem with DumPy (ostensibly D ynomight N umPy ): Yes! If you prefer, you can also use this equivalent syntax: Those are both fully vectorized. No loops are executed behind the scenes. They’ll run on a GPU if you have one. While it looks magical, the way this actually works is fairly simple: If you index a DumPy array with a string (or a object), it creates a special “mapped” array that pretends to have fewer dimensions. When a DumPy function is called (e.g. or (called with )), it checks if any of the arguments have mapped dimensions. If so, it automatically vectorizes the computation, matching up mapped dimensions that share labels. When you assign an array with mapped dimensions to a , it “unmaps” them into the positions you specify. No evil meta-programming abstract syntax tree macro bytecode interception is needed. When you run this code: This is what happens behind the scenes: It might seem like I’ve skipped the hard part. How does know how to vectorize over any combination of input dimensions? Don’t I need to do that for every single function that DumPy includes? Isn’t that hard? It is hard, but did it already. This takes a function defined using ( JAX ’s version of) NumPy and vectorizes it over any set of input dimensions. DumPy relies on this to do all the actual vectorization. (If you prefer your janky and broken, I heartily recommend PyTorch’s .) But hold on. If already exists, then why do we need DumPy? Here’s why: That’s how you solve the same problem with . (And basically what DumPy does behind the scenes.) I think is one of the best parts of the NumPy ecosystem. The above code seems genuinely better than the base NumPy version. But it still involves a lot of thinking! Why put in the inner and in the outer one? Why are all the axes even though you need to vectorize over the second dimension of ? There are answers, but they require thinking. Loops and indices are better. OK, I did do one thing that’s a little clever. Say you want to create a Hilbert matrix with . In base NumPy you’d have to do this: In DumPy, you can just write: Yes! That works! It works because a acts both like a string and like an array mapped along that string. So the above code is roughly equivalent to: In reality, the choose random strings. (The class maintains a stack of active ranges to prevent collisions.) So in more detail, the above code becomes something like this: To test if DumPy is actually better in practice, I took six problems of increasing complexity and implemented each of them using loops, NumPy, JAX (with ), and DumPy. Note that in these examples, I always assume the input arrays are in the class of the system being used. If you try running them, you’ll need to add some conversions with / / . (Pretending doesn’t exist.) The goal is to create with The goal of this problem is, given a list of vectors and a list of Gaussians parameters, and arrays mapping each vector to a list of parameters, evaluate each corresponding vector/parameter combination. Formally, given 2D , , , and and 3D , the goal is to create with See also the discussion in the previous post . I gave each implementation a subjective “goodness” score on a 1-10 scale. I always gave the best implementation for each problem 10 points, and then took off points from the others based on how much thinking they required. According to this dubious methodology and these made-up numbers, DumPy is 96.93877% as good as loops! Knowledge is power! But seriously, while subjective, I don’t think my scores should be too controversial. The most debatable one is probably JAX’s attention score. The only thing DumPy adds to NumPy is some nice notation for indices. That’s it. What I think makes DumPy good is it also removes a lot of stuff. Roughly speaking, I’ve tried to remove anything that is confusing and exists because NumPy doesn’t have loops. I’m not sure that I’ve drawn the line in exactly the right place, but I do feel confident that I’m on the right track with removing stuff. In NumPy, works if and are both scalar. Or if is and is . But not if is and is . Huh? In truth, the broadcasting rules aren’t that complicated for scalar operations like multiplication. But still, I don’t like it, because every time you see , you have to worry about what shapes those have and what the computation might be doing. So, I removed it. In DumPy you can only do if one of or is scalar or and have exactly the same shape. That’s it, anything else raises an error. Instead, use indices, so it’s clear what you’re doing. Instead of this: write this: Indexing in NumPy is absurdly complicated . When you write that could do many different things depending on what all the shapes are. I considered going cold-turkey and only allowing scalar indices in DumPy. That wouldn’t have been so bad, since you can still do advanced stuff using loops. But it’s quite annoying to not be able to write when and are just simple 1D arrays. So I’ve tentatively decided to be more pragmatic. In DumPy, you can index with integers, or slices, or (possibly mapped) s. But only one index can be non-scalar . I settled on this because it’s the most general syntax that doesn’t require thinking. Let me show you what I mean. If you see this: It’s “obvious” what the output shape will be. (First the shape of , then the shape of , then the shape of ). Simple enough. But as soon as you have two multidimensional array inputs like this: Suddenly all hell breaks loose. You need to think about broadcasting between and , orthogonal vs. pointwise indices, slices behaving differently than arrays, and quirks for where the output dimensions go. So DumPy forbids this. Instead, you need to write one of these: They all do exactly what they look like they do. Oh, and one more thing! In DumPy, you must index all dimensions . In NumPy, if has three dimensions, then is equivalent to . This is sometimes nice, but it means that every time you see , you have to worry about how many dimensions has. In DumPy, every time you index an array or assign to a , it checks that all indices have been included. So when you see option (4) above, you know that: Always, always, always . No cases, no thinking. Again, many NumPy functions have complex conventions for vectorization. sort of says, “If the inputs have ≤2 dimensions, do the obvious thing. Otherwise, do some extremely confusing broadcasting stuff.” DumPy removes the confusing broadcasting stuff. When you see , you know that and have no more than two dimensions, so nothing tricky is happening. Similarly, in NumPy, is equivalent to . When both inputs have ≤2 or fewer dimensions, this does the “obvious thing”. (Either an inner-product or some kind of matrix/vector multiplication.) Otherwise, it broadcasts or vectorizes or something? I can never remember. In DumPy you don’t have that problem, because it restricts to arrays with one or two dimensions only. If you need more dimensions, no problem: Use indices. It might seem annoying to remove features, but I’m telling you: Just try it . If you program this way, a wonderful feeling of calmness comes over you, as class after class of possible errors disappear. Put another way, why remove all the fancy stuff, instead of leaving it optional? Because optional implies thinking! I want to program in a simple way. I don’t want to worry that I’m accidentally triggering some confusing broadcasting insanity, because that would be a mistake. I want the computer to help me catch mistakes, not silently do something weird that I didn’t intend. In principle, it would be OK if there was a method that preserves all the confusing batching stuff. If you really want that, you can make it yourself: You can use that same wrapper to convert any JAX NumPy function to work with DumPy. Think about math: In two or fewer dimensions, coordinate-free linear algebra notation is wonderful. But for higher dimensional tensors , there are just too many cases, so most physicists just use coordinates. So this solution seems pretty obvious to me. Honestly, I’m a little confused why it isn’t already standard. Am I missing something? When I complain about NumPy, many people often suggest looking into APL -type languages, like A, J, K, or Q. (All single-letter languages are APL-like, except C, D, F, R, T, X, and many others. Convenient, right?) The obvious disadvantages of these are that: None of those bother me. If the languages are better, we should learn to use them and make them do autodiff on GPUs. But I’m not convinced they are better. When you actually learn these languages, what you figure out is that the symbol gibberish basically amounts to doing the same kind of dimension mashing that we saw earlier in NumPy: The reason is that, just like NumPy and , these languages choose align dimensions by position , rather than by name. If I have to mash dimensions, I want to use the best tool. But I’d prefer not to mash dimensions at all. People also often suggest “NumPy with named dimensions” as in xarray . (PyTorch also has a half-hearted implementation .) Of course, DumPy also uses named dimensions, but there’s a critical difference. In xarray, they’re part of the arrays themselves, while in DumPy, they live outside the arrays. In some cases, permanent named dimensions are very nice. But for linear algebra, they’re confusing. For example, suppose is 2-D with named dimensions and . Now, what dimensions should have? ( twice?) Or say you take a singular value decomposition like . What name should the inner dimensions have? Does the user have to specify it? I haven’t seen a nice solution. xarray doesn’t focus on linear algebra, so it’s not much of an issue there. A theoretical “DumPy with permanent names” might be very nice, but I’m not sure how it should work. This is worth thinking about more. I like Julia ! Loops are fast in Julia! But again, I don’t think fast loops matter that much, because I want to move all the loops to the GPU. So even if I was using Julia, I think I’d want to use a DumPy-type solution. I think Julia might well be a better host language than Python, but it wouldn’t be because of fast loops, but because it offers much more powerful meta-programming capabilities. I built DumPy on top of JAX just because JAX is very mature and good at calling the GPU, but I’d love to see the same idea used in Julia (“Dulia”?) or other languages. OK, I promised a link to my prototype, so here it is: It’s just a single file with around 700 lines. I’m leaving it as a single file because I want to stress that this is just something I hacked together in the service of this rant . I wanted to show that I’m not totally out of my mind, and that doing all this is actually pretty easy. I stress that I don’t really intend to update or improve this. (Unless someone gives me a lot of money?) So please do not attempt to use it for “real work”, and do not make fun of my code. PS. DumPy works out of the box with both and . For gradients, you need to either cast the output to a JAX scalar or use the wrapper. PPS. If you like this, you may also like einx or torchdim . Update : Due to many requests, I have turned this into a “real” package, available on PyPi as . You can install it by typing: Or, if you use uv (you should) you can play around with DumPy by just typing this one-liner in your terminal: For example: Don’t make me think. Run fast on GPUs. Really, do not make me think. OK, what shapes do all those arrays have? And what does do when it sees those shapes? Bring back the syntax of loops and indices. But don’t actually execute the loops. Just take the syntax and secretly compile it into vectorized operations. Also, let’s get rid of all the insanity that’s been added to NumPy because loops were slow. If you index a DumPy array with a string (or a object), it creates a special “mapped” array that pretends to have fewer dimensions. When a DumPy function is called (e.g. or (called with )), it checks if any of the arguments have mapped dimensions. If so, it automatically vectorizes the computation, matching up mapped dimensions that share labels. When you assign an array with mapped dimensions to a , it “unmaps” them into the positions you specify. has 4 dimensions has 2 dimensions has 1 dimension has 4 dimensions They’re unfamiliar. The code looks like gibberish. They don’t usually provide autodiff or GPU execution.

0 views
fnands 9 months ago

A quick first look at GPU programming in Mojo

The day has finally arrived. Well actually, the day arrived in February, but who’s counting. The Mojo language has finally publicly released the ability to do GPU programming - if you have a reasonably modern NVIDIA GPU. Luckily for me, I have an RTX 3090, and although it isn’t officially supported , it is basically an A10, which is. Looking at some of the comments on the nightly releases, it does seem that AMD support is on the way as well. The Modular team publicly released the ability to do GPU programming in Mojo in release 25.1, with further support and documentation in release 25.2. Fun fact: release 25.2 also saw my first (tiny) contribution to the Mojo standard library. This is a really important step for Mojo, a language that bills itself as a language designed to solve a variety of AI development challenges, which in this day and age basically means programming an increasingly heterogeneous stack of hardware. Today this mostly means GPUs, but there is an explosion of new accelerators like the ones from Cerebras, Groq and SambaNova, not to mention the not-so-new TPU from Google. As DeepSeek showed the world recently: if you’re willing to put the work in, there is a lot more to be squeezed out of current-gen hardware than most people thought. Now, I don’t think every ML engineer or researcher should be looking for every possible way to get more out of their compute, but there are definitely some wins to be had. As an example, I’m really fascinated by the work of Tri Dao and his collaborators, who work on deeply hardware aware improvements in machine learning, e.g.  FlashAttention , which is mathematically equivalent to the attention mechanism that powers all transformer models, but with hardware aware optimizations that take into account the cost of memory access in GPUs. This does make me wonder what other optimizations are out there to be discovered. This however is not easy, as the authors note in the “Limitations and Future Directions” section of the FlashAttention paper: Our current approach to building IO-aware implementations of attention requires writing a new CUDA kernel for each new attention implementation. This requires writing the attention algorithm in a considerably lower-level language than PyTorch, and requires significant engineering effort. Implementations may also not be transferrable across GPU architectures. These limitations suggest the need for a method that supports writing attention algorithms in a high-level language (e.g., PyTorch), and compiling to IO-aware implementations in CUDA What makes GPU programming in Mojo interesting is that you don’t need the CUDA toolkit to do so, and compiles down to PTX which you can think of as NVIDIA’s version of assembly. If Mojo (and Max in general) can make it easier to write GPU kernels in a more user-friendly language, it could be a game changer. If you want to get started, there is a guide for getting started with GPU programming in Mojo from Modular (the company behind Mojo), which I strongly recommend. I learn by doing, so I wanted to try to implement something with relatively simple using the GPU. The example idea I chose is to transform an RGB image to grayscale, which is an embarrassingly parallel problem without a lot of complexity. I was halfway through writing this post before I realized that there was already an example of how to do grayscale conversion in the Mojo repo, but oh well. I basically just start with what’s in the documentation, but I added another example that I did do myself. To start, let’s read in an image using mimage , an image processing library I am working on. The image is represented here as a rank three tensor with the dimensions being width, height and channels, and the data type is an unsigned 8-bit integer. In this case we have four channels: red, green, blue and alpha (transparency), the latter being 255 for all pixels. So what we want to do here is to sum together the RGB values for each pixel, using the weights , and for red, green and blue respectively. If you want to know why we are using these weights, read this article . Now that we have that, let’s define a simple version of the transform we want on CPU. So hopefully that worked! Let’s see if it’s correct. I haven’t implemented image saving in mimage yet, so let’s use the good old Python PIL library to save the image. Now that we have a working CPU implementation, let’s try to implement the same function on the GPU. But first, let’s check if Mojo can actually find my GPU: Now that we know that Mojo can find our GPU, let’s define the function that will do the actual conversion. This kernel reads a pixel from the input tensor, converts it to grayscale and writes the result to the output tensor. It is parallelized across the output tensor, which means that each thread is responsible for one pixel in the output tensor. As you can see, it takes in as parameters the layout specifications of the input and output tensors, the width and height of the image, and the input and output tensors themselves. Now, the first slightly awkward thing I had to do was convert the image from a , which is what is returned by , to a , which is the new tensor type that is compatible with GPU programming. I am assuming that will be deprecated in the future. With this new tensor type you can explicitly set which device the tensor should be allocated on. In this case I will allocate it to the CPU, i.e. the host device, and then copy over the data from the old tensor to the new one. Next, we have to move the tensor to the GPU. Now that was easy enough. The next step is to allocate the output grayscale tensor. As we don’t need to copy over the data from the old tensor, we can just allocate it on the GPU immediately. Next, we get the layout tensors for the input and output tensors. The documentation on LayoutTensor is a bit sparse, but it seems to be there to make it easy to reason about memory layouts. There seems to be two ways to use GPU functions in Mojo. The first is to use the function, which is what I do here. This compiles the gpu kernel into a function which can be called as normal. While this function is being executed on the GPU, the host device will wait until it is finished before moving on. Later in this post I will show the other option which allows the host device to do other things while waiting for the GPU. And that’s it! Let’s call the GPU function. Here I will device the image up into blocks of 32x32 pixels, and then call the function. I have to admit, I have no clue what the best practices are for choosing the block size, so if you know a good rule of thumb, please let me know. I wonder if there is a way to tune these parameters at compile time? Once that is run, we move the grayscale tensor back to the CPU and compare the results. and there we have it! We have successfully converted an image to grayscale using the GPU. Another example I wanted to try is downsampling an image. This is a bit more complex than the grayscale conversion, because we need to handle the different dimensions of the input and output tensors. First let’s define some test images to make sure the function is doing what we expect. If this works we should have a downsampled 8x8 image with the same values as the original image. Let’s start with a CPU implementation: So it works! This does make some assumptions about the input image, like that it is a multiple of the factor. But good enough for a blog post. Now let’s try to do the same on the GPU. We again define our output tensor on the GPU, get the layout tensor and move the data from the host device to the GPU. This time we will try the other way of using GPU functions: enqueing the function(s) to be executed on the GPU. This means the host device will not wait for the GPU to finish the function, but can do other things while the GPU is running. When we call the host device will wait for the GPU to finish all enqueued functions. This allows for some interesting things, like running the GPU function in parallel with some other code on the host device. This is can also be a little bit dangerous if you try to access the GPU memory from the host device while the GPU is still running. Let’s try it out: Again, it works! Let’s try it on our original image, and downsample it by a factor of 2 and 4. Let’s also do a CPU version for comparison, and define the output tensors on the GPU. Now we can call the GPU function. Notice how we can enqueue a second function while the first one is still running. As it does not depend on the first function to finish, it can potentially start running before the first function has finished. Now let’s verify the results: Great! We can save these and see what they look like: And as we can see, the images get progressively more blurry the more we downsample. This was my first quick look at GPU programming in Mojo. I feel the hardest thing is conceptually understanding how to properly divide the work between threads, and how to assign the correct numbers of threads, blocks and warps (which I didn’t even get into here). I guess the next move is to look up some guide on how to efficiently program GPUs, and to maybe try some more substantial examples. The documentation on GPU programming in Mojo is still a bit sparse, and there aren’t many examples out there in the wild to learn from, but I am sure that will change soon. The Moduar team did say they are releasing it unpolished so that they can gather some community feedback early. For someone who uses GPUs a lot in my day job, I never really interact with the GPUs at a low level; it’s always through PyTorch or JAX or some other layer of abstraction from Python. It’s quite fun to have such low level access to the hardware in a language that doesn’t feel that dissimilar from Python. I think this is really where I am starting to see the vision behind Mojo more clearly. I think the shallow take is that Mojo is a faster Python, or basically some ungodly hybrid between Python and Rust, but the more I play with it the more I feel it’s a language designed to make programming heterogenous hardware easier. I don’t think it will be the only language like this we’ll see, and I am curious to see if other languages based on MLIR will pop up soon, or if some existing languages will adapt. Maybe basing Julia 2.0 off MLIR instead of LLVM is a good next move for the language. You only need to look at the schematic off Apple silicon chips these days to see which way the wind is blowing: a significant fraction of the chip is dedicated to GPU cores. I think the days where having a GPU attached to your computer was only for specialists is going out the window, and we might pretty soon be able to safely assume that every modern computer will have at least a decent amount of GPU cores available for general purpose tasks, and not just graphics. Still, I doubt most programmers will ever have to worry about actually directly programming GPUs, but I am interested to see how libraries take advantage of this fact.

0 views
A Room of My Own 1 years ago

Life Without Envy

Jealousy is always a mask for fear: fear that we aren’t able to get what we want; frustration that somebody else seems to be getting what is rightfully ours even if we are too frightened to reach for it. At its root, jealousy is a stingy emotion. It doesn’t allow for the abundance and multiplicity of the universe. Jealousy tells us there is room for only one—one poet, one painter, one whatever you dream of being. Julia Cameron, The Artist's Way I read Life Without Envy: Ego Management for Creative People  by Camille DeAngelis a long time ago, but I never really processed its highlights. Usually, I read on my Kindle and manage my highlights through Readwise. I always have these ambitious plans to revisit my highlights and reflect on them, and while I do that occasionally, most of the time, they just sit there, waiting for the "right" moment. This particular book comes back to me often—not because it was especially brilliant or life-changing. I wouldn’t call it a great book, but it resonated in many ways. It put into words something I’d been feeling for a long time: the belief that my creative work isn’t worth anything until it’s publicly validated by someone else. Or as the author puts it:  I just need to prove myself as soon as possible, and then I’ll be someone important. I’ve struggled with this, especially when it comes to blogging. I wrote about my hesitation before —how much time I spent questioning whether what I wrote had any value or whether people I know would read my posts and judge me. Whether I sound ridiculous. That fear held me back for a long time. What helped me, at least partially, was stepping outside of my comfort zone in other ways. I joined a critique group for fiction writing, where we read our work out loud and received feedback from others. It was terrifying at first—sharing something so personal and opening myself up to critique—but it taught me to detach a little from the fear of judgment. I learned to listen, take what was useful, and leave the rest. Eventually, I moved further outside my comfort zone by sending my manuscript for assessment. When the feedback came back, it was also incredibly helpful. That experience reinforced the importance of putting my work out there. Below, I’ve shared the highlights I took from this book (in italics), along with my own commentary. Life Without Envy: Ego Management for Creative People  by Camille DeAngelis How many times did the hunger for approval win out over curiosity and imagination? We are made to feel that we must always be striving for more. A bigger house, more money, more success, because if you feel complete just as you are, then you’re no longer a cog in the system. So you see, you’re not special. And that is the one great and profound benediction underwriting your entire existence. The idea here is pretty simple: you’re not special—and that’s actually a good thing. Not being inherently "special" frees you from the pressure of living up to extraordinary expectations or comparisons. Instead, it emphasizes that the lack of special status allows you to simply be —to exist, learn, grow, and find meaning without the weight of proving uniqueness. When we let go of trying to be "special," you realize something beautiful: we’re all in this together. Life isn’t about being the best or standing out—it’s about living, growing, and connecting with others. The real gift of life is knowing your worth doesn’t depend on being unique or exceptional. You’re enough just as you are, and that’s more than okay—it’s freeing. When we talk about wanting to be “great,” we implicitly set ourselves above others. We see ourselves as chosen where others are not. We oftent think: I just need to prove myself as soon as possible, and then I’ll be someone important. The frantic desire to produce an amazing work of art as soon as possible so that everyone will hail your genius before any of your contemporaries can edge you out. If I can’t be the best, then I don’t deserve to be here. I’m not a “real” artist until I make my work public. Even commercially successful artists sometimes work under a scarcity mentality: there are only so many artists who can be taken seriously, and I am not one of them. Here’s the saddest part: if all along we’ve been creating from a place of lack, what might we be capable of if we drew from a full well? As the entrepreneur and business coach Marie Forleo says: “Envy is often a clue that there’s something latent in you that needs to be expressed.” If you keep wanting what someone else has, you can’t grow into everything you could be. When you hinge your perception of success or failure on how your work is received, you create your own misery. You may think, “But I don’t think I’m entitled. I know I have to earn it.” Yet if you look back over the past ten or twenty or thirty years, at the various ways in which you may have waited for your life to happen to you, you begin to see that this passivity has been an expectancy: entitlement in a softer guise. So, what can you do? DeAngelis concludes that we don’t a castle or a faraway destination. Step outside and soak in the natural world—whether it’s a quiet park, a beach, or even your backyard. Shake up your daily routine with something new, no matter how small. And most importantly, set the intention that something extraordinary can happen. Bliss is right here, waiting for us to notice it. Related Post: Recognizing the Scarcity Mentality I just need to prove myself as soon as possible, and then I’ll be someone important. The frantic desire to produce an amazing work of art as soon as possible so that everyone will hail your genius before any of your contemporaries can edge you out. If I can’t be the best, then I don’t deserve to be here. I’m not a “real” artist until I make my work public.

0 views
Julia Evans 1 years ago

Importing a frontend Javascript library without a build system

I like writing Javascript without a build system and for the millionth time yesterday I ran into a problem where I needed to figure out how to import a Javascript library in my code without using a build system, and it took FOREVER to figure out how to import it because the library’s setup instructions assume that you’re using a build system. Luckily at this point I’ve mostly learned how to navigate this situation and either successfully use the library or decide it’s too difficult and switch to a different library, so here’s the guide I wish I had to importing Javascript libraries years ago. I’m only going to talk about using Javacript libraries on the frontend, and only about how to use them in a no-build-system setup. In this post I’m going to talk about: There are 3 basic types of Javascript files a library can provide: I’m not sure if there’s a better name for the “classic” type but I’m just going to call it “classic”. Also there’s a type called “AMD” but I’m not sure how relevant it is in 2024. Now that we know the 3 types of files, let’s talk about how to figure out which of these the library actually provides! Every Javascript library has a build which it uploads to NPM. You might be thinking (like I did originally) – Julia! The whole POINT is that we’re not using Node to build our library! Why are we talking about NPM? But if you’re using a link from a CDN like https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js , you’re still using the NPM build! All the files on the CDNs originally come from NPM. Because of this, I sometimes like to the library even if I’m not planning to use Node to build my library at all – I’ll just create a new temp folder, there, and then delete it when I’m done. I like being able to poke around in the files in the NPM build on my filesystem, because then I can be 100% sure that I’m seeing everything that the library is making available in its build and that the CDN isn’t hiding something from me. So let’s a few libraries and try to figure out what types of Javascript files they provide in their builds! First let’s look inside Chart.js , a plotting library. This library seems to have 3 basic options: option 1: . The suffix tells me that this is a CommonJS file , for using in Node. This means it’s impossible to use it directly in the browser without some kind of build step. option 2: . The suffix by itself doesn’t tell us what kind of file it is, but if I open it up, I see which is an immediate sign that this is an ES module – the syntax is ES module syntax. option 3: . “UMD” stands for “Universal Module Definition”, which I think means that you can use this file either with a basic , CommonJS, or some third thing called AMD that I don’t understand. When I was using Chart.js I picked Option 3. I just needed to add this to my code: and then I could use the library with the global environment variable. Couldn’t be easier. I just copied into my Git repository so that I didn’t have to worry about using NPM or the CDNs going down or anything. A lot of libraries will put their build in the directory, but not always! The build files’ location is specified in the library’s . For example here’s an excerpt from Chart.js’s . I think this is saying that if you want to use an ES Module ( ) you should use , but the jsDelivr and unpkg CDNs should use . I guess is for Node. ’s also says , which according to this documentation tells Node to treat files as ES modules by default. I think it doesn’t tell us specifically which files are ES modules and which ones aren’t but it does tell us that something in there is an ES module. is a library for logging into Bluesky with OAuth in the browser. Let’s see what kinds of Javascript files it provides in its build! It seems like the only plausible root file in here is , which looks something like this: This syntax means it’s an ES module . That means we can use it in the browser without a build step! Let’s see how to do that. Using an ES module isn’t an easy as just adding a . Instead, if the ES module has dependencies (like does) the steps are: The reason we need an import map instead of just doing something like is that internally the module has more import statements like , and we need to tell the browser where to get the code for and all of its other dependencies. Here’s what the importmap I used looks like for : Getting these import maps to work is pretty fiddly, I feel like there must be a tool to generate them automatically but I haven’t found one yet. It’s definitely possible to write a script that automatically generates the importmaps using esbuild’s metafile but I haven’t done that and maybe there’s a better way. I decided to set up importmaps yesterday to get github.com/jvns/bsky-oauth-example to work, so there’s some example code in that repo. Also someone pointed me to Simon Willison’s download-esm , which will download an ES module and rewrite the imports to point to the JS files directly so that you don’t need importmaps. I haven’t tried it yet but it seems like a great idea. I did run into some problems with using importmaps in the browser though – it needed to download dozens of Javascript files to load my site, and my webserver in development couldn’t keep up for some reason. I kept seeing files fail to load randomly and then had to reload the page and hope that they would succeed this time. It wasn’t an issue anymore when I deployed my site to production, so I guess it was a problem with my local dev environment. Also one slightly annoying thing about ES modules in general is that you need to be running a webserver to use them, I’m sure this is for a good reason but it’s easier when you can just open your file without starting a webserver. Because of the “too many files” thing I think actually using ES modules with importmaps in this way isn’t actually that appealing to me, but it’s good to know it’s possible. If the ES module doesn’t have dependencies then it’s even easier – you don’t need the importmaps! You can just: If you don’t want to use importmaps, you can also use a build system like esbuild . I talked about how to do that in Some notes on using esbuild , but this blog post is about ways to avoid build systems completely so I’m not going to talk about that option here. I do still like esbuild though and I think it’s a good option in this case. CanIUse says that importmaps are in “Baseline 2023: newly available across major browsers” so my sense is that in 2024 that’s still maybe a little bit too new? I think I would use importmaps for some fun experimental code that I only wanted like myself and 12 people to use, but if I wanted my code to be more widely usable I’d use instead. Let’s look at one final example library! This is a different Bluesky auth library than . Again, it seems like only real candidate file here is . But this is a different situation from the previous example library! Let’s take a look at : There’s a bunch of stuff like this in : This syntax is CommonJS syntax, which means that we can’t use this file in the browser at all, we need to use some kind of build step, and ESBuild won’t work either. Also in this library’s it says which is another way to tell it’s CommonJS. Originally I thought it was impossible to use CommonJS modules without learning a build system, but then someone Bluesky told me about esm.sh ! It’s a CDN that will translate anything into an ES Module. skypack.dev does something similar, I’m not sure what the difference is but one person mentioned that if one doesn’t work sometimes they’ll try the other one. For using it seems pretty simple, I just need to put this in my HTML: and then put this in . It seems to Just Work, which is cool! Of course this is still sort of using a build system – it’s just that esm.sh is running the build instead of me. My main concerns with this approach are: I also learned that you can also use to convert a CommonJS module into an ES module, though there are some limitations – the syntax doesn’t work. Here’s a github issue about that . I think the approach is probably more appealing to me than the approach because it’s a tool that I already have on my computer so I trust it more. I haven’t experimented with this much yet though. Here’s a summary of the three types of JS files you might encounter, options for how to use them, and how to identify them. Unhelpfully a or file extension could be any of these 3 options, so if the file is you need to do more detective work to figure out what you’re dealing with. The main difference between CommonJS modules and ES modules from my perspective is that ES modules are actually a standard. This makes me feel a lot more confident using them, because browsers commit to backwards compatibility for web standards forever – if I write some code using ES modules today, I can feel sure that it’ll still work the same way in 15 years. It also makes me feel better about using tooling like because even if the esbuild project dies, because it’s implementing a standard it feels likely that there will be another similar tool in the future that I can replace it with. A lot of the time when I talk about this stuff I get responses like “I hate javascript!!! it’s the worst!!!”. But my experience is that there are a lot of great tools for Javascript (I just learned about https://esm.sh yesterday which seems great! I love esbuild!), and that if I take the time to learn how things works I can take advantage of some of those tools and make my life a lot easier. So the goal of this post is definitely not to complain about Javascript, it’s to understand the landscape so I can use the tooling in a way that feels good to me. Here are some questions I still have, I’ll add the answers into the post if I learn the answer. Here’s a list of every tool we talked about in this post: Writing this post has made me think that even though I usually don’t want to have a build that I run every time I update the project, I might be willing to have a build step (using or something) that I run only once when setting up the project and never run again except maybe if I’m updating my dependency versions. Thanks to Marco Rogers who taught me a lot of the things in this post. I’ve probably made some mistakes in this post and I’d love to know what they are – let me know on Bluesky or Mastodon!

0 views
Julia Evans 1 years ago

New microblog with TILs

I added a new section to this site a couple weeks ago called TIL (“today I learned”). One kind of thing I like to post on Mastodon/Bluesky is “hey, here’s a cool thing”, like the great SQLite repl litecli , or the fact that cross compiling in Go Just Works and it’s amazing, or cryptographic right answers , or this great diff tool . Usually I don’t want to write a whole blog post about those things because I really don’t have much more to say than “hey this is useful!” It started to bother me that I didn’t have anywhere to put those things: for example recently I wanted to use diffdiff and I just could not remember what it was called. So I quickly made a new folder called /til/ , added some custom styling (I wanted to style the posts to look a little bit like a tweet), made a little Rake task to help me create new posts quickly ( ), and set up a separate RSS Feed for it. I think this new section of the blog might be more for myself than anything, now when I forget the link to Cryptographic Right Answers I can hopefully look it up on the TIL page. (you might think “julia, why not use bookmarks??” but I have been failing to use bookmarks for my whole life and I don’t see that changing ever, putting things in public is for whatever reason much easier for me) So far it’s been working, often I can actually just make a quick post in 2 minutes which was the goal. My page is inspired by Simon Willison’s great TIL blog , though my TIL posts are a lot shorter. This came about because I spent a lot of time on Twitter, so I’ve been thinking about what I want to do about all of my tweets. I keep reading the advice to “POSSE” (“post on your own site, syndicate elsewhere”), and while I find the idea appealing in principle, for me part of the appeal of social media is that it’s a little bit ephemeral. I can post polls or questions or observations or jokes and then they can just kind of fade away as they become less relevant. I find it a lot easier to identify specific categories of things that I actually want to have on a Real Website That I Own: and then let everything else be kind of ephemeral. I really believe in the advice to make email lists though – the first two (blog posts & comics) both have email lists and RSS feeds that people can subscribe to if they want. I might add a quick summary of any TIL posts from that week to the “blog posts from this week” mailing list.

0 views
blog.philz.dev 1 years ago

Safari Top, Part 2

I posted recently about getting the top memory-using tabs from Safari. This is the sort of pickle you get into if you're using a laptop with only 8GB of RAM. There are two problems: (1) how to map tabs to process ids and (2) how to get the memory usage of the underlying processes. Once you enable AppleScript works well enough to get the mapping of tabs to process ids, but, crucially for the second problem, was underreporting memory usage. For example, Claude is reportedly using 1GB of memory, but is reporting just 1MB. This led me down a rabbit hole of finding the command, and seeing memory usage more in the 1GB ballpark. I learned about from Julia Evans' blog post and went on a little bit of a detour to try to replicate it. It turns out that to get a "mach port" you need several "Entitlements" like and . So, you make the Rust work, figure out how to and, voila, it still doesn't work . Safari is protected by System Integrity Protection and doesn't allow you to open a mach port to it. So, back at square two, we find out about , find the header files in , and use Python's package. The field seems to match with Activity Monitor says. (The documentation is sparse, and I haven't delved deeper.) The reason to use Python rather than compiling a binary is to avoid a compile or installation step. So, here's the result: Here's the Python code : This time, I converted the AppleScript into "Javascript for Automation" (JXA), and learned that the Script Editor app has an "Open Dictionary" feature which lets you browse what's possible. If you find out how Activity Monitor actually gets the pids of the tabs, let me know!

0 views
Lambda Land 1 years ago

My Top Emacs Packages

If you ask anyone what the best Emacs packages are, you’ll almost definitely hear Magit (the only Git porcelain worth using) and Org Mode (a way to organize anything and everything in plain text) listed as #1 and #2. And they’re right! I use those packages extensively every day. Besides those two powerhouses, there are a handful of packages that make using Emacs a delight. If I had to ever use something else, I would miss these packages most: Jump around your screen crazy fast. Teleport to any character with ~5 key strokes. See https://karthinks.com/software/avy-can-do-anything/ for more reasons why it’s awesome. I almost exclusively rely on and have it bound to . Kind of like a super-charged right-click for Emacs. Works beautifully in dired, when selecting files in the minibuffer. There’s an easy way to make it play well with Avy which is just the best. Eat is a terminal emulator that’s faster almost all the other terminal emulators for Emacs. The only emulator it’s not faster than is Vterm, which is pretty dang speedy. Eat has been more than fast enough for all my needs however. Additionally, it can make a terminal emulator in a particular region , so if you use Eshell, you can get a little terminal emulator for every command you run. Normally, if you run, say, , you see the ugly terminal escape characters printed as text. With Eat, however, those terminal escape characters get interpreted correctly. Interactive programs (e.g. the Julia and Elixir REPLs) work flawlessly with it. Best spellchecking ever. It can spellcheck based off of the fontlock face; I keep this on when I program to get on-the-fly spellchecking of code comments and strings. I keep bound to à la flyspell because it is so darn helpful. Supports checking documents with mixed languages. This is one of the packages I miss most when I’m editing text outside of Emacs. The best way to add citations in Emacs, hands-down. Reads bibtex, inserts in org-mode, LaTeX, whatever. These next packages are all by Daniel Mendler . These packages improve selecting commands, buffers, files, etc. from the and interfaces. These make Emacs insanely ergonomic and excellent. These replace packages like Helm , Ivy/Counsel/Swiper , and Company . In comparison to these packages, Vertico + Consult + Corfu are lighter-weight, faster, less buggy (in my experience; I’ve tried them all!), and work better with other Emacs packages because they follow the default built-in APIs. Lighter-weight, less buggy vertical completing-read interface. Replaces Ivy. Incredibly flexible. Works out-of-the-box with everything that has a interface, so you don’t need special packages to make it play nice. Recommend adding Marginalia as well by the same author to add extra infos. Better than counsel. The live preview is amazing; I use instead of , instead of Swiper. is :fire: for searching large projects with instant previewable results. Pairs well with Embark to save results to a buffer. Lightweight pop-up library. Pairs well with Cape by the same author. See also Orderless which enhances everything from to to the Corfu popup. Vertico + Consult + Orderless + Marginalia + Corfu + Cape + Embark is sometimes called the “minad stack”. Embark and Orderless are both developed by Omar Camarena (oantolin) who frequently collaborates with Daniel Mendler. When I asked Omar on Reddit about the name, Omar replied that “minad stack” is fine; another name they’ve tried for the stack is “iceberg”, which I think is a good name too. It’s the new hotness—that said, it’s gotten really really stable over the past two years. If you like these packages, consider sponsoring their maintainers! These are some of my favorite open-source projects and I try to support them when I can. If you like these packages, you might like my Emacs Bedrock starter kit which, unlike many other starter kits, is meant to be a no-nonsense no-fluff no-abstraction bare-bones start for you to fork and tinker with to your liking. The stock configuration only installs one package ( which-key , which is amazing) but includes some extra example configuration. The extras/base.el file includes sample starter configuration for most of the above packages. (I should add to it, come to think of it…) Avy Jump around your screen crazy fast. Teleport to any character with ~5 key strokes. See https://karthinks.com/software/avy-can-do-anything/ for more reasons why it’s awesome. I almost exclusively rely on and have it bound to . Embark Kind of like a super-charged right-click for Emacs. Works beautifully in dired, when selecting files in the minibuffer. There’s an easy way to make it play well with Avy which is just the best. Eat Eat is a terminal emulator that’s faster almost all the other terminal emulators for Emacs. The only emulator it’s not faster than is Vterm, which is pretty dang speedy. Eat has been more than fast enough for all my needs however. Additionally, it can make a terminal emulator in a particular region , so if you use Eshell, you can get a little terminal emulator for every command you run. Normally, if you run, say, , you see the ugly terminal escape characters printed as text. With Eat, however, those terminal escape characters get interpreted correctly. Interactive programs (e.g. the Julia and Elixir REPLs) work flawlessly with it. Jinx Best spellchecking ever. It can spellcheck based off of the fontlock face; I keep this on when I program to get on-the-fly spellchecking of code comments and strings. I keep bound to à la flyspell because it is so darn helpful. Supports checking documents with mixed languages. This is one of the packages I miss most when I’m editing text outside of Emacs. Citar The best way to add citations in Emacs, hands-down. Reads bibtex, inserts in org-mode, LaTeX, whatever. Vertico Lighter-weight, less buggy vertical completing-read interface. Replaces Ivy. Incredibly flexible. Works out-of-the-box with everything that has a interface, so you don’t need special packages to make it play nice. Recommend adding Marginalia as well by the same author to add extra infos. Consult Better than counsel. The live preview is amazing; I use instead of , instead of Swiper. is :fire: for searching large projects with instant previewable results. Pairs well with Embark to save results to a buffer. Corfu Lightweight pop-up library. Pairs well with Cape by the same author. Eat is not the fastest terminal emulator, Vterm is. Thanks to a Redditor who pointed this out.

0 views
blog.philz.dev 1 years ago

A Bibliography of Sorts and Some Quotes

These were impactful to me, one way or another. Did you just tell me to... is a classic from @jrecursive. Migrations are a fact of software engineering life, and this, by Manu Cornet , is on point. Julia Evans's comics and zines are a national treasure. I learned some options to ! I learned about CSS! I've shared the post on SQL queries don't start with SELECT many times! Google published The Standard of Code Review which includes the following: In general, reviewers should favor approving a CL once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn’t perfect. I learned a lot from the ggplot2 book . I've followed Jeff Heer's work including Vega Lite, and you could do much worse than reading some of it. Matt Eccleston wrote a blog post on code-centric versus product-goal-centric teams . My friend Dan wrote Effective Typescript . Sometimes, coming up with the right approach to testing a problem is the key to solving the problem. Don't take just my word for it; here's FoundationDB Testing and debugging distributed systems is at least as hard as building them. Unexpected process and network failures, message reorderings, and other sources of non determinism can expose subtle bugs and implicit assumptions that break in reality, which are extremely difficult to reproduce or debug. The consequences of such subtle bugs are especially severe for database systems, which purport to offer perfect fidelity to an unambiguous contract. Moreover, the stateful nature of a database system means that any such bug can result in subtle data corruption that may not be discovered for months. FDB took a radical approach— before building the database itself, we built a deterministic database simulation framework that can simulate a network of interacting processes and a variety of disk, process, network, and request-level failures and recoveries, all within a single physical process. This rigorous testing in simulation makes FDB extremely stable, and allows its developers to introduce new features and releases in a rapid cadence. When I approach a new code base, I look first at what happens on disk (typically, the database schemas) and what happens across the wire (the RPC definitions like protobuf or thrift files, OpenAPI/swagger/typescript schemas, clicking around in the network tab in Chrome). This quote, from The Mythical Man-Month (Brooks) strikes a chord: Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious. Tracing. I had the privilege of using Dapper while at Google. X-Trace is similar. The ad-hoc version of this is to add a random identifier to log statements (especially "canonical log lines") and pass those identifiers along across RPC boundaries (e.g., via an HTTP header). Add canonical log lines (thanks, Stripe and Brandur Leach, for the write up) to your system. If you can make it queryable with SQL, you have a lovely analytics system! Nelson Elhage's "What does a cache do?" is a great discussion of why read replicas and caches are different. E-mail me if you have pointers to these! A history of UI library APIs. How did we get from Apple II graphics to React? These libraries seem so dynamic; I'd love to read a history! A picture of the Cauldron visualization for a CDH build. (This is very specific!)

0 views
NorikiTech 1 years ago

Writing by hand for side projects

In the past few months I made good progress on my side projects ( DDPub , Typingvania ) compared to the previous period of drought. What changed? I started writing about them by hand. (Every journal is 140 A4 pages and I’m halfway through my third.) I’ve been following a practice called “ morning pages ” created by Julia Cameron. I learned about it by reading her book “ The Artist’s Way .” Previously I tried a similar practice called “freewriting” where you would ask a question and write for a predetermined short amount of time, but it didn’t stick. I’ve been doing the pages for about six months now. It is as simple as is explained on the page above—you sit down, preferably every day, and write down a certain amount straight from your thoughts. For me the right amount is two pages of A4 which takes me about an hour to write. It’s quite a lot and I skip some days, but it works so I keep at it. I use some space to write about my personal life, but over time I also started writing about the projects I’m working on. Usually it takes the form of a narrative around the next steps: Here I am in project right now: … The next two things I want to work on are this and this. How would I implement this approach? I would need to extend this object with such and such fields and add methods that would do this and this. When I transform this data, I’ll store it in this type of structure… I wonder if I could do this instead and rewrite that bit? The same internal monologue would happen if I were sitting in front of the code editor, but it’s hard to fully follow the thread of thought when you’re there trying to write some code at the same time. Writing by hand forces you to complete the thought to the end. There is some recent research showing higher brain activation when writing by hand. I don’t know if it’s that or not, but I often have really neat ideas when writing that I’m sure I would have missed otherwise—and I write them down immediately because I’m writing! Solutions to some architectural or UI problems simply presented themselves to me while I was writing. Clearly I thought about them previously, but I was also receptive during the writing sessions. Another benefit of the morning pages (that I usually do during lunch or even in the evening) is that divorcing the planning from the execution really helps when you are mentally fatigued. I work full time and do my side projects in the evening, and after a full day of work there is sometimes nothing but static in my head. It’s hard to stop working only to sit down and start working again. If I have written my pages, I already have a specific plan of the next steps, what exactly I want to do and how to implement it. It’s easier to sit down and start working on the side project when I’ve already done the hard part (came up with the implementation) and I only need to execute. The progress I make informs my next writing session, and the cycle repeats. The pages may take quite a lot of time, but as I said, they work (for me). I started to work and make progress on ideas that were sitting in my notes for five years or more. If I don’t write the pages, I’m likely to spend the same amount of time just starting at the editor trying to collect my thoughts and understand what to do next, that is if I sit down to work on the project at all. The pages help with that too. I reacknowledge to myself why I’m doing what I’m doing and what result I’m expecting. I’m prone to be distracted and anxious, and the pages give me more focus than I feel otherwise. The practice is simple so if you have similar problems as me, I encourage you to try it for a week and see if it makes any difference.

0 views
blog.philz.dev 2 years ago

Creating a monorepo out of a multirepo

Inspired by Julia Evans' posts on git , I'm jotting down an obscure trick to combine repos. Sometimes you have multiple repos, and you want to create a monorepo out of them. Perhaps you have a distribution of many components, and it's convenient to across all of them together. The following annotated snippet creates two repos and joins them together. The key insight is that a commit in git is made up of (roughly) a message, pointers to parent commits, and a "tree," the latter of which can be made synthetically by using plumbing commands. This approach is probably overkill for a one time merging of two repos. In that case, create a commit in your second repo that moves everything to a subdirectory, add the second repo as a remote, and merge in that second repo using the . Rendered another way (with from and questionable abuse of ): If you're doing this for real, note that the above will fail spectacularly if you have spaces (and their ilk) in your names. Use or something. Create two repos, a-repo and b-repo, and initialize them with a file and some commits. Initialize the monorepo Add both subrepos as remotes and fetch them. Synthesize a tree listing and create it. Synthesize a commit with all the parents Update our working copy You can see how the tree preserves the subrepo histories Now let's let the A repo change Now we have to redo the merge. We use the same trick, sorta. Since we have that nice "README.md" in the mono repo tree, we want to preserve that. But, when we pull out , we have those and trees, and we want to filter those out. So, here we're abusing to filter them out. The and expression is producing . And, sure enough, voila!

0 views
Lambda Land 2 years ago

Functional Languages Need Not Be Slow

Somewhere in my adolescence I got stuck with the notion that functional languages were slow while languages like C were fast. Now, a good C programmer can eke more performance out of their code than probably anyone else, but the cost you pay to keep your code correct goes exponential as you get closer and closer to the machine. Functional languages abstract a lot away from the machine. Higher languages in general abstract away the machine and make code easier to maintain. So I think I had it in my head that functional languages, being far away from the bare metal, must necessarily be slow. It didn’t help that I also thought of functional languages as being necessarily interpreted languages. Turns out, functional languages are just as amenable to compilation as imperative ones. Many popular/well-known functional are in fact compiled. (E.g. Haskell, Scala, Rust, Julia, etc.) Moreover, these languages can be just as fast—if not faster—than their more “mainstream” counterparts. I wanted to pit my favorite language (Racket) against a slightly more well-known language (Python) to see how they handled a simple single-threaded program. For good measure I threw Rust, Julia, and JavaScript into the mix for comparison. If you’re impatient, just jump to the results . I wrote the original program in Racket, then had ChatGPT help me rewrite it in Python. ChatGPT did astoundingly well. I eventually rewrote the program to be a little more idiomatic—I wanted it to use a loop instead of tail recursion, as Python is much better with loops than it is with lots and lots of function calls. I also had ChatGPT help me rewrite the program to Rust, Julia, and JavaScript. Impressive—and unsettling. I ran these programs on my M1 Pro MacBook Pro. Here’s what I got: In graphical form: Wow! I did not expect Python to get so pummeled by everything else! It makes sense that Julia with its sweet JIT engine is the fastest. Rust does well—no surprise there. (Note that I’m not counting compilation time here—all the more impressive for Julia!) Racket holds its own though, coming in third place by a wide margin. If you did take Rust’s compile time into account, I think that would make it switch places with Racket. Of course, you use Rust for compile-once-run-lots sorts of scenarios. Are these authoritative? No, of course not. This is a simple synthetic benchmark. I don’t consider myself expert programmer in any of these languages, so there are likely some performance tweaks that could make the respective language’s benchmark run faster. (Maybe I should in Racket… it’s kind of hard to consider yourself an expert in a language when its creator works down the hall from you though.) That said, I hope this dispels the myth that functional languages are necessarily slow. That is not the case. If Python is fast enough for your use-case, then there’s no reason you shouldn’t consider using Racket on performance grounds. Library support is a different matter: of all the programming that goes on in the world, the majority is probably just gluing libraries together to do the thing you want. This is a good thing: it means we’re not reinventing so many wheels and people are getting good work done. That said, there’s plenty of exciting work for which no library exists! If you find yourself in one of these exciting cases, consider using the tool of maximum power: in this regard, nothing beats Racket for its flexibility and extensibility . I love the Racket Plot library. So easy to use, so powerful. If you run it inside of DrRacket, the graphs are actually interactive: you can rotate 3D graphs and zoom in on sections of 2D graphs. So neat! Here’s the code I used to generate the above graph:

0 views
Hillel Wayne 2 years ago

Notes on Every Strangeloop 2023 Talk I Attended

This is my writeup of all the talks I saw at Strangeloop, written on the train ride back, while the talks were still fresh in my mind. Now that all the talks are online I can share it. This should have gone up like a month ago but I was busy and then sick. Enjoy! Topic: How to define what “success” means to you in your career and then be successful. Mostly focused on psychological maxims, like “put in the work” and “embrace the unknown”. I feel like I wasn’t the appropriate audience for this; it seemed intended for people early in their career. I like that they said it’s okay to be in it for the money. Between the “hurr dur you must be in it for the passion” people and the “hurr durr smash capitalism” people, it’s nice to hear some acknowledgement that money makes your life nice. Topic: the value of “play” (as in “play make believe”, or “play with blocks”) in engineering. Some examples of how play leads to innovation, collaboration, and cool new things. Most of the talk is about the unexpected directions her “play” went in, like how her work in education eventually lead to a series of collaborations with OK Go. I think it was more inspirational than informative, to try to get people to “play” rather than to provide deep insight into the nature of the world. Still pretty fun. (Disclosure, I didn’t actually see this talk live, I watched Zac Hatfield-Dodds rehearse and gave feedback. Also Zac is a friend and we’ve collaborated before on Hypothesis stuff.) Topic: Some of the unexpected things we observe in working LLMs, and some of the unexpected ways they’re able to self-reference themselves. Zac was asked to give the talk at the last minute due to a sickness cancellation by another speaker. Given the time crunch, I think he pulled it together pretty well. Even so it was a bit too technical for me; I don’t know if he was able to simplify it in time for the final presentation. Like most practical talks on AI, intentionally or not he slipped in a few tricks to eke more performance out of an LLM. Like if you ask them to answer a question, and then rate the confidence of the question they asked, they tend to be decently accurate at their confidence. Zac’s also a manager at Anthropic , which gave the whole talk some neat “forbidden knowledge” vibes. (Disclaimer: Douglas is a friend, too.) Topic: Why stack-based languages are an interesting computational model, how they can be Turing-complete, and some of the unusual features you get from stack programming. The first time I’ve seen a stack-based language talk that wasn’t about Forth. Instead, it used his own homegrown stack language so he could just focus on the computer science aspects. The two properties that stick out to me are: My only experience with stack machines is golfscript . Maybe I’ll try to pick up uiua or something. Topic: “Small” generative AI models, like “taking all one-star amazon reviews for the statue of liberty and throwing them into a Markov chain ”. This was my favorite session of the conference. The technical aspects are pretty basic, and it’s explained simply enough that even layfolk should be able to follow. His approach generates dreamy nonsense that should be familiar to anyone who’s played with Markov chains before. And then he pulls out a guitar. The high point was his “Weird Algorithm”, which was like a karaoke machine which replaced the lyrics of songs with corpus selections that matched the same meter and syllables. Like replacing “oh I believe in yesterday” with “this is a really nice Hyundai”. I don’t know how funny it’ll be in the video, it might be one of those “you had to be there” things. Topic: The modern pace of tech leaves a lot of software, hardware, and people behind. How can we make software more sustainable, drawn from the author’s experiences living on a boat. Lots of thoughts on this one. The talk was a crash course on all the different kinds of sustainability: making software run on old devices, getting software guaranteed to run on future devices, computing under significant power/space/internet constraints, and everything in between. I think it’s intended as a call to arms for us to think about doing better. I’m sympathetic to the goals of permacomputing; what do I do with the five old phones in my closet? That’s a huge amount of computing power just gone to waste. The tension I always have is how this scales. Devine Lu Levinga is an artistic savant (they made Orca !) and the kind of person who can live on a 200-watt boat for seven years. I’m not willing to give up my creature comforts of central heating and Geforce gaming. Obviously there’s a huge spectrum between “uses less electricity than a good cyclist ” and “buys the newest iPhone every year”, the question is what’s the right balance between sustainability and achievability. There’s also the whole aesthetic/cultural aspect to permacomputing. Devine used images in dithered black/white. AFAICT this is because Hypercard was black/white, lots of retrocomputing fans love hypercard, and there’s a huge overlap between retrocomputing and permacomputing. But Kid Pix is just as old as Hypercard and does full color. It’s just not part of the culture. Nit: at the end Devine discussed how they were making software preservation easier by writing a special VM. This was interesting but the discussion on how it worked ended up going way over time and I had to run to the next session. (Disclaimer: Jesus I’m friends with way too many people on the conference circuit now) Topic: Formal methods is useful to reason about existing legacy systems, but has too high a barrier to entry. Marianne made a new FM language called “Fault” with a higher levels of abstraction. Some discussion of how it’s implemented. This might just be the friendship talking, but Fault looks like one of the more interesting FM languages to come out recently. I’m painfully aware of just how hard grokking FM can be, and anything that makes it more accessible is good. I’ll have to check it out. When she said that the hardest part is output formatting I felt it in my soul. Topic: Lots of “simple” things take years to learn, like Bash or DNS. How can we make it easier for people to learn these things? Four difficult technologies, and different approaches to making them tractable. I consider myself a very good teacher. This talk made me a better one. Best line was “behind every best practice is a gruesome story.” That’ll stick with me. Topic: Randal Munroe (the xkcd guy)’s closing keynote. No deep engineering lessons, just a lot of fun. Before Julia Evans’ talk, Alex did A Long Strange Loop , how it went from an idea to the monolith it is today. Strange Loop was his vision, an eclectic mix of academia, industry, art, and activism. And it drew a diverse crowd because of that. I’ve made many friends at Strangeloop, people like Marianne and Felienne . I don’t know if I’ll run into them at any other conferences, because I don’t think other conferences will capture that lightning in a bottle. I’ll miss them. I also owe my career to Strangeloop. Eight years ago they accepted Tackling concurrency bugs with TLA+ , which got me started both speaking and writing formal methods professionally. There’s been some talk about running a successor conference (someone came up with the name “estranged loop”) but I don’t know if it will ever be the same. There are lots of people can run a good conference, but there’s only one person who can run Alex’s conference . Whatever comes next will be fundamentally different. Still good, I’m sure, but different. Stack programs don’t need to start from an empty stack, which means entire programs will naturally compose. Like you can theoretically pipe the output of a stack program into another stack program, since they’re all effectively functions of type . Stack ops are associative: if you chop a stack program into subprograms and pipe them into each other, it doesn’t matter where you make the cuts, you still get the same final stack. That’s really, really cool.

0 views
fnands 2 years ago

Mojo 0.5.0 and SIMD

Another month, another Mojo release! It feel like every time I run into a missing feature in Mojo it gets added in the next version: In my first post I complained about a lack of file handling, which was then added soon after in version . For version I ran into the issue that you can’t print Tensors, which has now been added in the release. So this means Mojo has now unlocked everyone’s favourite method of debugging: printing to stdout. In addition to that, Tensors can now also be written to and read from files with and . There have also been a couple of updates to the SIMD type, which lead me to ask: How does the SIMD type work in Mojo? For a bit of background, you might have noticed that CPUs clock speeds haven’t really increased by much in the last decade or so, but computers have definitely gotten faster. One of the factors that have increased processing speed has been a focus on vectorization through SIMD, which stands for , i.e. applying the same operation to multiple pieces of data. Modern CPUs come with SIMD registers that allow the CPU to apply the same operation over all the data in that register, resulting in large speedups, especially in cases where you are applying the same operation to multiple pieces of data, e.g. in image processing where you might apply the same operation to the millions of pixels in an image. One of the main goals of the Mojo language is to leverage the ability of modern hardware, both CPUs and GPUs, to execute SIMD operations. There is no native SIMD support in Python, however Numpy does make this possible. Note: SIMD is not the same as concurrency, where you have several different threads running different instructions. SIMD is doing the same operation on different data. Generally, SIMD objects are initialized as , so to create a SIMD object consisting of four 8-bit unsigned integers we would do: And actually, SIMD is so central to Mojo that the builtin type is actually just an alias for : Modern CPUs have SIMD registers, so lets use the package in Mojo to see what the register width on my computer is: This means we can pack 256 bits of data into this register and efficiently vectorize an operation over it. Some CPUs support AVX-512 , with as the name suggests 512 bit SIMD registers. Most modern CPUs will apply the same operation to all values in their register in one step, allowing for significant speedup for functions that can exploit SIMD vectorization. In my case, we’ll have to live with 256 bits. This means in this register we can either put 4 64-bit values, 8 32-bit values 16 16-bit values, or even 32 8-bit values. We can use the utility function to tell us how many 32-bit floating point numbers will fit in our register: One of the new features in Mojo is that SIMD types will default to the width of the architecture, meaning if we call: Mojo will automatically pack 8 32-bit values, or 32 8-bit values int the register. This is equivalent to calling: Operations over SIMD types are quite intuitive. Let’s try adding two SIMD objects together: Additionally, since the version 0.5.0, we can also concatenate SIMD objects with : Operations applied to a SIMD object will be applied element-wise to the data in it, if the function is set up to handle this: As far as I can tell, this doesn’t just work automatically. If I define a function as: Then applying it to a single floating point number works as expected: But trying this on a SIMD object does not: However, if I define a version of the function to take a SIMD object: Then (with the additional specification of the parameter ), it will apply the function to all the values: While still working on single floating point values, as they are just SIMD objects of width one under the hood: I do miss the flexibility of Julia a bit, where you can define one function and then vectorize it with a dot, i.e. if you have a function that operates on scalar values, then calling will apply it element-wise to all values of that vector, and return a vector of the same shape. But for the most part, defining functions just to apply to SIMD values in Mojo doesn’t lose you much generality anyway. To be honest, I was a little bit daunted when I first saw the SIMD datatype in Mojo. I vaguely remember playing around with SIMD in C++, where it can be quite complicate to implement SIMD operations. But in Mojo, it really is transparent and relatively straightforward to get going with SIMD. It is clear that exploiting vectorization is a top priority for the Modular team, and a lot of through has clearly gone into making it easy to exploit the SIMD capabilities of modern hardware. I might take a look at vectorization vs parallelization in Mojo in the future, and maybe even try my hand at a bit of benchmarking.

0 views
Lambda Land 2 years ago

Towards Fearless Macros

Macros are tricky beasts. Most languages—if they have macros at all—usually include a huge “here there be dragons” warning to warn curious would-be macro programmers of the dangers that lurk ahead. What is it about macros that makes them so dangerous and unwieldy? That’s difficult to answer in general: there are many different macro systems with varying degrees of ease-of-use. Moreover, making macros easy to use safely is an open area of research—most languages that have macros don’t have features necessary to implement macros safely. Hence, most people steer clear of macros. There are many ways to characterize macro systems; I won’t attempt to cover them all here, but here’s the spectrum I’ll be covering in this post: Figure 1: A spectrum of how easy macro systems are to use safely If you’ve done any C programming, you’ve likely run into things like: That bit is a macro—albeit a C macro. These operate just after the lexer: they work on token streams. It’s a bit like textual search-and-replace, though it knows a little bit about the structure of the language (not much: just what’s a token and what’s not) so that you won’t run into problems if you do something like this: because that in the string is not a token—it’s just part of a string. C macros can’t do very much: you scan the token stream for a macro, then fill in the variables to the macro, and then replace the macro and the arguments its consumed with the filled-out template that is the macro definition. This prevents you from doing silly things like replacing something sitting inside of a string literal, but it’s far, far from being safe, as we’ll see in the next section. In contrast to C’s macros, Lisp’s macros are much more powerful. Lisp macros operate after the lexer and the parser have had a go at the source code—Lisp macro operate on abstract syntax trees —or ASTs, which is what the compiler or interpreter works with. Why is this a big deal? The ASTs capture the language’s semantics around precedence, for instance. In C you can write a macro that does unexpended things, like this: The macro didn’t know anything about precedence and we computed the wrong thing. This means that, to use a macro in C, you have to have a good idea of how it’s doing what it’s intended to do. That means C macros are leaky abstractions that prevent local reasoning: you have to consider both the macro definition and where it’s used to understand what’s going on. In contrast, Lisp macros are an improvement because they will rewrite the AST and the precedence you’d expect will be preserved. You can do this, for example: Lisp macros are also procedural macros , meaning you can execute arbitrary code inside of a macro to generate new ASTs. Macros in Lisp and its descendants are essentially functions from AST → AST. This opens up a whole world of exciting possibilities! Procedural macros constitute a “lightweight compiler API”. [ 4 ] “Same except for variable names” is also called alpha-equivalence. This comes from the λ-calculus, which states that the particular choice of variable name should not matter. E.g. \(\lambda x.x\) and \(\lambda y.y\) are the same function in the lambda calculus, just as \(f(x) = x + 2\) and \(g(y) = y + 2\) are the same function in algebra. Lisp macros aren’t without danger—many a Lisp programmer has shot their foot off with a macro. One reason is that Lisp macros are not hygienic —variables in the macro’s implementation may leak into the context of the macro call. This means that two Lisp programs that are the same except for different variable names can behave differently: The fact that the macro implementation uses a variable named ( tmp-leaky ) has leaked through to the user of the macro. ( tmp-capture ) This phenomenon is called variable capture , and it exposes this macro as a leaky abstraction! There are ways to mitigate this using , but those are error-prone manual techniques. It makes macro writing feel like you’re writing in an unsafe lower-level language. Scheme’s macros introduce a concept known as hygiene , which prevents variable capture automatically: In this case, the variable that the macro introduces ( tmp-intro-macro ) is not the same thing that the variable from the calling context ( tmp-intro-let ) refers to. This separation of scopes happens automatically behind the scenes, so there’s now no chance of accidental variable capture. Breaking hygiene has some utility in some cases—for example, one might want to add a form inside the body of a loop. There are ways around hygiene, but these are not without some problems. For more details see [ 2 ]. If you’d like to know more about hygiene, [ 1 ] is an excellent resource. Since Scheme macros (and Lisp macros more generally) allow running arbitrary Scheme code—including code from other modules—the dependency graph between modules can get so tangled that clean builds of a Scheme codebase are impossible. Racket solves this problem with its phase separation , which puts clear delimiters between when functions and macros are available to different parts of the language. This detangles dependency graphs without sacrificing the expressive power of macros. I wrote a little bit about phase separation ; you can read more on the Racket docs as well as Matthew Flatt’s paper [ 3 ] on phase separation. Racket also has a robust system for reasoning about where a variable’s definition comes from called a scope set . This is a notion makes reasoning about where variables are bound sensible. See a blog post as well as [NO_ITEM_DATA:flattBindingSetsScopes2016b] by Matthew Flatt for more on scope sets. Phase separation and scope sets make Racket macros the safest to use: Racket macros compose sensibly and hide their implementation details so that it is easy to write macros that are easy to use as if they were built-in language constructs. Racket also goes beyond the form that it inherited from Scheme; Racket’s macro-building system makes generating good error messages easy. There’s a little bug in the macro we used earlier, and that is the form only takes an identifier (i.e. a variable) as its first argument. We don’t have any error checking inside the macro; if we were to call with something that wasn’t an identifier, we’d get an error in terms of the the macro expands to, not the macro call itself: This isn’t good because there’s no in our code at all! We could add some error handling in our macro to manually check that and are identifiers, but that’s a little tedious. Racket’s helps us out: Much better! Now our error is in terms that the macro user will recognize. There are lots of other things that can do that make it easy to write correct macros that generate good error messages—a must for macros that become a part of a library. Many modern languages use macros; I’ll only talk about a few more here. If something’s missing, that’s probably because I didn’t want to be exhaustive. Julia macros have a lot of nice things: they operate on ASTs and they’re hygienic, though the way hygiene is currently implemented is a little strange: all variables get ’d automatically Meaning, they all get replaced with some generated symbol that won’t clash with any possible variable or function name. whether or not they come from inside the macro or they originated from the calling code. Part of the problem is that all variables are represented as simple symbols, which [ 1 ] shows is insufficient to properly implement hygiene. Evidently there is some ongoing work to improve the situation. This is a good example of research ideas percolating into industry languages I think. Elixir has robust AST macros, and its standard library makes heavy use of macros; many “core” Elixir constructs like , , , , and others are actually macros that expand to smaller units of Elixir. Elixir actually gets hygiene right! Unlike Julia, variables in Elixir’s AST have metadata—including scope information—attached to them. This and other aspects of Elixir’s macro system open it up to lots of exciting possibilities. The Nx library brings support for numerical and GPU programming to Elixir, and it works essentially by implementing a custom Elixir compiler in Elixir itself , and macros play a big role in this. Me thinking that Elixir is a big mainstream language should tell you something about the languages I spend my time with in my job as a PhD student. I think Elixir macros are really neat—they’re the most powerful I’ve seen in a “big mainstream” language. Rust supports two kinds of macros: macros-by-example, and procedural macros. Macros-by-example are a simple pattern-to-pattern transformation. Here’s an example from The Rust Book : This macro takes a pattern like and expands it to a pattern like Notice how the marks a part of the template that can be repeated. pattern-repeat This is akin to Racket or Scheme’s repetition form. Macros-by-example work on AST, but you can’t perform arbitrary computation on the AST. For that, you need procedural macros. Rust’s procedural macros (called “proc macros”) work on a token stream, and you can perform arbitrary computation, which puts them in a bit of a funny middle ground between C and Lisp. There is a Rust crate that you can use to parse a Rust token stream into Rust AST, but you don’t get any nice source information from the AST nodes, which makes producing good error messages a challenge. I personally find Rust macros to be disappointing. There’s a wide variety of macro systems. The best macro systems: Different languages have different features in their macro systems; some languages make it easy to use macros sensibly, while for others macros are a formidable challenge to use properly—make sure you know what your language provides and the trade-offs involved. Turns out you can do a lot with functions. Powerful function programming languages let you do so much with first-class functions. If you can get access to first-class continuations , as you can in Racket and Scheme, then you can create powerful new programming constructs without having to resort to macros. I came across the JuliaCon 2019 keynote talk, where Steven Johnson explains how many of the things that you can do with macros can be solved just with Julia’s type dispatch. If you can do something with functions, you probably should: functions are first-class values in most languages these days, and you’ll enjoy increased composability, better error messages, and code that is easier to read and understand by your peers. Macros introduce little languages wherever you use them. For simple macros, you might not have any constraints on what you may write under the scope of a macro. As an example, consider a macro that adds a -loop construct to a language by rewriting to another kind of looping mechanism: you shouldn’t have any restriction on what you can write inside the body of the loop. However, more complex macros can impose more restrictions on what can and cannot be written under their lexical extent. These restrictions may or may not be obvious. Examples: accidental variable capture limits what can be safely written, and grammatical errors (e.g. using an expression where an identifier was expected) can lead to inscrutable errors. Better macro systems mitigate these problems. It’s not enough to just have a macro system that uses ASTs; you need a macro system that makes it easy to write correct macros with clear error messages so they truly feel like natural extensions of the language. Few languages do this right. Macro systems have improved since the 1960s. While Lisp excluded many of the pitfalls of C macros by construction , you still had to use kluges like to manually avoid variable capture. Scheme got rid of that with hygienic macros, and Racket improved matters further by improving macro hygiene through scope sets and introducing phase separation. It is so much easier to build robust macro-based abstractions. Macros are good—anyone can write macros and experiment with new syntactic constructs. This makes development and extension of the language no longer the sole domain of the language designer and maintainer—library authors can experiment with different approaches to various problems. We see this a lot with Elixir: Elixir’s core language is really rather small; most of the magic powering popular libraries like Ecto or Phoenix comes from a choice set of macro abstractions. These and other libraries are free to experiment with novel syntax without fear of cluttering and coupling the core language with bad abstractions that would then need to be maintained in perpetuity. Macros can be powerful when used correctly—something made much easier by modern macro systems. \(\lambda x.x\) and \(\lambda y.y\) are the same function in the lambda calculus, just as \(f(x) = x + 2\) and \(g(y) = y + 2\) are the same function in algebra. Lisp macros aren’t without danger—many a Lisp programmer has shot their foot off with a macro. One reason is that Lisp macros are not hygienic —variables in the macro’s implementation may leak into the context of the macro call. This means that two Lisp programs that are the same except for different variable names can behave differently: The fact that the macro implementation uses a variable named ( tmp-leaky ) has leaked through to the user of the macro. ( tmp-capture ) This phenomenon is called variable capture , and it exposes this macro as a leaky abstraction! There are ways to mitigate this using , but those are error-prone manual techniques. It makes macro writing feel like you’re writing in an unsafe lower-level language. Scheme’s macros introduce a concept known as hygiene , which prevents variable capture automatically: In this case, the variable that the macro introduces ( tmp-intro-macro ) is not the same thing that the variable from the calling context ( tmp-intro-let ) refers to. This separation of scopes happens automatically behind the scenes, so there’s now no chance of accidental variable capture. Breaking hygiene has some utility in some cases—for example, one might want to add a form inside the body of a loop. There are ways around hygiene, but these are not without some problems. For more details see [ 2 ]. If you’d like to know more about hygiene, [ 1 ] is an excellent resource. Racket macros: phase separation and scope sets # Since Scheme macros (and Lisp macros more generally) allow running arbitrary Scheme code—including code from other modules—the dependency graph between modules can get so tangled that clean builds of a Scheme codebase are impossible. Racket solves this problem with its phase separation , which puts clear delimiters between when functions and macros are available to different parts of the language. This detangles dependency graphs without sacrificing the expressive power of macros. I wrote a little bit about phase separation ; you can read more on the Racket docs as well as Matthew Flatt’s paper [ 3 ] on phase separation. Racket also has a robust system for reasoning about where a variable’s definition comes from called a scope set . This is a notion makes reasoning about where variables are bound sensible. See a blog post as well as [NO_ITEM_DATA:flattBindingSetsScopes2016b] by Matthew Flatt for more on scope sets. Phase separation and scope sets make Racket macros the safest to use: Racket macros compose sensibly and hide their implementation details so that it is easy to write macros that are easy to use as if they were built-in language constructs. Racket also goes beyond the form that it inherited from Scheme; Racket’s macro-building system makes generating good error messages easy. There’s a little bug in the macro we used earlier, and that is the form only takes an identifier (i.e. a variable) as its first argument. We don’t have any error checking inside the macro; if we were to call with something that wasn’t an identifier, we’d get an error in terms of the the macro expands to, not the macro call itself: This isn’t good because there’s no in our code at all! We could add some error handling in our macro to manually check that and are identifiers, but that’s a little tedious. Racket’s helps us out: Much better! Now our error is in terms that the macro user will recognize. There are lots of other things that can do that make it easy to write correct macros that generate good error messages—a must for macros that become a part of a library. Other languages # Many modern languages use macros; I’ll only talk about a few more here. If something’s missing, that’s probably because I didn’t want to be exhaustive. Julia # Julia macros have a lot of nice things: they operate on ASTs and they’re hygienic, though the way hygiene is currently implemented is a little strange: all variables get ’d automatically Meaning, they all get replaced with some generated symbol that won’t clash with any possible variable or function name. whether or not they come from inside the macro or they originated from the calling code. Part of the problem is that all variables are represented as simple symbols, which [ 1 ] shows is insufficient to properly implement hygiene. Evidently there is some ongoing work to improve the situation. This is a good example of research ideas percolating into industry languages I think. Elixir # Elixir has robust AST macros, and its standard library makes heavy use of macros; many “core” Elixir constructs like , , , , and others are actually macros that expand to smaller units of Elixir. Elixir actually gets hygiene right! Unlike Julia, variables in Elixir’s AST have metadata—including scope information—attached to them. This and other aspects of Elixir’s macro system open it up to lots of exciting possibilities. The Nx library brings support for numerical and GPU programming to Elixir, and it works essentially by implementing a custom Elixir compiler in Elixir itself , and macros play a big role in this. Me thinking that Elixir is a big mainstream language should tell you something about the languages I spend my time with in my job as a PhD student. I think Elixir macros are really neat—they’re the most powerful I’ve seen in a “big mainstream” language. Rust # Rust supports two kinds of macros: macros-by-example, and procedural macros. Macros-by-example are a simple pattern-to-pattern transformation. Here’s an example from The Rust Book : This macro takes a pattern like and expands it to a pattern like Notice how the marks a part of the template that can be repeated. pattern-repeat This is akin to Racket or Scheme’s repetition form. Macros-by-example work on AST, but you can’t perform arbitrary computation on the AST. For that, you need procedural macros. Rust’s procedural macros (called “proc macros”) work on a token stream, and you can perform arbitrary computation, which puts them in a bit of a funny middle ground between C and Lisp. There is a Rust crate that you can use to parse a Rust token stream into Rust AST, but you don’t get any nice source information from the AST nodes, which makes producing good error messages a challenge. I personally find Rust macros to be disappointing. Conclusion # There’s a wide variety of macro systems. The best macro systems: Operate on the AST rather than on a stream of tokens Avoid leaking implementation details through inadvertent variable capture by being hygienic Produce good error messages that are in terms of the caller’s context (Bonus) have good phase separation to enforce clear separation between complex macro systems

0 views
fnands 2 years ago

Stereo vision and disparity maps (in Julia)

I’ve been working a lot recently with stereo vision and wanted to go through the basics of how disparity is calculated. I’m partially doing this as an excuse to get better at Julia (v1.9.3 used here). You can view the notebook for this blog post on Github: In much the same way that we as humans can have depth perception by sensing the difference in the images we see between our left and right eyes, we can calculate depth from a pair of images taken from different locations, called a stereo pair. If we know the positions of out cameras, then we can use matching points in our two images to estimate how far away from the camera those points are. Taking a look at the image below (from OpenCV ): If we have two identical cameras, at points and at a distance from each other, with focal length , we can calculate the distance ( ) to object by using the disparity between where the object appears in the left image ( ) and where it appears in the right image ( ). In this simple case, the relation between disparity and distance is simply: If we know an , then we can rearrange this to give us distance as a function of disparity: You might notice that in case the disparity is zero, you will have an undefined result. This is just due to the fact that in this case the cameras are pointing in parallel, so in principle a disparity of zero should not be possible. The general case is more complicated, but we will focus on this simple setup for now. We can define the function as: Where and are measured in pixels, and is measured in centimeters. There is an inverse relation between distance and disparity: So once we have a disparity, it’s relatively straightforward to get a distance. But how do we find disparities? We usually represent the disparities for a given pair of images as a disparity map , which is an array with the same dimensions as (one of) your images, but with disparity values for each pixel. In principle, this is a two-dimensional problem, as an object might be matched to a point that has both a horizontal and vertical shift, but luckily, you can always find a transformation to turn this into a one dimensional problem. The cartoon below illustrates what a disparity map might look like: Above, we calculate the disparity with respect to the right image (you can do it with respect to the left image as well), and as you can see the disparity map tells us how many pixels to the right each object shifted in the left image vs the right image. For a set of images (taken from the Middlebury Stereo Datasets ): The corresponding disparity map can be visualized as follows: With darker pixels having lower disparity values, and brighter pixels having higher disparity values, meaning the dark objects are far away from the cameras, while the bright ones are close. The ground truth disparity as shown above is usually calculated from LiDAR or some other accurate method, and our goal is to get as close as possible to those values using only the images above. So let’s try and calculate disparity for the images above. There are many, many approaches to calculating disparity, but let us begin with the most simple approach we can think of. As a start, let us go through each pixel in the right image, and for that pixel, try and find the most similar pixel in the left image. So let us try and take the squared difference between pixels values as our similarity metric. As we are going to be doing the same thing for every row of pixels, we are just going to define a function that does the basic logic, and then apply the same function to every case. Let’s define a distance metric as the squared distance: And as a test case let’s create the cartoon image we had above: Now we can try and match pixels in the right image to pixels in the left image. So how did we do? So the toy example works! The top line, which moved more pixels, shows up brighter (i.e. larger disparity values), and the lower line is dimmer. So let’s move on to real images. We’ll start with the example case above, but for simplicity we’ll stick to grayscale at first: Redefining slightly… So let’s see how we did? Looking at the predicted disparity, we can see there is some vague resemblance to the input image, but we’re still pretty far from the target: A significant problem seems to be erroneous matches, especially in the background. As you can imagine, we are only comparing single channel pixels values, and it’s very likely that we might just find a better match by chance. In grayscale we are only matching pixel intensity, and we have no idea whether something is bright green, or bright red. So let’s try and improve the odds of a good match by adding colour. So, a slight improvement! There seem to be fewer random matches in the background, but still not that close to the desired outcome. Is there more we can do? The obvious downside of the naive approach above is that it only ever looks at one pixel (in each image) at a time. That’s not a lot of information, and also not how we intuitively match objects. Look at the image below. Can you guess the best match for the pixel in the row of pixels below it? Given only this information, it’s impossible for us to guess whether the green pixel matches with the pixels at location 3, 5 or 7. If however I was to give you more context, i.e. a block of say 3x3 pixels, would this make things simpler? In this case, there is an unambiguous answer, which is the principle behind block-matching. To confirm our idea that more context results in better matches, we can take a quick look at a row of pixels: Given the pixel above, where in the row below do you think this pixel matches? You would guess somewhere in the orange part on the left right? But which pixel exactly is almost impossible to say. If we now take a block with more context: And compare it to the row below, the location of the match becomes more obvious: Calculating the difference metric for each point with different block sizes, we can clearly see that for low block sizes, the lowest metric value is ambiguous, while for larger block sizes it becomes more clear where exactly the best match is: And now we are ready to define our block matching algorithm, much in the way we did our pixel matching algorithm: Let’s see how this does on the full image in comparison to the pixel matching: Now we are getting somewhere! Compared to the earlier results we can now start making out the depth of the separate objects like the lamp, bust and camera. There are still a few things we could do to improve our simple algorithm (like only accepting matches that have below a certain score for the metric), but I will leave those as an exercise to the reader. Above we went through a basic introduction to stereo vision and disparity, and built a bare-bones block matching algorithm from scratch. The above is pretty far away from the state of the art, and there are many more advanced methods for calculating disparity, ranging from relatively simple methods like block matching to Deep Learning methods. Below are some posts/guides I found informative: Introduction to Epipolar Geometry and Stereo Vision Stereo Vision: Depth Estimation between object and camera Depth Map from Stereo Images

0 views
fnands 2 years ago

A first look at Mojo 🔥

The Mojo programming language was officially released in May, but could only be used through some notebooks in a sandbox. Last week, the SDK (version 0.2.1) got released, so I decided to give it a look. Mojo’s goal is to “combine the usability of Python with the performance of C” , and bills itself as “the programming language for all AI developers” . It’s clear that Python is the dominant language when it comes to ML/AI, with great libraries like Pytorch and a few others being the main drivers of that. The problem comes with depth: all the fast libraries in Python are written in a performant language, usually C or C++, which means that if you want to dig into the internals of the tools you are using you have to switch languages, which greatly raises the barrier of entry for doing so. There are other languages that try to go for the usability of Python while retaining performance, and the first language that comes to mind for me in this respect is Julia. Julia is a pretty neat language, and writing math-heavy, fast code in it feels very elegant, while retaining a very Python like syntax. Julia is about twenty years younger than Python, and to me seems like they took the best aspects of Python and Fortran and rolled them into one language, allowing you to have performant and elegant code that is Julia all the way down. Given all this, in vacuum, Julia would seem like the obvious language to choose when it comes to ML/AI programming. The one major downside of Julia is that it doesn’t have the robust ecosystem of libraries that Python has, and unless something major changes, it seems that Python will keep winning. Enter Mojo, a language that then (aspires to) keep interoperability with Python, while itself being very performant and allowing you to write code that is Mojo all the way down. Basically if Mojo achieves its goals then we get to have our cake and eat it: we can keep the great ecosystem of packages that Python brings with it, while getting to write new performant code in a single. My guess is if this works out that all the major packages will eventually get rewritten in Mojo, but we can have a transition period where we still get to keep the C/C++ version of them until this can be done. The people behind Mojo (mostly Chris Lattner ) seem to know what they are doing, so I wish them all the best. I wanted to start with something basic, so I thought I would have a look at the first puzzle from the 2022 advent of code . Basically you are given a text file with a several lists of numbers representing the amount of calories some elves are carrying (go read up on the advent of code if you are unfamiliar, it will make sense then), and have to find which elves are carrying the most calories. So effectively a little bit of file parsing, with some basic arithmetic, i.e. a little puzzle to ease into Mojo. I won’t share the input because the creator of the AoC has explicitly asked people not to , but you can download your own and try the code below. At first glance, a lot of Python code will “just work”: However, it’s clear a lot is still missing, e.g. lambda functions don’t work yet: This is likely coming, but for now we have to live without it. So for the first step, let’s parse some text files. The first thing I found was that Mojo doesn’t have a native way to parse text yet. But luckily, you can just get Python to do it for you! In this case, you have to import Python as a module and call the builtin Python open function. It’s standard practice in Python to open text files with the incantation, but this doesn’t work in Mojo, so have to open and close files manually. All in all, it’s relatively standard Python, with a couple of caveats. One of the big things is that there is a distinction between Python types and Mojo types, i.e. the Python is not the same as Mojo’s , so if you want to get the most out of Mojo, you need to cast from the one to the other. Right now, there seems to be no direct way to go from to , so I had to take a detour via . I tried to keep the Python imports in the function, so that the other functions can be in “pure” Mojo. The my first impulse was to create a Python-esque list, but the builtin list in Mojo is immutable , so I had to go for a DynamicVector, which had a strong C++ flavour to it. Once that was done I was done with Python for this program and could go forth in pure Mojo. Below you can see I declare functions with while above I used . Both work in Mojo, but functions forces you to be strongly typed and enfoces some memory safe behaviour . You can see here the values are all declared as mutable ( ). You can also declare immutables with . This is enforced in functions. Other than that, a relatively standard loop over a container. Again, relatively straightforward. I’m definitely missing Python niceties like being able to easily sum over a container (can’t call in Mojo 😢). To put it all together we create a main , and notice that we need to indicate that it might raise errors as we are calling the unsafe . Mojo feels relatively familiar, but I will also say that when writing “pure” Mojo it feels like writing C with Python syntax. This makes sense given the goals of the language, but caught me a little off guard; I was expecting something a little closer to Julia, which still feels a lot like Python in most cases. This was not the greatest example to show Mojo off, as Mojo really shines in high performance environments, so the language didn’t really get to stretch its legs here. You can find some more performance oriented examples on the official Mojo website . I will probably give Mojo another look and try out something a bit more suited for the language in the future, maybe when the version of the language drops. I think I’ve been spoiled by mostly writing in two well supported languages (Python and C++) for which there are countless reference examples or StackOverflow posts on how to do things. Due to the fact that Mojo is brand new, there are very few examples to look to about how to do even relatively basic things. For now if you want to get started, I recommend starting with the exercises on mojodojo.dev .

0 views